Compare commits

...

8709 Commits

Author SHA1 Message Date
Carl Worth
0da7d59ac2 docs: Add MD5 sums for the 10.0.5 release.
These can be generated only after the release has been tarred up and tagged.
2014-04-18 17:02:17 -07:00
Carl Worth
c941373838 docs: Add release notes for 10.0.5 2014-04-18 16:51:02 -07:00
Carl Worth
c78a676998 Update version to 10.0.5
In preparation for the 10.0.5 release, of course.
2014-04-18 16:48:06 -07:00
Eric Anholt
5e718c11c6 i965: Fix buffer overruns in MSAA MCS buffer clearing.
This manifested as rendering failures or sometimes GPU hangs in
compositors when they accidentally got MSAA visuals due to a bug in the X
Server.  Today we decided that the problem in compositors was equivalent
to a corruption bug we'd noticed recently in resizing MSAA-visual
glxgears, and debugging got a lot easier.

When we allocate our MCS MT, libdrm takes the size we request, aligns it
to Y tile size (blowing it up from 300x300=900000 bytes to 384*320=122880
bytes, 30 pages), then puts it into a power-of-two-sized BO (131072 bytes,
32 pages).  Because it's Y tiled, we attach a 384-byte-stride fence to it.
When we memset by the BO size in Mesa, between bytes 122880 and 131072 the
data gets stored to the first 20 or so scanlines of each of the 3 tiled
pages in that row, even though only 2 of those pages were allocated by
libdrm.  In the glxgears case, the missing 3rd page happened to
consistently be the static VBO that got mapped right after the first MCS
allocation, so corruption only appeared once window resize made us throw
out the old MCS and then allocate the same BO to back the new MCS.

Instead, just memset the amount of data we actually asked libdrm to
allocate for, which will be smaller (more efficient) and not overrun.
Thanks go to Kenneth for doing most of the hard debugging to eliminate a
lot of the search space for the bug.

Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77207
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 7ae870211d)
2014-04-16 10:21:09 -07:00
Paul Berry
b2f14a6284 i965/gen7: Prefer vertical alignment of 4 when possible.
Gen6+ allows for color buffers to use a vertical alignment of either 4
or 2.  Previously we defaulted to 2.  This may have caused problems on
Gen7 because Y-tiled render targets are not allowed to use a vertical
alignment of 2.

This patch changes the vertical alignment to 4 on Gen7, except for the
few formats where a vertical alignment of 2 is required.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6b40dd17cf)
2014-04-16 10:10:19 -07:00
Emil Velikov
498853b9fd glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path
With commit 1f1928db001(glx: Drop _Xglobal_lock while we create and
initialize glx display) we've split the big _Xglobal_lock handling in
a more fine grained manner.

Unfortunatelly we forgot to drop the unlock_mutex on the error paths,
leading to undefined behaviour as the mutex is already unlocked.

Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: "9.2 10.0 10.1"  <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit f9832f960f)
2014-04-14 15:05:36 -07:00
Brian Paul
45fd1d336a svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()
Fixes a crash in svga_context_flush_buffers() if we use the 'draw' module
for AA lines (when the device doesn't support that feature).  We need to
initialize this list before we setup the swtnl pieces.

Found/fixed by Charmaine Lee.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
(cherry picked from commit e853ade544)

Conflicts:
	src/gallium/drivers/svga/svga_context.c
2014-04-14 15:05:35 -07:00
Courtney Goeltzenleuchter
cbaaf8fe42 mesa: add bounds checking to eliminate buffer overrun
Decompressing ETC2 textures was causing intermitent segfault
by copying resulting 4x4 texel block to the destination texture
regardless of the size of the destination texture. Issue found
via application crash in GLBenchmark 3.0's Manhattan test.

v2: add more detail comment. Compute limit outside inner loops.
v3: add bugzilla reference
v4: Correct cc syntax in commit log
v5: really grab the right patch

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74988
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1, suggested v2-3]
(cherry picked from commit cb4ad13685)
2014-04-14 15:05:35 -07:00
Brian Paul
7b580a567f svga: replace sampler assertion with conditional
For TEX instructions, the set of samplers and sampler views should
be consistent.  The XA state tracker sometimes passes an inconsistent
set of samplers and sampler views.  Rather than assert and die, issue
a warning.

v2: add debugging code to detect inconsistent state.
v3: also check for null sampler in svga_state_tss.c

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
(cherry picked from commit 9bb2ec6fd1)

Conflicts:
	src/gallium/drivers/svga/svga_state_fs.c
2014-04-14 15:05:35 -07:00
Ilia Mirkin
69a777dd21 nouveau: fix firmware check on nvd7/nvd9
The kernel driver expects the class to be based on chipset generation
rather than VP generation. Make sure to pass 90b1 for NVDX chipsets
instead of 95b1.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77102
Fixes: 40dd777b33
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@ubunutu.com>
(cherry picked from commit 89c5b56be6)
2014-04-14 15:05:35 -07:00
Johannes Nixdorf
76f33938dd configure.ac: fix the detection of expat with pkg-config
The pkg-config module was called "EXPAT" instead of "expat" in
PKG_CHECK_EXISTS. This seems to have been wrong because the wrong
argument was copied from PKG_CHECK_MODULES.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 476db98e03)
2014-04-14 15:05:35 -07:00
Brian Paul
68fef3983e cso: fix sampler view count in cso_set_sampler_views()
We want to call pipe->set_sampler_views() with count being the
maximum of the old number of sampler views and the new number.
This makes sure we null-out any old sampler views.

We already do the same thing for sampler states in single_sampler_done().
Fixes some assertions seen in the VMware driver with XA tracker.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Tested-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 2355a64414)
2014-04-14 15:05:35 -07:00
Brian Paul
5ab0f978b7 mesa: fix glMultiDrawArrays inside a display list
The underlying glDrawArrays() calls weren't getting compiled into
the display list.  We simply need to use the current dispatch table
so the CALL_DrawArrays() is routed to the display list save function.

This patch also fixes glMultiModeDrawArraysIBM and
glMultiModeDrawElementsIBM.

Fixes the new piglit gl-1.4-dlist-multidrawarrays test.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit e341856294)
2014-04-14 15:05:35 -07:00
Brian Paul
9cd2daa0ef st/mesa: add null pointer checking in query object functions
Don't pass null query object pointers into gallium functions.
This avoids segfaulting in the VMware driver (and others?) if the
pipe_context::create_query() call fails and returns NULL.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 488d4c4826)
2014-04-14 15:05:35 -07:00
Brian Paul
15b2587334 mesa: fix unpack_Z32_FLOAT_X24S8() / unpack_Z32_FLOAT() mix-up
And use the z32f_x24s8 helper struct in unpack_Z32_FLOAT_X24S8().
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 1f4ebfaa88)
2014-04-14 15:05:35 -07:00
Christian König
6cc6c921b1 st/mesa: fix sampler view handling with shared textures v4
Release the references to the sampler views before
destroying the pipe context.

v2: remove TODO and unrelated change
v3: move to st_texture.[ch], rename callback, add comment
v4: fix rebase mess up and add further cleanups

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d117ddbe31)
2014-04-14 15:05:35 -07:00
José Fonseca
132df6a9a5 draw: Duplicate TGSI tokens in draw_pipe_pstipple module.
As done in draw_pipe_aaline and draw_pipe_aapoint modules.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ee89432a47)
2014-04-14 15:05:35 -07:00
Christian König
0c37a0b94d st/mesa: recreate sampler view on context change v3
With shared glx contexts it is possible that a texture is create and used
in one context and then used in another one resulting in incorrect
sampler view usage.

v2: avoid template copy
v3: add XXX comment

Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 92e543c45d)
2014-04-14 15:05:35 -07:00
Ilia Mirkin
1184293f40 nouveau: there may not have been a texture if the fbo was incomplete
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e58071355e)
2014-04-14 15:05:34 -07:00
Ilia Mirkin
2bd3830197 nouveau: add forgotten GL_COMPRESSED_INTENSITY to texture format list
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b676df9abf)

Conflicts:
	src/mesa/drivers/dri/nouveau/nouveau_texture.c
2014-04-14 15:05:34 -07:00
Ilia Mirkin
f15356c70a mesa/main: condition GL_DEPTH_STENCIL on ARB_depth_texture
EXT_packed_depth_stencil is supported by all drivers, but
ARB_depth_texture isn't (notably nouveau_vieux). This should avoid
passing unexpected values down to ChooseTextureFormat.

The EXT_packed_depth_stencil spec does not make any explicit references
to requiring ARB_depth_texture in order to allow textures with that
format, however if there is no dependency, ARB_depth_texture would be
practically implied by the extension.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>

Note for 10.0 backport: This will produce a conflict, the solution is to
move the surrounding if as well.

(cherry picked from commit 18690995a6)

Conflicts:
	src/mesa/main/teximage.c
2014-04-14 15:05:34 -07:00
Emil Velikov
40a05673a7 mesa: return v.value_int64 when the requested type is TYPE_INT64
Fixes "Operands don't affect result" defect reported by Coverity.

Cc: "9.2 10.0 10.1"  <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit a9cf3aa208)
2014-04-14 15:05:34 -07:00
Jonathan Gray
437f291d64 gallium: add endian detection for OpenBSD
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 40214267ab)
2014-04-14 15:05:34 -07:00
Ilia Mirkin
5c4f80dca6 nv50: adjust blit_3d handling of ms output textures
This fixes some unwanted scaling when the output is multisampled.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 253314d487)

Also squashed in the following:

Revert nvc0 part of "nv50: adjust blit_3d handling of ms output textures"

The nvc0 bits don't appear to work, and I thought I had removed them
from the commit. Oops.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 897f40f25d)
2014-04-14 15:05:34 -07:00
Ilia Mirkin
860ee22480 nouveau: fix fence waiting logic in screen destroy
nouveau_fence_wait has the expectation that an external entity is
holding onto the fence being waited on, not that it is merely held onto
by the current pointer. Fixes a use-after-free in nouveau_fence_wait
when used on the screen's current fence.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75279
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 507f0230d4)

Conflicts:
	src/gallium/drivers/nouveau/nv30/nv30_screen.c
2014-04-14 15:05:34 -07:00
Matt Turner
5ad6062ee6 mesa: Wrap SSE4.1 code in #ifdef __SSE4_1__.
Because people insist on doing things like explicitly disabling SSE 4.1.

Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberger <david.heidelberger@ixit.cz>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71547
(cherry picked from commit 8d3f739383)
2014-04-14 15:05:34 -07:00
Brian Paul
063f9c6aef mesa: fix copy & paste bugs in pack_ubyte_SRGB8()
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 1e25aa4cdb)
2014-04-14 15:05:34 -07:00
Brian Paul
6490bdf358 mesa: fix copy & paste bugs in pack_ubyte_SARGB8()
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 9493fc729e)
2014-04-14 15:05:34 -07:00
Carl Worth
1062066fad Ignore patches which don't apply.
These patches all fix bugs in code that in not present in the 10.0 branch.
2014-04-14 15:05:34 -07:00
Brian Paul
fd5f0644af mesa: add unpacking code for MESA_FORMAT_Z32_FLOAT_S8X24_UINT
Fixes glGetTexImage() when converting from MESA_FORMAT_Z32_FLOAT_S8X24_UINT
to GL_UNSIGNED_INT_24_8.  Hit by the piglit
ext_packed_depth_stencil-getteximage test.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a12d4d0398)
2014-04-14 15:05:34 -07:00
Alex Deucher
ef793cbf6d radeon: reverse DBG_NO_HYPERZ logic
Change the flag to DBG_HYPERZ and reverse the logic
so setting the flag enabled the feature.  This disables
hyperz on r600g and radeonsi by default.  It can be
enabled by setting the env var.  There are just too
many issues with certain apps so leave it disabled for
now until we sort out the issues with the problematic
apps.

Bugs:
https://bugs.freedesktop.org/show_bug.cgi?id=58660
https://bugs.freedesktop.org/show_bug.cgi?id=64471
https://bugs.freedesktop.org/show_bug.cgi?id=66352
https://bugs.freedesktop.org/show_bug.cgi?id=68799
https://bugs.freedesktop.org/show_bug.cgi?id=72685
https://bugs.freedesktop.org/show_bug.cgi?id=73088
https://bugs.freedesktop.org/show_bug.cgi?id=74428
https://bugs.freedesktop.org/show_bug.cgi?id=74803
https://bugs.freedesktop.org/show_bug.cgi?id=74863
https://bugs.freedesktop.org/show_bug.cgi?id=74892
https://bugzilla.kernel.org/show_bug.cgi?id=70411

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org>
Acked-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 01e6371149)
2014-04-14 14:05:02 -07:00
Carl Worth
f6ce0eba76 docs: Add md5sums for the 10.0.4 release.
After tar files were generated.
2014-03-12 09:16:31 -07:00
Carl Worth
2cfd35186e docs: Add release notes for 10.0.4
Just prior to release.
2014-03-12 08:55:46 -07:00
Carl Worth
7494e2e50c Update version to 10.0.4
In preparation for a stable-branch release.
2014-03-12 08:52:06 -07:00
Carl Worth
c29c9947a3 get-pick-list: Update to only find patches nominated for the 10.0 branch
In early February, the 10.1 branch was created. From then on, patches that
don't specifically say "10.0" are intended for 10.1, not 10.0.
2014-03-11 11:49:52 -07:00
Hans
518526700e mesa: don't define c99 math functions for MSVC >= 1800
Signed-off-by: Brian Paul <brianp@vmware.com>
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 837da9bdae)
2014-03-11 11:49:52 -07:00
Hans
2f9e7f6394 util: don't define isfinite(), isnan() for MSVC >= 1800
Signed-off-by: Brian Paul <brianp@vmware.com>
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bf25660325)
2014-03-11 11:49:52 -07:00
Brian Paul
9cdb86a1da softpipe: use 64-bit arithmetic in softpipe_resource_layout()
To avoid 32-bit integer overflow for large textures.  Note: we're
already doing this in llvmpipe.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 465b2c42bc)
2014-03-11 11:49:52 -07:00
Julien Cristau
4588a32dee glx/dri2: fix build failure on HURD
Patch from Debian package.

Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6f0e2731e8)
2014-03-11 11:49:52 -07:00
Chris Forbes
aba40445c2 i965: Validate (and resolve) all the bound textures.
BRW_MAX_TEX_UNIT is the static limit on the number of textures we
support per-stage, not in total.

Core's `Unit` array is sized by MAX_COMBINED_TEXTURE_IMAGE_UNITS, which
is significantly larger, and across the various shader stages, up to
ctx->Const.MaxCombinedTextureImageUnits elements of it may be actually
used.

Fixes invisible bad behavior in piglit's max-samplers test (although
this escalated to an assertion failure on HSW with texture_view, since
non-immutable textures only have _Format set by validation.)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit befbda56a2)
2014-03-11 11:49:51 -07:00
Emil Velikov
6a81f2bc9b dri/i9*5: correctly calculate the amount of system memory
The variable name states megabytes, while we calculate the amount in
kilobytes. Correct this by dividing with the correct amount.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit fc25956bad)
2014-03-11 11:49:51 -07:00
Tom Stellard
a02c50ef4e r600g/compute: PIPE_CAP_COMPUTE should be false for pre-evergreen GPUs
This prevents clover from using unsupported devices.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

CC: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f61e382f0a)
2014-03-10 15:34:15 -07:00
Brian Paul
af1831d003 mesa: do depth/stencil format conversion in glGetTexImage
glGetTexImage(GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8) was just
using memcpy() instead of _mesa_unpack_uint_24_8_depth_stencil_row()
to convert texels from the hardware format to the GL format.

Fixes issue reported by David Meng at Intel.  The new piglit
ext_packed_depth_stencil-getteximage test checks for this bug.

Also, add some format/type assertions.  We don't yet handle the
GL_FLOAT_32_UNSIGNED_INT_24_8_REV type.  That should be fixed in
a follow-on patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43dee0295e)
2014-03-10 15:34:14 -07:00
Anuj Phogat
59ab5bf0a0 i965: Fix the region's pitch condition to use blitter
intelEmitCopyBlit uses a signed 16-bit integer to represent
buffer pitch, so it can only handle buffer pitches < 32k.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit b3094d9927)
2014-03-10 15:34:14 -07:00
Fredrik Höglund
85e04ad280 glx: Fix the GLXFBConfig attrib sort priorities
The sort priorites for GLX_SAMPLES and GLX_SAMPLE_BUFFERS are
not defined in GL_ARB_multisample, but they are defined in
the GLX 1.4 specification.

Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 3616e862f2)
2014-03-10 15:34:14 -07:00
Fredrik Höglund
6b2cf05192 glx: Fix the default values for GLXFBConfig attributes
The default values for GLX_DRAWABLE_TYPE and GLX_RENDER_TYPE are
GLX_WINDOW_BIT and GLX_RGBA_BIT respectively, as specified in
the GLX 1.4 specification.

This fixes the glx-choosefbconfig-defaults piglit test.

Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit f41c2f6c33)
2014-03-10 15:34:14 -07:00
Emil Velikov
3fc389efeb nv50: correctly calculate the number of vertical blocks during transfer map
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 882070cc81)
2014-03-10 15:34:14 -07:00
Kenneth Graunke
bab122c320 i965: Create a hardware context before initializing state module.
brw_init_state() calls brw_upload_initial_gpu_state().  If hardware
contexts are enabled (brw->hw_ctx != NULL), this will upload some
initial invariant state for the GPU.  Without hardware contexts, we
rely on this state being uploaded via atoms that subscribe to the
BRW_NEW_CONTEXT bit.

Commit 46d3c2bf4d accidentally moved
the call to brw_init_state() before creating a hardware context.
This meant brw_upload_initial_gpu_state would always early return.
Except on Gen6+, we stopped uploading the initial GPU state via
state atoms, so it never happened.

Fixes a regression since 46d3c2bf4d.

Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3663bbe773)
2014-03-04 13:24:48 -08:00
Ian Romanick
cf7daac483 glsl: Only warn for macro names containing __
From page 14 (page 20 of the PDF) of the GLSL 1.10 spec:

    "In addition, all identifiers containing two consecutive underscores
     (__) are reserved as possible future keywords."

The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos.  Names simply containing __ are dangerous to use, but should
be allowed.

Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
(cherry picked from commit 2c85fd5a96)
2014-03-04 13:24:04 -08:00
Ian Romanick
de6068a218 glcpp: Only warn for macro names containing __
Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and the
GLSL ES spec (all versions) say:

    "All macro names containing two consecutive underscores ( __ ) are
    reserved for future use as predefined macro names. All macro names
    prefixed with "GL_" ("GL" followed by a single underscore) are also
    reserved."

The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos.  Since every extension adds a name prefixed with GL_ (i.e.,
the name of the extension), that should be an error.  Names simply
containing __ are dangerous to use, but should be allowed.  In similar
cases, the C++ preprocessor specification says, "no diagnostic is
required."

Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
(cherry picked from commit 0bd7892630)
2014-03-04 13:23:46 -08:00
Anuj Phogat
c074e34745 glsl: Fix condition to generate shader link error
GL_ARB_ES2_compatibility doesn't say anything about shader linking
when one of the shaders (vertex or fragment shader) is absent. So,
the extension shouldn't change the behavior specified in GLSL
specification.

Tested the behavior on proprietary linux drivers of NVIDIA and AMD.
Both of them allow linking a version 100 shader program in OpenGL
context, when one of the shaders is absent.

Makes following Khronos CTS tests to pass:
successfulcompilevert_linkprogram.test
successfulcompilefrag_linkprogram.test

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 03597cf802)
2014-03-04 13:22:43 -08:00
Anuj Phogat
00d1daf2a8 mesa: Add GL_TEXTURE_CUBE_MAP_ARRAY to legal_get_tex_level_parameter_target()
Fixes failing Khronos CTS test packed_depth_stencil_init.test

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6bd2472a8b)
2014-03-04 13:22:07 -08:00
Kusanagi Kouichi
09a346a1c1 targets/vdpau: Always use c++ to link
If built without llvm, the following error occurs with mplayer:

Failed to open VDPAU backend .../libvdpau_r600.so: undefined symbol: _ZTVN10__cxxabiv117__class_type_infoE
[vo/vdpau] Error when calling vdp_device_create_x11: 1

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 61f6cddef7)
2014-03-04 13:20:34 -08:00
Carl Worth
5202312160 main: Avoid double-free of shader Label
As documented, the _mesa_free_shader_program_data function:

	"Frees all the data that hangs off a shader program object, but not
	the object itself."

This means that this function may be called multiple times on the same object,
(and has been observed to). Meanwhile, the shProg->Label field was not being
set to NULL after its free(). This led to a second call to free() of the same
address on the second call to this function.

Fix this by setting this field to NULL after free(), (just as with all other
calls to free() in this function).

Reviewed-by: Brian Paul <brianp@vmware.com>

CC: mesa-stable@lists.freedesktop.org
(cherry picked from commit a92581acf2)
2014-03-04 13:20:01 -08:00
Ilia Mirkin
d6e83e9a7a nouveau: fix chipset checks for nv1a by using the oclass instead
Commit f4ebcd133b ("dri/nouveau: NV17_3D class is not available for
NV1a chipset") fixed this partially by using the correct 3d class.
However there were a lot of checks left over comparing against the
chipset.

Reported-and-tested-by: John F. Godfrey <jfgodfrey@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 0c8b165366)
2014-03-04 13:15:11 -08:00
Fredrik Höglund
ad54c842fa mesa: Preserve the NewArrays state when copying a VAO
Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org>

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72895
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 9afbd04d89)
2014-03-04 13:14:51 -08:00
Emil Velikov
a4719eff1a dri/nouveau: Pass the API into _mesa_initialize_context
Currently we create a OPENGL_COMPAT context regardless of
what was requested by the program. Correct that by retaining
the program's request and passing it into _mesa_initialize_context.

Based on a similar commit for radeon/r200 by Ian Romanick.

Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 76d9f6d972)
2014-03-04 13:13:26 -08:00
Daniel Kurtz
5f35078700 glsl: Add locking to builtin_builder singleton
Consider a multithreaded program with two contexts A and B, and the
following scenario:

1. Context A calls initialize(), which allocates mem_ctx and starts
   building built-ins.
2. Context B calls initialize(), which sees mem_ctx != NULL and assumes
   everything is already set up.  It returns.
3. Context B calls find(), which fails to find the built-in since it
   hasn't been created yet.
4. Context A finally finishes initializing the built-ins.

This will break at step 3.  Adding a lock ensures that subsequent
callers of initialize() will wait until initialization is actually
complete.

Similarly, if any thread calls release while another thread is still
initializing, or calling find(), the mem_ctx/shader would get free'd while
from under it, leading to corruption or use-after-free crashes.

Fixes sporadic failures in Piglit's glx-multithread-shader-compile.

Bugzilla: https://bugs.freedesktop.org/69200
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b47d231526)
2014-03-04 13:12:27 -08:00
Ilia Mirkin
aa1f7b4237 nouveau/video: make sure that firmware is present when checking caps
Apparently some players are ill-prepared for us claiming that a decoder
exists only to have creating it fail, and express this poor preparation
with crashes (e.g. flash). Check that firmware is there to increase the
chances of there being a high correlation between reported capabilities
and ability to create a decoder.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.0 10.1 <mesa-stable@lists.freedesktop.org>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 40dd777b33)
2014-03-04 13:12:02 -08:00
Ilia Mirkin
d141825eff nv30: report 8 maximum inputs
nvfx_fragprog_assign_generic only allows for up to 10/8 texcoords for
nv40/nv30. This fixes compilation of the varying-packing tests.
Furthermore it appears that the last 2 inputs on nv4x don't seem to
work in those tests, so just report 8 everywhere for now.

Tested on NV42, NV44. NV4B appears to have additional problems.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 9.1 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 356aff3a5c)
2014-03-04 13:10:35 -08:00
Brian Paul
d13adcae22 mesa: update assertion in detach_shader() for geom shaders
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74723
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
(cherry picked from commit c325ec8965)
2014-03-04 13:09:17 -08:00
Kenneth Graunke
e91dd3661c glsl: Don't lose precision qualifiers when encountering "centroid".
Mesa fails to retain the precision qualifier when parsing:

   #version 300 es
   centroid in mediump vec2 v;

Consider how the parser's type_qualifier production is applied.
First, the precision_qualifier rule creates a new ast_type_qualifier:

    <precision: mediump>

Then the storage_qualifier rule creates a second one:

    <flags: in>

and calls merge_qualifier() to fold in any previous qualifications,
returning:

    <flags: in, precision: mediump>

Finally, the auxiliary_storage_qualifier creates one for "centroid":

    <flags: centroid>

it then does $$ = $1 and $$.flags |= $2.flags, resulting in:

    <flags: centroid, in>

Since precision isn't stored in the flags bitfield, it is lost.  We need
to instead call merge_qualifier to combine all the fields.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2062f40d81)
2014-03-04 13:07:32 -08:00
Brian Paul
490b810d0e st/mesa: avoid sw fallback for getting/decompressing textures
If st_GetTexImage() is to decompress the texture, avoid the fallback
path even if prefer_blit_based_texture_transfer = false.  For drivers
that returned PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER = 0, we
were always taking the fallback path for texture decompression rather
than rendering a quad.  The later is a lot faster.

Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit f47e596288)
2014-03-04 13:07:13 -08:00
Matt Turner
d37086c6fc glsl: Initialize ubo_binding_mask flags to zero.
Missed in commit e63bb298. Caused sporadic test failures, like
incorrect-in-layout-qualifier-repeated-prim.geom.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit e2ef93cf94)
2014-03-04 13:05:05 -08:00
Marek Olšák
cfd8aed240 st/mesa: fix crash when a shader uses a TBO and it's not bound
This binds a NULL sampler view in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74251

Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit c6dbcf10df)
2014-03-04 13:04:37 -08:00
Paul Berry
1b6aec4b5a glsl: Fix continue statements in do-while loops.
From the GLSL 4.40 spec, section 6.4 (Jumps):

    The continue jump is used only in loops. It skips the remainder of
    the body of the inner most loop of which it is inside. For while
    and do-while loops, this jump is to the next evaluation of the
    loop condition-expression from which the loop continues as
    previously defined.

Previously, we incorrectly treated a "continue" statement as jumping
to the top of a do-while loop.

This patch fixes the problem by replicating the loop condition when
converting the "continue" statement to IR.  (We already do a similar
thing in "for" loops, to ensure that "continue" causes the loop
expression to be executed).

Fixes piglit tests:
- glsl-fs-continue-inside-do-while.shader_test
- glsl-vs-continue-inside-do-while.shader_test
- glsl-fs-continue-in-switch-in-do-while.shader_test
- glsl-vs-continue-in-switch-in-do-while.shader_test

Cc: mesa-stable@lists.freedesktop.org

Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 7f5740899f)
2014-03-04 13:04:15 -08:00
Paul Berry
6d6bdd88e7 glsl: Make condition_to_hir() callable from outside ast_iteration_statement.
In addition to making it public, we also need to change its first
argument from an ir_loop * to an exec_list *, so that it can be used
to insert the condition anywhere in the IR (rather than just in the
body of the loop).

This will be necessary in order to make continue statements work
properly in do-while loops.

Cc: mesa-stable@lists.freedesktop.org

Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 56790856b3)
2014-03-04 13:01:20 -08:00
Topi Pohjolainen
2a19186953 i965/blorp: do not use unnecessary hw-blending support
This is really not needed as blorp blit programs already sample
XRGB normally and get alpha channel set to 1.0 automatically by
the sampler engine. This is simply copied directly to the payload
of the render target write message and hence there is no need for
any additional blending support from the pixel processing pipeline.

The blending formula is anyway broken for color components, it
multiplies the color component with itself (blend factor is the
component itself).
Alpha blending in turn would not fix the alpha to one independent
of the source but simply used the source alpha as is instead
(1.0 * src_alpha + 0.0 * dst_alpha).

Quoting Eric:

 "If we want to actually make the no-alpha-bits-present thing work,
  we need to override the bits in the surface state or in the
  generated code.  In the normal draw path, it's done for sampling
  by the swizzling code in brw_wm_surface_state.c, and the blending
  overrides is just to fix up the alpha blending stage which
  doesn't pay attention to that for the destination surface."

If one modifies piglit test gl-3.2-layered-rendering-blit to use
color component values other than zero or one, this change will
kick in on IVB. No regressions on IVB.

This is effectively revert of c0554141a9:

    i965/blorp: Support overriding destination alpha to 1.0.

    Currently, Blorp requires the source and destination formats to be
    equal.  However, we'd really like to be able to blit between XRGB and
    ARGB formats; our BLT engine paths have supported this for a long time.

    For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
    interpreted as 1.0.  For XRGB -> ARGB, we need to smash the alpha
    channel to 1.0 when writing the destination colors.  This is fairly
    straightforward with blending.

    For now, this code is never used, as the source and destination formats
    still must be equal.  The next patch will relax that restriction.

    NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 933be19cdf)
2014-03-04 13:00:47 -08:00
Christian König
dc0053b33f radeon/uvd: fix feedback buffer handling v2
Without the correct feedback buffer size UVD runs
into an error on each frame, reducing the maximum FPS.

v2: fixing Michels comments

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "10.1" "10.0" "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c3c24c3acc)
2014-03-04 13:00:16 -08:00
Brian Paul
e70e368af5 draw: fix incorrect color of flat-shaded clipped lines
When we clipped a line weren't copying the provoking vertex
color to the second vertex.  We also weren't checking for
first vs. last provoking vertex.

Fixes failures found with the new piglit line-flat-clip-color test.

Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit fc3fcd1e01)
2014-03-04 12:59:27 -08:00
Brian Paul
e10b0e0f50 gallium/auxiliary/indices: replace free() with FREE()
To match the CALLOC_STRUCT() call.

Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 307fd76053)
2014-03-04 12:58:30 -08:00
Ian Romanick
b74da80b71 meta: Consistenly use non-Apple VAO functions
For these objects, meta was already using the non-Apple function to
delete the objects.  Everywhere else in the file uses
_mesa_GenVertexArrays and _mesa_BindVertexArrays.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit abfa65ca81)
2014-03-04 12:58:10 -08:00
Ian Romanick
89c6473ff0 meta: Fallback to software for GetTexImage of compressed GL_TEXTURE_CUBE_MAP_ARRAY
The hardware decompression path isn't even close to being able to handle
this.  This converts the crash (assertion failure) in
"EXT_texture_compression_s3tc/getteximage-targets S3TC CUBE_ARRAY" to a
plain old failure.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 070f55d893)
2014-03-04 12:57:43 -08:00
Ian Romanick
a4a8af4cbb meta: Release resources used by _mesa_meta_DrawPixels
_mesa_meta_DrawPixels creates a VAO and (potentially) two fragment
programs, but none of them are ever released.  Leaking piles of memory
is generally frowned upon.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fcb498302b)
2014-03-04 12:57:01 -08:00
Ian Romanick
c1bcdcde1c meta: Release resources used by decompress_texture_image
decompress_texture_image creates an FBO, an RBO, a VBO, a VAO, and a
sampler object, but none of them are ever released.  Later patches will
add program objects, exacerbating the problem.  Leaking piles of memory
is generally frowned upon.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2d3f92e881)
2014-03-04 12:56:23 -08:00
Brian Paul
d18b182134 radeon: move driContextSetFlags(ctx) call after ctx var is initialized
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f51ca46f0c)
2014-03-04 12:55:24 -08:00
Brian Paul
5297fdc0c8 r200: move driContextSetFlags(ctx) call after ctx var is initialized
Otherwise, ctx was a garbage value.

CC: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2d6d69bab6)
2014-03-04 12:55:07 -08:00
Anuj Phogat
edf066f385 mesa: Generate correct error code in glDrawBuffers()
OpenGL 3.3 spec expects GL_INVALID_OPERATION:
 "For both the default framebuffer and framebuffer objects, the
  constants FRONT, BACK, LEFT, RIGHT, and FRONT AND BACK are not
  valid in the bufs array passed to DrawBuffers, and will result
  in the error INVALID OPERATION."

But OpenGL 4.0 spec changed the error code to GL_INVALID_ENUM:
 "For both the default framebuffer and framebuffer objects, the
  constants FRONT, BACK, LEFT, RIGHT, and FRONT_AND_BACK are not
  valid in the bufs array passed to DrawBuffers, and will result
  in the error INVALID_ENUM."

This patch changes the behaviour to match OpenGL 4.0 spec
Fixes Khronos OpenGL CTS draw_buffers_api.test.

V2: Update the comment in code.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 3303475558)
2014-03-04 12:54:41 -08:00
Carl Worth
593484a1c4 docs: Add md5sums for 10.0.3 release
Which we couldn't do until after tagging the release, of course.
2014-02-03 12:19:49 -08:00
Carl Worth
d8225ac67a docs: Add release notes for 10.0.3 release.
Just before making the actual release.
2014-02-03 11:21:23 -08:00
Carl Worth
3eac4b550d Update version to 10.0.3
In preparation for the upcoming 10.0.3 release.
2014-02-03 11:17:06 -08:00
Paul Seidler
cb7caac053 build: move ARCH_LIBS definition outside of ASM definition
_mesa_streaming_load_memcpy is also needed even if assembling is disabled

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 1cdeeef6c4)
2014-02-03 09:59:52 -08:00
Lauri Kasanen
0461451dcd mesa: Fix build to properly check for supported compiler flags
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72708
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lauri Kasanen <cand@gmx.com>
(cherry picked from commit fcefdc9a59)
2014-02-03 09:59:52 -08:00
Matt Turner
0657a6a6ae glx: Update glxext.h to revision 24777.
It readds the GLXContextID typedef, but under #ifndef GLX_VERSION_1_3.

Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=11454

(Backported from commit 3f3aafbfee)
2014-02-03 09:59:37 -08:00
Anuj Phogat
559d9b894e i965: Ignore 'centroid' interpolation qualifier in case of persample shading
This patch handles the use of 'centroid' qualifier with 'in' variables
in a fragment shader when persample shading is enabled. Per sample
shading for the whole fragment shader can be enabled by:
glEnable(GL_SAMPLE_SHADING) or using {gl_SamplePosition, gl_SampleID}
builtin variables in fragment shader. Explaining it below in more
detail.

/* Enable sample shading using OpenGL API */
glEnable(GL_SAMPLE_SHADING);
glMinSampleShading(1.0);

Example fragment shader:
in vec4 a;
centroid in vec4 b;
main()
{
  ...
}

Variable 'a' will be interpolated at sample location. But, what
interpolation should we use for variable 'b' ?

ARB_sample_shading recommends interpolation at sample position for
all the variables. GLSL 400 (and earlier) spec says that:

"When an interpolation qualifier is used, it overrides settings
established through the OpenGL API."
But, this text got deleted in later versions of GLSL.

NVIDIA's and AMD's proprietary linux drivers (at OpenGL 4.3)
interpolates at sample position. This convinces me to use
the similar approach on intel hardware.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit f5cfb4ae21)

and

i965: Ignore 'centroid' interpolation qualifier in case of persample shading

I missed this change in commit f5cfb4a. It fixes the incorrect
rendering caused in Dolphin Emulator.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73915

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Markus Wick <wickmarkus@web.de>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit dc2f94bc78)
2014-01-31 13:01:44 -08:00
Anuj Phogat
765e3d373b i965: Use sample barycentric coordinates with per sample shading
Current implementation of arb_sample_shading doesn't set 'Barycentric
Interpolation Mode' correctly. We use pixel barycentric coordinates
for per sample shading. Instead we should select perspective sample
or non-perspective sample barycentric coordinates.

It also enables using sample barycentric coordinates in case of a
fragment shader variable declared with 'sample' qualifier.
e.g. sample in vec4 pos;

A piglit test to verify the implementation has been posted on piglit
mailing list for review.

V2: Do not interpolate all the 'in' variables at sample position
    if fragment shader uses 'sample' qualifier with one of them.
    For example we have a fragment shader:
    #version 330
    #extension ARB_gpu_shader5: require
    sample in vec4 a;
    in vec4 b;
    main()
    {
      ...
    }

    Only 'a' should be sampled at sample location, not 'b'.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit a92e5f7cf6)
2014-01-31 13:01:34 -08:00
Carl Worth
ae286af09d cherry-ignore: Ignore 4 patches at teh request of the author, (Anuj).
For 3 of the 4, I was already ignoring them since they were not picking
cleanly. Now, Anuj has explicitly requested they be ignored since they all
depend on a series that is not yet on the 10.0 branch.
2014-01-31 12:38:10 -08:00
José Fonseca
ed437df208 mesa: Use IROUND instead of roundf.
roundf is not available on MSVC.

(cherry picked from commit bba8f10598)
2014-01-31 12:37:11 -08:00
Chad Versace
f7848574b3 i965/gen6/blorp: Emit more flushes to workaround hangs
This is a squash of three related cherry-picks from master.

[PATCH 1/3]

  i965/gen6/blorp: Set need_workaround_flush immediately after primitive

  This patch makes the workaround code in gen6 blorp follow the pattern
  established in the regular draw path. It shouldn't result in any
  behavioral change.

  On gen6, there are two places where we emit 3D_CMD_PRIM: brw_emit_prim()
  and gen6_blorp_emit_primitive().  brw_emit_prim() sets
  need_workaround_flush immediately after emitting the primitive, but
  blorp does not. Blorp sets need_workaround_flush at the bottom of
  brw_blorp_exec().

  This patch moves the need_workaround_flush from brw_blorp_exec() to
  gen6_blorp_emit_primitive().  There is no need to set
  need_workaround_flush in gen7_blorp_emit_primitive() because the
  workaround applies only to gen6.

  Reviewed-by: Paul Berry <stereotype441@gmail.com>
  Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
  (cherry picked from commit 5e0cd58de4)

[PATCH 2/3]

  i965/gen6/blorp: Set need_workaround_flush at top of blorp

  Unconditionally set brw->need_workaround_flush at the top of gen6 blorp
  state emission.

  The art of emitting workaround flushes on Sandybridge is mysterious and
  not fully understood. Ken and I believe that
  intel_emit_post_sync_nonzero_flush() may be required when switching from
  regular drawing to blorp.  This is an extra safety measure to prevent
  undiscovered difficult-to-diagnose gpu hangs.

  I verified that on ChromeOS, pre-patch, need_workaround_flush was not
  set at the top of blorp, as Paul expected. To verify, I inserted the
  following debug code at the top of gen6_blorp_exec(), restarted the ui,
  and inspected the logs in /var/log/ui. The abort gets triggered so early
  that the browser never appears on the display.

      static void
      gen6_blorp_exec(...)
      {
          if (!brw->need_workaround_flush) {
              fprintf(stderr, "chadv: %s:%d\n", __FILE__, __LINE__);
              abort();
          }
          ...
      }

  CC: Kenneth Graunke <kenneth@whitecape.org>
  CC: Stéphane Marchesin <marcheu@chromium.org>
  Reviewed-by: Paul Berry <stereotype441@gmail.com>
  Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
  (cherry picked from commit 6a5c86f486)

[PATCH 3/3]

  i965/gen6/blorp: Remove redundant HiZ workaround

  Commit 1a92881 added extra flushes to fix a HiZ hang in
  WebGL Google Maps. With the extra flushes emitted by the previous two
  patches, the flushes added by 1a92881 are redundant.

  Tested with the same criteria as in 1a92881: by zooming in and out
  continuously for 2 hours on Sandybridge Chrome OS (codename
  Stumpy) without a hang.

  CC: Kenneth Graunke <kenneth@whitecape.org>
  CC: Stéphane Marchesin <marcheu@chromium.org>
  Reviewed-by: Paul Berry <stereotype441@gmail.com>
  Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
  (cherry picked from commit 90368875e7)

  Conflicts:
  	src/mesa/drivers/dri/i965/gen6_blorp.cpp
2014-01-31 12:21:25 -08:00
Ian Romanick
319d6d6067 radeon / r200: Pass the API into _mesa_initialize_context
Otherwise an application that requested an OpenGL ES 1.x context would
actually get a desktop OpenGL context.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 33214679bb)
2014-01-28 13:21:55 -08:00
Tom Stellard
6f27353c20 r600g/compute: Emit DEALLOC_STATE on cayman after dispatching a compute shader.
This is necessary to prevent the next SURFACE_SYNC packet from
hanging the GPU.

https://bugs.freedesktop.org/show_bug.cgi?id=73418

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d51dbe048a)
2014-01-28 13:21:25 -08:00
Emil Velikov
99f695f716 gallium/rtasm: handle mmap failures appropriately
For a variety of reasons mmap (selinux and pax to name
a few) and can fail and with current code. This will
result in a crash in the driver, if not worse.

This has been the case since the inception of the
gallium copy of rtasm.

Cc: 9.1 9.2 10.0 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73473
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
(cherry picked from commit 4dd445f1cf)
2014-01-28 13:20:53 -08:00
Carl Worth
ef75bf0777 Drop another couple of patches.
These depend on code which does not exist on the stable branch.
2014-01-28 13:18:40 -08:00
Matt Turner
0cd3d50f07 glcpp: Define GL_EXT_shader_integer_mix in both GL and ES.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 66ef8feb4d)

Conflicts:
	src/glsl/glcpp/glcpp-parse.y
2014-01-28 13:03:08 -08:00
Carl Worth
31b2e73a2d cherry-ignore: Ignore several patches not yet ready for the stable branch
The comments describe the reasons for each being excluded.
2014-01-28 12:51:53 -08:00
Brian Paul
df62691a02 draw: fix incorrect vertex size computation in LLVM drawing code
We were calling draw_total_vs_outputs() too early.  The call to
draw_pt_emit_prepare() could result in the vertex size changing.
So call draw_total_vs_outputs() after draw_pt_emit_prepare().

This fix would seem to be needed for the non-LLVM code as well,
but it's not obvious.  Instead, I added an assertion there to
try to catch this problem if it were to occur there.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72926
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit ad814d04ca)

Conflicts:
	src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline.c
2014-01-27 16:15:10 -08:00
Kenneth Graunke
fe2678accd glsl: Fix chained assignments of vector channels.
Simple shaders such as:

    void splat(vec2 v, float f) {
        v[0] = v[1] = f;
    }

failed to compile with the following error:
error: value of type vec2 cannot be assigned to variable of type float

First, we would process v[1] = f, and transform:
LHS: (expression float vector_extract (var_ref v) (constant int (1)))
RHS: (var_ref f)
into:
LHS: (var_ref v)
RHS: (expression vec2 vector_insert (var_ref v) (constant int (1))
                 (var_ref f))

Note that the LHS type is now vec2, not a float.  This is surprising,
but not the real problem.

After emitting assignments, this ultimately becomes:
(declare (temporary) vec2 assignment_tmp)
(assign (xy)
  (var_ref assignment_tmp)
  (expression vec2 vector_insert (var_ref v) (constant int (1))
              (var_ref f)))
  (assign (xy) (var_ref v) (var_ref assignment_tmp))

We would then return (var_ref assignment_tmp) as the rvalue, which has
the wrong type---it should be float, but is instead a vec2.

To fix this, we simply return (vector_extract (var_ref assignment_temp)
<the appropriate channel>) to pull out the desired float value.

Fixes Piglit's chained-assignment-with-vector-constant-index.vert and
chained-assignment-with-vector-dynamic-index.vert tests.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74026
Reported-by: Dan Ginsburg <dang@valvesoftware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 44a86e2b4f)
2014-01-25 16:55:24 -08:00
Kenneth Graunke
83e9eb81be glsl: Rename "expr" to "lhs_expr" in vector_extract munging code.
When processing assignments, we have both an LHS and RHS.  At a glance,
"lhs_expr" clearly refers to the LHS, while a generic name like "expr"
is ambiguous.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6c158e110c)
2014-01-25 16:55:15 -08:00
Anuj Phogat
8c467b825f glsl: Disable ARB_texture_rectangle in shader version 100.
OpenGL with ARB_ES2_compatibility allows shaders that specify #version
100.

This fixes the Khronos OpenGL test(Texture_Rectangle_Samplers_frag.test)
failure.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit c907595ba7)
2014-01-25 16:53:05 -08:00
Brian Paul
79ef990ef8 st/mesa: fix glReadBuffer(GL_NONE) segfault
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73956
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Tested-by: Ahmed Allam <ahmabdabd@hotmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit f7c118ffbf)
2014-01-25 16:52:34 -08:00
Marek Olšák
b1694c9f87 gallium/util: util_format_srgb should not return FORMAT_NONE for sRGB formats
This fixes a serious regression introduced
in 4e549ddb50.

Cc: 9.2 10.0 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit d40532f260)
2014-01-25 16:52:17 -08:00
Ilia Mirkin
e2b6834c87 st/vdpau: don't return a device if the screen doesn't support NPOT
NV3x cards don't support NPOT textures. Technically this restriction
could be worked around, but since it also doesn't expose any video
decoding hw, just turn it off entirely.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 00e4314f6d)
2014-01-25 16:46:11 -08:00
Emil Velikov
04e5f2e94f nv50: access only the available amount of constbuf
The textures array is defined as a number of NV50_MAX_PIPE_CONSTBUFS
per shader stage. Currently the nv50 driver handles only 3 shader
stages, thus we wreck chaos when accessing array-out-of-bounds.

Cc: 9.1 9.2 10.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 12e744abbb)
2014-01-25 16:45:49 -08:00
Emil Velikov
a3f259e404 nv50: access only the available amount of textures
The textures array is defined as a number of PIPE_MAX_SAMPLERS per shader stage.
Currently nv50 driver handles only 3 shader stages, thus we wreck chaos when
accessing array-out-of-bounds.

Fixes a segfault in piglit/bin/arb_texture_buffer_object-data-sync -fbo -auto

Cc: 9.1 9.2 10.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit d606ca37eb)
2014-01-25 16:45:36 -08:00
Ilia Mirkin
705da42130 mesa: fix GL_COLOR_SUM enum for drivers without ARB_vertex_program
Commit c13970808 (mesa: GL_EXT_secondary_color is not optional) changed

CHECK_EXTENSION2(EXT_secondary_color, ARB_vetex_program, cap)

to

CHECK_EXTENSION(ARB_vertex_program, cap)

However CHECK_EXTENSION2 checks that either extension is available, not
both. Remove the extension check entirely since the intent was for it to
always be enabled.

v2: Fix glGet*(GL_COLOR_SUM) too.  Suggested by Ian.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: 9.2 10.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 739dc95e67)
2014-01-25 16:45:16 -08:00
Aaron Watry
b646441307 st/dri: prevent leak of dri option default values
v2: Change comment style

CC: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit ce3528896b)
2014-01-25 16:44:23 -08:00
Aaron Watry
0ec1ae90ef radeon: Move gfx/dma cs cleanup to r600_common_context_cleanup
The radeonsi code was not cleaning up either of these items leading to
leaked memory.

v2: Move cleanup to r600_common_context_cleanup instead of duplicating
    the logic for SI

CC: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 5ac3229f76)

Conflicts:
	src/gallium/drivers/radeon/r600_pipe_common.c
2014-01-25 16:43:47 -08:00
Ian Romanick
0fd4cf4bf8 mesa: Add COMPRESSED_RGBA_S3TC_DXT1_EXT to COMPRESSED_TEXTURE_FORMATS for GLES
The ES and desktop GL specs diverge here.  Yay!

In desktop OpenGL, the driver can perform online compression of
uncompressed texture data.  GL_NUM_COMPRESSED_TEXTURE_FORMATS and
GL_COMPRESSED_TEXTURE_FORMATS give the application a list of formats
that it could ask the driver to compress with some expectation of
quality.  The GL_ARB_texture_compression spec calls this "suitable for
general-purpose usage."  As noted above, this means
GL_COMPRESSED_RGBA_S3TC_DXT1_EXT is not included in the list.

In OpenGL ES, the driver never performs compression.
GL_NUM_COMPRESSED_TEXTURE_FORMATS and GL_COMPRESSED_TEXTURE_FORMATS give
the application a list of formats that the driver can receive from the
application.  It is the *complete* list of formats.  The
GL_EXT_texture_compression_s3tc spec says:

    "New State for OpenGL ES 2.0.25 and 3.0.2 Specifications

        The queries for NUM_COMPRESSED_TEXTURE_FORMATS and
        COMPRESSED_TEXTURE_FORMATS include COMPRESSED_RGB_S3TC_DXT1_EXT,
        COMPRESSED_RGBA_S3TC_DXT1_EXT, COMPRESSED_RGBA_S3TC_DXT3_EXT,
        and COMPRESSED_RGBA_S3TC_DXT5_EXT."

Note that the addition is only to the OpenGL ES specification!

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
See-also: http://lists.freedesktop.org/archives/mesa-dev/2013-October/047439.html
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0a75909b3f)
2014-01-25 16:39:13 -08:00
Emil Velikov
45f0736aa5 st/mesa: use signed temporary variable to store _ColorDrawBufferIndexes
The temporary variable used to store _ColorDrawBufferIndexes must be
signed (GLint), otherwise the following conditional will be incorrectly
evaluated. Leading to crashes in the driver/mesa or accessing/writing
to arbitrary memory location. The bug dates back to 2009.

Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit bfcf78c110)
2014-01-25 16:38:37 -08:00
Emil Velikov
b513c66a4e mesa: use signed temporary variable to store _ColorDrawBufferIndexes
_ColorDrawBufferIndexes is defined as GLint* and using a GLuint*
will result in the first part of the conditional to be evaluated to
true always.

Unintentionally introduced by the following commit, this will result
in a driver segfault if one is using an old version of the piglit test

    bin/clearbuffer-mixed-format -auto -fbo

commit 03d848ea10
Author: Marek Olšák <marek.olsak@amd.com>
Date:   Wed Dec 4 00:27:20 2013 +0100

    mesa: fix interpretation of glClearBuffer(drawbuffer)

    This corresponding piglit tests supported this incorrect behavior instead of
    pointing at it.

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 10368e1446)
2014-01-25 16:38:12 -08:00
Michał Górny
dbc0ae1079 Use AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config.
This should help with cross-compiling and multilib when $CHOST-specific
llvm-config is expected rather than build host default one.

It will help us a bit in Gentoo where we've started using
i686-pc-linux-gnu-llvm-config for 32-bit multilib LLVM.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Michał Górny <mgorny@gentoo.org>
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=73100

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5ea2376334)
2014-01-25 16:37:41 -08:00
Paul Berry
9ca4c8f6a2 i965: Ensure that all necessary state is re-emitted if we run out of aperture.
Prior to this patch, if we ran out of aperture space during
brw_try_draw_prims(), we would rewind the batch buffer pointer
(potentially throwing some state that may have been emitted by
brw_upload_state()), flush the batch, and then try again.  However, we
wouldn't reset the dirty bits to the state they had before the call to
brw_upload_state().  As a result, when we tried again, there was a
danger that we wouldn't re-emit all the necessary state.  (Note: prior
to the introduction of hardware contexts, this wasn't a problem
because flushing the batch forced all state to be re-emitted).

This patch fixes the problem by leaving the dirty bits set at the end
of brw_upload_state(); we only clear them after we have determined
that we don't need to rewind the batch buffer.

Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit fb6d9798a0)
2014-01-25 16:37:19 -08:00
Marek Olšák
502d89b260 st/mesa: use sRGB formats for MSAA resolving if destination is sRGB
Copied from the i965 driver, including the big comment.

Cc: 9.2 10.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4e549ddb50)
2014-01-25 16:35:57 -08:00
Eric Anholt
3a6271890c i965: Don't do the temporary-and-blit-copy for INVALIDATE_RANGE maps.
We definitely want to fall through to the unsynchronized map case, instead
of wasting bandwidth on a copy.  Prevents a -43.2407% +/- 1.06113% (n=49)
performance regression on aa10perf when teaching glamor to provide the
GL_INVALIDATE_RANGE_BIT information.

This is a performance fix, which I usually wouldn't cherry-pick to stable.
But this was really was just a bug in the code, its presence would
discourage developers from giving us the best information they can, and I
think we've got fairly high confidence in the unsynchronized map path
already.

Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f46563fe1c)
2014-01-09 12:24:44 -08:00
Eric Anholt
9b3ed4c8c2 i965: Fix handling of MESA_pack_invert in blit (PBO) readpixels.
Fixes piglit GL_MESA_pack_invert/readpixels and GPU hangs with glamor and
cairo-gl.

Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit e186b927b8)
2014-01-09 12:24:03 -08:00
Thomas Sondergaard
38235d2923 mesa: Namespace qualify fma to override ambiguity with fma from math.h
MSVC 2013 version of math.h includes an fma() function.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit e8ff08edd8)
2014-01-09 12:23:32 -08:00
Thomas Sondergaard
0df489f0e0 mesa: Work around internal compiler error
This small rearrangement avoids MSVC 2013 ICE. Also, this should be
a better memory access order.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 8fcddd325c)
2014-01-09 12:23:14 -08:00
Thomas Sondergaard
31e2824d99 mesa: Fix compile error with MSVC 2013
This fixes the following compile error:
src\glsl\ir_constant_expression.cpp(1405) : error C2666: 'copysign' : 3
overloads have similar conversions

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 067ad6e53e)
2014-01-09 12:22:40 -08:00
Thomas Sondergaard
700b916da1 mesa: Preliminary support for MSVC_VERSION=12.0
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 20e65c92c7)
2014-01-09 12:22:19 -08:00
Chris Forbes
c24489b0ef i965: fold offset into coord for textureOffset(gsampler2DRect)
The hardware is broken with nonzero texel offsets and unnormalized
coordinates; instead of doing correct offsetting, we get garbage.

This just extends the existing workaround for ir_txf and
ir_tg4+gsampler2DRect to also consider ir_tex+gsampler2DRect.

Fixes broken rendering in 'tesseract' when 'mesa_texrectoffset_bug' is
not enabled; also fixes the new piglit test
'tests/spec/glsl-1.30/execution/fs-textureOffset-Rect'.

Has been broken ~forever; suggesting including this in only 10.0 because
the lowering pass doesn't exist in 9.2 or earlier so would require quite
a different patch.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Lee Salzman <lsalzman@gmail.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9e99735f30)
2014-01-09 12:20:23 -08:00
Andreas Fänger
2b205f2864 swrast: fix delayed texel buffer allocation regression for OpenMP
Commit 9119269ca1 moved the texel
buffer allocation to _swrast_texture_span(), however, when compiled
with OpenMP support this code already runs multi-threaded so a
critical section is required to prevent multiple allocations and
rendering errors.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 2a0fb946e1)
2014-01-09 12:19:01 -08:00
Brian Paul
b1ff3f6270 mesa: implement missing glGet(GL_RGBA_SIGNED_COMPONENTS_EXT) query
This is part of the GL_EXT_packed_float extension.

  Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
  Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
  (cherry picked from commit 3486f6f31b

Also squashed in a subsequent bug fix:

  mesa: check for MESA_FORMAT_RGB9_E5_FLOAT in _mesa_is_format_signed()

  This packed floating point format only stores positive values.

  Reviewed-by: Marek Olšák <marek.olsak@amd.com>
  Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
  Reviewed-by: Roland Scheidegger <sroland@vmware.com>
  (cherry picked from commit 0fc8d7c66e)

Also squashed in a second, subsequent bug fix:

  mesa: check bits per channel for GL_RGBA_SIGNED_COMPONENTS_EXT query

  If a channel has zero bits it's not signed.

  v2: also check for luminance and intensity format bits.  Bruce
  Merry's proposed piglit test hits the luminance case.

  Reviewed-by: Matt Turner <mattst88@gmail.com>
  (cherry picked from commit d046fd731a)

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=73096
Cc: 10.0 <mesa-stable@lists.freedesktop.org>

Conflicts:
	src/mesa/main/get.c
2014-01-09 12:15:17 -08:00
Carl Worth
5310a8cc20 Add md5sums for 10.0.2. release.
Which can be added only after the tag, of course.
2014-01-09 11:59:08 -08:00
Carl Worth
108e50c3bc docs: Add release notes for 10.0.2 release.
Which will happen today.
2014-01-09 11:49:28 -08:00
Carl Worth
44dfcf6e88 Update version to 10.0.2
In preparation for the upcoming 10.0.2 release.
2014-01-09 11:45:18 -08:00
Alexander von Gluck IV
e833368e04 Haiku: Add in public GL kit headers
* These make up the base of what C++ GL Haiku applications
  use for 3D rendering.
* Not placed in includes/GL to prevent Haiku headers from
  getting installed on non-Haiku systems.

Acked-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 56d920a5c1)
2014-01-02 17:11:17 -08:00
Ilia Mirkin
3efc2bbf07 nv50: fix a small leak on context destroy
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit f50a45452a)
2014-01-02 17:11:17 -08:00
Paul Berry
d46a58703a glsl: Fix inconsistent assumptions about ir_loop::counter.
The compiler back-ends (i965's fs_visitor and brw_visitor,
ir_to_mesa_visitor, and glsl_to_tgsi_visitor) assume that when
ir_loop::counter is non-null, it points to a fresh ir_variable that
should be used as the loop counter (as opposed to an ir_variable that
exists elsewhere in the instruction stream).

However, previous to this patch:

(1) loop_control_visitor did not create a new variable for
    ir_loop::counter; instead it re-used the existing ir_variable.
    This caused the loop counter to be double-incremented (once
    explicitly by the body of the loop, and once implicitly by
    ir_loop::increment).

(2) ir_clone did not clone ir_loop::counter properly, resulting in the
    cloned ir_loop pointing to the source ir_loop's counter.

(3) ir_hierarchical_visitor did not visit ir_loop::counter, resulting
    in the ir_variable being missed by reparenting.

Additionally, most optimization passes (e.g. loop unrolling) assume
that the variable mentioned by ir_loop::counter is not accessed in the
body of the loop (an assumption which (1) violates).

The combination of these factors caused a perfect storm in which the
code worked properly nearly all of the time: for loops that got
unrolled, (1) would introduce a double-increment, but loop unrolling
would fail to notice it (since it assumes that ir_loop::counter is not
accessed in the body of the loop), so it would unroll the loop the
correct number of times.  For loops that didn't get unrolled, (1)
would introduce a double-increment, but then later when the IR was
cloned for linking, (2) would prevent the loop counter from being
cloned properly, so it would look to further analysis stages like an
independent variable (and hence the double-increment would stop
occurring).  At the end of linking, (3) would prevent the loop counter
from being reparented, so it would still belong to the shader object
rather than the linked program object.  Provided that the client
program didn't delete the shader object, the memory would never get
reclaimed, and so the shader would function properly.

However, for loops that didn't get unrolled, if the client program did
delete the shader object, and the memory belonging to the loop counter
got re-used, this could cause a use-after-free bug, leading to a
crash.

This patch fixes loop_control_visitor, ir_clone, and
ir_hierarchical_visitor to treat ir_loop::counter the same way the
back-ends treat it: as a freshly allocated ir_variable that needs to
be visited and cloned independently of other ir_variables.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72026

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d6eb4321d0)
2014-01-02 17:10:39 -08:00
Paul Berry
8eee788bd6 glsl: Teach ir_variable_refcount about ir_loop::counter variables.
If an ir_loop has a non-null "counter" field, the variable referred to
by this field is implicitly read and written by the loop.  We need to
account for this in ir_variable_refcount, otherwise there is a danger
we will try to dead-code-eliminate the loop counter variable.

Note: at the moment the dead code elimination bug doesn't occur due to
a bug in ir_hierarchical_visitor: it doesn't visit the "counter"
field, so dead code elimination doesn't treat it as a candidate for
elimination.  But the patch to follow will fix that bug, so we need to
fix ir_variable_refcount first in order to avoid breaking dead code
elimination.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9d2951ea0a)
2014-01-02 17:10:21 -08:00
Chad Versace
9ccb6cc7b7 i965/gen6: Fix HiZ hang in WebGL Google Maps
Emitting flushes before depth and hiz resolves at the top of blorp's
state emission fixes the hang. Marchesin and I found the fix
experimentally, as opposed to adhering to a documented hardware
workaround.  A more minimal fix likely exists, but this gets the job
done.

Fixes HiZ hangs in the new WebGL Google maps on Sandybridge Chrome OS.
Tested by zooming in and out continuously for 2 hours.

This patch is based on
8bc07bb701

CC: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70740
Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 1a928816a1)
2014-01-02 15:59:44 -08:00
Marek Olšák
4d7961e95e st/mesa: fix glClear with multiple colorbuffers and different formats
Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0612005aa6)
2014-01-02 15:57:41 -08:00
Erik Faye-Lund
b8be00e5f2 glcpp: error on multiple #else/#elif directives
The preprocessor currently accepts multiple else/elif-groups
per if-section. The GLSL-preprocessor is defined by the C++
specification, which defines the following parse-rule:

if-section:
	if-group elif-groups(opt) else-group(opt) endif-line

This clearly only allows a single else-group, that has to come
after any elif-groups.

So let's modify the code to follow the specification. Add test
to prevent regressions.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>

Cc: 10.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit eb212c5a30)
2014-01-02 15:57:41 -08:00
Kenneth Graunke
347f149332 Revert "mesa: Remove GLXContextID typedef from glx.h."
This reverts commit 136a12ac98.

According to belak51 on IRC, this commit broke Allegro, which would no
longer compile.  Applications apparently expect the GLXContextID typedef
to exist in glx.h; removing it breaks them.  A bit of searching around
the internet revealed other complaints since upgrading to Mesa 10.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f425d56ba4)
2014-01-02 15:57:41 -08:00
Alex Deucher
49c865180a r600g: fix SUMO2 pci id
0x9649 is sumo2, not sumo.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e2d53fac1c)
2014-01-02 15:57:41 -08:00
Aaron Watry
765ceb6a36 r600/pipe: Stop leaking context->start_compute_cs_cmd.buf on EG/CM
Found while tracking down memory leaks in VDPAU playback

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3ddabe0d52)
2014-01-02 15:57:41 -08:00
Aaron Watry
7a7166f832 st/vdpau: Destroy context when initialization fails
Prevents a potential memory leak found when tracking down something else.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 20446d0e53)
2014-01-02 15:57:41 -08:00
Aaron Watry
a4a2f239d7 radeon/llvm: Free target data at end of optimization
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 767b0f82c3)
2014-01-02 15:57:41 -08:00
Aaron Watry
23d290d102 r600/compute: Use the correct FREE macro when deleting compute state
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0bd858d7ff)
2014-01-02 15:57:41 -08:00
Aaron Watry
2a20bf3ed2 r600/compute: Free compiled kernels when deleting compute state
v2: Remove unnecessary null pointer check

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e19717d075)
2014-01-02 15:57:41 -08:00
Aaron Watry
87cdd13324 radeon/compute: Stop leaking LLVMContexts in radeon_llvm_parse_bitcode
Previously we were creating a new LLVMContext every time that we called
radeon_llvm_parse_bitcode, which caused us to leak the context every time
that we compiled a CL program.

Sadly, we can't dispose of the LLVMContext at the point that it was being
created because evergreen_launch_grid (and possibly the SI equivalent) was
assuming that the context used to compile the kernels was still available.

Now, we'll create a new LLVMContext when creating EG/SI compute state, store
it there, and pass it to all of the places that need it.

The LLVM Context gets destroyed when we delete the EG/SI compute state.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8c9a9205d9)
2014-01-02 15:57:41 -08:00
Aaron Watry
b2ea582679 pipe_loader/sw: close dev->lib when initialization fails
Prevents a memory leak.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a7653c19a3)
2014-01-02 15:57:41 -08:00
Aaron Watry
0057a2b0e7 clover: Remove unused variable
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 862f55c29c)
2014-01-02 15:57:40 -08:00
Jonathan Liu
8518b6360d llvmpipe: use pipe_sampler_view_release() to avoid segfault
This fixes another case of faulting when freeing a pipe_sampler_view
that belongs to a previously destroyed context.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Liu <net147@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 7990ab58fa)
2014-01-02 15:57:40 -08:00
Jonathan Liu
ffd89b27a7 st/mesa: use pipe_sampler_view_release()
This fixes a crash where old_view->context was already freed in the
pipe_sampler_view_reference function contained in
src/gallium/auxiliary/utils/u_inlines.h. As a result, the
sampler_view_destroy function pointer contained 0xfeeefeee indicating
freed heap memory.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Liu <net147@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 670be71bd8)
2014-01-02 15:57:40 -08:00
Henri Verbeet
b0ee1b1748 i915: Add support for gl_FragData[0] reads.
Similar to 556a47a262, without this reading from
gl_FragData[0] would cause a software fallback.

Bugzilla: https://bugs.winehq.org/show_bug.cgi?id=33964
Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit b094b3b9f4)
2014-01-02 15:57:40 -08:00
Kenneth Graunke
8dd89b8ad8 i965: Fix 3DSTATE_PUSH_CONSTANT_ALLOC_PS packet creation.
When adding geometry shader support, we accidentally reversed the size
and offset parameters.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 51c9cfc296)
2014-01-02 15:57:40 -08:00
Kevin Rogovin
ec80a279a5 Use line number information from entire function expression
This patch changes the error reporting behavior for incorrect function
invocation (triggered by match_function_by_name() unable to find a
matching function call) from using the line number information
associated to the function name term to using the line number
information of the entire function expression. Fixes bug #72264.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72264
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 23d294bb60)
2014-01-02 15:57:40 -08:00
Anuj Phogat
f6ea5b7bd7 mesa: Fix error code generation in glBeginConditionalRender()
This patch changes the error condition to satisfy below statement
from OpenGL 4.3 core specification:
"An INVALID_OPERATION error is generated if id is the name of a query
object with a target other SAMPLES_PASSED, ANY_SAMPLES_PASSED, or
ANY_SAMPLES_PASSED_CONSERVATIVE, or if id is the name of a query
currently in progress."

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 7a73c6acb0)
2014-01-02 15:57:40 -08:00
Kristian Høgsberg
db0dc5c008 dri_util: Don't assume __DRIcontext->driverPrivate is a gl_context
The driverPrivate pointer is opaque to the driver and we can't assume
it's a struct gl_context in dri_util.c.  Instead provide a helper function
to set the struct gl_context flags from the incoming DRI context flags.

v2 (idr): Modify the other classic drivers to also use
driContextSetFlags.  I ran all the piglit GLX_ARB_create_context tests
with i965 and classic swrast without regressions.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> [v1 on Gallium nouveau]
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 38366c0c6e)
2014-01-02 15:57:40 -08:00
Marek Olšák
c2940d11d0 mesa: fix interpretation of glClearBuffer(drawbuffer)
This corresponding piglit tests supported this incorrect behavior instead of
pointing at it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 03d848ea10)
2014-01-02 15:57:40 -08:00
Vadim Girlin
27623f2645 r600g/sb: fix stack size computation on evergreen
On evergreen we have to reserve 1 stack element in some additional cases
besides the ones mentioned in the docs, but stack size computation was
recently reimplemented exactly as described in the docs by the patch that
added workarounds for stack issues on EG/CM, resulting in regressions
with some apps (Serious Sam 3).

This patch fixes it by restoring previous behavior.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=72369

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Andre Heider <a.heider@gmail.com>
(cherry picked from commit 00faf82832)
2014-01-02 14:40:47 -08:00
Carl Worth
6f7da0188a docs: Add md5sums for the 10.0.1 release. 2013-12-12 22:16:28 -08:00
Carl Worth
12484d2582 Update version for the 10.0.1 release.
It's so nice that this is updated in just a single place now. Thanks, Emil!
2013-12-12 21:34:55 -08:00
Carl Worth
d573899b93 Makefile: Add bin/test-driver to EXTRA_FILES
I'm not sure why this change is necessary. When I've built previous tar files
(such as 9.2.4) with the "make tarballs" target, they include the
bin/test-driver file. But at my first attempt to build the tar files for the
10.0.1 release this file was not being included and the build failed.
2013-12-12 21:33:02 -08:00
Carl Worth
142144e7fd docs: Add release notes for 10.0.1 2013-12-12 21:16:37 -08:00
Ilia Mirkin
a717ae1b2d nv50: report 15 max inputs for fragment programs
First off, nv50_program only has 16 in/out varyings. However reporting
16 makes 'm' become 68 in nv50_fp_linkage_validate with the
varying-packing-simple piglit test. (Subverting the assert makes it
compile but fail.) With this patch, varying-packing-simple passes.

See: https://bugs.freedesktop.org/show_bug.cgi?id=69155

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bad8871e52)
2013-12-12 15:35:57 -08:00
Maarten Lankhorst
a876ea4b76 nouveau: Fix compiler warning regression
cfg is now unused, remove it.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5576ad11ed)
2013-12-12 15:35:34 -08:00
Dave Airlie
d7a71b7181 swrast: fix readback regression since inversion fix
This readback from the frontbuffer with swrast was broken, that bug
just made it more obviously broken, this fixes it by inverting the
sub image gets. Also fixes a few other piglits.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72327
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72325

(for 9.2 the patches this depends on were asked to be backported separately
 in an email).
Cc: "9.2" "10.0" mesa-stable@lists.fedoraproject.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

(cherry picked from commit 0b16042377)
2013-12-12 15:35:04 -08:00
Axel Davy
2776a496d4 Enable throttling in SwapBuffers
flush_with_flags, when available, allows the driver to throttle.
Using this suppress input lag issues that can be observed in heavy
rendering situations on non-intel cards.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.0" mesa-stable@lists.freedesktop.org
(cherry picked from commit afcce46fd5)
2013-12-12 15:34:27 -08:00
Kristian Høgsberg
1919ec6ba4 egl/wayland: Send commit after flushing the driver context
This typically won't make a difference, since we only send the requests at
wl_display_flush() time.  There might be a small race
with another thread calling wl_display_flush() after our commit request,
but before we flush the DRI driver.  Moving the commit below the DRI
driver flush call looks more natural and eliminates the small race.

Cc: "10.0" mesa-stable@lists.freedesktop.org
(cherry picked from commit 33eb5eabee)
2013-12-12 15:33:59 -08:00
Axel Davy
188c60143b egl/wayland: Flush the wl_display at the end of SwapBuffers
We would like the compositor to receive the commited buffer
as soon as possible, so it has the time to treat it, and
release old ones. We shouldn't rely on the client
to flush the queue for us.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.0" mesa-stable@lists.freedesktop.org
(cherry picked from commit 402bf6e8d0)
2013-12-12 15:33:33 -08:00
Kristian Høgsberg
d0f606ffbd egl/wayland: Damage INT32_MAX x INT32_MAX region for eglSwapBuffers
If we're not using EGL_EXT_swap_buffers_with_damage, we have to
damage the full extent.  EGL operates on buffer coordinates, but
wl_surface.damage takes surface coordinates.  EGL doesn't know the
buffer transformation (rotated or scaled) and can't post accurate
damage in surface coordinates.  The damage event however is clipped to
the surface extents so we can just damage the maximum rectangle.

In case of EGL_EXT_swap_buffers_with_damage, the application knows
the buffer transform and is expected to pass in rectangles in
surface space.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70250
Cc: "10.0" mesa-stable@lists.freedesktop.org
Tested-by: U. Artie Eoff <ullysses.a.eoff@intel.com>
(cherry picked from commit bce64c6c83)
2013-12-09 17:41:23 -08:00
Jordan Justen
fdede18275 dri megadriver_stub: add compatibility for older DRI loaders
To help the transition period when DRI loaders are being updated
to support the newer __driDriverExtensions_foo mechanism,
we populate __driDriverExtensions with the extensions returned
by __driDriverExtensions_foo during a library contructor
function.

We find the driver foo's name by using the dladdr function
which gives the path of the dynamic library's name that
was being loaded.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Keith Packard <keithp@keithp.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4859d492b2)
2013-12-09 17:28:20 -08:00
Tom Stellard
4cbd424631 r300/compiler/tests: Fix line length check in test parser
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9a5ce0c4c9)
2013-12-09 17:28:15 -08:00
Tom Stellard
331a8a3586 r300/compiler/tests: Fix segfault
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1896431f79)
2013-12-09 17:28:09 -08:00
Ilia Mirkin
f528981f1a nouveau/video: update a few more h264 picparm field names
Based on comments by Benjamin Morris <bmorris@nvidia.com> in
http://lists.freedesktop.org/archives/nouveau/2013-December/015328.html

This adds setting of is_long_term, and updates a few field names we were
unclear about.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2cd2b9705e)
2013-12-09 17:28:07 -08:00
Ilia Mirkin
d5f1a270ef nouveau/video: update h264 picparm field names based on usage
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 78525dae8a)
2013-12-09 17:28:04 -08:00
Ilia Mirkin
f4f1159716 nv50: enable h264 and mpeg4 for nv98+ (vp3, vp4.0)
Create the ref_bo without any storage type flags set for now. The issue
probably arises from our use of the additional buffer space at the end
of the ref_bo. It should probably be split up in the future.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Martin Peres <martin.peres@labri.fr>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e01ba9d6b0)
2013-12-09 17:28:01 -08:00
Ian Romanick
b531dcaec4 glsl: Don't emit empty declaration warning for a struct specifier
The intention is that things like

   int;

will generate a warning.  However, we were also accidentally emitting
the same warning for things like

  struct Foo { int x; };

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68838
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Aras Pranckevicius <aras@unity3d.com>
Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 758658850b)
2013-12-09 17:27:40 -08:00
Ilia Mirkin
b160fea306 nv50: wait on the buf's fence before sticking it into pushbuf
This resolves some rendering issues in source games.
See https://bugs.freedesktop.org/show_bug.cgi?id=64323

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0e5bf85651)
2013-12-06 10:51:49 -08:00
Ilia Mirkin
05d2a796a0 nouveau: avoid leaking fences while waiting
This fixes a memory leak in some situations. Also avoids emitting an
extra fence if the kick handler does the call to nouveau_fence_next
itself.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ce6dd69697)
2013-12-06 10:51:45 -08:00
Ilia Mirkin
de517d2bb3 nv50: Fix GPU_READING/WRITING bit removal
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
CC: "9.1, 9.2, 10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c45cf6199f)
2013-12-06 10:51:18 -08:00
Ian Romanick
8991193f70 Remove a057b83 from the pick list
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-12-06 09:42:58 -08:00
Ilia Mirkin
6c00504a8a mesa: don't leak performance monitors on context destroy
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 267679be84)
2013-12-06 08:09:03 -08:00
Emil Velikov
e6710f4217 automake: include only one copy VERSION in tarball
The VERSION file is tracked by git (git ls-files), thus
adding it to EXTRA_FILES will result in a duplicate copy
within the final tarball.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72230
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Patrick Steinhardt <ps@pks.im>
Tested-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 507c2356e3)
2013-12-06 08:08:09 -08:00
Chad Versace
31751bd40b i965: Add extra-alignment for non-msrt fast color clear for all hw (v2)
The BSpec states that the aligment for the non-msrt clear rectangle must
be doubled; the BSpec does not restricit the workaround to specific
hardware.

Commit 9a1a67b applied the workaround to Haswell GT3.  Commit 8b659ce
expanded the workaround to all Haswell variants. This commit expands it
to all hardware.

No Piglit regressions on Ivybridge 0x0166. No fixes either.

I know no Ivybridge nor Baytrail bug related to this workaround.
However, the BSpec says the extra alignment is required, so let's do it.

v2: Apply to all hardware, not just gen7.

CC: "9.2, 10.0" <mesa-stable@lists.freedesktop.org>
CC: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 998018d7be)
2013-12-06 08:08:09 -08:00
Chad Versace
edca52e6e7 i965/hsw: Apply non-msrt fast color clear w/a to all HSW GTs
Pre-patch, the workaround was applied to only HSW GT3. However, the
workaround also fixes render corruption on the HSW GT1 Chromebook,
codenamed Falco.

Also, update the BSpec quote that discusses the workaround to reflect
the latest BSpec.

The BSpec states that the workaround is required for Ivybridge and
Baytrail as well as Haswell. But, we apply the workaround to only
Haswell because (a) we suspect that is the only hardware where it is
actually required and (b) we haven't yet validated the workaround for
the other hardware.

CC: "9.2, 10.0" <mesa-stable@lists.freedesktop.org>
CC: Anuj Phogat <anuj.phogat@gmail.com>
OTC-Tracker: CHRMOS-812
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 8b659cef3a)
2013-12-06 08:08:09 -08:00
Paul Berry
2457b5bfa4 i965/gen6: Fix multisample resolve blits for luminance/intensity 32F formats.
On gen6, multisamble resolve blits use the SAMPLE message to blend
together the 4 samples for each texel.  For some reason, SAMPLE
doesn't blend together the proper samples when the source format is
L32_FLOAT or I32_FLOAT, resulting in blocky artifacts.

To work around this problem, sample from the source surface using
R32_FLOAT.  This shouldn't affect rendering correctness, because when
doing these resolve blits, the destination format is R32_FLOAT, so the
channel replication done by L32_FLOAT and I32_FLOAT is unnecessary.

Fixes piglit tests on Sandy Bridge:
- spec/ARB_texture_float/multisample-formats 2 GL_ARB_texture_float
- spec/ARB_texture_float/multisample-formats 4 GL_ARB_texture_float

No piglit regressions on Sandy Bridge.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70601

Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c4cf487315)
2013-12-06 08:08:09 -08:00
Thomas Hellstrom
edb4956932 st/xa: Bump major version number to 2
For some reason this was left out when the version was changed...

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
2013-12-06 06:14:37 -08:00
Ian Romanick
643f986942 docs: Add 10.0 release md5sums
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-30 23:29:21 -08:00
Ian Romanick
724c07ff12 mesa: Bump version to 10.0 (final)
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-30 23:25:47 -08:00
Ian Romanick
56d1ba17f1 docs: Update release notes for 10.0
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-30 23:25:28 -08:00
Kenneth Graunke
44e38a878a i965: Always reserve binding table space for at least one render target.
In brw_update_renderbuffer_surfaces(), if there are no color draw
buffers, we always set up a null render target at surface index 0 so we
have something to use with the FB write marking the end of thread.

However, when we recently began computing surface indexes dynamically,
we failed to reserve space for it.  This meant that the first texture
would be assigned surface index 0, and our closing FB write would
clobber the texture.

Fixes Piglit's EXT_packed_depth_stencil/fbo-blit-d24s8 test on Gen4-5,
which regressed as of commit 4e5306453d
("i965/fs: Dynamically set up the WM binding table offsets.")

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70605
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Tested-by: lu hua <huax.lu@intel.com>
Cc: "10.0" mesa-stable@lists.freedesktop.org
(cherry picked from commit c4815f6cd6)
2013-11-28 08:37:40 -08:00
Ian Romanick
93dfd0522f dri: Allow __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS in driCreateContextAttribs
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reported-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 73e9aa9e3f)
2013-11-28 08:37:39 -08:00
Ian Romanick
a5f78c4025 i965: Only enable __DRI2_ROBUSTNESS if kernel support is available
This is a squash of the following two cherry-picked patches:

    i965: Only enable __DRI2_ROBUSTNESS if kernel support is available

    Rather than always advertising the extension but failing to create a
    context with reset notifiction, just don't advertise it.  I don't know
    why it didn't occur to me to do it this way in the first place.

    NOTE: Kristian requested that I provide a follow-up for master that
    dynamically generates the list of DRI extensions instead of selected
    between two hardcoded lists.

    Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
    Suggested-by: Kristian Høgsberg <krh@bitplanet.net>
    Reviewed-by: Matt Turner <mattst88@gmail.com>
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
    Cc: "10.0" <mesa-stable@lists.freedesktop.org>
    (cherry picked from commit 9b1c68638d)

and

    i965: Properly reject __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS when __DRI2_ROBUSTNESS is not enabled

    Only allow __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS in brwCreateContext if
    intelInitScreen2 also enabled __DRI2_ROBUSTNESS (thereby enabling
    GLX_ARB_create_context).

    This fixes a regression in the piglit test
    "glx/GLX_ARB_create_context/invalid flag"

    v2: Remove commented debug code.  Noticed by Jordan.

    Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
    Reported-by: Paul Berry <stereotype441@gmail.com>
    Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
    Reviewed-by: Matt Turner <mattst88@gmail.com>
    Cc: "10.0" <mesa-stable@lists.freedesktop.org>
    (cherry picked from commit 53a65e547c)
2013-11-28 08:36:51 -08:00
Ian Romanick
9ec00c187c i965: Bump libdrm requirement
drm_intel_get_reset_stats is only available in libdrm-2.4.48, and
libdrm-2.4.49 contains an important bug fix in that function.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cb728bb028)
2013-11-28 08:35:28 -08:00
Francisco Jerez
5ec641bbc9 glsl: Initialize _mesa_glsl_parse_state::atomic_counter_offsets before using it.
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6b2b4cc885)
2013-11-26 21:04:52 -08:00
Paul Berry
444a621e55 glsl: Fix lowering of direct assignment in lower_clip_distance.
In commit 065da16 (glsl: Convert lower_clip_distance_visitor to be an
ir_rvalue_visitor), we failed to notice that since
lower_clip_distance_visitor overrides visit_leave(ir_assignment *),
ir_rvalue_visitor::visit_leave(ir_assignment *) wasn't getting called.
As a result, clip distance dereferences appearing directly on the
right hand side of an assignment (not in a subexpression) weren't
getting properly lowered.  This caused an ir_dereference_variable node
to be left in the IR that referred to the old gl_ClipDistance
variable.  However, since the lowering pass replaces gl_ClipDistance
with gl_ClipDistanceMESA, this turned into a dangling pointer when the
IR got reparented.

Prior to the introduction of geometry shaders, this bug was unlikely
to arise, because (a) reading from gl_ClipDistance[i] in the fragment
shader was rare, and (b) when it happened, it was likely that it would
either appear in a subexpression, or be hoisted into a subexpression
by tree grafting.

However, in a geometry shader, we're likely to see a statement like
this, which would trigger the bug:

    gl_ClipDistance[i] = gl_in[j].gl_ClipDistance[i];

This patch causes
lower_clip_distance_visitor::visit_leave(ir_assignment *) to call the
base class visitor, so that the right hand side of the assignment is
properly lowered.

Fixes piglit test:
- spec/glsl-1.50/execution/geometry/clip-distance-itemized-copy

Cc: Ian Romanick <idr@freedesktop.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 9dfcb05fa6)
2013-11-26 21:04:52 -08:00
Paul Berry
756b4f9a8c i965/gs: Set GS prog_data to NULL if there is no GS program.
The previous commit fixes a bug wherein we would incorrectly refer to
stale geometry shader prog_data when no geometry shader was active.

This patch reduces the likelihood of that sort of bug occurring in the
future by setting prog_data to NULL whenever there is no GS program.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 37bdde1087)
2013-11-26 21:04:52 -08:00
Paul Berry
d963daa380 i965/gs: Properly skip GS binding table upload when no GS active.
Previously, in brw_gs_upload_binding_table(), we checked whether
brw->gs.prog_data was NULL in order to determine whether a geometry
shader was active.  This didn't work: brw->gs.prog_data starts off as
NULL, but it is set to non-NULL when a geometry shader program is
built, and then never set to NULL again.  As a result, if we called
brw_gs_upload_binding_table() while there was no geometry shader
active, but a geometry shader had previously been active, it would
refer to a stale (and possibly freed) prog_data structure.

This patch fixes the problem by modifying
brw_gs_upload_binding_table() to use the proper technique to determine
whether a geometry shader is active: by checking whether
brw->geometry_program is NULL.

This fixes the crash reported in comment 2 of bug 71870 (the incorrect
rendering remains, however).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 2714ca81b9)
2013-11-26 21:04:51 -08:00
Tom Stellard
bab6f40b29 radeon/compute: Unconditionally inline all functions v2
We need to do this until function calls are supported.

v2:
  - Fix loop conditional

https://bugs.freedesktop.org/show_bug.cgi?id=64225

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ddc77c5092)
2013-11-26 13:10:29 -08:00
Kenneth Graunke
c0c3fa564b i965: Use __attribute__((flatten)) on fast tiled teximage code.
The fast tiled texture upload code does not compile with GCC 4.8's -Og
optimization flag.

memcpy() has the always_inline attribute set.  This poses a problem,
since {x,y}tile_copy_faster calls it indirectly via {x,y}tile_copy,
and {x,y}tile_copy normally aren't inlined at -Og.

Using __attribute__((flatten)) tells GCC to inline every function call
inside the function, which I believe was the author's intent.

Fix suggested by Alexander Monakov.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ad542a10c5)
2013-11-26 13:09:41 -08:00
Maarten Lankhorst
ec013f809b gbm/dri: hide extension loader symbols
They should not be exposed.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5455c818b5)
2013-11-26 13:09:29 -08:00
Ian Romanick
866ce39ca0 mesa: Bump version to 10.0.0-rc2
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-23 17:23:00 -08:00
Ian Romanick
48e4daf977 Remove 068a073 from the pick list
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-23 17:20:36 -08:00
Eric Anholt
1efe2ef620 i965: Fix streamed state dumping/annotation after the blorp-flush change.
I think I was thinking of the batch command packet cache when I pasted
this in, but this counter is only used for dumping out streamed state for
INTEL_DEBUG=batch and for putting annotations in our aub files.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5891f98145)
2013-11-23 12:55:04 -08:00
Paul Berry
47ff55fa86 mesa: Implement GL_FRAMEBUFFER_ATTACHMENT_LAYERED query.
From section 6.1.18 (Renderbuffer Object Queries) of the GL 3.2 spec,
under the heading "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE
is TEXTURE, then":

    If pname is FRAMEBUFFER_ATTACHMENT_LAYERED, then params will
    contain TRUE if an entire level of a three-dimesional texture,
    cube map texture, or one-or two-dimensional array texture is
    attached. Otherwise, params will contain FALSE.

Fixes piglit tests:
- spec/!OpenGL 3.2/layered-rendering/framebuffer-layered-attachments
- spec/!OpenGL 3.2/layered-rendering/framebuffertexture-defaults

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

v2: Don't include "EXT" in the error message, since this query only
makes sensen in context versions that have adopted
glGetFramebufferAttachmentParameteriv().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit ec79c05cbf)
2013-11-23 12:55:04 -08:00
Paul Berry
8f4d95d41c mesa: Fix texture target validation for glFramebufferTexture()
Previously we were using the code path for validating
glFramebufferTextureLayer().  But glFramebufferTexture() allows
additional texture types.

Fixes piglit tests:
- spec/!OpenGL 3.2/layered-rendering/gl-layer-cube-map
- spec/!OpenGL 3.2/layered-rendering/framebuffertexture

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

v2: Clarify comment above framebuffer_texture().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit af1471dc04)
2013-11-23 12:55:04 -08:00
Paul Berry
79d727e063 i965: Fix fast clear of depth buffers.
From section 4.4.7 (Layered Framebuffers) of the GLSL 3.2 spec:

    When the Clear or ClearBuffer* commands are used to clear a
    layered framebuffer attachment, all layers of the attachment are
    cleared.

This patch fixes the fast depth clear path.

Fixes piglit test "spec/!OpenGL 3.2/layered-rendering/clear-depth".

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 0831523350)
2013-11-23 12:55:04 -08:00
Paul Berry
7f99ae72c4 i965: Fix blorp clear of layered framebuffers.
From section 4.4.7 (Layered Framebuffers) of the GLSL 3.2 spec:

    When the Clear or ClearBuffer* commands are used to clear a
    layered framebuffer attachment, all layers of the attachment are
    cleared.

This patch fixes the blorp clear path for color buffers.

Fixes piglit test "spec/!OpenGL 3.2/layered-rendering/clear-color".

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit c1019670ea)
2013-11-23 12:55:04 -08:00
Paul Berry
e934782b2a i965: refactor blorp clear code in preparation for layered clears.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 1ec5365429)
2013-11-23 12:55:04 -08:00
Paul Berry
ffa073ec72 mesa: Track number of layers in layered framebuffers.
In order to properly clear layered framebuffers, we need to know how
many layers they have.  The easiest way to do this is to record it in
the gl_framebuffer struct when we check framebuffer completeness.

This patch replaces the gl_framebuffer::Layered boolean with a
gl_framebuffer::NumLayers integer, which is 0 if the framebuffer is
not layered, and equal to the number of layers otherwise.

v2: Remove gl_framebuffer::Layered and make gl_framebuffer::NumLayers
always have a defined value.  Fix factor of 6 error in the number of
layers in a cube map array.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 95140740ad)
2013-11-23 12:47:05 -08:00
Tom Stellard
620d11aed4 radeonsi/compute: Fix LDS size calculation
We need to include the number of LDS bytes allocated by the state tracker.

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1bdb99330a)
2013-11-23 12:46:59 -08:00
Tom Stellard
c8cf5dc401 r600g/compute: Add a work-around for flushing issues on Cayman
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

https://bugs.freedesktop.org/show_bug.cgi?id=69321

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7a30cd7085)
2013-11-23 12:46:22 -08:00
Paul Berry
a645df0134 glsl: Fix interstage uniform interface block link error detection.
Previously, we checked for interstage uniform interface block link
errors in validate_interstage_interface_blocks(), which is only called
on pairs of adjacent shader stages.  Therefore, we failed to detect
uniform interface block mismatches between non-adjacent shader stages.

Before the introduction of geometry shaders, this wasn't a problem,
because the only supported shader stages were vertex and fragment
shaders, therefore they were always adjacent.  However, now that we
allow a program to contain vertex, geometry, and fragment shaders,
that is no longer the case.

Fixes piglit test "skip-stage-uniform-block-array-size-mismatch".

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

v2: Rename validate_interstage_interface_blocks() to
validate_interstage_inout_blocks() to reflect the fact that it no
longer validates uniform blocks.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

v3: Make validate_interstage_inout_blocks() skip uniform blocks.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 544e3129c5)
2013-11-23 12:45:16 -08:00
Paul Berry
3470916d6a glsl: Fix cross-version linking between VS and GS.
Previously, when attempting to link a vertex shader and a geometry
shader that use different GLSL versions, we would sometimes generate a
link error due to the implicit declaration of gl_PerVertex being
different between the two GLSL versions.

This patch fixes that problem by only requiring interface block
definitions to match when they are explicitly declared.

Fixes piglit test "shaders/version-mixing vs-gs".

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

v2: In the interface_block_definition constructor, move the assignment
to explicitly_declared after the existing if block.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0f4cacbb53)
2013-11-23 12:44:18 -08:00
Paul Berry
320f2fa45d glsl: Prohibit illegal mixing of redeclarations inside/outside gl_PerVertex.
From section 7.1 (Built-In Language Variables) of the GLSL 4.10
spec:

    Also, if a built-in interface block is redeclared, no member of
    the built-in declaration can be redeclared outside the block
    redeclaration.

We have been regarding this text as a clarification to the behaviour
established for gl_PerVertex by GLSL 1.50, so we apply it regardless
of GLSL version.

This patch enforces the rule by adding an enum to ir_variable to track
how the variable was declared: implicitly, normally, or in an
interface block.

Fixes piglit tests:
- gs-redeclares-pervertex-out-after-global-redeclaration.geom
- vs-redeclares-pervertex-out-after-global-redeclaration.vert
- gs-redeclares-pervertex-out-after-other-global-redeclaration.geom
- vs-redeclares-pervertex-out-after-other-global-redeclaration.vert
- gs-redeclares-pervertex-out-before-global-redeclaration
- vs-redeclares-pervertex-out-before-global-redeclaration

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

v2: Don't set "how_declared" redundantly in builtin_variables.cpp.
Properly clone "how_declared".

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2bbcf19aca)
2013-11-23 12:42:47 -08:00
Tapani Pälli
2747e72036 mesa: enable GL_TEXTURE_LOD_BIAS set/get
Earlier comments suggest this was removed from GL core spec but it is
still there. Enabling makes 'texture_lod_bias_getter' Khronos
conformance tests pass, also removes some errors from Metro Last Light
game which is using this API.

v2: leave NOTE comment (Ian)

Cc: "9.0 9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 7e61b44dcd)
2013-11-23 12:41:46 -08:00
Dave Airlie
d4b7ff7fe0 glx: don't fail out when no configs if we have visuals
GLX 1.2 servers with no SGIX_fbconfigs exist (some citrix thing),
and we fail glxinfo completely in those cases.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b01a3a9b72)
2013-11-23 12:41:41 -08:00
Dave Airlie
63b02533f0 mesa/swrast: fix inverted front buffer rendering with old-school swrast
I've no idea when this broke, but we have some people who wanted it fixed,
so here's my attempt.

reproducer, run readpix with swrast hit f, or run trival tri -sb things are
upside down, after this patch they aren't.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62142
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66213

Cc: <mesa-stable@lists.freedesktop.org>"
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a43b49dfb1)
2013-11-23 12:41:35 -08:00
Matt Turner
19f05b26ba i965: Link -ldl after libmesa.la
DLOPEN_LIBS is part of DRI_LIB_DEPS.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71512
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 1f9092958d)
2013-11-23 12:40:34 -08:00
Brian Paul
11da04e1bb st/mesa: fix GL_FEEDBACK mode inverted Y coordinate bug
We need to check the drawbuffer's orientation before inverting Y
coordinates.  Fixes piglit feedback tests when running with the
-fbo option.

Cc: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 15d8e05e1e)
2013-11-23 12:40:29 -08:00
Paul Berry
989d650090 i965/vec4: Fix broken IR annotation in debug output.
Commit 70953b5 (i965: Initialize all member variables of
vec4_instruction on construction) inadvertently added a line to the
vec4_instruction constructor setting this->ir to NULL, wiping out the
previously set value.  As a result, ever since then, the output of
INTEL_DEBUG=vs and INTEL_DEBUG=gs has been missing IR annotations.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 60b1a118e1)
2013-11-23 12:40:20 -08:00
Tom Stellard
9495fb4fff r600g/compute: Fix handling of global buffers in r600_resource_copy_region()
Global buffers do not have an associate cs_buf handle, so
we can't copy them using r600_copy_buffer()

https://bugs.freedesktop.org/show_bug.cgi?id=64226

Reviewed-by: Marek Ol????k <marek.olsak@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1b9511d7ce)
2013-11-23 12:39:45 -08:00
Tom Stellard
521c59f132 gallium: Pass version scripts to linker using --version-script=
This fixes build failures with the gold linker.

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 17930a66aa)
2013-11-23 12:36:07 -08:00
Tom Stellard
eafb9f6756 clover: Optionally return context's devices from clGetProgramInfo()
The spec allows clGetProgramInfo() to return information about either
the devices associated with the program or the devices associated
with the context.  If there are no devices associated with the program,
then we return devices associated with the context.

https://bugs.freedesktop.org/show_bug.cgi?id=52171

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a84dd2398f)
2013-11-23 12:34:16 -08:00
Paul Berry
5af1fb5324 i965/gen7: Emit workaround flush when changing GS enable state.
v2: Don't go to extra work to avoid extraneous flushes.  (Previous
experiments in the kernel have suggested that flushing the pipeline
when it is already empty is extremely cheap).

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 7dfb4b2d00)
2013-11-23 12:33:17 -08:00
Emil Velikov
0040edcf9d docs: indicate GLX_MESA_query_renderer's completion
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit d33d260b90)
2013-11-23 12:32:38 -08:00
Emil Velikov
defff44e1c docs: add a note about removed state tracker/targets
The X.Org state tracker is gone, as well as the xvmc/vdpau
r300 and softpipe targets.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit ca9794658e)
2013-11-23 12:32:34 -08:00
Vadim Girlin
8f78b06dca r600g/sb: work around hw issues with stack on eg/cm
v2: make it actually work, improve condition

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68503
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
(cherry picked from commit 4cb04aa0df)
2013-11-23 12:32:28 -08:00
Vinson Lee
367241ec64 i965: Add missing break in SHADER_OPCODE_GEN7_SCRATCH_READ case.
Fixes "Missing break in switch" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b570c4229f)
2013-11-23 12:23:08 -08:00
Ian Romanick
15118b45a0 mesa: Bump version to 10.0.0-rc1 2013-11-18 12:23:56 -08:00
Aaron Watry
3fd32619d7 radeon/llvm: Free elf_buffer after use
Prevents a memory leak.

v2: Remove null check

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2be85e2492)
2013-11-15 13:39:41 -08:00
Aaron Watry
7a87dc278e r600/llvm: Free binary.code/binary.config in r600_llvm_compile
radeon_llvm_compile allocates memory for binary.code, binary.config,
or neither depending on what's being done.

We need to make sure to free that memory after it's no longer needed.

v2: Don't bother checking for null before FREE()

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 01f3622c74)
2013-11-15 13:39:41 -08:00
Aaron Watry
f843604b6a r600/llvm: initialize radeon_llvm_binary
use memset to initialize to 0's... otherwise code_size and config_size
could be uninitialized when read later in this method.

It's also hard to do NULL checks on uninitialized pointers.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

v2: Fix indentation

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dd73b99420)
2013-11-15 13:39:41 -08:00
Brian Paul
e9f8b78278 svga: mark dest image as defined in svga_surface_copy()
After we blit/copy to a dest texture image we need to mark it as
being defined.  This fixes broken mipmap generation for quite a
few texture formats.  Mipgen involves making texture views and
svga_texture_view_surface() skips texture images that are undefined.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 3969330b47)
2013-11-15 13:39:41 -08:00
Brian Paul
dfff838429 svga: do primitive trimming in translate_indices()
The index translation code expects the number of indexes to be
consistent with the primitive type (ex: a multiple of 3 for
PIPE_PRIM_TRIANGLES).  If it's not, we can write out of bounds
in the destination buffer.

Fixes failed assertions in the pipebuffer debug code found with
Piglit primitive-restart-draw-mode test.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 79984b9928)
2013-11-15 13:39:41 -08:00
Aaron Watry
11982ca08d gallium/pipe_loader: un-reference udev resources when we're done with them.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 598f61ba28)
2013-11-15 13:39:41 -08:00
Aaron Watry
713966c82f radeonsi/compute: Dispose of LLVM module after compiling kernels
v2: Fix indentation

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4c6ac9e614)
2013-11-15 13:39:41 -08:00
Aaron Watry
3a98fc6abe radeonsi/compute: Free program and program.kernels on shutdown
v2: Fix indentation

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 35dad4a1e2)
2013-11-15 13:39:41 -08:00
Aaron Watry
531637feee radeon/llvm: Free created llvm memory buffer
v2: Fix indentation

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d41b10f811)
2013-11-15 13:39:41 -08:00
Aaron Watry
02807c06b8 radeon/llvm: Free libelf resources
v2: Fix indentation

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a2b93da84b)
2013-11-15 13:39:40 -08:00
Aaron Watry
9ed0452740 radeon/llvm: fix spelling error
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit df482fe02f)
2013-11-15 13:39:40 -08:00
Tom Stellard
ef8fcfc9cf clover: Support multiple devices in clCreateContextFromType() v2
v2:
  - Use clGetDeviceIDs to query devices.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

CC: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 17af4dd52b)
2013-11-15 13:39:40 -08:00
Paul Berry
1b45f255b5 glsl: Rework interface block linking.
Previously, when doing intrastage and interstage interface block
linking, we only checked the interface type; this prevented us from
catching some link errors.

We now check the following additional constraints:

- For intrastage linking, the presence/absence of interface names must
  match.

- For shader ins/outs, the interface names themselves must match when
  doing intrastage linking (note: it's not clear from the spec whether
  this is necessary, but Mesa's implementation currently relies on
  it).

- Array vs. nonarray must be consistent, taking into account the
  special rules for vertex-geometry linkage.

- Array sizes must be consistent (exception: during intrastage
  linking, an unsized array matches a sized array).

Note: validate_interstage_interface_blocks currently handles both
uniforms and in/out variables.  As a result, if all three shader types
are present (VS, GS, and FS), and a uniform interface block is
mentioned in the VS and FS but not the GS, it won't be validated.  I
plan to address this in later patches.

Fixes the following piglit tests in spec/glsl-1.50/linker:
- interface-blocks-vs-fs-array-size-mismatch
- interface-vs-array-to-fs-unnamed
- interface-vs-unnamed-to-fs-array
- intrastage-interface-unnamed-array

v2: Simplify logic in intrastage_match() for handling array sizes.
Make extra_array_level const.  Use an unnamed temporary
interface_block_definition in validate_interstage_interface_blocks()'s
first call to definitions->store().

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit f38ac41ed4)
2013-11-15 13:39:40 -08:00
Paul Berry
1a163c0b34 i965: Fix vertical alignment for multisampled buffers.
From the Sandy Bridge PRM, Vol 1 Part 1 7.18.3.4 (Alignment Unit
Size):

    j [vertical alignment] = 4 for any render target surface is
    multisampled (4x)

From the Ivy Bridge PRM, Vol 4 Part 1 2.12.2.1 (SURFACE_STATE for most
messages), under the "Surface Vertical Alignment" heading:

    This field is intended to be set to VALIGN_4 if the surface was
    rendered as a depth buffer, for a multisampled (4x) render target,
    or for a multisampled (8x) render target, since these surfaces
    support only alignment of 4.

Back in 2012 when we added multisampling support to the i965 driver,
we forgot to update the logic for computing the vertical alignment, so
we were often using a vertical alignment of 2 for multisampled
buffers, leading to subtle rendering errors.

Note that the specs also require a vertical alignment of 4 for all
Y-tiled render target surfaces; I plan to address that in a separate
patch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53077
Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit b4c3b833ec)
2013-11-15 13:39:40 -08:00
Paul Berry
53e681f2fe main: Fix MaxUniformComponents for geometry shaders.
For both vertex and fragment shaders we default MaxUniformComponents
to 4 * MAX_UNIFORMS.  It makes sense to do this for geometry shaders
too; if back-ends have different limits they can override them as
necessary.

Fixes piglit test:
spec/glsl-1.50/built-in constants/gl_MaxGeometryUniformComponents

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 46e9f78efc)
2013-11-15 13:39:40 -08:00
Fredrik Höglund
10c25e58ca mesa: Fix derived vertex state not being updated in glCallList()
AEcontext::NewState is not always set when the vertex array state
is changed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71492
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit ff353c218a)
2013-11-15 13:39:40 -08:00
Ian Romanick
0558e10160 dri: Change value param to unsigned
This silences some compiler warnings in i915 and i965.  See also
75982a5.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a15a19f0d1)
2013-11-15 13:39:40 -08:00
Ian Romanick
1e51d3a668 i965: Use drm_intel_get_aperture_sizes instead of hard-coded 2GiB
Systems with little physical memory installed will report less than
2GiB, and some systems may (hypothetically?) have a larger address space
for the GPU.  My IVB still reports 1534.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cb6182bdfa)
2013-11-15 13:39:39 -08:00
Ian Romanick
e5839c2397 i915: Use drm_intel_get_aperture_sizes instead of drmAgpSize
Send the zombie back to the grave before it infects the townsfolk.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9fe108db09)
2013-11-15 13:39:39 -08:00
Kristian Høgsberg
7d2187176a dri: Remove redundant createNewContext function from __DRIimageDriverExtension
createContextAttribs is a superset of what createNewContext provides.
Also remove the function typedef, since createNewContext is deprecated
and no longer used in  multiple interfaces.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e048953145)
2013-11-15 13:39:39 -08:00
Kristian Høgsberg
329a75511f wayland: Use __DRIimage based getBuffers implementation when available
This lets us allocate color buffers as __DRIimages and pass them into
the driver instead of having to create a __DRIbuffer with the flink
that requires.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 68bb26bead)
2013-11-15 13:39:39 -08:00
Kristian Høgsberg
76434775e0 gbm: Add support for __DRIimage based getBuffers when available
This lets us allocate color buffers as __DRIimages and pass them into
the driver instead of having to create a __DRIbuffer with the flink
that requires.

With this patch, we can now run gbm on render-nodes.  A render-node is a
drm device that doesn't support modesetting and all the legacy DRI ioctls.
flink is also not supported, but now that gbm doesn't need flink, we can
run piglit on head-less gbm or head-less GPGPU.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 04e3ef00db)
2013-11-15 13:39:39 -08:00
Ander Conselvan de Oliveira
2365244302 dri/i915, dri/i965: Fix support for planar images
Planar images have format __DRI_IMAGE_FORMAT_NONE, but the patch that
moved the conversion from dri_format to the mesa format made it
impossible to allocate a image with that format.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5ba6be2617)
2013-11-15 13:39:39 -08:00
Eric Anholt
3e6f200250 i965/fs: Try a different pre-scheduling heuristic if the first spills.
Since LIFO fails on some shaders in one particular way, and non-LIFO
systematically fails in another way on different kinds of shaders, try
them both, and pick whichever one successfully register allocates first.
Slightly prefer non-LIFO in case we produce extra dependencies in register
allocation, since it should start out with fewer stalls than LIFO.

This is madness, but I haven't come up with another way to get unigine
tropics to not spill while keeping other programs from not spilling and
retaining the non-unigine performance wins from texture-grf.

total instructions in shared programs: 1626728 -> 1626288 (-0.03%)
instructions in affected programs:     1015 -> 575 (-43.35%)
GAINED:                                50
LOST:                                  0

Improves Unigine Tropics performance by 14.5257% +/- 0.241838% (n=38)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70445
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit e9daead784)
2013-11-15 13:39:39 -08:00
Eric Anholt
99c62ff2ea i965/fs: Do instruction pre-scheduling just before register allocation.
Long ago, the HW_REG usage in assign_curb/urb_setup() were scheduling
barriers, so we had to run scheduler before them in order for it to be
able to do basically anything.  Now that that's fixed, we can delay the
scheduling until we go to allocate (which will make the next change less
scary).

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit fbd8303a94)
2013-11-15 13:39:39 -08:00
Eric Anholt
a5a6ef9702 i965/fs: Ignore actual latency pre-reg-alloc.
We care about depth-until-program-end, as a proxy for "make sure I
schedule those early instructions that open up the other things that can
make progress while keeping register pressure low", not actual latency
(since we're relying on the post-register-alloc scheduling to actually
schedule for the hardware).

total instructions in shared programs: 1609931 -> 1609931 (0.00%)
instructions in affected programs:     0 -> 0
GAINED:                                55
LOST:                                  43

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit f72a0d99fe)
2013-11-15 13:39:39 -08:00
Eric Anholt
6640147463 i965/fs: Fix message setup for SIMD8 spills.
In the SIMD16 spilling changes, I replaced a "1" in the spill path with
"mlen", but obviously it wasn't mlen before because spills have the g0
header along with the payload. The interface I was trying to use was
asking for how many physical regs we're writing, so we're looking for "1"
or "2".

I'm guessing this actually passed piglit because the high 8 bits of the
execution mask in SIMD8 mode are all 0s.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
(cherry picked from commit 7c90947a0b)
2013-11-15 13:39:38 -08:00
Eric Anholt
229ee20460 i965/fs: Prefer things we know reduce reg pressure when pre-scheduling.
Previously, the best thing we had was to schedule the things unblocked by
the last chosen instruction, on the hope that it would be consuming two
values at the end of their live intervals while only producing one new
value.  But that's just a guess, and we can do counting of usage of
registers to know when an instruction would (almost surely) reduce
register pressure.

The only failure mode I know of in this new dominant heuristic is that
inside of a loop when scheduling the iterator (for example), choosing the
last use of the iterator doesn't actually reduce the live interval of the
iterator.  But it doesn't seem to matter in shader-db:

total instructions in shared programs: 1618700 -> 1618700 (0.00%)
instructions in affected programs:     0 -> 0
GAINED:                                13
LOST:                                  0

Note: The new functions are made virtual because I expect we'll soon lift
the pre-regalloc scheduling heuristic over to the vec4 backend.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit bc0e3bb4d0)
2013-11-15 13:39:38 -08:00
Eric Anholt
dbddd86cc2 i965: Fix undefined value usage in ABO setup.
Fixes a compiler warning.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 9b3e1592c2)
2013-11-15 13:39:38 -08:00
Francisco Jerez
c702f5eead clover: Fix the const variant of adaptor_range::end to deal with mismatching range sizes.
Fixes infinite loop in find_grid_optimal_factor() in cases where the
user specifies a grid size with less dimensions than the device
supports.

Reported-by: Tom Stellard <thomas.stellard@amd.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 99d447cc5d)
2013-11-15 13:39:38 -08:00
Cyril Brulebois
c4cc166abc gallium: fix build on GNU/Hurd due to missing PIPE_OS_HURD detection
Thanks to Pino Toscano.  Patch from Debian package.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 2d77e4f922)
2013-11-15 13:39:38 -08:00
Petr Sebor
2a3dcece72 meta: enable vertex attributes in the context of the newly created array object
Otherwise, the function would enable generic vertex attributes 0
and 1 of the array object it does not own. This was causing crashes
in Euro Truck Simulator 2, since the incorrectly enabled generic
attribute 0 in the foreign context got precedence before vertex
position attribute at later time, leading to NULL pointer dereference.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Petr Sebor <petr@scssoft.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit f2b844f59d)
2013-11-15 13:39:38 -08:00
Brian Paul
accc276df2 mesa: call update_array_format() after error checking
We try to do all error checking before changing any GL state.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>

Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit ce193d4f01)
2013-11-15 13:39:38 -08:00
Ilia Mirkin
afbdcdcaaf nouveau/video: mark bitstream-level acceleration as unsupported
Adding a vl_mpeg-based helper didn't seem to work, as it produced data
that the card couldn't handle. (And I didn't investigate further.) This
makes the decoding functionality only accessible via XvMC and avoids
crashes when attempting to use VDPAU.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 08122e151a)
2013-11-15 13:39:38 -08:00
Ilia Mirkin
6f2877c40d nouveau/video: don't try on nv3x
It doesn't work, I don't know why, but no point in hanging people's
displays until it gets figured out.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e8d5d3409c)
2013-11-15 13:39:38 -08:00
Tom Stellard
02d9e1be87 egl-static: Only export necessary symbols v3
This fixes a crash in glamor when mesa links against static LLVM.

v2:
  - Inline LINKER_SCRIPT variable

v3: Kai Wasserbäch
  - Fix out out-of-tree-builds

Tested-by: Kai Wasserbäch <kai@dev.carbon-project.or>
(cherry picked from commit 594fa4a208)
2013-11-15 13:39:37 -08:00
Tom Stellard
095d583e52 configure.ac: Don't require shared LLVM when building OpenCL
This works now that pipe_*.so is no longer exporting LLVM symbols.

Tested-by: Kai Wasserbäch <kai@dev.carbon-project.or>
(cherry picked from commit cb080a10b6)
2013-11-15 13:39:37 -08:00
Tom Stellard
8af132fca9 pipe-loader: Only export necessary symbols v3
This makes it possible to use clover with statically linked LLVM.

v2:
  - Inline LINKER_SCRIPT variable

v3: Kai Wasserbäch
  - Fix out out-of-tree-builds

Tested-by: Kai Wasserbäch <kai@dev.carbon-project.or>
(cherry picked from commit 6d6c749215)
2013-11-15 13:39:37 -08:00
Tom Stellard
ade312cd8a radeonsi/compute: Add Sea Islands support
(cherry picked from commit a859131003)
2013-11-15 13:39:37 -08:00
Rico Schüller
f9a74a0b4c tests: Fix make check for out of tree builds.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Rico Schüller <kgbricola@web.de>
(cherry picked from commit 23afe71f44)
2013-11-15 13:39:37 -08:00
Brian Paul
0e3f5999b9 osmesa: fix broken triangle/line drawing when using float color buffer
Doesn't seem to help with bug 71363 but it fixed a failure I found in
my testing.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a66a008b17)
2013-11-15 13:39:37 -08:00
Chris Forbes
ebc460bc5f i965: convert brw_lower_offset_array_visitor to ir_rvalue_visitor
Previously, we would bogusly replace the entire statement containing the
ir_texture node with an ir_dereference_variable.

Correct this to just replace the ir_texture node itself as intended.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5442c0eae3)
2013-11-15 13:39:37 -08:00
Chris Forbes
0010bdd54a glsl: fix missing breaks in equals(ir_texture,..)
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d257350949)
2013-11-15 12:28:34 -08:00
Matt Turner
b8a631295a i965/fs: Don't perform CSE on inst HW_REG dests (unless it's null)
Commit b16b3c87 began performing CSE on CMP instructions with null
destinations. I relaxed the restrictions a bit too much, thereby
allowing CSE to be performed on instructions with, for instance, an
explicit accumulator destination.

This broke the arb_gpu_shader5/fs-imulExtended shader tests because
they emit MUL instructions with the accumulator as the destination. CSE
would instead cause the MUL to write to a GRF, which is lower precision
than the accumulator.

Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 68349e5219)
2013-11-15 12:28:34 -08:00
Ian Romanick
c94ed272eb Add .cherry-ignore file
Since we've disabled DRI3 completely in 10.0, f0f202e this commit is no
longer necessary.
2013-11-15 12:28:33 -08:00
Eric Anholt
47139b0233 glx: Back DRI3 enablement out of the stable branch.
After more testing (everyone else trying to build the stack is having as
much trouble as I had, even after the problems I had were fixed), it
really feels like dri3 is not something we're ready to support in this
stable branch.  The .c/.h code will remain here to enable easier
cherry-picking from master, and everything stays on master so we can ship
a solid DRI3 in 3 months.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-11-15 12:28:33 -08:00
Brian Paul
94251281b4 glx: change query_renderer_integer() value param to unsigned
When this function was added, the returned value was signed in some
places, unsigned in others.

v2: also add unsigned in the unit test, per Ian.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 75982a5df4)
2013-11-15 11:58:06 -08:00
José Fonseca
03a29306b5 glx: Fix scons build.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6c6f4aa6fd)
2013-11-15 11:58:06 -08:00
Brian Paul
d37ea6dfec swrast: add missing notify_reset parameter to dri_create_context()
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit f41c01c688)
2013-11-15 11:57:10 -08:00
José Fonseca
84ee00c1b2 scons: Add dri2_query_renderer.c to sources.
(cherry picked from commit cb3c57df3a)
2013-11-15 11:57:10 -08:00
José Fonseca
3ffcc96abc st/dri: Fix dri_create_context declaration prototype.
(cherry picked from commit caf1d96862)
2013-11-15 11:57:10 -08:00
Alexander von Gluck IV
bc94bf08c4 haiku/swrast: Inherit gl_config, fix flush
* Inherit gl_context so we always have access to it
* Thanks curro for the idea.
* Last Haiku cannidate for 10.0.0

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-11-14 12:38:27 -06:00
Alexander von Gluck IV
ce904c4caf haiku: add swrast driver
* This is pretty small and upkeep should be minimal.
* Currently fully working.
* Cannidate for 10.0.0 branch

Acked-by: Brian Paul <brianp@vmware.com>
2013-11-13 13:35:58 -06:00
Keith Packard
035cce83f7 dri3: Fix pixmap buf_id computation
Looks like some kind of rebase damage to me...

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-07 19:08:09 -08:00
Eric Anholt
4b5d0d10f1 glx: Add a more informative debug message in a DRI3 error path. 2013-11-07 19:08:09 -08:00
Keith Packard
2d94601582 Add DRI3+Present loader
Uses the __DRIimage loader interfaces.

v2: Fix _XIOErrors when DRI3 isn't present (change by anholt).  Apparently
    XCB just terminates your connection if you don't check for extensions
    before using them, instead of returning an error like you'd expect.

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-07 19:08:09 -08:00
Keith Packard
442442026e dri: add __DRIimageLoaderExtension and __DRIimageDriverExtension
These provide an interface between the driver and the loader to allocate
color buffers through the DRIimage extension interface rather than through a
loader-specific extension (as is used by DRI2, for instance).

The driver uses the loader 'getBuffers' interface to allocate color buffers.

The loader uses the createNewScreen2, createNewDrawable, createNewContext,
getAPIMask and createContextAttribs APIS (mostly shared with DRI2).

This interface will work with the DRI3 loader, and should also work with GBM
and other loaders so that drivers need not be customized for each new loader
interface, as long as they provide this image interface.

v2: Fix build of i915 and i965 together (by anholt)

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-07 19:08:09 -08:00
Keith Packard
1f085ba18f dri/i915,dri/i965: Use driGLFormatToImageFormat and driImageFormatToGLFormat
Remove private versions of these functions

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2013-11-07 19:08:09 -08:00
Keith Packard
b7818b8c36 dri/common: Add functions mapping MESA_FORMAT_* <-> __DRI_IMAGE_FORMAT_*
The __DRI_IMAGE_FORMAT codes are used by the image extension, drivers need to
be able to translate between them. Instead of duplicating this translation in
each driver, create a shared version.

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2013-11-07 19:08:09 -08:00
Keith Packard
aba6b84ce5 Define __DRI_IMAGE_FORMAT_SARGB8
This format will be used by the i965 driver

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-11-07 19:08:09 -08:00
Keith Packard
bf6591e948 dri/intel: Add explicit size parameter to intel_region_alloc_for_fd
Instead of assuming that the size will be height * pitch, have the caller pass
in the size explicitly.

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2013-11-07 19:08:09 -08:00
Keith Packard
888533dcd6 dri/intel: Split out DRI2 buffer update code to separate function
Make an easy place to splice in a DRI3 version of this function

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-11-07 19:08:09 -08:00
Keith Packard
f66a6c5fe7 drivers/dri/common: A few dri2 functions are not actually DRI2 specific
This just renames them so that they can be used with the DRI3 extension
without causing too much confusion.

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-11-07 19:08:09 -08:00
Roland Scheidegger
ea1f7d2894 gallivm: deduplicate some indirect register address code
There's only one minor functional change, for immediates the pixel offsets
are no longer added since the values are all the same for all elements in
any case (it might be better if those weren't stored as soa vectors in the
first place maybe).

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-11-08 03:38:32 +01:00
Ian Romanick
8c5330226f glx/tests: Add unit tests for the DRI2 part of GLX_MESA_query_renderer
After adding $(DEFINES) to AM_CPPFLAGS, the __glXGetCurrentContext
wrapper function is no longer needed and causes compile errors.  Using
the correct defines causes it to be a macro!

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 18:12:33 -08:00
Ian Romanick
0cce553867 glx/tests: Add unit tests for the GLX part of GLX_MESA_query_renderer
These tests primarilly ensure that the functions added by this extension
don't abuse other interfaces (e.g., glx_screen::query_renderer_integer)
when provided bad data.

These tests helped me find a couple small bugs in the initial
implementation.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 18:12:33 -08:00
Ian Romanick
d4cc186937 glx/tests: Add GetGLXScreenConfigs_called flag
Tests for the GLX_MESA_query_context extension will use this flag.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 18:12:33 -08:00
Ian Romanick
ee6c9fcbca docs: Import extension spec for GLX_MESA_query_renderer
The enumerated values are currently allocated from Intel's range.

v2: Fix a typo.  Update the list of functions to which the new enums can
be passed.  The "Current" versions were previously missing.  Both things
noticed by Marek.

v3: Fix typo in return type of glXQueryRendererIntegerMESA in the spec
body (noticed by Ken).  Fix typo in issue #14 referencing itself instead
of issue #13 (noticed by Dave).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2013-11-07 18:12:33 -08:00
Ian Romanick
4680d237c5 glx/dri2: Add DRI2 support for GLX_MESA_query_renderer
The new functions for this extension were added to a separate file
(dri2_query_renderer.c) to facilitate unit testing.  I tried putting
them in dri2_glx.c, and it resulting in an unending chain of
dependencies.  It was the proverbial threading hanging from a sweater.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 18:12:33 -08:00
Ian Romanick
419684091c glx/dri2: Pull some internal structures out to a separate header file
This structures will be accessed by internal functions that will be
added in a file separate from dri2_glx.c.  The new code will be added to
a new file to facilitate unit testing.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 18:12:32 -08:00
Ian Romanick
4944588cfd glx/tests: Silence warnings after adding fields to glx_screen_vtable
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 18:12:32 -08:00
Ian Romanick
6c28c037c4 glx: Add functions and GLX plumbing for GLX_MESA_query_renderer
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 18:12:32 -08:00
Ian Romanick
38a1d8b14c glx: Add GLX_MESA_query_renderer
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 18:12:32 -08:00
Ian Romanick
b3ffc5b6f4 glx: Add extension tracking GLX_MESA_query_renderer
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 18:12:32 -08:00
Ian Romanick
1e4ce08f38 i965: Wire up initial support for DRI_RENDERER_QUERY extension
v2: Use sysconf instead of sysinfo for improved portability.  Suggested
by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 18:12:27 -08:00
Ian Romanick
2fe6fbd19f i915: Wire up initial support for DRI_RENDERER_QUERY extension
v2: Use sysconf instead of sysinfo for improved portability.  Suggested
by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 18:08:15 -08:00
Ian Romanick
9dbc14abcf dri: Add function to implement queries common to all Mesa drivers
v2: Add assertions that the version string has the expected format.
This will catch build errors (or changes to the version string format)
in debug build without exposing release builds to buffer over-runs.
Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 18:08:15 -08:00
Ian Romanick
83ffe47be0 i965: Refactor the renderer string creation out of intelGetString
This will soon be used in intel_screen.c from a function that doesn't
have a gl_context.

v2: Delete local variables that are now unused.  This matches v1 of the
changes to the i915 driver.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 18:08:15 -08:00
Ian Romanick
339f36fc5e i915: Refactor the renderer string creation out of intelGetString
This will soon be used in intel_screen.c from a function that doesn't
have a gl_context.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 18:08:15 -08:00
Ian Romanick
18291251ec i965: Refactor the vendor string out of intelGetString
This will soon be used in intel_screen.c from a function that doesn't
have a gl_context.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 18:08:15 -08:00
Ian Romanick
135b7e7260 i915: Refactor the vendor string out of intelGetString
This will soon be used in intel_screen.c from a function that doesn't
have a gl_context.

v2: Remove spurious break after return.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 18:08:15 -08:00
Ian Romanick
64bb1e857a dri: Add interface definition for DRI_RENDERER_QUERY extension
This will be used to let apps query hardware and driver limits before
creating a GL context.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 18:08:15 -08:00
Ian Romanick
1f712bdd38 i965: Enable DRI_Robustness extension
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 17:40:25 -08:00
Ian Romanick
e8dac9632d i965: Propagate the GPU reset notifiction strategy down into the driver
If the application requests reset notifiction, connect up the reset
status query method and set gl_context::ResetStrategy.

v2: Update based on kernel interface / libdrm changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 17:40:25 -08:00
Ian Romanick
8f2c93ff75 i965: Add function to query the GPU reset status for a context
v2: Update based on kernel interface / libdrm changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 17:40:25 -08:00
Ian Romanick
15c3bac3d0 i965: Handle __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS flag
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 17:40:25 -08:00
Ian Romanick
7b140d1bda mesa/dri: Move context flag validation down into the drivers
Soon some drivers will support a different set of flags than other
drivers.  If some flags have to be filtered in the driver, we might as
well filter all of them in the driver.

The changes in nouveau use tabs because nouveau seems to have it's own
indentation rules.

v2: Fix some rebase failures noticed by Ken (returning the wrong types,
etc.).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 17:40:05 -08:00
Ian Romanick
17c94de33b mesa/dri: Add basic plumbing for GLX_ARB_robustness reset notification strategy
No drivers advertise the DRI2 extension yet, so no driver should ever
see a value other than false for notify_reset.

The changes in nouveau use tabs because nouveau seems to have it's own
indentation rules.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 17:31:16 -08:00
Ian Romanick
916bc4491a mesa: Implement proper tracking logic for glGetGraphicsResetStatusARB
Drivers still have to implement dd_function_table::GetGraphicsResetStatus.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-11-07 16:41:38 -08:00
Ian Romanick
a6eb04c3d8 mesa: Add gl_shared_state::ShareGroupReset and gl_context::ShareGroupReset
These will be used to determine whether to signal a GPU reset after
another context in the share group has observed a reset.

v2: Change ShareGroupReset from GLboolean to bool.  Suggested by Brian.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-11-07 16:41:38 -08:00
Ian Romanick
2fdc0ee19f mesa: Add dd_function_table::GetGraphicsResetStatus
This allows drivers to determine whether a GPU reset has occured.  It
should return non-zero status if a reset was observed by the specified
context.  Another mechanism will be used to observe resets occuring in
other contexts in the share group.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-11-07 16:41:38 -08:00
Ian Romanick
114d360dfa mesa: Remove gl_context::ResetStatus
This isn't going to be used in the actual implemenation of
glGetGraphicsResetStatus.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-11-07 16:41:38 -08:00
Matt Turner
69b425efae st/xorg: Delete.
Acked-by: Lucas Stach <l.stach@pengutronix.de>
2013-11-07 16:14:25 -08:00
Matt Turner
48f4f59dc6 xorg-nouveau: Delete. 2013-11-07 16:14:25 -08:00
Matt Turner
11ff1725cc xorg-i915: Delete.
Acked-by: Jakob Bornecrantz <wallbraker@gmail.com>
Acked-by: Stéphane Marchesin <stephane.marchesin@gmail.com>
2013-11-07 16:14:25 -08:00
Ian Romanick
cf0da87917 docs: Mark off ARB_shader_atomic_counters for i965
...and update relnotes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 16:02:03 -08:00
Francisco Jerez
597634556e i965/gen7: Expose ARB_shader_atomic_counters.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 15:56:57 -08:00
Francisco Jerez
5c114939b4 glsl: Linker support for ARB_shader_atomic_counters.
v2: Add comments on the purpose of the auxiliary data structures.
    Check for atomic counter overlaps.  Use the contains_atomic()
    convenience method.  Add static assert with the number of expected
    shader stages.
v3: Don't resize atomic arrays.
v4: Add comment on the reason why we don't resize atomic counter
    arrays.  Use 'strcmp(...) == 0' instead of '!strcmp(...)'.
v5 (idr):  Don't use STL in the linker.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 15:56:57 -08:00
Francisco Jerez
e63bb29853 glsl: Implement parser support for atomic counters.
v2: Mark atomic counters as read-only variables.  Move offset overlap
    code to the linker.  Use the contains_atomic() convenience method.
v3: Use pointer to integer instead of non-const reference.  Add
    comment so we remember to add a spec quotation from the next GLSL
    release once the issue of atomic counter aggregation within
    structures is clarified.
v4 (idr): Don't use std::map because it's overkill.  Add an assertion
    that ctx->Const.MaxAtomicBufferBindings <= MAX_COMBINED_ATOMIC_BUFFERS.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 15:56:57 -08:00
Kenneth Graunke
30f61c471d Revert "i965: Add support for GL_AMD_performance_monitor on Ironlake."
This reverts most of commit 0f2da77307.
(I chose to leave the additions to brw_defines.h.)

My previous Ironlake implementation was somewhat broken: counter data
was global, rather than per-context.  This meant that performance
monitors captured data from your compositor, 2D driver, and other 3D
programs.

Originally, I believed that Sandybridge and later had an easy way to
avoid this problem (setting per-context flags in OACONTROL), while
Ironlake did not.  So I'd intended to leave it as a known limitation of
performance monitoring support on Ironlake.  However, this turned out
not to be true.

Unfortunately, our hardware only has one set of aggregating performance
counters shared between all 3D programs, and their values are not saved
or restored by hardware contexts.  Also, at least on Sandybridge and
Ivybridge, the counters lose their values if the GPU goes to sleep.

To work around both of these problems, we have to snapshot the
performance counters at the beginning and end of each batch, similar to
how we handle query objects on platforms that don't support hardware
contexts.

For occlusion queries, this batch bookending approach is fairly simple:
only one occlusion query can be active at a time, and the result is a
single integer.  Performance monitors are more complex: an arbitrary
number of monitors can be active at a time, each monitoring some subset
of our ~30 observability counters.  Individual monitors can be started
and stopped at any point during the batch.  Tracking where each monitor
started/ended relative to batch flushes ends up being a pain.  And you
can run out of space in the buffer.

Properly supporting this required some serious rearchitecting of the
code.  Rather than writing patches to try and morph a broken system into
a working one (which operates quite differently), I decided it would be
simplest to revert the old code and start fresh.  Parts will look
familiar, but other parts are new.

I also decided it would be best to include Sandybridge and Ivybridge
support from the start, since the newer platforms have added complexity
that I wanted to make sure worked.  They're also what most people care
about these days.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-07 15:52:02 -08:00
Kenneth Graunke
1bd6233169 glsl: Enable dFdx, dFdy, and fwidth by default in GLSL ES 3.00.
Previously, we only exposed them in desktop GL or with:

   #extension GL_OES_standard_derivatives : enable

GLSL ES 3.00 includes these without an extension, so we need to expose
them by default.

Note that the above #extension line results in an error or desktop GL,
so we don't need to worry about this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-11-07 15:52:02 -08:00
Fredrik Höglund
c9ac891fa4 docs: Mark off ARB_vertex_type_10f_11f_11f_rev for r600g
...and update relnotes.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-11-07 23:55:46 +01:00
Fredrik Höglund
e420fb887f r600g: Add support for PIPE_FORMAT_R11G11B10_FLOAT vertex elements
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-11-07 23:51:44 +01:00
Fredrik Höglund
bfc28e4aff st/mesa: Add support for ARB_vertex_type_10f_11f_11f_rev
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-11-07 23:51:24 +01:00
Brian Paul
fe9284a7bf mesa: fix return statements in varray.c
Return false, not GL_FALSE.  Add missing return value.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71359
2013-11-07 15:23:36 -07:00
Brian Paul
6592a6d065 svga: always return 4 for PIPE_MAX_COLOR_BUFS
Even if the query returns 8, only 4 really work.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-11-07 15:21:40 -07:00
Brian Paul
055dbd5c3e svga: return true for the PIPE_CAP_SM3 query
This just tells the state tracker to turn on the GL_ARB_shader_texture_lod
extension.  This simply allows the GLSL compiler to emit TXL and TXD
instructions for both vertex and fragment shaders.  We already support
these opcodes in the svga driver.  Though, the shadow2DGrad() Piglit
tests are failing.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-11-07 15:21:40 -07:00
Matt Turner
6b990a7474 i965: Add an implementation of intel_miptree_map using streaming loads.
Improves performance of RoboHornet's 2D Canvas toDataURL benchmark
[http://www.robohornet.org/#e=canvastodataurl] by approximately 5x
on Baytrail on ChromiumOS.

Elapsed time drops by -81.4861% +/- 1.22619% (n=3 s=14.9105, confidence=95%).

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-11-07 13:18:03 -08:00
Matt Turner
6f2e81ce4c mesa: Add a streaming load memcpy implementation.
Uses SSE 4.1's MOVNTDQA instruction (streaming load) to read from
uncached memory without polluting the cache.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-11-07 13:18:03 -08:00
Chris Forbes
d41084a63d docs: Mark off some more things.
These have been supported on i965/Gen7+ for a while, and are listed
in the 10.0 release notes.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2013-11-08 09:57:29 +13:00
Anuj Phogat
735a777842 i965: Fix 'SIMD16 only' dispatch of fragment shader in case of sample shading
This patch make changes to correctly set up the Dispatch GRF Start
Register in case of 'SIMD16 only' FS dispatch.

This fixes an issue of incorrect rendering on dolphin emulator with
GL_SAMPLE_SHADING enabled.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-07 12:20:33 -08:00
Chris Forbes
4871e7b91f docs: update relnotes 2013-11-08 09:10:06 +13:00
Chris Forbes
2973f38f1c docs: Mark off ARB_vertex_type_10f_11f_11f_rev.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-08 09:10:06 +13:00
Chris Forbes
5e61c746d5 i965: Enable ARB_vertex_type_10f_11f_11f_rev on Gen6+.
This theoretically works on earlier hardware as well, but the extension
requires at least GL3.0.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-08 09:10:06 +13:00
Chris Forbes
7a95bb0a80 i965: add support for UNSIGNED_INT_10F_11F_11F_REV vertex attribs
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-08 09:10:06 +13:00
Chris Forbes
48b6d70bef vbo: add 10_11_11 support to vbo_attrib_tmp
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-08 09:10:06 +13:00
Chris Forbes
fa14f8afa0 mesa: Add support to _mesa_bytes_per_vertex_attrib for 10_11_11 format.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-08 09:10:06 +13:00
Chris Forbes
1f092a9594 mesa: add varray support for UNSIGNED_INT_10F_11F_11F_REV type
V2: fix interaction with VertexAttribFormat, since that landed after
this was originally written

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-08 09:09:43 +13:00
Chris Forbes
aba355b463 mesa: Add extension scaffolding for ARB_vertex_type_10f_11f_11f_rev
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-08 09:00:47 +13:00
Matthew McClure
f9e2c24326 draw,llvmpipe,util: add depth bias calculation for arb_depth_buffer_float
With this patch, the llvmpipe and draw modules will calculate the depth bias
according to floating point depth buffer semantics described in the
arb_depth_buffer_float specification, when the driver has a z buffer bound
with a format type of UTIL_FORMAT_TYPE_FLOAT.

By default, the driver will use the existing UNORM calculation for depth bias.

A new function, draw_set_zs_format, was added to calculate the Minimum
Resolvable Depth value and floating point depth sense for the draw module.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-11-07 18:32:54 +00:00
Eric Anholt
185b5a54c9 i965: Avoid flushing the batch for every blorp op.
This brings over the batch-wrap-prevention and aperture space checking
code from the normal brw_draw.c path, so that we don't need to flush the
batch every time.

There's a risk here if the intel_emit_post_sync_nonzero_flush() call isn't
high enough up in the state emit sequences -- before, we implicitly had
one at the batch flush before any state was emitted, so Mesa's workaround
emits didn't really matter.  Since the SNB fixes by Ken, I didn't see any
regressions after 3 piglit runs.

Improves cairo-gl performance by 13.7733% +/- 1.74876% (n=30/32)
Improves minecraft apitrace performance by 1.03183% +/- 0.482297% (n=90).
Reduces low-resolution GLB 2.7 performance by 1.17553% +/- 0.432263% (n=88)
Reduces Lightsmark performance by 3.70246% +/- 0.322432% (n=126)
No statistically significant performance difference on unigine tropics
(n=10)
No statistically significant performance difference on openarena (n=755)

The two apps that are hurt happen to include stalls on busy buffer
objects, so I think this is an effect of missing out on an opportune
flush.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-07 10:20:33 -08:00
Matt Turner
fd03dd6ddd build: Build gen_matypes and matypes.h from src/mesa.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-07 10:00:25 -08:00
Matt Turner
d8abd6710e build: Change HAVE_X86_ASM to mean x86 or x86-64 asm.
I want a conditional that says generally "we have x86 assembly" in the
next patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-07 10:00:25 -08:00
Matt Turner
957c7570ea configure.ac: Test $asm_arch directly.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-07 10:00:25 -08:00
Fredrik Höglund
23e69ad6ec docs: Mark ARB_vertex_attrib_binding as done, update relnotes
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-07 16:21:43 +01:00
Fredrik Höglund
d2ac5d9a13 mesa: Enable ARB_vertex_attrib_binding
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-07 16:20:45 +01:00
Fredrik Höglund
193e8b4b93 mesa: Optimize rebinding the same VBO
Check if the new buffer object has the same name as the current
buffer object before looking it up.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-07 16:20:45 +01:00
Fredrik Höglund
965900e830 mesa: Handle zero-stride arrays in _mesa_update_array_max_element()
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 16:20:45 +01:00
Fredrik Höglund
fb370f89db mesa: Add Get* support for ARB_vertex_attrib_binding
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 16:20:45 +01:00
Fredrik Höglund
59b01ca252 mesa: Add ARB_vertex_attrib_binding
update_array() and update_array_format() are changed to update the new
attrib and binding states, and the client arrays become derived state.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-07 16:20:45 +01:00
Fredrik Höglund
bb2d02c7b5 glapi: Add infrastructure for ARB_vertex_attrib_binding
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 16:20:45 +01:00
Fredrik Höglund
ccb6286707 mesa: Make handle_bind_buffer_gen() non-static
...and rename it to _mesa_bind_buffer_gen().

This is so the function can be called from _mesa_BindVertexBuffer().

This patch also adds a caller parameter so we can report the right
entry point in error messages.

Based on a patch by Eric Anholt.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 16:20:45 +01:00
Fredrik Höglund
12cbe995ed mesa: Rename gl_array_object::VertexAttrib to _VertexAttrib
This will become derived state as part of the ARB_vertex_attrib_binding
support.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 16:20:45 +01:00
Fredrik Höglund
d5543213f2 mesa: Split out the format code from update_array()
Split out the code for updating the array format into a new function
called update_array_format(). This function will be called by both
update_array() and the new glVertexAttrib*Format() entry points in
ARB_vertex_attrib_binding.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 16:20:44 +01:00
Fredrik Höglund
6a650fa787 mesa: Restore gl_array_object::NewArray
This will be used by the ARB_vertex_attrib_binding implementation.
This reverts commit db38e9a0e1.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-07 16:20:44 +01:00
Kenneth Graunke
c6a3fb69c6 i965: Use has_surface_tile_offset in depth/stencil alignment workaround.
Currently, has_surface_tile_offset is equivalent to gen == 4 && !is_g4x.

We already use it for related checks in brw_wm_surface_state.c, so it
makes sense to use it here too.  It's simpler and more future-proof.

Broadwell also lacks surface tile offsets.  With this patch, I won't
need to update any generation checking; I can simply not set the flag.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-07 00:17:53 -08:00
Fabio Pedretti
110009302b gallium: fix build on GNU/kFreeBSD
Patch from Debian package

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-11-06 22:08:26 +01:00
Fabio Pedretti
4f4da81dc8 configure.ac: fix build on GNU/kFreeBSD
Based on existing patch from Debian package.

Debian bug: http://bugs.debian.org/524690

Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-11-06 22:08:26 +01:00
Fabio Pedretti
9d805c96eb mesa: add arm64 support
Patch from Ubuntu package

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-11-06 22:08:26 +01:00
Fabio Pedretti
da7daade92 r600/compute: silence unused var warning
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-11-06 22:07:58 +01:00
Paul Berry
2fd785ac49 i965/gen6: Don't allow SIMD16 dispatch in 4x PERPIXEL mode with computed depth.
Hardware docs say we can only use SIMD8 dispatch in this condition.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-11-06 11:58:42 -08:00
Matt Turner
4e84f394e9 configure.ac: Drop no-out-of-tree notice.
We do support out of tree builds now.

Tested-by: Colin Walters <walters@verbum.org>
2013-11-06 11:26:19 -08:00
Matt Turner
5ca3926442 mesa: Build program as part of libmesa. 2013-11-06 11:26:19 -08:00
Matt Turner
b0bfb7c41e mesa: Clean up use of top_srcdir/top_builddir. 2013-11-06 11:26:19 -08:00
Matt Turner
8bc126cd37 i965: Use unreachable() to silence a compiler warning.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-11-06 11:26:18 -08:00
Matt Turner
3a5223c24c mesa: Add unreachable() macro.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-11-06 11:26:18 -08:00
Roland Scheidegger
b35ea09349 gallivm: fix indirect addressing of inputs
We weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first element.
(Copied straight from the same fix for temps.)
While here fix up a couple of broken comments in the fetch functions,
plus don't name a straight float type float4 which is just confusing.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-11-06 18:20:54 +01:00
Vincent Lejeune
08556073d1 r600/llvm: Fix isampleBuffer on preEG 2013-11-06 17:36:22 +01:00
Vincent Lejeune
1184f8fd34 r600/llvm: Fix texbuf for pre EG gen 2013-11-06 17:36:22 +01:00
Brian Paul
36f1c6e3db mesa: for GLSL_DUMP_ON_ERROR, also dump the info log
Since it's helpful to know why the shader did not compile.
Also, call fflush() for Windows.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-11-06 09:04:16 -07:00
Grigori Goronzy
5580ff818e st/vdpau: resolve delayed rendering for GL interop v2
Otherwise OutputSurface interop has funny results sometimes.
This fixes interop with the mpv media player.

v2 (chk): add proper locking

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-11-06 08:45:57 +01:00
Chris Forbes
3785fe2715 docs: Mark off ARB_sample_shading; minor tidyup.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2013-11-06 19:36:27 +13:00
Chris Forbes
f7e15fcf56 i965/fs: Gen4-5: Implement alpha test in shader for MRT
V2: Add comment explaining what emit_alpha_test() is for;
    fix spurious temp and bogus whitespace.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-06 19:29:52 +13:00
Chris Forbes
ca82ba90dd i965/fs: Gen4-5: Setup discard masks for MRT alpha test
The same setup is required here as when the user-provided shader
explicitly uses KIL or discard.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-06 19:29:49 +13:00
Chris Forbes
1080fc610e i965: Gen4-5: Include alpha func/ref in program key
V2: Better explanation of the rationale for doing this.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-06 19:29:46 +13:00
Chris Forbes
dbcd633040 i965: Gen4-5: Don't enable hardware alpha test with MRT
We have to do this in the shader instead, since these gens lack an
independent RT0 alpha value in their render target write messages.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-06 19:29:36 +13:00
Kenneth Graunke
39ebb72e52 i965: Combine {brw,gen7}_update_texture_buffer_surface() functions.
Now that brw_update_texture_buffer_surface() uses the virtual
emit_buffer_surface_state() function, it works for Gen7+ too.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-05 17:59:53 -08:00
Kenneth Graunke
7a974a645e i965: Unvirtualize brw_create_constant_surface; delete Gen7+ variant.
Now that brw_create_constant_surface uses a virtual function internally,
it doesn't need to be virtual itself.  We can delete the Gen7+ variant
and simplify things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-05 17:59:51 -08:00
Kenneth Graunke
ee23dd139a i965: Use the new emit_buffer_surface_state() vtable entry.
This will allow us to combine the Gen4-6 and Gen7 variants of these
functions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-05 17:59:50 -08:00
Kenneth Graunke
ba836e02a3 i965: Virtualize emit_buffer_surface_state().
This entails adding "mocs" and "rw" parameters to the Gen4-5 version.
I made it actually pay attention to the rw flag (even though it is
always false), but mocs is always ignored.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-05 17:59:39 -08:00
Courtney Goeltzenleuchter
e3854fe194 i965: Fix compiler warning.
fix: intel_screen.c:1320:4: warning: initialization from
incompatible pointer type [enabled by default]

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-05 17:59:38 -08:00
Eric Anholt
ff337bc800 i965: Tell the unit states how many binding table entries we have.
Before the series with 3c9dc2d31b to
dynamically assign our binding table indices, we didn't really track our
binding table count per shader, so we never filled in these fields.

Affects cairo-gl trace runtime by -2.47953% +/- 1.07281% (n=20)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-05 15:39:45 -08:00
Eric Anholt
3f319eef76 i965: Fix context initialization after 2f89662717
You can't return stack-initialized values and expect anything good to
happen.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-11-05 15:39:44 -08:00
Roland Scheidegger
5ae31d7e1d gallivm: optimize lp_build_minify for sse
SSE can't handle true vector shifts (with variable shift count),
so llvm is turning them into a mess of extracts, scalar shifts and inserts.
It is however possible to emulate them in lp_build_minify with float muls,
which should be way faster (saves over 20 instructions per 8-wide
lp_build_minify). This wouldn't work for "generic" 32bit shifts though
since we've got only 24bits of mantissa (actually for left shifts it would
work by using sse41 int mul instead of float mul but not for right shifts).
Note that this has very limited scope for now, since this is only used with
per-pixel lod (otherwise we're avoiding the non-constant shift count by doing
per-quad shifts manually), and only 1d textures even then (though the latter
should change).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-11-05 23:32:24 +01:00
Ian Romanick
7df7e730fb nouveau: Use _NEW_SCISSOR instead of hooking through dd_function_table
This will enable removing the dd_function_table::Scissor hook in the
near future.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-11-05 07:50:19 -08:00
Ian Romanick
3f30425424 nouveau: Use _NEW_VIEWPORT instead of hooking through dd_function_table
This will enable removing the dd_function_table::DepthRange hook in the
near future.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-11-05 07:50:19 -08:00
Ian Romanick
3a5b84cece radeon / r200: Don't pass unused parameters to radeon_viewport
The x, y, width, and height parameters aren't used by radeon_viewport,
so don't pass them.  This should make future changes to the
dd_function_table::Viewport interface a little easier.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jljusten@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Courtney Goeltzenleuchter <courtney@lunarg.com>
2013-11-05 07:50:12 -08:00
Ian Romanick
619a9bee7d i915: Bring sanity to the Viewport function
The i830 and the i915 driver have the same dd_function_table::Viewport
function... it just has two names and lives in two places.  Using a
single implementation allows cleaning up the saved_viewport nonsense
too.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jljusten@gmail.com>
Cc: Courtney Goeltzenleuchter <courtney@lunarg.com>
2013-11-05 07:50:04 -08:00
Ian Romanick
abd962f1d5 i965: Eliminate the saved_viewport wrapper
The i965 driver never installed a dd_function_table::Viewport function,
so this wrapper never actually did anything.

No piglit regressions on IVB on DRI2.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jljusten@gmail.com>
Cc: Courtney Goeltzenleuchter <courtney@lunarg.com>
2013-11-05 07:49:54 -08:00
Alexander von Gluck IV
1c7605685d mesa: Remove last BEOS checks
* Goodbye BeOS, we hardly knew thee
* As BeOS was gcc2 only, there was little chance
  of this being useful.
* Doesn't effect Haiku in any meaningful way

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-11-05 09:37:58 -06:00
José Fonseca
c883ee4498 util/u_format: take normalized flag in consideration in util_format_is_rgba8_variant
Just happened to notice it was missing while looking at it.
2013-11-05 14:05:41 +00:00
Paul Berry
86cdff5635 glsl: Don't generate misleading debug names when packing gs inputs.
Previously, when packing geometry shader input varyings like this:

    in float foo[3];
    in float bar[3];

lower_packed_varyings would declare a packed varying like this:

    (declare (shader_in flat) (array ivec4 3) packed:foo[0],bar[0])

That's confusing, since the packed varying acutally stores all three
values of foo and all three values of bar.

This patch causes it to generate the more sensible declaration:

    (declare (shader_in flat) (array ivec4 3) packed:foo,bar)

Note that there should be no functional change for users of geometry
shaders, since the packed name is only used for generating debug
output.  But this should reduce confusion when using INTEL_DEBUG=gs.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-04 19:04:29 -08:00
Vinson Lee
749cb89097 gallivm: Remove llvm::DisablePrettyStackTrace for LLVM >= 3.4.
LLVM 3.4 r193971 removed llvm::DisablePrettyStackTrace and made the
pretty stack trace opt-in rather than opt-out.

The default value of DisablePrettyStackTrace has changed to true in LLVM
3.4 and newer.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60929
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-11-04 18:22:04 -08:00
Alexander von Gluck IV
e759f1c111 target/haiku-softpipe: Fix viewport issues
* Call mesa viewport call on winndow resize
* Add initial postprocessing code
* Pass hgl_context to private statetracker
  as it is more useful than GalliumContext
* Use Lock and Unlock functions to standardize
  GalliumContext locking
* Create texture resources in texture validation

Acked-by: Brian Paul <brianp@vmware.com>
2013-11-05 01:17:55 +00:00
Brian Paul
faaf568cfb mesa: remove __alpha__ && CCPML check
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-11-04 18:09:57 -07:00
Brian Paul
2671b576b2 mesa: remove OPENSTEP stuff
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-11-04 18:09:57 -07:00
Brian Paul
32577fc0ad mesa: remove macintosh preprocessor stuff
IIRC, this is MacOS 9.x stuff.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-11-04 18:09:57 -07:00
Brian Paul
5a5d2d2db8 mesa: remove __QUICKDRAW__ tests
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-11-04 18:09:57 -07:00
Brian Paul
9bdc94b94d mesa: remove WGLAPI macro
WGLAPI was defined in glheader.h but wasn't used anywhere.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-11-04 18:09:57 -07:00
Kenneth Graunke
7b4b94a956 i965: Expose brw_reg_from_fs_reg() to other files.
This will be useful for Broadwell code as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-11-04 16:51:22 -08:00
Kenneth Graunke
10cb91d7fb i965: Combine gen6_clip_state.c and gen7_clip_state.c.
The changes between Gen6-7 are minimal, and can easily be solved with
an extra generation check.  This cuts a lot of duplicated code.

It also helps prevent even more duplication for Broadwell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-04 16:44:42 -08:00
Francisco Jerez
67b8f4c569 dri/nouveau: Fix nouveau_init_screen2 breakage.
Fix incorrect init ordering in nouveau_init_screen2 caused by
083f66fdd6.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71172
2013-11-04 12:17:37 -08:00
Francisco Jerez
35fe7ed7d3 i965/gen7: Add instruction latency estimates for untyped atomics and reads.
The latency information has been obtained empirically from
measurements taken on Haswell and Ivy Bridge.

Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-11-04 12:12:38 -08:00
Francisco Jerez
ba885c30c7 i965/gen7: Handle atomic instructions from the VEC4 back-end.
This can deal with all the 15 32-bit untyped atomic operations the
hardware supports, but only INC and PREDEC are going to be exposed
through the API for now.

v2: Represent atomics as GLSL intrinsics.  Add support for variably
    indexed atomic counter arrays.
v3: Add comment on why we don't need to assign uniform storage for
    atomic counters.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-04 12:12:38 -08:00
Francisco Jerez
764f40d92e i965/gen7: Handle atomic instructions from the FS back-end.
This can deal with all the 15 32-bit untyped atomic operations the
hardware supports, but only INC and PREDEC are going to be exposed
through the API for now.

v2: Represent atomics as GLSL intrinsics.  Add support for variably
    indexed atomic counter arrays.  Fix interaction with fragment
    discard.
v3: Add comment on why we don't need to assign uniform storage for
    atomic counters.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-04 12:12:37 -08:00
Francisco Jerez
34fe051e21 i965: Add a 'has_side_effects' back-end instruction predicate.
This patch fixes the three dead code elimination passes and the
VEC4/FS instruction scheduling passes so they leave instructions with
side effects alone.

At some point it might be interesting to have the instruction
scheduler calculate the exact memory dependencies between atomic ops,
but they're rare enough that it seems unlikely that it will make any
practical difference.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-04 12:12:37 -08:00
Francisco Jerez
bf045bf9b4 clover: Calculate optimal work group size when it's not specified by the user.
Inspired by a patch sent to the mailing list by Tom Stellard, but
using a different algorithm to calculate the optimal block size that
has been found to be considerably more effective.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-11-04 12:12:37 -08:00
Francisco Jerez
67a3037444 clover: Constify some command_queue arguments. 2013-11-04 12:12:37 -08:00
Francisco Jerez
6e9206bdcc clover: Workaround compiler bug present in GCC 4.7.0-4.7.2.
Variadic template aliases make these versions of GCC very confused,
write down the full type spec instead.
2013-11-04 12:12:37 -08:00
Emil Velikov
0a2bdbb76f st/xorg: handle updates to DamageUnregister API
xserver 1.14.99.2 simplified the DamageUnregister API, by
dropping the drawable argument.
Follow xf86-video-intel and xf86-video-vmware approach and
handle the new API by checking XORG_VERSION_CURRENT.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71110
Reported-by: Michał Górny <mgorny@gentoo.org>
Reported-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-11-04 19:49:26 +00:00
Brian Paul
4e0ed59959 mesa: remove Watcom C support
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-04 12:23:09 -07:00
Brian Paul
2a1f74e7d9 mesa: remove Centerline C support from gl.h
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-04 12:23:09 -07:00
Brian Paul
61ec037c61 mesa: remove BUILD_FOR_SNAP bits
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-04 12:23:09 -07:00
Brian Paul
5d5d63d63c mesa: remove SciTech stuff from gl.h
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-04 12:23:09 -07:00
Marek Olšák
6463b94973 r600g: properly unbind a DSA state being deleted in r600_delete_dsa_state
Tested-by: Christian König <christian.koenig@amd.com>
2013-11-04 19:07:57 +01:00
Marek Olšák
f0733479f0 docs/GL3: document radeonsi support, minor cleanup
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-04 19:07:57 +01:00
Marek Olšák
a767f57a7d radeonsi: implement ARB_vertex_type_2_10_10_10_rev 2013-11-04 19:07:57 +01:00
Marek Olšák
6a250877ea r600g,radeonsi: properly expose texture buffer formats
This exposes GL_ARB_texture_buffer_object_rgb32.
2013-11-04 19:07:57 +01:00
Marek Olšák
dbeedbb7ab radeonsi: implement texture buffer objects
GLSL 1.40 is done.
2013-11-04 19:07:57 +01:00
Marek Olšák
164de0d2a5 radeonsi: report our border color behavior 2013-11-04 19:07:57 +01:00
Marek Olšák
4569bf9199 radeonsi: bind a dummy constant buffer in place of NULL buffers 2013-11-04 19:07:57 +01:00
Marek Olšák
2fd4200123 radeonsi: implement uniform buffer objects 2013-11-04 19:07:57 +01:00
Marek Olšák
d0cf73a408 tgsi/scan: set maximum index for each constant buffer 2013-11-04 19:07:57 +01:00
Marek Olšák
e5f0080d91 radeonsi: try to fix IA_MULTI_VGT_PARAM programming
This doesn't make any difference on Bonaire, but it might help on Hawaii.
2013-11-04 19:07:57 +01:00
Marek Olšák
5e43819475 winsys/radeon: use type-3 NOPs for CS padding on CIK
The type-2 NOPs are said to be unstable. It doesn't make a difference here.
2013-11-04 19:07:56 +01:00
Aaron Watry
1b2c6cd205 clover: fix build with LLVM 3.4
dso_list was added as an argument for createInternalizePass in 3.4, and then
it was removed again in the same llvm version.

Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-11-04 08:51:57 -08:00
Brian Paul
9fc41e2eea draw: move type construction out of loop
We can create clip_ptr_type once instead of n times inside the loop.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-11-04 07:12:14 -07:00
Chad Versace
2f89662717 i965: Add driconf option clamp_max_samples
The new option clamps GL_MAX_SAMPLES to a hardware-supported MSAA mode.
If negative, then no clamping occurs.

v2: (for Paul)
  - Add option to i965 only, not to all DRI drivers.
  - Do not realy on int->uint cast to convert negative
    values to large positive values. Explicitly check for
    clamp_max_samples < 0.
v3: (for Ken)
   - Don't allow clamp_max_samples to alter context version.
   - Use clearer for-loop and correct comment.
   - Rename variables.
v4: (for Ken)
   - Merge identical if-branches.

Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-11-03 15:55:18 -08:00
Vinson Lee
68f1b274b0 i965: Fix logic_op check.
Fixes "Macro compares unsigned to 0" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-03 14:45:59 -08:00
Vinson Lee
9943b6612b i915: Fix logic_op check.
Fixes "Macro compares unsigned to 0" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-03 14:45:56 -08:00
Vinson Lee
14ddc83346 i965: Initialize vec4_visitor member variables.
Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-03 14:36:27 -08:00
Marek Olšák
fa8b1514d3 gallium/targets: remove vdpau-softpipe
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-11-02 23:34:01 +01:00
Marek Olšák
7c2531847f gallium/targets: remove xvmc-softpipe
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-11-02 23:34:01 +01:00
Marek Olšák
0e17c12fa7 gallium/targets: remove r300/vdpau
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-11-02 23:34:01 +01:00
Marek Olšák
5f7233c8ea gallium/targets: remove r300/xvmc
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-11-02 23:34:00 +01:00
Marek Olšák
be331e82d1 gallium/targets: remove radeonsi/xorg
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-11-02 23:34:00 +01:00
Marek Olšák
da82d7b6ba gallium/targets: remove r600/xorg
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-11-02 23:34:00 +01:00
Rob Clark
f407ea1f1c freedreno/a3xx/texture: min/max lod
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:22:40 -04:00
Rob Clark
2d10e22f8b freedreno/a3xx: update envytools headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:22:28 -04:00
Rob Clark
f16b084bb9 freedreno/a3xx: fix VS out / FS in linking
Actually link VS out / FS in based on semantic info, keeping in mind
that position/pointsize can also be an input to the FS.  This fixes a
few fragment shaders which were using gl_Position.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:20:47 -04:00
Rob Clark
83318d6511 freedreno/a3xx: allow num_samplers != num_textures
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:20:29 -04:00
Rob Clark
a53fe2221c freedreno/a3xx/compiler: highp frag shader
Fixes use of full-precision in fragment shader (ie. don't clobber r0.x
since that can be used by future bary instructions for varying fetch).
And makes use of full-precision the default in fragment shader (but can
be overriden via FD_MESA_DEBUG=fraghalf).

Seems like half precision is often not enough for texture coordinates.
The blob compiler is clever enough to keep texture coords in full
precision registers while using half precision for everything else.  But
we aren't quite that clever yet, so better to default to full precision.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:19:42 -04:00
Rob Clark
310fd5839c freedreno/a3xx/compiler: relative addressing fixes.
Handle some relative addressing constraints: cannot handle const or
relative in cat5 and src2 of cat3.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:18:44 -04:00
Rob Clark
4ddd4e83c7 freedreno: we do actually support sqrt
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-01 20:17:56 -04:00
Anuj Phogat
625a631383 i965: Enable ARB_sample_shading on intel hardware >= gen6
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ken Graunke <kenneth@whitecape.org>
2013-11-01 16:01:49 -07:00
Anuj Phogat
e7393260be i965/gen7: Enable the features required for GL_ARB_sample_shading
- Enable GEN7_WM_MSDISPMODE_PERSAMPLE, GEN7_WM_POSOFFSET_SAMPLE,
  GEN7_WM_OMASK_TO_RENDER_TARGET as per extension's specification.
- Only enable one of GEN7_WM_8_DISPATCH_ENABLE or GEN7_WM_16_DISPATCH_ENABLE
  when GEN7_WM_MSDISPMODE_PERSAMPLE is enabled. Refer IVB PRM Vol. 2, Part 1,
  Page 288 for details.

V2:
    - Use shared function _mesa_get_min_invocations_per_fragment().
    - Use brw_wm_prog_data variables: uses_pos_offset, uses_omask.

V3:
    - Enable simd16 dispatch with per sample shading.
    - Make changes to give preference to 'simd16 only' mode over
      'simd8 only' mode in case of non 1x per sample shading.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-01 16:01:49 -07:00
Anuj Phogat
8d7a934d09 i965/gen6: Enable the features required for GL_ARB_sample_shading
- Enable GEN6_WM_MSDISPMODE_PERSAMPLE, GEN6_WM_POSOFFSET_SAMPLE,
  GEN6_WM_OMASK_TO_RENDER_TARGET as per extension's specification.
- Only enable one of GEN6_WM_8_DISPATCH_ENABLE or GEN6_WM_16_DISPATCH_ENABLE
  when GEN6_WM_MSDISPMODE_PERSAMPLE is enabled.
  Refer SNB PRM Vol. 2, Part 1, Page 279 for details.

V2:
    - Use shared function _mesa_get_min_invocations_per_fragment().
    - Use brw_wm_prog_data variables: uses_pos_offset, uses_omask.

V3:
    - Enable simd16 dispatch with per sample shading.
    - Make changes to give preference to 'simd16 only' mode over
      'simd8 only' mode in case of non 1x per sample shading.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-01 16:01:48 -07:00
Anuj Phogat
e26bdf56a4 i965: Add FS backend for builtin gl_SampleMask[]
V2:
   - Update comments
   - Add a special backend instructions to compute sample_mask.
   - Add a new variable uses_omask in brw_wm_prog_data.

V3:
   - Make changes to support simd16 mode.
   - Delete redundant AND instruction and handle the register
     stride in FS backend instruction.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-01 16:01:48 -07:00
Anuj Phogat
e12bbb503f i965: Add FS backend for builtin gl_SampleID
V2:
   - Update comments
   - Add compute_sample_id variables in brw_wm_prog_key
   - Add a special backend instruction to compute sample_id.

V3:
   - Make changes to support simd16 mode.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-01 16:01:48 -07:00
Anuj Phogat
65d0452bbc i965: Add FS backend for builtin gl_SamplePosition
V2:
   - Update comments.
   - Add compute_pos_offset variable in brw_wm_prog_key.
   - Add variable uses_pos_offset in brw_wm_prog_data.

V3:
   - Make changes to support simd16 mode.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-01 16:01:48 -07:00
Anuj Phogat
81f5fb352a i965: Don't do vector splitting for ir_var_system_value
This is required while adding builtin system value vec{2, 3, 4}
variables. For example:
(declare (sys) vec2 gl_SamplePosition)

Without this patch above glsl ir splits in to:
(declare (temporary) float gl_SamplePosition_x)
(declare (temporary) float gl_SamplePosition_y)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-11-01 16:01:48 -07:00
Anuj Phogat
627b2692e9 mesa: Add a helper function _mesa_get_min_invocations_per_fragment()
This function is used to test if we need to do per sample shading or
per fragment shading.

V2: Use MAX2() to make sure the function returns a number >= 1.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-01 16:01:48 -07:00
Anuj Phogat
e849511c78 glsl: Add new builtins required by GL_ARB_sample_shading
New builtins added by GL_ARB_sample_shading:
in vec2 gl_SamplePosition
in int gl_SampleID
in int gl_NumSamples
out int gl_SampleMask[]

V2: - Use SWIZZLE_XXXX for STATE_NUM_SAMPLES.
    - Use "result.samplemask" in arb_output_attrib_string.
    - Add comment to explain the size of gl_SampleMask[] array.
    - Make gl_SampleID and gl_SamplePosition system values.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-11-01 16:01:48 -07:00
Anuj Phogat
0d69e8c813 mesa: Pass number of samples as a program state variable
Number of samples will be required in fragment shader program by new
GLSL builtin uniform "gl_NumSamples".

V2: Use "state.numsamples" in place of "state.num.samples"
    Use _NEW_BUFFERS flag in place of _NEW_MULTISAMPLE

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Ken Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-01 16:01:47 -07:00
Anuj Phogat
77b440e42d mesa: Add new functions and enums required by GL_ARB_sample_shading
New functions added by GL_ARB_sample_shading:
glMinSampleShadingARB()

New enums:
GL_SAMPLE_SHADING_ARB
GL_MIN_SAMPLE_SHADING_VALUE_ARB

V2: Update comments.
    Create new GL4x.xml.
    Remove redundant code in get.c.
    Update the API_XML list in Makefile.am.
    Add extra_gl40_ARB_sample_shading predicate to get.c.

V3:
   Fix make check failure.
   Add checks for desktop GL.
   Use GLfloat in place of GLclampf in glMinSampleShading().
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ken Graunke <kenneth@whitecape.org>
2013-11-01 16:01:47 -07:00
Anuj Phogat
e919e5ee4e mesa: Add infrastructure for GL_ARB_sample_shading
This patch implements the common support code required for the
GL_ARB_sample_shading extension.

V2: Move GL_ARB_sample_shading to ARB extension list.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Ken Graunke <kenneth@whitecape.org>
2013-11-01 16:01:47 -07:00
Matt Turner
3c28b2c09f i965/fs: Optimize saturating SEL.G(E) with imm val <= 0.0f.
Only one program's instruction count is changed, but a shader in Tropics
is also affected.

instructions in affected programs:     326 -> 320 (-1.84%)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-01 15:21:07 -07:00
Matt Turner
ca675b73d3 i965/fs: Optimize saturating SEL.L(E) with imm val >= 1.0.
total instructions in shared programs: 1409124 -> 1406971 (-0.15%)
instructions in affected programs:     158376 -> 156223 (-1.36%)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-01 15:21:07 -07:00
Matt Turner
a8f76d829b i965/fs: Optimize OR with identical sources into a MOV.
Helps a lot of Steam games.

total instructions in shared programs: 1409360 -> 1409124 (-0.02%)
instructions in affected programs:     20842 -> 20606 (-1.13%)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-01 15:21:07 -07:00
Eric Anholt
fd05ede0d0 glsl: Add a CSE pass.
This only operates on constant/uniform values for now, because otherwise I'd
have to deal with killing my available CSE entries when assignments happen,
and getting even this working in the tree ir was painful enough.

As is, it has the following effect in shader-db:

total instructions in shared programs: 1524077 -> 1521964 (-0.14%)
instructions in affected programs:     50629 -> 48516 (-4.17%)
GAINED:                                0
LOST:                                  0

And, for tropics, that accounts for most of the effect, the FPS
improvement is 11.67% +/- 0.72% (n=3).

v2: Use read_only field of the variable, manually check the lod_info union
    members, use get_num_operands(), rename cse_operands_visitor to
    is_cse_candidate_visitor, move all is-a-candidate logic to that
    function, and call it before checking for CSE on a given rvalue, more
    comments, use private keyword.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-01 10:25:33 -07:00
Eric Anholt
3641b97bdc i965/vec4: Don't overwrite op[1] when doing a UBO load.
Prior to the GLSL CSE pass, all of our testing happened to have a freshly
computed temporary in op[1], from the multiply by 16 to get a byte offset.
As of CSE you'll get var_refs of a reused value when you've got multiple
loads from the same offset.

Make a proper temporary for computing our temporary value, to avoid
shifting the value farther and farther down.  Avoids a regression in
gs-float-array-variable-index

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-11-01 10:25:33 -07:00
Brian Paul
2197967cd4 st/mesa: fix _mesa_init_transform_feedback_object() argument
Need to pass a pointer of the base type, not the st type.
Fixes a compiler warning.
2013-11-01 08:43:25 -06:00
Kenneth Graunke
723f047a3b i965: Fix brw_store_register_mem64 to stay within a single batch.
Previously, the write of each 32-bit half might land in separate batch
buffers, which is insane.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-10-31 12:11:52 -07:00
Kenneth Graunke
5eb0835b91 docs: List transfom_feedback{2,3,instanced} for i965 in release notes. 2013-10-31 11:11:01 -07:00
Kenneth Graunke
0eeaf11edf i965: Enable the ARB_transform_feedback_instanced extension on Gen7+.
This depends on ARB_transform_feedback2, so I've predicated it on the
ability to do register writes.

It also depends on ARB_transform_feedback3, which is the only reason we
couldn't expose it previously.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-31 11:04:37 -07:00
Kenneth Graunke
c4ec0ad8a9 i965: Enable the ARB_transform_feedback3 extension on Gen7+.
This extension is written a bit strangely.  Although it introduces the
concept of multiple transform feedback streams, it doesn't actually
provide more than a single stream.

The ARB_gpu_shader5 extension is what introduces the ability to write to
streams other than stream #0 and increases the required number of streams.

Since we don't yet support ARB_gpu_shader5, we can safely enable
ARB_transform_feedback3 even though we only support a single stream.
This does provide some useful functionality: applications can now use
more than one interleaved transform feedback buffer.

v2: Only expose the extension if ARB_transform_feedback2 is also
    available, to avoid confusing applications (suggested by Ian).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-31 11:04:37 -07:00
Kenneth Graunke
066fb237e6 i965: Add support for gl_SkipComponents[1234].
ARB_transform_feedback3 allows applications to insert blank space
between interleaved varyings by adding fake 1, 2, 3, or 4-component
varyings named gl_SkipComponents[1234].

Mesa's core data structures don't explicitly track these, instead simply
tracking the buffer offset for each real varying.  If there is padding
due to gl_SkipComponents, these will not be contiguous.

Our hardware takes the specification quite literally.  Instead of
specifying offsets for each varying, it assumes they're all contiguous
and requires you to program fake varyings for each "hole".

This patch adds support for emitting SO_DECL structures for these holes.
Although we've lost the information about exactly how the application
specified their padding (i.e. gl_SkipComponents2, gl_SkipComponents2
vs. a single gl_SkipComponents4), it shouldn't matter.  We just need to
emit the right amount of space.  This patch emits the minimal number of
hole SO_DECL structures.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-31 11:04:37 -07:00
Kenneth Graunke
7232e8bea7 i965: Explicitly maintain a count of SO_DECL structures emitted.
Currently, we emit one SO_DECL structure per output, so we use the index
in the Outputs[] array as the index into the so_decl[] array as well.

In order to support the fake "gl_SkipComponents[1234]" varyings from
ARB_transform_feedback3, we'll need to emit SO_DECLs to fill in the
holes between successive outputs.  This means we'll likely emit more
SO_DECLs than there are outputs, so we need to count it explicitly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-31 11:04:37 -07:00
Kenneth Graunke
e095434e52 i965: Create a temporary for transform feedback output components.
This is a bit shorter.

v2: Mark the temporary const (requested by Ian).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-31 11:04:37 -07:00
Kenneth Graunke
129da5b1c8 i965: Enable ARB_transform_feedback2 on Gen7+ if register writes work.
With Linux 3.12, register writes work on Ivybridge and Baytrail, but not
Haswell.  That will be fixed in a future kernel revision, at which point
this extension will automatically be enabled.

v2: Use I915_GEM_DOMAIN_INSTRUCTION for the register read, and also
    correctly set the writeable flag when mapping (caught by Eric).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-31 11:04:37 -07:00
Kenneth Graunke
46d3c2bf4d i965: Initialize batchbuffer and state modules before extensions.
We only want to enable ARB_transform_feedback2 if we can write to
registers from batchbuffers.  In order to test that, we need to be able
to submit batches.  And for batches to work, we need to program the
initial pipeline state (like PIPELINE_SELECT), which is done from
brw_state_init().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-31 11:04:37 -07:00
Kenneth Graunke
82a5ee6be4 i965: Implement glDrawTransformFeedback().
Implementing the GetTransformFeedbackVertexCount() driver hook allows
the VBO module to call us with the right number of vertices.

The hardware doesn't directly count the number of vertices written by
SOL, so we instead use the SO_NUM_PRIMS_WRITTEN(n) counters and multiply
by the number of vertices per primitive.

Unfortunately, counting the number of primitives generated is tricky:
a program might pause a transform feedback operation, start a second one
with a different object, then switch back and resume.  Both transform
feedback operations share the SO_NUM_PRIMS_WRITTEN counters.

To work around this, we save the counter values at Begin, Pause, Resume,
and End.  This "bookends" each section where transform feedback is
active for the current object.  Adding up differences of pairs gives
us the number of primitives generated.  (This is similar to what we
do for occlusion queries on platforms without hardware contexts.)

v2: Fix missing parenthesis in assertion (caught by Eric Anholt).
v3: Reuse prim_count_bo rather than freeing it and immediately
    allocating a new one (suggested by Topi Pohjolainen).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-31 11:04:37 -07:00
Kenneth Graunke
b2ff11618f i965: Mark brw_draw_prims tfb_vertcount parameter as unused.
Renaming it makes it obvious that it isn't used, and the assertion
verifies that the VBO module never passes us such an object.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-31 11:04:37 -07:00
Kenneth Graunke
ded34f65ad mesa: Add a new GetTransformFeedbackVertexCount() driver hook.
DrawTransformFeedback() needs to obtain the number of vertices written
to a particular stream during the last Begin/EndTransformFeedback block.
The new driver hook returns exactly that information.

Gallium drivers already implement this by passing the transform feedback
object to the drawing function, counting the number of vertices written
on the GPU, and using draw indirect.  This is efficient, but doesn't
always work:

If vertex data comes from user arrays, then the VBO module needs to
know how many vertices to upload, so we need to synchronously count.
Gallium drivers are currently broken in this case.

It also doesn't work if primitive restart is done in software.  For
normal drawing, vbo_draw_arrays() performs software primitive restart,
splitting the draw call in two.  vbo_draw_transform_feedback() currently
doesn't because it has no idea how many vertices need to be drawn.

The new driver hook gives it that information, allowing us to reuse
the existing vbo_draw_arrays() code to do everything right.

On Intel hardware (at least Ivybridge), using the draw indirect approach
is difficult since the hardware counts primitives, rather than vertices,
which requires doing some simple math.  So we always use this hook.

Gallium drivers will likely want to use this hook in some cases, but
want to use the existing draw indirect approach where possible.  Hence,
I've added a flag to allow drivers to opt-in to this call.

v2: Make it possible to implement this hook but only use this path
    when necessary (suggested by Marek).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-10-31 11:04:37 -07:00
Kenneth Graunke
684958d1e7 i965: Implement Pause/ResumeTransformfeedback driver hooks on Gen7+.
The ARB_transform_feedback2 extension introduces the ability to pause
and resume transform feedback sessions.  Although only one can be active
at a time, it's possible to switch between multiple transform feedback
objects while paused.

In order to facilitate this, we need to save/restore the SO_WRITE_OFFSET
registers so that after resuming, the GPU continues writing where it
left off.

This functionality also exists in ES 3.0, but somehow we completely
forgot to implement it.

v2: Reduce alignment from 4096 to 64 (it seemed excessive).
v3: Use I915_GEM_DOMAIN_INSTRUCTION instead of RENDER, for consistency
    with other writes.  It shouldn't matter on IVB+.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-31 11:04:37 -07:00
Kenneth Graunke
0d7033c394 i965: Create a new brw_transform_feedback_object subclass.
This adds the basic driver hooks to allocate/free the brw variant.
It doesn't contain any additional information yet, but it will soon.

v2: Use the new _mesa_init_transform_feedback_object helper function
    (requested by Eric and Ian).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-31 11:04:37 -07:00
Kenneth Graunke
be6227d29d st/mesa: Use the new _mesa_init_transform_feedback_object() helper.
This picks up a missing obj->EverBound = GL_FALSE line, and will catch
any new fields that get added in the future.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-31 11:04:37 -07:00
Kenneth Graunke
f02ee3044f mesa: Separate transform feedback object initialization from allocation.
Both Gallium and i965 subclass gl_transform_feedback_object, which
requires implementing a custom NewTransformFeedback hook.  Creating a
helper function to initialize the fields avoids code duplication and
divergence.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-31 11:04:36 -07:00
Brian Paul
0e2f0baa43 vbo: fix MSVC double->float conversion warnings 2013-10-31 08:21:58 -06:00
Brian Paul
83f276ab05 swrast: fix MSVC double->float conversion warnings 2013-10-31 08:21:58 -06:00
Brian Paul
717621acff mesa: fix some MSVC signed/unsigned compiler warnings 2013-10-31 08:21:58 -06:00
Brian Paul
010f8762e8 meta: fix assorted MSVC int/float conversion warnings 2013-10-31 08:21:58 -06:00
Brian Paul
e4d4ec9ddf glsl: fix MSVC int->bool conversion warning 2013-10-31 08:21:58 -06:00
Brian Paul
3c11bc6a5a st/draw: silence Mingw warning in pointer_to_offset()
Fixes "warning: cast from pointer to integer of different size" for
64-bit builds.
2013-10-31 08:21:58 -06:00
Matt Turner
b16b3c8703 i965/fs: Perform CSE on CMP(N) instructions.
Optimizes

      cmp.ge.f0(8)  null     g45<8,8,1>F  0F
(+f0) sel(8)        g50<1>F  g40<8,8,1>F  g10<8,8,1>F
      cmp.ge.f0(8)  null     g45<8,8,1>F  0F
(+f0) sel(8)        g51<1>F  g41<8,8,1>F  g11<8,8,1>F
      cmp.ge.f0(8)  null     g45<8,8,1>F  0F
(+f0) sel(8)        g52<1>F  g42<8,8,1>F  g12<8,8,1>F
      cmp.ge.f0(8)  null     g45<8,8,1>F  0F
(+f0) sel(8)        g53<1>F  g43<8,8,1>F  g13<8,8,1>F

into

      cmp.ge.f0(8)  null     g45<8,8,1>F  0F
(+f0) sel(8)        g50<1>F  g40<8,8,1>F  g10<8,8,1>F
(+f0) sel(8)        g51<1>F  g41<8,8,1>F  g11<8,8,1>F
(+f0) sel(8)        g52<1>F  g42<8,8,1>F  g12<8,8,1>F
(+f0) sel(8)        g53<1>F  g43<8,8,1>F  g13<8,8,1>F

total instructions in shared programs: 1644938 -> 1638181 (-0.41%)
instructions in affected programs:     574955 -> 568198 (-1.18%)

Two more 16-wide programs (in L4D2). Some large (-9%) decreases in
instruction count in some of Valve's Source Engine games. No
regressions.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 19:49:27 -07:00
Matt Turner
219b43c612 i965/fs: Don't emit null MOVs in CSE.
We'd like to CSE some instructions, like CMP, that often have null
destinations. Instead of replacing them with MOVs to null, just don't
emit the MOV.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 19:49:27 -07:00
Matt Turner
a93d54eb68 i965/fs: Use reads_flag and writes_flag methods in the scheduler.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 19:49:27 -07:00
Matt Turner
20d0297ff2 i965/fs: Add reads_flag() and writes_flag() to fs_inst.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 19:49:27 -07:00
Matt Turner
f768f998e0 i965/fs: Add is_null() method to fs_reg.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 19:49:27 -07:00
Eric Anholt
8dfc9f038e i965/fs: Use the gen7 scratch read opcode when possible.
This avoids a lot of message setup we had to do otherwise.  Improves
GLB2.7 performance with register spilling force enabled by 1.6442% +/-
0.553218% (n=4).

v2: Use BRW_PREDICATE_NONE, improve a comment (by Paul).

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:51:19 -07:00
Eric Anholt
6032261682 i965: Merge together opcodes for SHADER_OPCODE_GEN4_SCRATCH_READ/WRITE
I'm going to be introducing gen7 variants, and the previous naming was
going to get confusing.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:51:17 -07:00
Eric Anholt
32182bb004 i965/fs: Fix register unspills from a reg_offset.
We were clearing the reg_offset before trying to use it.  Oops.  Fixes
glsl-fs-texture2drect with the reg spilling debug enabled.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:51:15 -07:00
Eric Anholt
0e20051f54 i965/fs: Fix register spilling for 16-wide.
Things blew up when I enabled the debug register spill code without
disabling 16-wide, so I decided to just fix 16-wide spilling.

We still don't generate 16-wide when register spilling happens as part of
allocation (since we expect it to be slower), but now we can experiment
with allowing it in some cases in the future.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:51:10 -07:00
Eric Anholt
537f183fe6 i965/fs: Exit the compile if spilling would overwrite in-use MRFs.
I believe this will never happen in SIMD8 mode, but it could for SIMD16
when we fix it.

v2: Fix off-by-one in my register counting comment (caught by Paul).

Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)
2013-10-30 17:51:02 -07:00
Eric Anholt
44ec2f1751 i965/fs: Fix broken register spilling debug code.
Now that reg spilling generates new vgrfs, we were looping forever if you
ever turned it on.

Instead, move the debug code into the register allocator right near where
we'd be doing spilling anyway, which should more accurately reflect how
register spilling occurs in the wild.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:50:59 -07:00
Eric Anholt
b3f6690406 i965/fs: Split "find what MRFs were used" to a helper function.
I'm going to need to reuse this for fixing register spilling on SIMD16.
Note that BRW_MAX_MRF is 16, which is the same as BRW_MAX_GRF -
GEN7_MRF_HACK_START.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:50:56 -07:00
Eric Anholt
32ac5634d6 i965/fs: Update an ancient, wrong comment about reg_offset.
This hasn't been true since SIMD16 mode was added.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:50:51 -07:00
Kai Wasserbäch
bbb77fc2f1 radeonsi: Allow longer intrinsic names
Fixes a boat load of Piglit tests for me, which crashed like fdo#70913
before.

Thanks to Michel Dänzer for the tip.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70913
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-30 16:40:06 -07:00
Tom Stellard
193594a1b8 clover: Don't install headers when using the icd
The ICD loader should be responsible for installing headers.

Reviewed and Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-10-30 16:40:06 -07:00
Tom Stellard
6f3465f340 radeon/llvm: Specify the DataLayout when running optimizations
Without DataLayout, a lot of optimization passes aren't run and the ones
that are don't work as well.
2013-10-30 16:40:06 -07:00
Eric Anholt
20dbeadd83 i965/fs: Prefer more-critical instructions of the same age in LIFO scheduling.
When faced with a million instructions that all became candidates at the
same time (none of which individually reduce register pressure), the ones
on the critical path are more likely to be the ones that will free up some
candidates soon.

shader-db:
total instructions in shared programs: 1681070 -> 1681070 (0.00%)
instructions in affected programs:     0 -> 0
GAINED:                                40
LOST:                                  74

Fixes indistinguishable-from-hanging behavior in GLES3conform's
uniform_buffer_object_max_uniform_block_size test, regressed by
c3c9a8c857.  Given that
93bd627d5a was unlocked by that commit, the
net effect on 16-wide program count is still quite positive, and I think
this should give us more stable scheduling (less dependency on original
instruction emit order).

v2: Comment suggestions by Paul

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70943
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 15:46:54 -07:00
Eric Anholt
017361dd37 i965: Compute the node's delay time for scheduling.
This is a step in doing scheduling as described in Muchnick (p538).  A
difference is that our latency function is only specific to one
instruction (it doesn't describe, for example, the different latency
between WAR of a send's arguments and RAW of a send's destination), but
that's changeable later.  We also don't separately compute the postorder
traversal of the graph, since we can use the setting of the delay field as
the "visited" flag.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 15:46:48 -07:00
Emil Velikov
9eb3de1ce7 automake: handle expat version pre 2.1
Commit aec20d66d9
(automake: properly handle non-default expat installation),
assumed that up-to date distributions use a recent version
of expat that handles security vunerabilities CVE-2012-1147
and CVE-2012-1148. Seems like this is not always the case
and they prefer to backport only the fix, rather than use
the updated library.

This commit adds a default case -lexpat whenever expat is
not found, while properly handling expat.pc if present.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71022
Reported-By: Bryce Harrington <b.harrington@samsung.com>
Reported-By: Vinson Lee <vlee@freedesktop.org>
Tested-by: Bryce Harrington <b.harrington@samsung.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-30 22:05:42 +00:00
Ian Romanick
5cb80f0314 glsl: Move layout(location) checks to AST-to-HIR conversion
This will simplify the addition of layout(location) qualifiers for
separate shader objects.  This was validated with new piglit tests
arb_explicit_attrib_location/1.30/compiler/not-enabled-01.vert and
arb_explicit_attrib_location/1.30/compiler/not-enabled-02.vert.

v2: Refactor error checking to check_explicit_attrib_location_allowed
and eliminate the gotos.  Suggested by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 13:49:30 -07:00
Ian Romanick
9d6294f5a2 glsl: Slightly restructure error generation in validate_explicit_location
Use mode_string to get the name of the variable mode.  Slightly change
the control flow.  Both of these changes make it easier to support
separate shader object location layouts.

The format of the message changed because mode_string can return a
string like "shader output".  This would result in an awkward message
like "vertex shader shader output..."

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 13:49:30 -07:00
Ian Romanick
f8c579dc0f glsl: Make mode_string function globally available
I made this a function (instead of a method of ir_variable) because it
made the change set smaller, and I expect that there will be an overload
that takes an ir_var_mode enum.  Having both functions used the same way
seemed better.

v2: Add missing case for ir_var_system_value.

v3: Change the ir_var_mode_count case to just break.  Move the assertion
and the return outside the switch-statment.  In the unlikely event that
var->mode is an invalid value other than ir_var_mode_count, the
assertion will still fire, and in release builds we won't wind up
returning a garbage pointer.  Suggested by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 13:49:30 -07:00
Ian Romanick
2cb760d994 glsl: Eliminate the global check in validate_explicit_location
Since the separation of ir_var_function_in and ir_var_shader_in (similar
for out), this check is no longer necessary.  Previously, global_scope
was the only way to tell which was which.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 13:49:29 -07:00
Ian Romanick
8f00a77fbc glsl: Extract explicit location code from apply_type_qualifier_to_variable
Future patches will add some extra code to this path, and some of that
code will want to exit from the explicit location code early.

v2: Change a geometry shader "break" to a "return" so that try to apply
a bogus geometry shader location qualifier (which could cause cascading
errors).  Suggested by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 13:49:29 -07:00
Gregory Hainaut
0059d1948e mesa: Drop unused return value from use_shader_program
The return value has been unused since commit d348b0c.  This was
originally included in another patch, but it was split out by Ian
Romanick.

v2: Drop unnecessary final return.  Suggested by Paul.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
2013-10-30 13:49:29 -07:00
Fabio Pedretti
103824dc24 wayland: silence unused var warning
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-10-30 12:50:09 -07:00
Johannes Obermayr
5e162566db ilo: Fix out-of-tree build.
[olv: use $(srcdir) instead of $(top_srcdir)]
2013-10-30 21:17:10 +08:00
José Fonseca
26a8f76ba1 scons: Add missing dependencies to src/mapi/glapi/gen/*.xml
Incremental builds were failing because not all generated source files
were missing dependencies to src/mapi/glapi/gen/*.xml.

Hopefully this change will be the end of these incremental build
failures.
2013-10-30 12:21:54 +00:00
Marek Olšák
e929e27737 glsl: fix crash introduced by the previous commit 2013-10-30 00:14:35 +01:00
Marek Olšák
7e414b5864 glsl: break the gl_FragData array into separate gl_FragData[i] variables
This avoids a defect in lower_output_reads.

The problem is lower_output_reads treats the gl_FragData array as a single
variable. It first redirects all output writes to a temporary variable (array)
and then writes the whole temporary variable to the output, generating
assignments to all elements of gl_FragData.

BTW this pass can be modified to lower all arrays, not just inputs and outputs.
The question is whether it is worth it.

Reviewed-by: Paul Berry <stereotype441@gmail.com>

v2: addressed Paul Berry's comments
2013-10-29 23:50:01 +01:00
Emil Velikov
aec20d66d9 automake: properly handle non-default expat installation
Use PKG_CHECK_MODULE over requesting the user to setup the
option at configure time. Drop unused EXPAT_INCLUDE and
update all targets.

NOTE: The this commit removes the --with-expat configure
option. One should ensure that the expat they wish to use
has expat.pc file accessible by pkg-config.

v2:
* Add note about the removal of --with-expat
(per Tom Stellard)
* Drop EXPAT_CFLAGS for targets that do not build DRI_COMMON
(spotted by Matt Turner)
v3:
* Rebase on top of megadrivers (drop EXPAT_CFLAGS from swrast)

Acked-by: Matt Turner <mattst88@gmail.com> (v2)
Reviewed-by: Tom Stellard <thomas.stellard@amd.com> (v2)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	configure.ac
	src/mesa/drivers/dri/common/Makefile.am
2013-10-29 21:14:41 +00:00
Emil Velikov
0828ad4e63 configure: use PKG_CONFIG variable over hardcoded pkg-config
Already available and used in other places of configure.ac.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 21:04:38 +00:00
Emil Velikov
2a87647c6a targets/xorg-nouveau: drop usage of dri1 function DRICreatePCIBusID
The function should have never used it in the first place as it was
a left over from the DRI1 days of the nouveau ddx. While we're around
check if KMS is supported before opening the nouveau device, and
add support for Fermi & Kepler cards.

Compile tested only due to the lack of a Fermi/Kepler card.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-29 21:04:38 +00:00
Emil Velikov
c9e6e6382f gallium/targets/xorg: drop set but unused variable entity
The function xf86GetEntityInfo() retrieves the entity rather than
doing any changes. Remove this no-op code.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-29 21:04:38 +00:00
Emil Velikov
ba3efd6b42 st/xorg: drop set but unsused variables dxo, dyo
Commit a9f8baf00b removed the first and only use of the variables
but forgot to remove them.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 21:04:38 +00:00
Emil Velikov
2b7ffde8bd st/xorg: add sanity checks after malloc
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 21:04:37 +00:00
Emil Velikov
5c398e243c st/xorg: remove unnecessary headers
v2: Remove xf86PciInfo.h, all drivers provide their own PCI ID list

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 21:04:37 +00:00
Rob Clark
2bc1fc2fb6 freedreno: emulated unsupported primitive types
Use u_primconvert to convert unsupported primitives into supported
primitive plus index buffer.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-10-29 16:49:43 -04:00
Rob Clark
b881917088 gallium/auxiliary/indices: add u_primconvert
A convenient front end to indices generate/translate code, for emulating
primitives which are not supported natively by the driver.

This handles saving/restoring index buffer state, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 16:49:43 -04:00
Rob Clark
28f3f8d413 gallium/auxiliary/indices: add start param
Add 'start' parameter to generator/translator.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 16:49:43 -04:00
Rob Clark
5127436a4a freedreno: update generated headers
pull in some fixes to draw-initiator/prim-type.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-10-29 16:49:43 -04:00
Eric Anholt
774b787d6b i965/fs: Drop our dead push constants before overflowing to pull constants.
The idea of the original order was that you'd dead code eliminate accesses
to push constants.  But I've never seen a case of that (nor has
shader-db), while we frequently see sparse accesses of large constant
arrays that would overflow into pull constants.

Cuts pull constant use on csgo, serious sam, planeshift, and the cave:

total instructions in shared programs: 1695103 -> 1688795 (-0.37%)
instructions in affected programs:     92024 -> 85716 (-6.85%)
GAINED:                                339
LOST:                                  0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-29 13:43:01 -07:00
Alexander von Gluck IV
9a9fb94ca9 haiku-softpipe: Minor cleanup and color space fixes
* Use more consistant data sources
* Fix improper color space assignments
* Remove unnecessary comments and code
* Drop unnecessary round_up function (this was leftover
  from moving winsys code out of renderer)

Acked-by: Brian Paul <brianp@vmware.com>
2013-10-29 15:27:43 -05:00
Alexander von Gluck IV
439dd0e20a winsys: Correct Haiku winsys display target code
* Instead of assuming the displaytarget is the same
  stride / colorspace as the destination, lets
  actually check the source bitmap.
* Fixes random stride issues in rendering

Acked-by: Brian Paul <brianp@vmware.com>
2013-10-29 15:27:40 -05:00
Francisco Jerez
b8f89fc5cb clover: Use context device list for error checking in clGetProgramBuildInfo.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70891.

Reported-by: Bruno Jiménez <brunojimen@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
e515dcbf96 i965: Simplify the shader time code by using atomic counter helpers.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
d58bd75263 i965: Add brw_reg constructors taking a dynamically determined vector width.
The MRF variant is going to be used extensively by the atomic counter
intrinsics to assemble untyped atomic and surface read messages
easily.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
5e621cb9fe i965/gen7: Implement code generation for untyped surface read instructions. 2013-10-29 12:40:56 -07:00
Francisco Jerez
cfaaa9bbb7 i965/gen7: Implement code generation for untyped atomic instructions.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
5809512b17 i965: Implement ABO surface state emission.
The maximum number of atomic buffer objects is somewhat arbitrary, we
can change it in the future easily if it turns out it's not enough...

v2: Add comments with the relevant mesa dirty bits.  Fix usage of
    BRW_NEW_UNIFORM_BUFFER in the GS ABO state atom.
v3: Update binding table layout diagrams.
v4: Resolve conflicts with the recent dynamic surface index assignment changes.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
c4e730e218 i965: Define vtbl method that initializes an untyped R/W surface.
And add Gen7 implementation.

v2: Fix off by one error in buffer size calculation.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
7a54db9ce5 glsl: Fix the function inlining pass to deal with general opaque arguments.
Almost a trivial change, it boils down to renaming a few identifiers
so their names still make sense for opaque types other than sampler.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
bbded5b5fe glsl: Add built-in functions and constants required for ARB_shader_atomic_counters.
v2: Represent atomics as GLSL intrinsics.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
9562922376 glsl: Basic support for built-in intrinsics.
Fix the linker to deal with intrinsic functions which are undefined
all the way down to the driver back-end, and introduce intrinsic
definition helpers in the built-in generator.

We still need to figure out what kind of interface we want for drivers
to communicate to the GLSL front-end which of the supported intrinsics
should use a default GLSL implementation and which should use a
hardware-specific override.  As there's no default GLSL implementation
for atomic ops, this seems like something we can worry about later on.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

v2: Define local helper function to generate ir_call nodes in the
    builtin generator.
2013-10-29 12:40:55 -07:00
Francisco Jerez
cc744a0947 glsl: Add type predicate to check whether a type contains any opaque types.
And use it to forbid comparisons of opaque operands.  According to the
GL 4.2 specification:

> Except for array indexing, structure member selection, and
> parentheses, opaque variables are not allowed to be operands in
> expressions.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
26db3b933f glsl: Add new atomic_uint built-in GLSL type.
v2: Fix GLSL version in which the type became available.  Add
    contains_atomic() convenience method.  Split off atomic counter
    comparison error checking to a separate patch that will handle all
    opaque types.  Include new ir_variable fields for atomic types.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
0bed1ab73b glsl: Add extension enables for ARB_shader_atomic_counters.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
1c7dcfed7c mesa: Add support for ARB_shader_atomic_counters.
This patch implements the common support code required for the
ARB_shader_atomic_counters extension.  It defines the necessary data
structures for tracking atomic counter buffer objects (from now on
"ABOs") associated with some specific context or shader program, it
implements support for binding buffers to an ABO binding point and
querying the existing atomic counters and buffers declared by GLSL
shaders.

v2: Fix extension checks.  Drop unused MAX_ATOMIC_BUFFERS constant.

Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
e3fd31dc41 glapi: Add support for ARB_shader_atomic_counters.
Add XML file for the dispatch code generator, update the
dispatch_sanity test and add stub definition for the new entry point.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
db47074ac0 i965: Handle deallocation of some private ralloc contexts explicitly.
These ralloc contexts belong to a specific object and are being
deallocated manually from the class destructor.  Now that we've hooked
up destructors to ralloc there's no reason for them to be children of
any other context, and doing so might to lead to double frees under
some circumstances.  The class destructor has all the responsibility
of freeing class memory resources now.
2013-10-29 12:40:55 -07:00
Francisco Jerez
d18477deea ralloc: Hook up C++ destructors to ralloc when necessary.
This patch makes sure that class destructors are called as they should
be when a C++ object allocated by ralloc is released.

Based on a previous patch by Kenneth Graunke, but it doesn't exhibit
the ~0.8% performance regression in shader compilation times because
we now use the HAS_TRIVIAL_DESTRUCTOR() macro to detect the typical
case where the indirect function call can be avoided because the
object's destructor doesn't need to do anything.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
98ab905af0 mesa: Define introspection macro to determine whether a type is trivially destructible.
Only implemented on GCC and Clang for now.  Other compilers use a
dummy implementation that always returns false, which should be a safe
[but slightly inefficient] assumption in all cases.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Paul Berry
be63803b0c glsl: Generalize MSVC fix for strcasecmp().
This will let us use strcasecmp() from anywhere inside Mesa without
having to worry about the fact that it doesn't exist in MSVC.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 11:10:56 -07:00
Roland Scheidegger
e4195acab5 llvmpipe: fix bogus layer clamping in setup
The layer coming from GS needs to be clamped (not sure if that's actually
the correct error behavior but we need something) as the number can be higher
than the amount of layers in the fb. However, this code was using the layer
calculation from the scene, and this was actually calculated in
lp_scene_begin_rasterization() hence too late (so setup was using the value
from the _previous_ scene or just zero if it was the first scene).
Since the value is used in both rasterization and setup, move calculation up
to lp_scene_begin_binning() though it's a bit more inconvenient to calculate
there. (Theoretically could move _all_ code which was in
lp_scene_begin_rasterization() to there, because ever since we got rid of
swizzled render/depth buffers our "map" functions preparing the fb data for
render don't actually change the data in there at all, but it feels like
it would be a hack.)

v2: improve comments

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-10-29 17:54:03 +01:00
Matthew McClure
be0b67a143 util,llvmpipe: correctly set the minimum representable depth value
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-10-29 15:53:48 +00:00
Brian Paul
d0eaf6752d st/mesa: move out of memory check in st_draw_vbo()
Before we were only checking the st->vertex_array_out_of_memory flag
after updating array state.  But if there's two consecutive glDrawArrays
calls and the first one is skipped because of OOM, the second one should
be skipped too.

Cc: 9.2 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-10-29 08:09:34 -06:00
Brian Paul
ea9fe9ebdb svga: reindent drawing code 2013-10-29 08:09:34 -06:00
Eric Anholt
415d6dc5bd i965/vec4: Reduce working set size of live variables computation.
Orbital Explorer was generating a 4000 instruction geometry shader, which
was taking 275 trips through dead code elimination and register
coalescing, each of which updated live variables to get its work done, and
invalidated those live variables afterwards.

By using bitfields instead of bools (reducing the working set size by a
factor of 8) in live variables analysis, it drops from 88% of the profile
to 57%, and reduces overall runtime from I-got-bored-and-killed-it (Paul
says 3+ minutes) to 10.5 seconds.

Compare to f179f419d1 on the FS side.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 00:27:35 -07:00
Vadim Girlin
8bd4476010 r600g/sb: fix value::is_fixed()
This prevents unnecessary (and wrong) register allocation in the
scheduler for preloaded values in fixed registers.

Fixes interpolation-mixed.shader_test on rv770
(and probably on all other pre-evergreen chips).

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-10-29 05:49:21 +04:00
Eric Anholt
08bf52712e glsl: Drop no-op shifts involving 0.
I noticed this in a shader in Unigine Heaven that was spilling.  While it
doesn't really reduce register pressure, it shaves a few instructions
anyway (7955 -> 7882).

v2: Fix turning "0 >> x" into "x" instead of "0" (caught by Erik
    Faye-Lund).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-28 14:07:31 -07:00
Eric Anholt
3a0fdf2ab6 glsl: Use ir_builder more in opt_algebraic.
While ir_builder is slightly less efficient, we're only increasing the
work when there's actual optimization being done, and it's way more
readable code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-28 14:07:31 -07:00
Eric Anholt
27bcb5063f glsl: Move common code out of opt_algebraic's handle_expression().
Matt and I had each screwed up these common required patterns recently, in
ways that wouldn't have been noticed for a long time if not for code
review.  Just enforce it in the caller so that we don't rely on code
review catching these bugs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-28 14:07:31 -07:00
Carl Worth
29996e2199 Remove error when calling glGenQueries/glDeleteQueries while a query is active
There is nothing in the OpenGL specification which prevents the user from
calling glGenQueries to generate a new query object while another object is
active. Neither is there anything in the Mesa implementation which prevents
this. So remove the INVALID_OPERATION errors in this case.

Similarly, it is explicitly allowed by the OpenGL specification to delete an
active query, so remove the assertion for that case, replacing it with the
necesssary state updates to end the query, (clear the bindpt pointer and call
into the driver's EndQuery hook).

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-10-28 12:56:49 -07:00
Kenneth Graunke
5563dfabc8 i965: Also emit HiZ and Stencil packets when disabling depth on Gen6.
The normal drawing path does this, and it's necessary on Ivybridge,
so let's try it on Sandybridge too.  It's not explicitly documented
as necessary, but might help with hangs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:36 -07:00
Kenneth Graunke
29e5d5db51 i965: Also emit HIER_DEPTH and STENCIL packets when disabling depth.
From the documentation:
"[DevIVB] 3DSTATE_DEPTH_BUFFER must always be programmed along with the
 other Depth/Stencil state commands(i.e. 3DSTATE_CLEAR_PARAMS,
 3DSTATE_STENCIL_BUFFER, or 3DSTATE_HIER_DEPTH_BUFFER)."

We normally do this, but BLORP was failing to do so in the case where it
disables depth.

Not observed to fix anything yet.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:33 -07:00
Kenneth Graunke
65b1f642ac i965: Move post-sync non-zero flush for 3DSTATE_MULTISAMPLE.
For some reason, we put the flush in the caller, rather than just before
emitting the packet.  This is more than a cosmetic problem: BLORP calls
gen6_emit_3dstate_multisample() directly, and so it missed the flush.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:32 -07:00
Kenneth Graunke
10a918e52c i965: Also guard 3DSTATE_DRAWING_RECTANGLE with a flush in blorp.
Non-pipelined commands need this flush.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:31 -07:00
Kenneth Graunke
3aef1fefb4 i965: Emit post-sync non-zero flush before 3DSTATE_DRAWING_RECTANGLE.
This is another non-pipelined command that needs a flush on Sandybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:29 -07:00
Kenneth Graunke
436e815a25 i965: Emit post-sync non-zero flush before 3DSTATE_GS_SVB_INDEX.
From the comments above intel_emit_post_sync_nonzero_flush:
"[DevSNB-C+{W/A}] Before any depth stall flush (including those
 produced by non-pipelined state commands), software needs to first
 send a PIPE_CONTROL with no bits set except Post-Sync Operation != 0."

This suggests that every non-pipelined (0x79xx) command needs a
post-sync non-zero flush before it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:27 -07:00
Daniel Vetter
32a3f5f6d7 i965: CS writes/reads should use I915_GEM_INSTRUCTION
Otherwise the gen6 w/a in the kernel won't kick in and the write will
land nowhere.

Inspired by a patch Ken pointed me at which had the same issue (but
isn't yet merged and also for a gen7+ feature). An audit of the entire
driver didn't reveal any other case than the one in in the write_reg
helper used by the gen6 queryobj code.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:15 -07:00
Anuj Phogat
f278d49c4b i965: Do not set bilinear_filter flag in case of multisample blits
Setting bilinear_filter flag in case of multisample blits with
GL_LINEAR filter causes incorrect behavior in translate_dst_to_src()
function. This broke Modern Warfare (1, 2 and 3) on SNB, IVB and HSW.

Tested on SNB and IVB, no Piglit regressions. Trace file of the game
(taken with apitrace) works fine with this patch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69078
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reported-by: Armin K <krejzi@email.com>
Tested-by: Armin K <krejzi@email.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-28 09:33:01 -07:00
Rico Schüller
14f02cdee8 mesa: Remove trailing whitespace in texparam.c
Signed-off-by: Rico Schüller <kgbricola@web.de>
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-10-28 08:43:40 -06:00
Brian Paul
0ce3bfbd40 mesa: use void in _mesa_VDPAUFiniNV() as in the header file 2013-10-28 08:37:39 -06:00
Timothy Arceri
b59c5926cb glsl: Add check for unsized arrays to glsl types
The main purpose of this patch is to increase readability of
the array code by introducing is_unsized_array() to glsl_types.
Some redundent is_array() checks are also removed, and small number
of other related clean ups.

The introduction of is_unsized_array() should also make the
ARB_arrays_of_arrays code simpler and more readable when it arrives.

V2: Also replace code that checks for unsized arrays directly with the
length variable

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

v3 (Paul Berry <stereotype441@gmail.com>): clean up formatting.
Separate whitespace cleanups to their own patch.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-28 06:06:04 -07:00
Timothy Arceri
5cd7eb9f07 glsl: whitespace cleanups.
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

v2 (Paul Berry <stereotype441@gmail.com>): Separate from "glsl: Add
check for unsized arrays to glsl types".

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-28 06:06:04 -07:00
Timothy Arceri
e14abf566b glsl: Fix comment
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-28 06:05:51 -07:00
Christian König
925ffa8c4a vl/h264: split fields into SPS/PPS
Add alot of missing fields as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-10-28 11:08:12 +01:00
Christian König
6f2410c9aa radeon/uvd: fix H264 chroma format handling
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-10-28 11:06:37 +01:00
Christian König
cc49baeedc vl: add 400 chroma format as well
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-10-28 11:06:18 +01:00
Chia-I Wu
d2fdc0d634 ilo: minor cleanups for recent interface changes
Kill ilo_bind_sampler_states2 and ilo_set_sampler_views2.  Map
PIPE_FORMAT_R10G10B10A2_UINT to BRW_SURFACEFORMAT_R10G10B10A2_UINT.
2013-10-28 11:40:41 +08:00
Timothy Arceri
d1d3b1e361 glsl: Move error message inside validation check reducing duplicate message handling
v2 (Paul Berry <stereotype441@gmail.com): Fix precedence error in call
to _mesa_glsl_error().

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-27 10:23:52 -07:00
Paul Berry
e79e6c5911 i965: Make fs gl_PrimitiveID input work even when there's no gs.
When a geometry shader is present, the fragment shader gl_PrimitiveID
input acts like an ordinary varying, receiving data from the gs
gl_PrimitiveID output.  When there's no geometry shader, we have to
ask the fixed function SF hardware to provide the primitive ID to the
fragment shader instead.

Previously, the SF setup code would handle this situation by
recognizing that the FS gl_PrimitiveID input didn't match to any VS
output; since normally an FS input with no corresponding VS output
leads to undefined data, the SF setup code used to just arbitrarily
assign it to receive data from attribute 0.

This patch changes the SF setup code so that instead of arbitrarily
using attribute 0, it assigns the unmatched FS input to receive
gl_PrimitiveID.  In the case where the FS input really is
gl_PrimitiveID, this produces the intended result.  In all other
cases, no harm is done since GL specifies that the behaviour is
undefined.

Fixes piglit test primitive-id-no-gs.

v2: If an attribute is already being overridden with point
coordinates, don't try to also override it with gl_PrimitiveID.  This
is necessary to avoid regressing piglit tests such as
shaders/glsl-fs-pointcoord.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-27 10:23:39 -07:00
Vinson Lee
7f76368305 mesa: Add GL_NV_vdpau_interop functions to dispatch_sanity.cpp.
Fixes 'make check' failures introduced with commit
80964226e9.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70900
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-10-26 23:13:51 -07:00
Brian Paul
bc23944091 mesa: add vdpau.c and st_vdpau.c to src/mesa/SConscript
Fixes SCons build.
2013-10-26 07:24:17 -06:00
Christian König
80964226e9 implement NV_vdpau_interop v7
v2: Actually implement interop between the gallium
    state tracker and the VDPAU backend.

v3: Make it also available in non legacy contexts,
    fix video buffer sharing.

v4: deny interop if we don't have the same screen object

v5: rebased on upstream changes

v6: implemented VDPAUGetSurfaceivNV, improved error handling,
    unregister all surfaces in VDPAUFiniNV

v7: squash merge with Mareks changes

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-10-26 12:13:36 +02:00
Christian König
3d3a0b9b67 winsys/radeon: make radeon_drm_winsys_create public
Otherwise OpenGL/VDPAU interop won't work as expected.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-10-26 12:13:36 +02:00
Chris Forbes
598ca510b8 i965: Remove ir_txf coord+offset special case in visitors
Just let it be handled by the lowering pass.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-26 22:56:27 +13:00
Chris Forbes
06de9f8ff1 i965: Generalize coord+offset lowering pass for ir_txf
ir_txf expects an ivec* coordinate, and may be larger than ivec2;
shuffle things around so that this will work.

V2: Fix style nits, use ir_builder

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-26 22:56:25 +13:00
Chris Forbes
72b5e9c42a i965: Add lowering pass to fold offset into unnormalized coords
It turns out that nonzero offsets with gsampler2DRect don't work -- they
just return garbage. Work around this by folding the offset into the
coord.

Done as an IR pass rather than yet another hack in the visitors because
it's clear what's going on this way. Can possibly reuse this to replace
the existing txf coord+offset hacks.

V2: Use ir_builder

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-26 22:56:09 +13:00
Chris Forbes
a936000db6 i965: Add lowering pass for splitting textureGatherOffsets
Rewrites textureGatherOffsets(s, p, offsets) into

   gvec4(
      textureGatherOffset(s, p, offsets[0]).w,
      textureGatherOffset(s, p, offsets[1]).w,
      textureGatherOffset(s, p, offsets[2]).w,
      textureGatherOffset(s, p, offsets[3]).w
      )

V2: Use ir_builder to be slightly clearer.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-26 22:28:26 +13:00
Chris Forbes
4c1eae5395 i965: Add asserts to ensure that ir_tg4 offset arrays are lowered
We don't have a message that does 4 independent offsets; a lowering
pass needs to lower it to 4 normal gather4s before reaching this
point.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-26 22:28:05 +13:00
Chris Forbes
de8948a0b6 glsl: add signatures for textureGatherOffsets()
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-26 22:28:03 +13:00
Chris Forbes
a9de744a26 glsl: add support for texture functions with offset arrays
This is needed for textureGatherOffsets()

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-26 22:27:37 +13:00
Chris Forbes
3c98d77460 i965/fs: Add support for shadow comparitors with gather4
Note that gather4_po_c's parameters are too long for SIMD16. It might be
worth emitting 2xSIMD8 messages in this case at some point.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-26 22:16:32 +13:00
Chris Forbes
32f898a71c i965/vs: Add support for shadow comparitors with gather4
gather4_c's argument layout is straightforward -- refz just goes on the
end.

gather4_po_c's layout however -- the array index is replaced with refz.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-26 22:16:28 +13:00
Chris Forbes
070c841111 i965: Add Gen7 gather4_c and gather4_po_c message types
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-26 22:16:27 +13:00
Chris Forbes
43e3ae112f glsl: Add new textureGather[Offset]() overloads for shadow samplers
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-26 22:16:24 +13:00
Chris Forbes
af1dfd99b7 glsl: Add support for separate reference Z for shadow samplers
ARB_gpu_shader5's textureGather*() functions which take shadow samplers
have a separate `refz` parameter rather than adding it to the
coordinate.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-26 22:16:19 +13:00
Chris Forbes
fb08769bb6 i965/vs: add support for gather4 with nonconstant offsets
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2013-10-26 22:10:02 +13:00
Chris Forbes
938d909894 i965/fs: add support for gather4 with nonconstant offsets
V3: fixup crazy check for whether we need to emit the coordinate after
    custom handling.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-26 22:08:51 +13:00
Chris Forbes
bdcacaed9c i965: relax brw_texture_offset assert
Some texturing ops are about to have nonconstant offset support; the
offset in the header in these cases should be zero.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-26 21:54:15 +13:00
Chris Forbes
6bb2cf2107 i965: Add SHADER_OPCODE_TG4_OFFSET for gather with nonconstant offsets.
The generator code ends up clearer this way than if we had to sniff
via the message length. Implemented via the gather4_po message in
hardware, which is present in Gen7 and later.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-26 21:54:15 +13:00
Chris Forbes
cd8505bfb8 i965: add missing tg4 case in brw_instruction_name
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-26 21:54:15 +13:00
Chris Forbes
4fa123deac glsl: relax const offset requirement for textureGatherOffset
Prior to ARB_gpu_shader5 / GLSL 4.0, the offset is required to be
a constant expression.

With that extension, it is relaxed to be an arbitrary expression.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-26 21:54:15 +13:00
Chris Forbes
00235402a0 glsl: Add ARB_gpu_shader5 textureGatherOffset signatures
- gsampler2DRect
- optional `comp` parameter

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-26 21:54:15 +13:00
Kenneth Graunke
d07d38e696 i965: Weaken the flushing in gen7_end_transform_feedback().
Since 062317d667 (i965: Go back to using the kernel SOL reset feature.)
we've been flushing the batch on BeginTransformFeedback().  So it's not
necessary to do it on EndTransformFeedback().  A PIPE_CONTROL will work.

This makes gen7_end_transform_feedback() exactly the same as the gen6
variant.  However, they'll diverge again shortly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-25 22:25:38 -07:00
Eric Anholt
93bd627d5a i965/fs: Stop trying to hack around MRF dep chains on gen7+ LIFO scheduling.
This was a hack to avoid choosing to schedule all texturing before
consumption of any texture results due to the way dependency chains worked
out in the presence of MRFs.  On gen7, we don't have MRFs, so the problem
doesn't apply, and this was just badly constraining our scheduling.

total instructions in shared programs: 1615306 -> 1612534 (-0.17%)
instructions in affected programs:     9958 -> 7186 (-27.84%)
GAINED:                                259
LOST:                                  9

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-25 16:45:30 -07:00
Eric Anholt
c3c9a8c857 i965: Try not to reverse-schedule things when doing LIFO scheduling.
The LIFO plan was simple: Take the most recently made available
instructions, and pick those first.

But because of the order we were pushing things onto our list of
available-to-schedule instructions, it meant that when a set of
instructions was made available at the same time (for example, everything
at the start of the program that didn't depend on other instructions) we'd
schedule them in reverse order.

If you had 10 texture calls in a row in your program, each with
independent argument setup, we'd set up the last texture call's args and
execute it first, even though we wouldn't be able to consume its results
until we'd finished the other 9 texture calls (assuming consumption of
texture results happens near each texture call, and combines it with
another texture result, which is normal for a convolution shader).

To fix this, walk the list for doing LIFO in the order that instructions
were originally generated in the program, but choose to push
newly-made-available instructions to the other end of the list instead.

total instructions in shared programs: 1587242 -> 1586290 (-0.06%)
instructions in affected programs:     7801 -> 6849 (-12.20%)
GAINED:                                76
LOST:                                  67

Thanks to Chia-I Wu for pointing out the bug in my first version of the
patch that made it a huge loss.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-25 16:45:30 -07:00
Ilia Mirkin
a7ce1fef27 mesa/st: disable ARB_framebuffer_object when no driver support.
When PIPE_CAP_MIXED_FRAMEBUFFER_SIZES is not provided, parts of
ARB_framebuffer_object can't be supported, such as on NV30.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2013-10-26 01:36:07 +02:00
Ilia Mirkin
12d39b4fa8 gallium: add PIPE_CAP_MIXED_FRAMEBUFFER_SIZES
This CAP will determine whether ARB_framebuffer_object can be enabled.
The nv30 driver does not allow mixing swizzled and linear zsbuf/cbuf
textures.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2013-10-26 01:36:07 +02:00
Adam Jackson
1090eb5755 glx: Fix return value from indirect_bind_context
_XReply returns 1 on success, but indirect_bind_context returns 0 on
success.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70486
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-10-25 16:49:28 -04:00
Matt Turner
64c081e8b7 glsl: Optimize (not A) and (not B) into not (A or B).
No shader-db changes, but seems like a good idea.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-25 10:35:18 -07:00
Matt Turner
65a600f58a glsl: Optimize (not A) or (not B) into not (A and B).
A few Serious Sam 3 shaders affected:

instructions in affected programs:     4384 -> 4344 (-0.91%)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-25 10:35:13 -07:00
Matt Turner
e52959e961 i965/fs: Match commutative expressions with reversed arguments.
total instructions in shared programs: 1645011 -> 1644938 (-0.00%)
instructions in affected programs:     17543 -> 17470 (-0.42%)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-25 10:34:02 -07:00
Matt Turner
503fe278b0 i965: s/Muchnik/Muchnick/.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-25 10:34:02 -07:00
Marek Olšák
9807556e86 r600g,radeonsi: use fences provided by the winsys 2013-10-25 11:55:55 +02:00
Marek Olšák
6067a30838 winsys/radeon: add the implementation of fences from r300g 2013-10-25 11:55:55 +02:00
Marek Olšák
48784f3591 radeonsi: add the vertex shader position output if it's missing
This fixes a lockup in piglit/spec/glsl-1.40/execution/tf-no-position.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-10-25 11:55:55 +02:00
Marek Olšák
94715130e6 radeonsi: respect semantic indices for COLOR[i] fragment shader outputs
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-10-25 11:55:55 +02:00
Paul Berry
e8f6f244bb glsl: When disabling gl_PerVertex variables, check that mode matches.
In commit 1b4a737 (glsl: Support redeclaration of VS and GS
gl_PerVertex output), I added code to ensure that when an unnamed
gl_PerVertex interface block is redeclared, any ir_variables that
weren't included in the redeclaration are removed from the IR (and the
symbol table).  This ensures that only those variables that were
explicitly redeclared may be used.

However, when I wrote this code, I neglected to match the variable
mode when finding variables to remove.  This meant that redeclaring a
built-in output block might cause the built-in input gl_in to be
accidentally removed.

Fixes piglit test gs-redeclares-pervertex-out-only.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:01:30 -07:00
Paul Berry
719bf30165 glsl: Remove unused gl_PerVertex interface blocks.
The GLSL 4.10 rules for redeclaration of built-in interface blocks
(which we've chosen to regard as clarifications of GLSL 1.50) only
require gl_PerVertex blocks to match in shaders that actually use
those blocks.  The easiest way to implement this is to detect
situations where a compiled shader doesn't refer to any elements of
gl_PerVertex, and remove all the associated ir_variables from the
shader at the end of ast-to-ir conversion.

Fixes piglit tests
linker/interstage-{pervertex,pervertex-in,pervertex-out}-redeclaration-unneeded.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:01:27 -07:00
Paul Berry
37d97668ae glsl: Call check_builtin_array_max_size when redeclaring gl_in.
Normally when a built-in array (such as gl_ClipDistance) is
redeclared, we call get_variable_being_redeclared() to do the
redeclaration, and it in turn calls check_builtin_array_max_size() to
make sure that the redeclared array size isn't too large.

However when a built-in array is redeclared as part of redeclaring
gl_in, we don't call get_variable_being_redeclared() (since the
individual built-ins aren't each represented by their own ir_variable
anymore).  So we need to add an explicit call to
check_builtin_array_max_size() to make sure the new array size isn't
too large.

Note: at the moment this is redundant with a test that's done at link
time, so there's no change to piglit results.  But the patch that
follows will prevent link errors from being reported if gl_PerVertex
isn't used, so in order to prevent that patch from causing
regressions, we need to add the compile check now.  Besides, it's
nicer to report this error at compile time anyhow.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:01:24 -07:00
Paul Berry
156b31c5be mesa: Fix geometry shader program queries.
The queries GEOMETRY_VERTICES_OUT, GEOMETRY_INPUT_TYPE, and
GEOMETRY_OUTPUT_TYPE (defined by GL 3.2) differ from the corresponding
queries in ARB_geometry_shader4 in the following ways:

- They use different enum values

- They can only be queried; they cannot be set.

- Attempting to query them yields INVALID_OPERATION if the program is
  not linked, or lacks a geometry shader.

This patch switches us over from the ARB_geometry_shader4 behaviour to
the GL 3.2 behaviour.

Fixes piglit test query-gs-prim-types.

v2: Improve comment above has_core_gs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:01:22 -07:00
Paul Berry
a49830b8f5 glsl: Account for interface block lowering in program_resource_visitor.
When program_resource_visitor visits variables that were created by
lower_named_interface_blocks, it needs to do extra work to un-do the
effects of lower_named_interface_blocks and construct the proper API
names.

Fixes piglit test
spec/glsl-1.50/execution/interface-blocks-api-access-members.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:01:19 -07:00
Paul Berry
4b97c581b4 glsl: mark variables produced by lower_named_interface_blocks.
These variables will need to be treated specially by
program_resource_visitor, so that they can be addressed through the
API using their interface block name (and array index, for interface
block arrays).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:01:14 -07:00
Paul Berry
99512dc40d glsl: Keep track of centroid/interpolation mode for interface block members.
Fixes piglit tests:
- interface-block-interpolation-{array,named,unnamed}
- glsl-1.50-interface-block-centroid {array,named,unnamed}

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:01:10 -07:00
Paul Berry
e17d671d9f glsl: Pass variable mode into ast_process_structure_or_interface_block().
Later patches will use this information to do proper error checking of
interpolation qualifiers that appear inside of interface blocks.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:01:07 -07:00
Paul Berry
81a5067966 glsl: Extract interpretation of interpolation to its own function.
In future patches, we will need this in order to interpret
interpolation qualifiers that appear inside interface blocks.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:01:04 -07:00
Paul Berry
f65feb5335 glsl: Pull interpolation_string() out of ir_variable.
Future patches will need to call this function when there isn't an
ir_varible present to refer to.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:00:59 -07:00
Paul Berry
1e3e72e305 i965: Reduce gl_MaxGeometryInputComponents to 64.
Although in principle there is no hardware limitation that prevents
gl_MaxGeometryInputComponents from being set to 128 on Gen7, we have
the following limitations in the vec4 compiler back end:

- Registers assigned to geometry shader inputs can't be spilled or
  later re-used for any other purpose.

- The last 16 registers are set aside for the "MRF hack", meaning they
  can only be used to send messages, and not for general purpose
  computation.

- Up to 32 registers may be reserved for push constants, even if there
  is sufficient register pressure to make this impractical.

A shader using 128 geometry input components, and having an input type
of triangles_adjacency, would use up:

- 1 register for r0 (which holds URB handles and various pieces of
  control information).

- 1 register for gl_PrimitiveID.

- 102 registers for geometry shader inputs (17 registers per input
  vertex, assuming DUAL_INSTANCED dispatch mode and allowing for one
  register of overhead for gl_Position and gl_PointSize, which are
  present in the URB map even if they are not used).

- Up to 32 registers for push constants.

- 16 registers for the "MRF hack".

That's a total of 152 registers, which is well over the 128 registers
the hardware supports.

Fortunately, the GLSL 1.50 spec allows us to reduce
gl_MaxGeometryInputComponents to 64.  Doing that frees up 48
registers, brining the total down to 104 registers, leaving 24
registers available to do computation.

Fixes piglit test
spec/glsl-1.50/execution/geometry/max-input-components.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:00:57 -07:00
Paul Berry
3c2feb1969 i965/gs: If a DUAL_OBJECT gs would spill, fall back to DUAL_INSTANCED.
This is similar to what we do for 16-wide vs 8-wide fragment shaders.
First we try compiling the geometry shader in DUAL_OBJECT mode.  If we
can't do that without spilling, we fall back on DUAL_INSTANCED mode,
which should require less spilling (since it uses an interleaved
layout of payload registers).

In an ideal world we'd fall back to SINGLE mode, which would allow us
to interleave general-purpose registers too (resulting in even less
likelihood of spilling).  But at the moment, the vec4 generator and
visitor classes don't have the infrastructure to interleave general
purpose registers, so DUAL_INSTANCED is the best we can do.

As a side benefit this paves the way for implementing instanced
geometry shaders (which are incompatible with DUAL_OBJECT mode).

Since most geometry shaders used in piglit testing are small,
DUAL_INSTANCED mode won't get exercised very much in a normal piglit
run.  To force DUAL_INSTANCED mode to be used for all geometry
shaders, set INTEL_DEBUG=nodualobj.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-24 22:00:53 -07:00
Paul Berry
03ac2c7223 i965/gs: Fix up gl_PointSize input swizzling for DUAL_INSTANCED gs.
Geometry shaders that run in "DUAL_INSTANCED" mode store their inputs
in vec4's.  This means that when compiling gl_PointSize input
swizzling (a MOV instruction which uses a geometry shader input as
both source and destination), we need to do two things:

- Set force_writemask_all to ensure that the MOV happens regardless of
  which channels are enabled.

- Set the source register region to <4;4,1> (instead of <0;4,1> to
  satisfy register region restrictions.

v2: move the source register region fixup to the top of
vec4_generator::generate_vec4_instruction(), so that it applies to all
instructions rather than just MOV.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-24 22:00:50 -07:00
Paul Berry
a05589ea0b i965/gs: Add the ability to compile a DUAL_INSTANCED geometry shader.
Not yet enabled.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-24 22:00:46 -07:00
Paul Berry
34cba13ef8 i965/vec4: Add the ability to suppress register spilling.
In future patches, this will allow us to first try compiling a
geometry shader in DUAL_OBJECT mode (which is more efficient but uses
more registers) and then if spilling is required, fall back on
DUAL_INSTANCED mode.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-24 22:00:43 -07:00
Paul Berry
89647cffb3 i965/vec4: if register allocation fails, don't try to schedule.
Otherwise the scheduler would be invoked with prog_data->total_grf ==
0, causing havoc.

In a future patch, this will allow us to try compiling a geometry
shader in DUAL_OBJECT mode with spilling disabled, and then fall back
to DUAL_INSTANCED mode if that failed.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-24 22:00:40 -07:00
Paul Berry
8bb15813e3 i965/vec4: Add the ability for attributes to be interleaved.
When geometry shaders are operated in "single" or "dual instanced"
mode, a single set of geometry shader inputs is interleaved into the
thread payload (with each payload register containing a pair of
inputs) in order to save register space.

This patch modifies vec4_visitor::lower_attributes_to_hw_regs so that
it can handle the interleaved format.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-24 22:00:37 -07:00
Paul Berry
3da2c5123d i965/gs: Set force_writemask_all when setting up g0.
All geometry shaders begin this instruction:

    mov(1) g0.2<1>:ud 0x0:ud { align1 }

which sets up GRF0 properly for scratch reads and writes.  Since this
instruction has a SIMD size of 1, it will only have an effect if the
first channel is enabled.  In practice, the hardware seems to always
dispatch geometry shaders with the first channel enabled, but I can't
find anything in the docs to guarantee that.

So to be on the safe side, set force_writemask_all on the instruction,
which guarantees that it will have the desired effect regardless of
which channels are enabled.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-24 22:00:37 -07:00
Paul Berry
172aec281d glsl: set explicit_location correctly in lower_named_interface_blocks.
When lower_named_interface_blocks lowers a built-in interface block
member to an ir_variable, it needs to set explicit_location in the
ir_variable.  Otherwise the linker gets confused and treats the
variable as a generic varying.

Fixes the following piglit tests, which were regressed by commit
63974c0 (glsl: Simplify the interface to
link_invalidate_variable_locations):
- clip-distance-bulk-copy
- clip-distance-in-bulk-read
- clip-distance-in-explicitly-sized
- clip-distance-in-param
- clip-distance-in-values
- core-inputs
- gs-redeclares-both-pervertex-blocks
- gs-redeclares-pervertex-in-only
- redeclare-pervertex-subset-vs-to-gs
- unsized-in-named-interface-block-gs
- unsized-in-named-interface-block-multiple
- unsized-in-unnamed-interface-block-gs
- unsized-in-unnamed-interface-block-multiple

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70820

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:00:32 -07:00
Paul Berry
85db1326a2 i965/gs: Precompile geometry shaders.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:00:28 -07:00
Paul Berry
e0f34301b2 i965/vec4: Extract function to set up vec4 prog key for precompiling.
This will allow us to re-use it for precompiling geometry shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:00:25 -07:00
Paul Berry
068df64ba6 i965/vec4: Remove uses_clip_distance from program key.
This should never have been in the program key in the first place,
since it's determined by the shader source, not by GL state.  Change
the code to just refer to gl_program::UsesClipDistanceOut directly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:00:22 -07:00
Paul Berry
11634e491b glsl: Move UsesClipDistance from gl_{vertex,geometry}_program into gl_program.
This will make it easier for back-ends to share code between geometry
shader and vertex shader compilation.  Also, it is renamed to
"UsesClipDistanceOut" to clarify that (a) in geometry shaders, it
refers to the gl_ClipDistance output rather than the gl_ClipDistance
input, and (b) it is irrelevant in fragment shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 22:00:13 -07:00
Paul Berry
44b7ebe52d glsl/gs: Fix transform feedback of gl_ClipDistance.
Since gl_ClipDistance is lowered from an array of floats to an array
of vec4's during compilation, transform feedback has special logic to
keep track of the pre-lowered array size so that attempting to perform
transform feedback on gl_ClipDistance produces a result with the
correct size.

Previously, this special logic always consulted the vertex shader's
size for gl_ClipDistance.  This patch fixes it so that it uses the
geometry shader's size for gl_ClipDistance when a geometry shader is
in use.

Fixes piglit test spec/glsl-1.50/transform-feedback-type-and-size.

v2: Change the type of LastClipDistanceArraySize to "unsigned", and
clarify the comment above it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 21:59:39 -07:00
Paul Berry
fe36154ff3 i965: Fix gl_MaxCombinedTextureImageUnits.
We've always overriden
ctx->Const.{Vertex,Fragment}Program.MaxTextureImageUnits to reflect
the number of texture image units supported by the hardware (rather
than using the default values assigned by Mesa core) so it seems
sensible to do that for GeometryProgram.MaxTextureImageUnits too.  We
set it to 0 if geometry shaders aren't supported.

Once that is done, we can just unconditionally add
GeometryProgram.MaxTextureImageUnits to MaxCombinedTextureImageUnits.

Fixes piglit test "spec/glsl-1.50/built-in
constants/gl_MaxCombinedTextureImageUnits".

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-24 21:14:26 -07:00
Rob Clark
a453242fda freedreno/a3xx/compiler: relative addressing
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-10-24 20:21:08 -04:00
Rob Clark
4317c4e6e0 freedreno/a3xx: fix const/rel/const-rel encoding
The encoding of constant, relative, and relative-const src registers is
a bit more complex than originally thought, which gives an extra bit to
encode const reg # at expense of taking a bit from relative offset.

In most cases a3xx seems to actually use a scheme whereby it can encode
an extra bit for const register.  You have three possible encodings in
thirteen bits:

   register:  (11 bits for N.c)
     00........... rN.c

   relative:  (10 bits for N)
     010.......... r<a0.x + N>
     011.......... c<a0.x + N>

   const:     (12 bits for N.c)
     1............ cN.c

Which means we can deal w/ more consts than previously thought.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-10-24 20:21:08 -04:00
Rob Clark
bfd30935c9 freedreno/a3xx: add blend state
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-10-24 20:21:08 -04:00
Rob Clark
0a1e4361e8 freedreno/resource: fail more gracefully
Fail more gracefully when buffer allocation/import fails.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-10-24 20:21:08 -04:00
Roland Scheidegger
2b2fc03beb gallivm: implement fully accurate corner filtering for seamless cube maps
d3d10 requires that cube corners are filtered with accurate weights (that
is, the weight of the non-existing corner texel should be evenly distributed
to the other 3 texels). OpenGL does not require this (but recommends it).
This requires us to use different filtering code, since we need per-texel
weights which our 2d lerp doesn't (and can't) do. And of course the (now
per element) weights need to be adjusted too for it to work.
Invoke the new filtering code whenever there's an edge to keep things simpler,
as it will work for edges too not just corners but of course it's only needed
with corners.
More ugly code for not much gain but at least a hacked up cubemap demo
shows very nice corners now... Not sure yet if and how this should be
configurable...

v2: incorporate feedback from Jose, only use special corner filtering code
when there's a corner not when there's only an edge (as corner filtering code
is slower, though a perf difference was only measureable when always
forcing edge code). Plus some minor style fixes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-10-25 01:29:14 +02:00
Eric Anholt
dde9260fdc mesa: Remove dricore from the build.
No driver uses it any more, and it's been replaced by megadrivers.

v2: Remove always-on conditional for NEED_LIBPROGRAM (review by Emil)

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:13:09 -07:00
Eric Anholt
bdcee13ca3 swrast: Build the driver into the shared mesa_dri_drivers.so.
v2: drop dridir now that it's unused.
v3: Fix linking after rebase when building just swrast from classic but a
    drm-using gallium driver.
v4: Consistently put spaces around += in the updated Makefile.am block.
v5: Set a global driverAPI variable so loaders don't have to update to
    createNewScreen2() (though they may want to for thread safety).

Reviewed-by: Matt Turner <mattst88@gmail.com> (v3)
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:13:09 -07:00
Eric Anholt
86d50c2f15 radeon: Build the driver into the shared mesa_dri_drivers.so.
This required some reordering of headers to ensure that the symbol name
redefines happened before any prototypes.

v2: drop dridir now that it's unused.
v3: Consistently put spaces around += in the updated Makefile.am blocks.
v4: Set a global driverAPI variable so loaders don't have to update to
    createNewScreen2() (though they may want to for thread safety).

Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:13:09 -07:00
Eric Anholt
6665b71b22 i915: Build the driver into the shared mesa_dri_drivers.so.
i915 has symbols for formerly-shared code that conflict with i965, so we
define them away using gen-symbol-redefs.py.  Options considered:

- This option.  Downsides: The symbols in profiling and debugging don't
  match the source.  The symbol list may change in the future and we won't
  notice without manually running the tool again.

- Use objcopy --localize-hidden to automatically demote our symbols to
  locals.  This didn't work on i965 due to c++ weak symbols (which can't
  be localized), but could work on i915.  We could do it on i915 only, but
  it does produce libtool warnings at link time due to libtool not knowing
  if the resulting .o file is safe to link (stupid libtool).  Plus you end
  up with different symbols of the same name, which is confusing for
  debugging too.  On the other hand, no future symbol conflicts long term.

- Write our own libelf tool that handles c++ weak symbols like we want and
  apply it to all drivers.  All the downsides of above, but applies
  uniformly across drivers.

- Edit the files to just rename all the i915 or i965 symbols that
  conflict.  There are on the order of 100 that have a prefix we used to
  share, so it would take a bit of typing.  Fewest downsides, but still
  can have conflicts long term.

Ultimately, this is the least invasive change at the moment, and we can
see if the "more symbol conflicts appear later" thing is a real concern or
not.

Note that the ability to compile a version of i915 without INTEL_DEBUG env
support is dropped.  It's too useful.

v2: drop dridir now that it's unused.
v3: Consistently put spaces around += in the updated Makefile.am block.
v4: Set a global driverAPI variable so loaders don't have to update to
    createNewScreen2() (though they may want to for thread safety).

Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:13:09 -07:00
Eric Anholt
ba10d79cca dri: Add a tool for generating #defines to namespace driver global symbols.
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:13:09 -07:00
Eric Anholt
ead86e378f nouveau: Build the driver into the shared mesa_dri_drivers.so.
v2: drop dridir now that it's unused.
v3: Consistently put spaces around += in the updated Makefile.am block.
v4: Set a global driverAPI variable so loaders don't have to update to
    createNewScreen2() (though they may want to for thread safety).
v5: Fix missed public symbol in nouveau. (caught by Emil)

Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:13:08 -07:00
Eric Anholt
1925a9aebd i965: Build the driver into a shared mesa_dri_drivers.so .
Previously, we've split things such that mesa core is in libdricore,
exposing the whole Mesa core interface in the global namespace, and the
i965_dri.so code all links against that.  Along with polluting application
namespace terribly, it requires extra PLT indirections and prevents LTO.

Instead, we can build all of the driver contents into the same .so with
just a few symbols exposed to be referenced from the actual driver .so
file, allowing LTO and reducing our exposed symbol count massively.

FPS improvement on GLB2.7 with INTEL_NO_HW=1: 2.61061% +/- 1.16957% (n=50)
(without LTO, just the PLT reductions from this commit)

Note that the X Server requires commit
7ecfab47eb221dbb996ea6c033348b8eceaeb893 to successfully load this driver!

v2: Set a global driverAPI variable so loaders don't have to update to
    createNewScreen2() (though they may want to for thread safety).
v3: Drop AM_CPPFLAGS addition (Emil pointed out I'd missed some cflags
    that would be necessary, though only if we actually relied on them).
v4: Fix install with DESTDIR set.

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> (v2)
2013-10-24 14:12:58 -07:00
Eric Anholt
4e54751624 dri: Implement a DRI vtable extension to replace the global driDriverAPI.
As we move to megadrivers, we are unable to build multiple drivers with
the same public global symbol per driver (Think an X Server with an intel
and a nouveau driver, and the X Server implementing indirect for both --
we have to actually talk to the right driver).  By slipping the
driDriverAPI vtable into the driver's extension list, we can replace the
usage of the global symbol with usage of the loader-dlsym()ed driver
information.

v2: Pull in the hunk to avoid crashing on null driver_extensions.  Thanks,
    Emil!

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:04:20 -07:00
Eric Anholt
f93533d118 dri: Pass in the dlsym()ed driver extension to screen creation.
This will allow a megadrivers build to reference the actual driver being
loaded from the shared dri_util screen creation code.

v2: Fix indentation, fallback case in EGL (review by Emil).

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v1)
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:04:20 -07:00
Eric Anholt
67caf36489 gbm: Add support for the new __driDriverGetExtensions interface.
v2: Fix uninitialized variable use in the old-ABI case.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v1)
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:04:20 -07:00
Eric Anholt
a64bb7553a egl: Add an optional function call for getting the DRI driver interface.
v2: Fix asprintf error checking.

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:04:20 -07:00
Eric Anholt
fcb57a8210 glx: Add an optional function call for getting the DRI driver interface.
The previous interface relied on a static struct, which meant that the
driver didn't get a chance to edit the struct before the struct got used.
For megadrivers, I want struct specific to the driver being loaded.

v2: Fix the prototype in the docs (caught by Marek).  Since the driver
    name was in the function, we didn't need to also pass it in.
v3: Fix asprintf error checking (caught by Matt's gcc).

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:04:20 -07:00
Eric Anholt
6868923702 dri: Move driver config options to dri driver extensions.
This way they aren't all sitting in the global namespace (with the same
name per driver).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:04:20 -07:00
Eric Anholt
cf5d8fc310 dri: Allow config options to be passed to the loader through extensions.
Turns out already we have this nice mechanism for providing optional
things from the driver to the loader, and I was going to have to rename
the public global symbol to avoid conflicts when doing megadrivers.

While the former __driConfigOptions is technically loader interface, this
is the only loader that made use of that symbol.  Continue paying
attention to it if we can't find the new option, to retain compatibility
with old drivers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:04:20 -07:00
Eric Anholt
80806c98ef glx: Move the driver extension-loading to a helper function.
I'm planning on doing driver extension parsing from 3 places, and making
the extension loading step a bit longer.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-24 14:04:20 -07:00
Francisco Jerez
7463abd37d clover: Query maximum kernel block size from the device instead of the kernel object.
Based on a similar fix from Aaron Watry.  It seems unlikely that we
will ever need a kernel-specific setting for this, and the Gallium API
doesn't support it.  Remove kernel::max_block_size() altogether.
2013-10-24 13:33:41 -07:00
Brian Paul
b8d7a97fad glsl: silence unused 'var' variable warning
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-24 10:45:47 -06:00
Brian Paul
8d7b913e4e svga: remove user-space vertex/index buffer code
The gallium vbuf module, which we've been using for some time now, takes
care of uploading user-space vertex/index data into real buffers.  The
upload code in the svga driver was unused.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-10-24 10:45:47 -06:00
Chad Versace
2f6a315085 i965: Print more debuginfo in intel_texsubimage_memcpy()
Print info about packing, format, type, and tiling. This will help debug
future issues with this fastpath.

Reviewed-by: Frank Henigman <fjhenigman@google.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-10-24 09:25:45 -07:00
Chad Versace
c4205590e7 i965: Fix glTexImage when packing alignment != cpp
Fixes texture corruption of Weston clients on cairo-glesv2 backend.
Commit 49ed599 introduced the bug.

Corruption occured when glTexSubImage called
intel_texsubimage_tiled_memcpy() with:
  x,y=10,9
  w,h=7,7
  format=GL_ALPHA(0x1906)
  type=GL_UNSIGNED_BYTE(0x1401)
  gl_format=MESA_FORMAT_A8(0x18)
  packing.alignemnt=4

The function miscalculated the source image's stride as w*cpp=7 without
taking into account the packing alignment. The actual stride was 8.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70435
Reported-by: U. Artie Eoff <ullysses.a.eoff@intel.com>
Tested-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by:Frank Henigman <fjhenigman@google.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-10-24 09:25:24 -07:00
Rob Clark
a6e45b6a17 freedreno: fix compile error
Small typo introduced in a3ed98f.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-10-23 18:38:05 -06:00
Paul Berry
4df56177ed i965/fs: Only unroll high-accuracy dFdy() from SIMD16 to SIMD8 on gen4 and IVB.
In commit 800610f (i965/fs: Improve accuracy of dFdy() to match
dFdx()) I unrolled the high-accuracy dFdy() computation from a single
SIMD16 instruction to two SIMD8 instructions because of text I found
in the i965 (gen4) PRM saying that instruction compression could not
be used in align16 mode.  I couldn't find similar text in later
hardware docs, and I observed problems trying to use instruction
compression on align16 mode on Ivy Bridge, so I assumed that the
restriction still applied and the associated documentation had simply
been lost.

After consultation with the hardware engineers, it turns out this is
not the case.  In point of fact, the restriction was dropped in gen5,
re-introduced in Ivy Bridge, and dropped again in Haswell.  The reason
I didn't notice this is that in the Ivy Bridge documentation, the
restriction was in a different section, and described using different
language.

Now that we know that the restriction only applies to Gen4 and Ivy
Bridge, we can limit the unrolling to those platforms.

Tested on gen5, gen6, and gen7 (both Ivy Bridge and Haswell).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-23 16:51:15 -07:00
Paul Berry
8e15207c9d glsl/gs: Prevent illegal input/output primitive types.
From the GLSL 1.50 spec, section 4.3.8.1 (Input Layout Qualifiers):

    The layout qualifier identifiers for geometry shader inputs are

        layout-qualifier-id
            points
            lines
            lines_adjacency
            triangles
            triangles_adjacency

And from section 4.3.8.2 (Output Layout Qualifiers)

    The layout qualifier identifiers for geometry shader outputs are

        layout-qualifier-id
            points
            line_strip
            triangle_strip
            max_vertices = integer-constant

We were erroneously allowing line_strip and triangle_strip to be used
as input qualifiers, and we were allowing lines, lines_adjacency,
triangles, and triangles_adjacency to be used as output qualifiers.

Fixes piglit tests "glsl-1.50-gs-{input,output}-layout-qualifiers *".

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-23 16:51:05 -07:00
Eric Anholt
867d0cc1fe i965: Add perf debug hint when the app makes us do index buffer scanning.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-23 15:33:46 -07:00
Eric Anholt
c298f5ff56 i965: Try to avoid stalls on the GPU when doing glBufferSubData().
On DOTA2, framerate on dota2-de1.dem in windowed mode on my laptop
improves by 7.69854% +/- 0.909163% (n=3).  In a microbenchmark hitting
this code path (wall time of piglit vbo-subdata-many), runtime decreases
from 0.8 to 0.05 seconds.

v2: Use out of range start/end instead of separate bool for the active
    flag (suggestion by Jordan), fix double-upload in the stalling path.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-23 15:33:19 -07:00
Eric Anholt
3b58e0ed64 i965: Be sure to reset brw->vb.buffers[] when trying to redo vertex setup.
The brw_prepare_vertices that sets up buffers[] depends on these
parameters, so don't let brw_prepare_vertices() skip it.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-23 15:33:16 -07:00
Eric Anholt
a5e2e7f9a4 i965: Add support for GL_ARB_texture_buffer_range.
Supporting this extension turns out to simplify our code a bit over not
supporting this extension, once the glBufferSubData() synchronization code
lands.

v2: Use 16 byte alignment like we do for uniform buffers, due to unaligned
    access penalties.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
2013-10-23 15:33:10 -07:00
Eric Anholt
b37f7e0160 i965: Add a note about the late-allocation in intel_bufferobj_buffer().
This was mostly for the i915 system-memory VBO code, which we don't have
any more, but since that existed we've ended up producing dependencies on
it being there.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-23 15:33:06 -07:00
Eric Anholt
060a49a896 i965: Drop intel_bufferobj_source().
Since src_offset was always 0, it wasn't doing anything for us beyond
intel_bufferobj_buffer().

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-23 15:33:03 -07:00
Eric Anholt
c0a9436d19 i965: Fix texture buffer rendering after a whole buffer replacement.
If glBufferData(), glBufferSubData(0, obj->Size), or similar happens, we
get a new drm_intel_bo for the buffer object, and thus need to re-upload
texture buffer state so we point at the new data.

Fixes the new piglit GL_ARB_texture_buffer_object/data-sync

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-23 15:31:44 -07:00
David Heidelberger
2901e2efcd clover: fix build after a3ed98f7aa 2013-10-23 13:13:36 -07:00
Brian Paul
c1345720c8 nv50: clamp PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS to PIPE_MAX_SAMPLERS
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70212
Tested-by: Aaron Watry <awatry@gmail.com>
2013-10-23 13:43:18 -06:00
Brian Paul
ef98e2ee61 radeonsi: remove unused si_set_cs_sampler_view()
Fixes build breakage.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70804

Tested-by: Vinson Lee <vlee@freedesktop.org>
2013-10-23 13:42:51 -06:00
Brian Paul
a3ed98f7aa gallium: new, unified pipe_context::set_sampler_views() function
The new function replaces four old functions: set_fragment/vertex/
geometry/compute_sampler_views().

Note: at this time, it's expected that the 'start' parameter will
always be zero.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-23 10:15:38 -06:00
Brian Paul
b11fc226e6 svga: remove unneeded include of u_double_list.h 2013-10-23 10:15:38 -06:00
Kenneth Graunke
30bb170479 i965: Expose write_reg() as brw_store_register_mem64().
Writing a 64-bit register value to memory is sufficiently complicated
that it makes sense to reuse this function rather than duplicating it.

Exposing it outside of gen6_queryobj.c means it needs a more descriptive
function name.  It could probably be moved to brw_util.c or somewhere
else, but this works too.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-23 01:06:26 -07:00
Kenneth Graunke
d5db3ece0a i965: Move flushing out of write_reg and into the callers.
The current callers just want to write a single register, so combining
the register read with a pipeline flush made sense.  However, in the
future we'll want to do multiple register reads back to back, and we'll
only want to flush once.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-23 01:06:26 -07:00
Ian Romanick
63974c0f5b glsl: Simplify the interface to link_invalidate_variable_locations
The unit tests added in the previous commits prove some things about the
state of some internal data structures.  The most important of these is
that all built-in input and output variables have explicit_location
set.  This means that link_invalidate_variable_locations doesn't need to
know the range of non-generic shader inputs or outputs.  It can simply
reset location state depending on whether explicit_location is set.

There are two additional assumptions that were already implicit in the
code that comments now document.

  - ir_variable::is_unmatched_generic_inout is only used by the linker
    when connecting outputs from one shader stage to inputs of another
    shader stage.

  - Any varying that has explicit_location set must be a built-in.  This
    will be true until GL_ARB_separate_shader_objects is supported.

As a result, the input_base and output_base parameters to
link_invalidate_variable_locations are no longer necessary, and the code
for resetting locations and setting is_unmatched_generic_inout can be
simplified.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-22 15:23:30 -07:00
Ian Romanick
1eee0a9f01 glsl/tests: Unit test vertex shader in / out with link_invalidate_variable_locations
Validates:

  - ir_variable::explicit_location should not be modified.

  - If ir_variable::explicit_location is not set, ir_variable::location,
    ir_variable::location_frac, and
    ir_variable::is_unmatched_generic_inout must be reset to 0.

  - If ir_variable::explicit_location is set, ir_variable::location
    should not be modified.  ir_variable::location_frac, and
    ir_variable::is_unmatched_generic_inout must be reset to 0.
    Previous unit tests have shown that all non-generic inputs / outputs
    have explicit_location set.

v2: Split the link_invalidate_variable_locations interface change out to
a separate patch.  Remove the vertex_in_builtin_without_explicit and
vertex_out_builtin_without_explicit tests.  There was a lot of good
discussion about this on the mailing list to which I refer the
interested reader.  Both changes suggested by Paul.

    http://lists.freedesktop.org/archives/mesa-dev/2013-October/046652.html

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-22 15:23:30 -07:00
Ian Romanick
cf8b14ce6d glsl: Modify interface to link_invalidate_variable_locations
This will make it easier to unit test this function in successive
patches.  Also, correct the prototype in linker.h.  It was... wrong.

v2: Split the interface change from adding the unit tests.  Suggested by
Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-22 15:23:30 -07:00
Ian Romanick
af229c94e3 glsl/tests: Verify geometry shader built-ins generated by _mesa_glsl_initialize_variables
Checks that the variables generated meet certain criteria.

 - Geometry shader inputs have an explicit location.

 - Geometry shader outputs have an explicit location.

 - Fragment shader-only varying locations are not used.

 - Geometry shader uniforms and system values don't have an explicit
   location.

 - Geometry shader constants don't have an explicit location and are
   read-only.

 - No other kinds of geometry variables exist.

It does not verify that an specific variables exist.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-22 15:23:30 -07:00
Ian Romanick
f094a0f825 glsl/tests: Verify fragment shader built-ins generated by _mesa_glsl_initialize_variables
Checks that the variables generated meet certain criteria.

 - Fragment shader inputs have an explicit location.

 - Fragment shader outputs have an explicit location.

 - Vertex / geometry shader-only varying locations are not used.

 - Fragment shader uniforms and system values don't have an explicit
   location.

 - Fragment shader constants don't have an explicit location and are
   read-only.

 - No other kinds of fragment variables exist.

It does not verify that an specific variables exist.

v2: Use _mesa_varying_slot_in_fs in
fragment_builtin.inputs_have_explicit_location.  Suggested by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-22 15:23:30 -07:00
Ian Romanick
d05202900b glsl/tests: Verify vertex shader built-ins generated by _mesa_glsl_initialize_variables
Checks that the variables generated meet certain criteria.

 - Vertex shader inputs have an explicit location.

 - Vertex shader outputs have an explicit location.

 - Fragment shader-only varying locations are not used.

 - Vertex shader uniforms and system values don't have an explicit
   location.

 - Vertex shader constants don't have an explicit location and are
   read-only.

 - No other kinds of vertex variables exist.

It does not verify that an specific variables exist.

v2: Fix memory management mistakes in
common_builtin::string_starts_with_prefix.  Clean up error message
reporting in common_builtin::no_invalid_variable_modes.  Both suggested
by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-22 15:23:30 -07:00
Ian Romanick
78b70ceae1 glsl: When constructing a variable with an interface type, set interface_type
Ever since the addition of interface blocks with instance names, we have
had an implicit invariant:

    var->type->is_interface() ==
        (var->type == var->interface_type)

The odd use of == here is intentional because !var->type->is_interface()
implies var->type != var->interface_type.

Further, if var->type->is_array() is true, we have a related implicit
invariant:

    var->type->fields.array->is_interface() ==
        (var->type->fields.array == var->interface_type)

However, the ir_variable constructor doesn't maintain either invariant.
That seems kind of silly... and I tripped over it while writing some
other code.  This patch makes the constructor do the right thing, and it
introduces some tests to verify that behavior.

v2: Add general-ir-test to .gitignore.  Update the description of the
ir_variable invariant for arrays in the commit message.  Both suggested
by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-22 15:23:30 -07:00
Ian Romanick
09ceed7587 mesa/tests: Add simple, dumb test for _mesa_program_state_string
After some discussions about the correct way to update
_mesa_program_state_string, I decided to make a unit test for the
function.  It turns out that the function didn't work quite the way I
thought.  The unit test proves that the code was already correct.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
2013-10-22 15:23:30 -07:00
Ander Conselvan de Oliveira
98b359bd1b wayland: Don't leak wl_drm global when unbinding display 2013-10-22 14:57:03 -07:00
Scott Graham
dafa97fed9 mesa: fixes for MSVC 2013
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-22 08:39:40 -06:00
Brian Paul
65ee044a97 st/mesa: minor whitespace, comment changes in st_draw.c 2013-10-22 08:20:45 -06:00
Brian Paul
f166fbae36 st/dri: minor formatting clean-ups in dri_context.c 2013-10-22 08:20:45 -06:00
Brian Paul
f0d4636d9c mesa: fix a couple issues with U_FIXED, I_FIXED macros
Silence a bunch of MSVC type conversion warnings.

Changed return type of S_FIXED to int32_t (signed).  The result
is the same.  It just seems more intuitive that a signed conversion
function should return a signed value.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-10-22 08:20:45 -06:00
Brian Paul
6767c56e6d mesa: remove GL_MESA_program_debug bits from gl.h
The code for this was removed from Mesa some time ago.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-22 08:20:45 -06:00
Brian Paul
971c74309e mesa: remove remnants of GL_MESA_shader_debug
This extension never saw any real use so remove it.

v2: also update tests/num_strings.cpp for 'make check'

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-22 08:20:45 -06:00
Kenneth Graunke
43b05b8fac i965: Only emit interpolation setup if there are actual FS inputs.
Dead code elimination would get rid of the extra instructions, but
skipping this saves iterations through the optimization loop.

From shader-db:

      N     Min     Max        Median           Avg        Stddev
x 14672       3      16             3     3.1334515    0.59904168
+ 14672       1      16             3     2.8955153    0.77732963
Difference at 95.0% confidence
        -0.237936 +/- 0.0158798
        -7.59342% +/- 0.506783%
        (Student's t, pooled s = 0.693935)

Embarassingly, the classic shadow mapping shader:

   void main() { }

used to require three iterations through the optimization loop.
With this patch, it only requires one (which makes no progress).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-21 23:31:15 -07:00
Chris Forbes
c4de86fd26 i965/fs: Fix accidental type conversion in header setup
Previously one side could be UD while the other was float.

V2: Prefer float; apparently IVB can dispatch float ops faster. (Thanks
Eric)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-22 18:56:14 +13:00
Chris Forbes
b38af01ccf i965/fs: Fix handling of sampler messages with header but zero offset
Gather unconditionally uses a header, but in some cases the
texture_offset value will be zero.

V2: Don't introduce a bogus conversion.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-22 18:56:14 +13:00
Matt Turner
f1e605f1ad glsl: Optimize -(-expr) into expr.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-21 22:53:36 -07:00
Matt Turner
963df4d37d glsl: Optimize abs(-expr) and abs(abs(expr)) into abs(expr).
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-21 22:53:36 -07:00
Matt Turner
5b3aec412e glsl: Use saved values instead of recomputing them.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-21 22:53:36 -07:00
Matt Turner
6aeb7514c3 docs: Mark GLSL 1.50, 3.30, and geometry shaders done for i965.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-21 22:53:36 -07:00
Rico Schüller
aab03f75f3 docs: Update docs for ARB_texture_mirror_clamp_to_edge.
Signed-off-by: Rico Schüller <kgbricola@web.de>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-21 21:02:51 -07:00
Kenneth Graunke
2d3282188e i965: Implement ARB_texture_mirror_clamp_to_edge.
This passes Piglit's texwrap tests.

v2: Remove _EXT suffix.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Rico Schüller <kgbricola@web.de>
2013-10-21 21:02:51 -07:00
Kenneth Graunke
cc2f87891b i965: Drop unused simple_list.h includes.
These don't appear to be necessary.  Everything compiles just fine.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-21 21:02:51 -07:00
Kristian Høgsberg
1a2a30ba20 gbm-dri: Support importing RGB565 buffers 2013-10-21 20:56:17 -07:00
Paul Berry
672fab0b1b glsl/linker: Allow mixing of desktop GLSL versions.
Previously, Mesa followed the linkage rules outlined in the GLSL
1.20-1.40 specs, which (collectively) said that GLSL versions 1.10 and
1.20 could be linked together, but no other versions could be linked.

In GLSL 4.30, the linkage rules were relaxed so that any two desktop
GLSL versions can be linked together.  This change was made because it
reflected the behaviour of nearly all existing implementations (see
Khronos bug 8463).  Mesa was one of the few (perhaps the only)
exceptions to prohibit cross-linking of some GLSL versions.

Since the GLSL linkage rules were deliberately relaxed in order to
match the behaviour of existing implementations, it seems appropriate
to relax the rules in Mesa too (even though Mesa doesn't support GLSL
4.30 yet).

Note that linking ES and desktop shaders is still prohibited, as is
linking ES shaders having different GLSL versions.

Fixes piglit tests "shaders/version-mixing {interstage,intrastage}".

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-21 17:27:41 -07:00
Francisco Jerez
e26ed75066 clover: Improve region and pitch argument handling in memory transfer APIs.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:04 -07:00
Francisco Jerez
adefa84d66 clover: Add a pixel_size() method to the image class.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:04 -07:00
Francisco Jerez
6230f77232 clover: Implement support for the ICD extension.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
9a5afd0dbd clover: Make sure hidden is the default symbol visibility.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Tom Stellard
07567c17f1 clover: Prepare the build system for ICD support.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
2013-10-21 10:47:03 -07:00
Francisco Jerez
9e0b7f76f9 clover: Fix memory leak when initializing a device object fails.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
1d741e3ac0 clover: Tidy up resource::mapping.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
6db102597a clover: Simplify command_queue::flush().
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
7a9bbff7d6 clover: Clean up the kernel and program object interface.
[ Tom Stellard: Make sure to bind global arguments before retrieving handles. ]
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
10284b1d2d clover: Clean up the interface of the context object slightly.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
5226eacf8d clover: Delete copy constructors and assignment operators in all non-copiable objects.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
369419f761 clover: Define a few convenience equality operators.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
c6e7a0d0d3 clover: Simplify the platform object by using util/range.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
e5fc61fa3f clover: Add property list helpers with a syntax consistent with other API objects.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
04d0ab9f64 clover: Switch samplers to the new model.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
d6f7afc3ed clover: Switch memory objects to the new model.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
35307f540f clover: Switch kernel and program objects to the new model.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
9968d9daf2 clover: Switch command queues to the new model.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:03 -07:00
Francisco Jerez
257781f243 clover: Switch event objects to the new model.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:02 -07:00
Francisco Jerez
9d06fb8fa8 clover: Switch context objects to the new model.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:02 -07:00
Francisco Jerez
c9e009b74d clover: Switch device objects to the new model.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:02 -07:00
Francisco Jerez
49a49e0742 clover: Switch platform objects to the new model.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:02 -07:00
Francisco Jerez
bff60c894a clover: Define helper classes for the new object model.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:02 -07:00
Francisco Jerez
d8b4994281 clover: Clean up property query functions by using a new property_buffer helper class.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:02 -07:00
Francisco Jerez
7d61769e44 clover: Switch to the new utility code.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:02 -07:00
Francisco Jerez
099d281b38 clover: Name include guards consistently.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:02 -07:00
Francisco Jerez
8e14b82fd2 clover: Replace a bunch of double underscores with single underscores.
Identifiers with double underscores are reserved, and using them has
undefined behavior according to the C++ spec.  It's unlikely to make
any difference, but...

Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:02 -07:00
Francisco Jerez
ebfdce079b clover: Clean up the event profiling code.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:02 -07:00
Francisco Jerez
e93efa0d50 clover: Import new utility library.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:02 -07:00
Francisco Jerez
7baad4b996 clover: Require GCC 4.7 or higher to build.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-21 10:47:02 -07:00
Tom Stellard
4f49c97afe clover: Use std::numeric_limits<std::size_t>::max() instead of SIZE_MAX
This prevents a build failure on some systems.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-10-21 10:47:02 -07:00
Roland Scheidegger
ac81b6f2be llvmpipe: enable seamless cube filtering
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-21 15:42:04 +02:00
Roland Scheidegger
3bdd1074e1 gallivm: implement seamless cube filtering
For seamless cube filtering it is necessary to determine new faces and new
coords per sample. The logic for this is _seriously_ complex (what needs
to happen is very "asymmetric" wrt face, x/y under/overflow), further
complicated by the fact that if the 4 samples are in a corner (meaning we
only have actually 3 samples, and all 3 are on different faces) then
falling off the edge is happening _both_ on x and y axis simultaneously.
There was a noticeable performance hit in mesa's cubemap demo when seamless
filtering was forced on (just below 10 percent or so in a debug build, when
disabling all filtering hacks, otherwise it would probably be a bit more) and
when always doing the logic, hence use a branch which it only does it if any
of the pixels in a quad (or in two quads) actually hit this. With that there
was no measurable performance hit in the cubemap demo (neither in a debug nor
release buidl), but this will vary (cubemap demo very rarely hits edges).
Might also be different on other cpus, as this forces SoA sampling path which
potentially can be quite a bit slower.
Note that as for corners, this code gets all the 3 samples which actually
exist right, and the 4th texel will simply be the same as one of the others,
meaning that filter weights will be a bit wrong. This however should be
enough for full OpenGL (but not d3d10) compliance.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-21 15:42:04 +02:00
Christian König
21a57f9040 winsys/radeon: cleanup CS offloading
Using atomic function for ncs is superfluous since it is
protected by a mutex anyway. Also lock the mutex only once
while retrieving the next CS for submission.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-10-21 10:20:18 +02:00
Rico Schüller
14429295e1 radeon: Enable ARB_texture_mirror_clamp_to_edge.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Rico Schüller <kgbricola@web.de>
2013-10-20 20:12:39 -07:00
Rico Schüller
5da618c20e r200: Enable ARB_texture_mirror_clamp_to_edge.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Rico Schüller <kgbricola@web.de>
2013-10-20 20:12:39 -07:00
Rico Schüller
e487948bef gallium: Enable ARB_texture_mirror_clamp_to_edge.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Rico Schüller <kgbricola@web.de>
2013-10-20 20:12:39 -07:00
Rico Schüller
a59ae25d81 swrast: Enable ARB_texture_mirror_clamp_to_edge.
v2: fix commit message

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Rico Schüller <kgbricola@web.de>
2013-10-20 20:12:39 -07:00
Rico Schüller
1bbd3bb98a mesa: Add infrastructure for GL_ARB_texture_mirror_clamp_to_edge.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rico Schüller <kgbricola@web.de>
2013-10-20 20:12:08 -07:00
Alexander von Gluck IV
50370e483b scons: Fix Haiku missing library
* The softpipe add-on needs libtranslation
  due to the use of BTranslatorRoster

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-20 19:20:59 -05:00
Alexandre Demers
24fd074ce7 docs: Updating forgotten GL feature completion for r600 2013-10-21 01:35:08 +02:00
David Heidelberger
c948aab96c r300g/compiler: Fix unsigned comparison with less than zero
rc_find_free_temporary_list() returns signed integer
(in case of lack of free temporary registers returns -1),
so new_index in radeon_rename_regs() should be signed.

https://bugs.freedesktop.org/show_bug.cgi?id=54867

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2013-10-21 01:31:51 +02:00
Vinson Lee
c325aa5d80 r600g/sb: Initialize shader::dce_flags.
Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-10-20 00:38:40 -07:00
Kenneth Graunke
00b5d8aeae i965: Mark G45 as having surface tile offset support.
Fixes a regression since 02b632d8e8.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-19 18:43:09 -07:00
Vinson Lee
37cd9ac6df glsl: Initialize per_vertex_accumulator::fields.
Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-18 18:29:18 -07:00
Vinson Lee
136a12ac98 mesa: Remove GLXContextID typedef from glx.h.
Fixes this build error.

  CC     clientattrib.lo
In file included from ../../include/GL/glx.h:333,
                 from glxclient.h:45,
                 from clientattrib.c:32:
../../include/GL/glxext.h:275: error: redefinition of typedef ‘GLXContextID’
../../include/GL/glx.h:171: note: previous declaration of ‘GLXContextID’ was here

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70591
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-10-18 18:08:31 -07:00
Carl Worth
bf7b425083 docs: Import 9.2.2 release notes, add news item. 2013-10-18 17:19:31 -07:00
Kenneth Graunke
653cc008a8 docs: Note that we support OpenGL 3.3 in the release notes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-18 15:24:18 -07:00
Kenneth Graunke
567445e2b9 i965: Enable OpenGL 3.3 and GLSL 3.30.
Everything necessary for these appears to be implemented.  We'll want to
add more tests to guard against bugs, but it should be functionally
complete.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-18 15:24:18 -07:00
Jon TURNEY
cedfd79be2 translate_sse: Fix generated code argument handling for msabi on x86_64
translate_sse.c contains code for msabi on x86_64, but it appears to be
untested.

Currently arguments 1 and 2 passed to the generated code are moved as 32-bit
quantities into the registers used by sysvabi, irrespective of the architecture.
Since these may be pointers, they must be moved as 64-bit quantities to avoid
truncation.

Commit f4dd099171 disabled tranlate_sse.c on MinGW
x86_64, I don't know if was due to this issue, or a different one...

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-18 14:17:15 +01:00
Jon TURNEY
72a0f832ec rtasm: Cygwin uses the msabi calling convention on x86_64
Cygwin also uses the msabi calling convention on x86_64, not the sysvabi calling
convention

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Brian Paul <brianp@vmware.com>

ignored, and an empty message aborts the commit.
2013-10-18 14:16:56 +01:00
Jon TURNEY
87e84acbfd rtasm: The heap is NX on 64-bit Cygwin, so use the rtasm_exec_malloc() implementation which uses mmap()
The heap is NX on 64-bit Cygwin, so use the rtasm_exec_malloc() implementation
which uses mmap() to allocate an anonymous page with execute permission, rather
than the one which just uses malloc().

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-18 14:16:27 +01:00
Alexander von Gluck IV
9aad1ba70f scons: Simplified fix of llvm cxxflags for rtti
* Based on ideas of Jose Fonseca
* A rework of ce8eadb6e8

Tested-by: Vinson Lee <vlee@freedesktop.org>
2013-10-17 20:33:05 -05:00
Paul Berry
b08195faec glsl: Fix MSVC build (missing strcasecmp())
MSVC doesn't have a strcasecmp() function; it uses _stricmp() instead.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-10-17 18:11:22 -07:00
Kenneth Graunke
b3360d23ac i965: Fold brwInitVtbl() into brwCreateContext().
With most of the virtual functions gone, brwInitVtbl() is now tiny.

Merging it into the caller allows us to delete the entire file.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-17 14:27:03 -07:00
Kenneth Graunke
f8fef8ee92 i965: Merge brw_destroy_context() into intelDestroyContext().
Now that i915 and i965 have been split, the separation between
intelDestroyContext and brw_destroy_context is kind of arbitrary.

This patch replaces the only brw->vtbl.destroy() call with the body
of brw_destroy_context (the only implementation of that virtual
function).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-17 14:27:03 -07:00
Kenneth Graunke
7601ba649f i965: Replace dri_bo_release with drm_intel_bo_unreference.
dri_bo_release is a helper function that calls drm_intel_bo_unreference
but then also sets the pointer to NULL.  This is unnecessary, since
brw_destroy_context is called from intelDestroyContext, which also frees
brw completely.

If you're still trying to access them, you've got bigger problems.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-17 14:27:03 -07:00
Kenneth Graunke
5f76bc37ab i965: Unindent the body of intelDestroyContext.
Having almost the entire body of the function indented one level for a
check that should never happen seems silly.  Just early return.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-17 14:27:03 -07:00
Kenneth Graunke
80a9c42e9e i965: Un-virtualize brw_new_batch().
Since the i915/i965 split, there's only one implementation of this
virtual function.  We may as well just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-17 14:27:03 -07:00
Kenneth Graunke
6613f346ac i965: Un-virtualize brw_finish_batch().
Since the i915/i965 split, there's only one implementation of this
virtual function.  We may as well just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-17 14:27:03 -07:00
Paul Berry
e2d1eaa32a glsl: In update_max_array_access, fix interface instance check.
In commit f878d20 (glsl: Update ir_variable::max_ifc_array_access
properly), I accidentally used the wrong kind of check to determine
whether the variable being accessed was an interface instance (I used
var->get_interface_type() != NULL when I should have used
var->is_interface_instance()).  As a result, if an unnamed interface
block contained a struct which contained an array,
update_max_array_access() would mistakenly interpret the struct as a
named interface block and try to dereference a null
var->max_ifc_array_access.

This patch corrects the check, fixing the null dereference.

Fixes piglit test interface-block-struct-nesting.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70368

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-17 11:51:06 -07:00
Paul Berry
79e835a712 glsl: Treat layout-qualifier-id's as case-insensitive in desktop GLSL.
In desktop GLSL, location qualifiers are case-insensitive.  In GLSL
ES, they are case-sensitive.  This patch handles the difference by
using a new function to match layout qualifiers,
match_layout_qualifier(), which calls either strcmp() or strcasecmp()
as appropriate.

Fixes piglit tests:
- layout-not-case-sensitive-in.geom
- layout-not-case-sensitive-max-vert.geom
- layout-not-case-sensitive-out.geom
- layout-not-case-sensitive.frag

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-17 11:51:01 -07:00
Brian Paul
a36f7e651e mesa: remove PFNGLBLENDCOLORPROC, PFNGLBLENDEQUATIONPROC typedefs in gl.h
Fixes error about duplicated typedefs (also in glext.h) reported on
NetBSD 6.1

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70546
Tested-by:  Vinson Lee <vlee@freedesktop.org>
2013-10-17 12:10:39 -06:00
Brian Paul
282bb87366 st/mesa: add a few comments in st_create_context_priv() 2013-10-17 09:28:17 -06:00
Dave Airlie
530afc82a1 st/mesa: handle layer and primitive id output and point size input
This fixes a number of piglit crashes when running on a hacked up llvmpipe.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-17 08:35:42 +01:00
Dave Airlie
038a9aab33 st/mesa: add geometry shader ubo support
This just adds the missing bits so the ubo tests don't crash.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-17 08:35:42 +01:00
Fabian Bieler
20cad7fd6f mesa/st: Allow geometry shaders without gl_Position export.
From the ARB_geometry_shader4 spec (section Geometry Shader outputs):
"The built-in special variable gl_Position is intended to hold the
homogeneous vertex position. Writing gl_Position is optional."

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-17 08:35:42 +01:00
Bryan Cain
9bfa475684 st/mesa, glsl_to_tgsi: add support for geometry shaders
v2 (Bryan Cain <bryancain3@gmail.com>): fix 2D array indexing order.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-17 08:35:42 +01:00
Bryan Cain
6b0df34ae5 mesa/st: Add VARYING_SLOT_TEX[1-7] to st_translate_geometry_program().
v2 (Paul Berry <stereotype441@gmail.com>: Split out to separate patch
(previously this was part of "glsl: add builtins for geometry
shaders.")

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-17 08:35:42 +01:00
Kristian Høgsberg
4ef1c8fb4c Revert "i965: Create ARGB2101010 DRI configs"
Exposing 10-bit color configs confuses too many applications that try to
use the chooser to pick an 8 bit config.  The chooser consider an fbconfig
with more bits a better match and will thus give a 10 bit config when an
application asks for a config with GLX_RED_SIZE 1 or 8.

One key example is glxinfo, which does this, and then doesn't specify that
it needs a config where GLX_DRAWABLE_TYPE has the GLX_WINDOW_BIT set.
This way it ends up with a 10 bit config that it can't use to create a
GLX window and fails to log extensions.

This reverts commit f354bcc177.

https://bugs.freedesktop.org/show_bug.cgi?id=70557
2013-10-16 22:22:45 -07:00
Vadim Girlin
62c8149472 r600g/sb: fix issue with DCE between GVN and GCM (v2)
We can't perform DCE using the liveness pass between GVN and GCM
because it relies on the correct schedule, but GVN doesn't care about
preserving correctness - it's rescheduled later by GCM.

This patch makes dce_cleanup pass perform simple DCE
between GVN and GCM instead of relying on liveness pass.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70088

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-10-17 07:57:49 +04:00
Matt Turner
38fe3bd5f2 glapi: Add missing XML files to Makefile dependencies.
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-10-16 20:49:43 -07:00
Matt Turner
a360ca7476 glsl: Optimize mul(a, -1) into neg(a).
Two extra instructions in some heroesofnewerth shaders, but a win for
everything else.

total instructions in shared programs: 1531352 -> 1530815 (-0.04%)
instructions in affected programs:     121898 -> 121361 (-0.44%)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-16 20:49:43 -07:00
Matt Turner
197f3a33fb i965/fs: Handle printing HW_REGS in dump_instruction().
Scheduling debugging now prints:

Instructions before scheduling (reg_alloc 1)
0: linterp vgrf20, hw_reg2, hw_reg3, hw_reg4,
1: linterp vgrf21, hw_reg2, hw_reg3, hw_reg4+16,

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-16 20:49:43 -07:00
Matt Turner
7d0519c082 i965: Print instructions' children during scheduling debugging.
Useful for tracking down problems in dependency calculations.

Scheduling debugging now prints:

clock    2, scheduled: linterp vgrf5, hw_reg2, hw_reg3, hw_reg0,
        child 0, 53 parents: fb_write (null), (null), (null), (null),
        child 1, 2 parents: tex vgrf4, vgrf5, (null), (null),
        child 2, 52 parents: placeholder_halt (null), (null), (null), (null),
clock    4, scheduled: linterp vgrf5+1, hw_reg2, hw_reg3, hw_reg0+16,
        child 0, 52 parents: fb_write (null), (null), (null), (null),
        child 1, 1 parents: tex vgrf4, vgrf5, (null), (null),
                now available
        child 2, 51 parents: placeholder_halt (null), (null), (null), (null),

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-16 20:49:43 -07:00
José Fonseca
40ddd8b659 Revert "scons: Fix build when rtti is disabled"
This reverts commit 94d05bf87a as it has a
few problems:

- it breaks windows builds becuase env[LLVM_CXXFLAGS] is never set there

- it is merging not only rtti, but the whole cxxflags (defines etc)
  which has proven to be a source of troubles (breaks debugging etc.)
2013-10-16 15:05:51 -07:00
Tom Stellard
9da4021626 radeonsi: Use 'SI' as the LLVM processor for CIK on LLVM <= 3.3
LLVM 3.3 does not know about CIK processors, and the codes paths for SI
and CIK are the same.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-16 12:55:30 -04:00
Tom Stellard
13ac38b4ef r600g/compute Improve debugging output 2013-10-16 09:39:31 -07:00
Tom Stellard
de1de88dfc clover: Link libclc before running any optimizations
This is required in order for clang to correctly handle the OpenCL C
barrier() builtin which has the following restrictions acording to
the OpenCL 1.1 Specification:

If barrier is inside a conditional statement, then all work-items must
enter the conditional if any work-item enters the conditional statement
and executes the barrier.

If barrier is inside a loop, all work-items must execute the barrier for
each iteration of the loop before any are allowed to continue execution
beyond the barrier.

By linking before otimizations, we can replace calls to barrier() with
calls to a target specific intrinsic which has the noduplicate attribute
This attribute prevents clang from performing optimizations which could
violate the above rules.

This attribute must be applied to the call instruction that invokes
the function, so it is not enough to add this attribute the barrier()
declaration.

As a bonus this will probably speed up compile times since we will no
longer need to run link-time optimizations.
2013-10-16 09:39:15 -07:00
Brian Paul
2273b04c61 mesa: change glTexImage[23]DMultisample() internalformat to GLenum
To match glext.h and the GL_ARB_texture_multisample extension.
However, the GL 4.0 spec and man page say it's GLint.
An OpenGL spec bug will be filed.
2013-10-16 08:43:23 -06:00
Brian Paul
4f08cdefda svga: minor fix-ups in svga_get_shader_param()
Fix debug error message.  Add switch case for PIPE_SHADER_COMPUTE.
Trivial.
2013-10-16 08:26:45 -06:00
Brian Paul
e96c55ff49 cso: fix incorrect sampler view count in cso_restore_sampler_views()
During the recent bind_sampler_states() interface change in gallium
we changed the CSO single_sampler_done() function so that if we were
decreasing the number of sampler states bound in the driver, we'd
null-out the "extra/old" sampler states to unbind them.  See commit
1e2fbf265.

However, we didn't make the corresponding fix for sampler views.
This caused an assertion to fail in the svga driver which checked
that the number of sampler views matched the number of sampler states.

This patch fixes cso_restore_sampler_views() so that it nulls-out
the extra/old sampler views if the number of new views is less than
the number of current/old views.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-10-16 08:13:47 -06:00
Brian Paul
0d1011638b mesa: update glxext.h to version 20131008
The diff is huge but the actual changes are few:
* Whitespace changes
* Items are reordered
* extern qualifiers dropped
2013-10-16 08:13:46 -06:00
Brian Paul
4d9e61c046 mesa: update glext.h to version 20131008
Only two notable changes in this revision:
* GLvoid has been replaced by void.
* Added the GL_NV_blend_equation_advanced extension.
2013-10-16 08:13:45 -06:00
Brian Paul
3c074e4d4d vbo: access VBO memory more efficiently when building display lists
Use GL_MAP_INVALIDATE_RANGE, UNSYNCHRONIZED and FLUSH_EXPLICIT flags
when mapping VBOs during display list compilation.  This mirrors what
we do for immediate-mode VBO building in vbo_exec_vtx_map().

This improves performance for applications which interleave display
list compilation with execution.  For example:

glNewList(A);
glBegin/End prims;
glEndList();
glCallList(A);
glNewList(B);
glBegin/End prims;
glEndList();
glCallList(B);

Mesa's vbo module tries to combine the vertex data from lists A and B
into the same VBO when there's room.  Before, when we mapped the VBO for
building list B, we did so with GL_MAP_WRITE_BIT only.  Even though we
were writing to an unused part of the buffer, the map would stall until
the preceeding drawing call finished.

Use the extra map flags and FlushMappedBufferRange() to avoid the stall.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-10-16 08:13:45 -06:00
Brian Paul
fa9c702164 mesa: consolidate cube width=height error checking
Instead of checking width==height in four places, just do it in
_mesa_legal_texture_dimensions() where we do the other width, height,
depth checks.  Similarly, move the check that cube map array depth is
a multiple of 6.

This change also fixes some missing cube dimension checks for the
glTexStorage[23]D() functions.

Remove width==height assertion in _mesa_get_tex_max_num_levels() since
that's called before the other size checks for glTexStorage.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-16 08:13:45 -06:00
Kristian Høgsberg
6e444a72c1 gbm: Add support for gbm bos and surfaces using GBM_FORMAT_ARGB2101010
We can now add GBM support for the 10 bit/channel formats which lets us
create a gbm surface that we can use with KMS for display hardware that
support the format.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2013-10-15 22:07:52 -07:00
Kristian Høgsberg
3160ec353e dri: Add __DRIimage support for the ARGB2101010 format
We add support for the ARGB2101010 color format to the DRI image extension,
which allows DRI loaders to create a __DRIimage with this color format.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2013-10-15 22:07:52 -07:00
Kristian Høgsberg
f354bcc177 i965: Create ARGB2101010 DRI configs
This commit enables ARGB2101010 system framebuffers (that is, DRI drawables)
for the i965 drivers.  This is done by generating DRI configs that advertise
this color format as well as teaching intelCreateBuffer to pick the right
color format when it sees such a DRI config.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2013-10-15 22:07:52 -07:00
Kristian Høgsberg
afda76cc0d dri/common: Add support for creating ARGB2101010 configs
This extends the common dri driver infrastructure with the ability to create
__DRIconfigs for 10 bits/channel + 2 bit alphs formats.  This still has
to be supported and requested by a driver, so this doesn't enable anthing yet.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2013-10-15 22:07:52 -07:00
Kristian Høgsberg
df479cffcc egl_dri2: Set NativeVisualID to the matching GBM config for the gbm platform
The EGLConfig doesn't have the rgba masks, only the rgba sizes.  To
make sure a config is usable with a given GBM/KMS format, we need a way
to make sure the formats really match.
2013-10-15 22:07:52 -07:00
Kristian Høgsberg
44e584a73a egl_dri2: Remove depth argument from dri2_add_config()
All callers now use the more correct rgba mask mechanism for filtering
out mathcing DRI configs.  Even if depth and buffer size match, the
color component layout can be different, or in case or ARGB8888 and
ARGB2101010 the color components can even be different sizes.

Since anything that the depth check would reject is also rejected by
the rgba mask comparison, the depth parameter is redundant and not
specific enough.  We should probably have removed it when the rgba
masks argument was introduced, but better late than never.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2013-10-15 22:07:52 -07:00
Kristian Høgsberg
e3d0a0eac7 egl_dri2: Match X11 visuals using rgba masks instead of depth
Matching on visual depth to buffer size makes 8 bpc RGBA look similar to
10 bit RGB with 2 bit alphs - both have buffer size 32.  Instead, build
the rgba masks from the visual data and use that for finding matching
DRI configs.

We need to keep the special case that allows us to match 24 bit visuals
to DRI configs with buffer size 32.  We do that by creating an alpha
mask of "all the non-rgb bits" for 24 bit visuals and matching a second
time with that.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2013-10-15 22:06:46 -07:00
Singh, Satyeshwar
e2620c1a74 i965: Add support for RGB565 __DRIimage
Add information for RGB565 to the table of image formats so that we can
create a __DRIimage for that format.  This in turn enables RGB565
wayland clients.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2013-10-15 21:30:49 -07:00
Singh, Satyeshwar
2efc97d513 egl-wayland: Add support for RGB565 pixel format for Wayland clients
With this patch Wayland clients can now ask EGL for RGB 565 format buffers
and attach them to a Wayland compositor.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2013-10-15 21:26:56 -07:00
Alexander von Gluck IV
94d05bf87a scons: Fix build when rtti is disabled
* The rtti fix actually dug up a bug in the scons build scripts.
* Autotools took the LLVM cpp and cxx flags, while scons only took
  the cpp flags.
* This grabs the cxx flags and applies them where needed. We may
  want to make the same change for the llvm cpp flags in scons.
* The only linux platform I can find with LLVM no-rtti is Ubuntu.
* Fixes bug #70471

Tested-by: Vinson Lee <vlee@freedesktop.org>
2013-10-15 22:12:18 -05:00
José Fonseca
85d7f6779f llvmpipe: Advertise PIPE_CAP_DEPTH_CLIP_DISABLE.
Actually implemented by draw module.

Tested piglit ARB_depth_clamp tests, which pass 100%.

Trivial.
2013-10-15 18:22:57 -07:00
José Fonseca
3b3591cd15 draw: make vs_slot signed.
Otherwise (vs_slot < 0) will never be true.

Trivial.
2013-10-15 18:22:57 -07:00
Emil Velikov
b1e7cd037e configure.ac: drop obsolete variable HAVE_COMMON_DRI
The original intent of the variable was to prevent adding
libdrm dependency for non drm drivers (swrast). This is
already handled with __NOT_HAVE_DRM_H, and with the recent
merge of the dri_util and drisw_util code this variable has
started causing build issues.

Eg. the following will fail
$ ./autogen.sh --with-dri-drivers=swrast --with-gallium-drivers=
$ make

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-10-15 21:54:20 +02:00
Emil Velikov
cd3fa176a8 swrast: add correct include for out-of-tree builds
The xmlpool/options.h file was not accessible when building
out-of-tree leading to failure.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70378
Reported-by: Fabio Pedretti <fabio.ped@libero.it>
Tested-by: Fabio Pedretti <fabio.ped@libero.it>
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-10-15 21:50:09 +02:00
Bryan Cain
467e3aa3de mesa: fix transform feedback when a geometry shader is active.
When a geometry shader is active, the transform feedback primitive
type ("mode") needs to be validated against the geometry shader output
primitive type, not the primitive type passed to the glDraw*()
function.

Fixes the following piglit tests:
- glsl-1.50-geometry-primitive-types GL_LINES
- glsl-1.50-geometry-primitive-types GL_LINES_ADJACENCY
- glsl-1.50-geometry-primitive-types GL_LINE_STRIP
- glsl-1.50-geometry-primitive-types GL_LINE_STRIP_ADJACENCY
- glsl-1.50-geometry-primitive-types GL_TRIANGLES
- glsl-1.50-geometry-primitive-types GL_TRIANGLES_ADJACENCY
- glsl-1.50-geometry-primitive-types GL_TRIANGLE_FAN

Exposes previously hidden failures in the following piglit tests:
- glsl-1.50-geometry-primitive-id-restart GL_LINES other
- glsl-1.50-geometry-primitive-id-restart GL_LINES_ADJACENCY other
- glsl-1.50-geometry-primitive-id-restart GL_LINE_LOOP ffs
- glsl-1.50-geometry-primitive-id-restart GL_LINE_LOOP other
- glsl-1.50-geometry-primitive-id-restart GL_LINE_STRIP other
- glsl-1.50-geometry-primitive-id-restart GL_LINE_STRIP_ADJACENCY other
- glsl-1.50-geometry-primitive-id-restart GL_TRIANGLES other
- glsl-1.50-geometry-primitive-id-restart GL_TRIANGLES_ADJACENCY other
- glsl-1.50-geometry-primitive-id-restart GL_TRIANGLE_FAN ffs
- glsl-1.50-geometry-primitive-id-restart GL_TRIANGLE_FAN other
- glsl-1.50-geometry-primitive-id-restart GL_TRIANGLE_STRIP other
- glsl-1.50-geometry-primitive-id-restart GL_TRIANGLE_STRIP_ADJACENCY other

(These failures were previously hidden due to a flaw in the test: it
doesn't check for GL errors.  I'll fix the test shortly).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-15 11:40:43 -07:00
Paul Berry
afccf3d8e7 i965/gs: Set the REORDER bit in 3DSTATE_GS.
Ivy Bridge's "reorder enable" bit gives us a binary choice for the
order in which vertices from triangle strips are delivered to the
geometry shader.  Neither choice follows the OpenGL spec, but setting
the bit is better, because it gets triangle orientation correct.

Haswell replaces the "reorder enable" bit with a new "reorder mode"
bit (which occupies the same location in the command packet).  This
bit gives us a different binary choice, which affects both triangle
strips and triangle strips with adjacency.  Setting the bit ("reorder
trailing") gives the proper order according to the OpenGL spec.

So in either case we want to set the bit.

On Ivy Bridge, fixes piglit test "triangle-strip-orientation".

On Haswell, fixes piglit tests "glsl-1.50-geometry-primitive-types
{GL_TRIANGLE_STRIP,GL_TRIANGLE_STRIP_ADJACENCY}" and
"glsl-1.50-geometry-tri-strip-ordering-with-prim-restart *".

v2: Rename the bit to "REORDER_TRAILING" for consistency with Haswell
docs.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-15 11:40:32 -07:00
Paul Berry
caf9cef7ee i965/fs: Remove bogus field prog_data->dispatch_width.
Despite the name, this field wasn't being set to the dispatch width at
all; it was always 8.  The only place it was used was that the
constant buffer read length was aligned to it, and as far as I can
tell from the docs, there is no need to align this value to the
dispatch width; aligning it to a multiple of 8 is sufficient.  So I've
just replaced it with a hardcoded 8.

v2: In gen6_wm_state, use brw->wm.base.push_const_size for consistency
with VS and GS state upload.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-15 11:34:30 -07:00
Paul Berry
2910a82eb4 glsl: Add new GLSL 1.50 constants.
This patch populates the following built-in GLSL 1.50 variables based
on constants stored in ctx->Const:

- gl_MaxVertexOutputComponents
- gl_MaxGeometryInputComponents
- gl_MaxGeometryOutputComponents
- gl_MaxFragmentInputComponents
- gl_MaxGeometryTextureImageUnits
- gl_MaxGeometryOutputVertices
- gl_MaxGeometryTotalOutputComponents
- gl_MaxGeometryUniformComponents
- gl_MaxGeometryVaryingComponents

On i965/gen7, fixes all Piglit tests in "spec/glsl-1.50/built-in
constants/*" except for gl_MaxCombinedTextureImageUnits and
gl_MaxGeometryUniformComponents.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-15 11:34:30 -07:00
Eric Anholt
705a90e304 i965: Move the common binding table offset code to brw_shader.cpp.
Now that both vec4 and fs are dynamically assigning offsets, a lot of the
code is the same.

v2: Avoid passing around the next offset through the class.  (Review by
    Paul)

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-15 10:18:50 -07:00
Eric Anholt
d395485e1d i965/vec4: Dynamically assign the VS/GS binding table offsets.
Note that the dropped comment in brw_context.h is mostly (better written)
in brw_binding_table.c as well.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-15 10:18:48 -07:00
Eric Anholt
4e5306453d i965/fs: Dynamically set up the WM binding table offsets.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-15 10:18:45 -07:00
Eric Anholt
3c9dc2d31b i965: Make a brw_stage_prog_data for storing the SURF_INDEX information.
It would be nice to be able to pack our binding table so that programs
that use 1 render target don't upload an extra BRW_MAX_DRAW_BUFFERS - 1
binding table entries.  To do that, we need the compiled program to have
information on where its surfaces go.

v2: Rename size to size_bytes to be more explicit.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-15 10:18:42 -07:00
Eric Anholt
5463b5bbbd i965: Always have the struct gl_program * in the backend visitor.
vec4 already had it, so put it in the FS, too.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-15 10:18:40 -07:00
Eric Anholt
2788798388 i965: Drop a couple of unused defines.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-15 10:18:37 -07:00
Eric Anholt
fbc088ee49 i965: Remove dead arguments from prog_data_compare.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-15 10:18:32 -07:00
Alexander von Gluck IV
ce8eadb6e8 build: remove forced -fno-rtti
* As discussed on the mailing list,
  forced no-rtti breaks C++ public
  API's such as the Haiku C++ libGL.so
* -fno-rtti *can* be still set however
  instead of blindly forcing -fno-rtti,
  we can rely on the llvm-config
  --cppflags output.
  If the system llvm is built without
  rtti (default), the no-rtti flag will be
  present in llvm-config --cppflags
  (which we pick up on)
  If llvm is built with rtti
  (REQUIRES_RTTI=1), then -fno-rtti is
  removed from llvm-config --cppflags.
* We could selectively add / remove rtti
  from various components, however mixing
  rtti and non-rtti code is tricky and
  could introduce missing symbols.
* This needs impact tested.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-10-14 23:00:55 -05:00
Matt Turner
7a2e9f9778 configure.ac: Don't check for awk, grep, nm.
Not used since d53901c6.
2013-10-14 11:13:09 -07:00
Matt Turner
9ae1f0bad6 configure.ac: Don't check for cross compiling.
Dead since c845140a.
2013-10-14 11:13:09 -07:00
Matt Turner
a5ec01fb1b i965: Don't copy prop source mods into instructions that can't take them. 2013-10-14 11:13:09 -07:00
Constantin Baranov
53904c64da mesa: Add missing switch break in invalidate_framebuffer_storage()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70411
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-14 09:06:07 -06:00
Grigori Goronzy
e6c2afa9ce st/vdpau: add format conversions for GetBitsYCbCr
Add simple plain C routines for NV12<->YV12 and YUYV<->UYVY
conversions. The NV12->YV12 conversion is commonly used, for instance
by VLC.

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-10-13 20:09:38 +02:00
Grigori Goronzy
f250fd59c4 radeon: use staging for mapping linear textures
Textures that likely reside in VRAM, are mapped for reading and
don't require direct mapping should be staged into GTT, to avoid bad
performance. This fixes readback performance of VDPAU surfaces.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-10-13 20:09:34 +02:00
Grigori Goronzy
270fab5164 radeon/uvd: use PIPE_BIND_LINEAR for video surfaces
This new bind flag forces linear storage, but does not have other
side effects like R600_RESOURCE_FLAG_TRANSFER.

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-10-13 20:09:02 +02:00
Vincent Lejeune
6e51c2a941 radeonsi: Allow Sinking pass to move preloaded const/res/sampl
This fixes a crash in Unigine Heaven 3.0, and probably in some
others apps.
2013-10-13 20:03:42 +02:00
Vadim Girlin
453ea2d309 radeonsi: pass alpha_ref value to PS in the user sgpr
Currently it's hardcoded in the shader, so every change requires
compilation of the shader variant, killing the performance
in Serious Sam 3 and probably other apps.

This patch passes alpha_ref in the user sgpr and removes it from
the shader key.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-10-13 20:03:35 +04:00
Vadim Girlin
10ddeb910b r600g: fix tgsi_op2_s with trans-only instructions
This fixes the issue when dst and src is the same reg and operation on one
channel overwrites the source for other channels, e.g.:

UMUL TEMP[2].xyz, TEMP[0].xyzz, TEMP[2].xxxx

In this example the result of the operation on channel x is written in
TEMP[2].x and then used as a second source operand for channels y and z
instead of original value in TEMP[2].x.

This patch stores the results in temp reg and moves them to
dst after performing operation on all channels.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70327

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-10-13 20:03:35 +04:00
Kenneth Graunke
8958741e5a i965: Merge intel_context.h into brw_context.h.
v2: Keep the random 32-bit only version of memcpy, since Ian says I
    can't delete it without data proving it isn't useful.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
3dda3ebec9 i965: Delete our copy of likely/unlikely macros.
brw_context.h includes imports.h which includes compiler.h which already
defines these.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
67601da24c mesa: Move U_FIXED/S_FIXED macros from i965 to macros.h.
These make it easy to convert a floating point value to a fixed point
numbers.  The second parameter is the number of bits used for the
fractional part of the number.

It looks like core Mesa has similar functions already, but none that
allows an arbitrary number of fractional bits.  The more generic version
is probably useful to everyone.

r600g apparently has an identical copy of the S_FIXED macro, but doesn't
include this file.  I'm not sure what to do about that, so I'm just
going to leave it for now.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
1a82081db6 mesa: Move ROUND_DOWN_TO() macro from i915/i965 to macros.h.
This seems generally useful, so it may as well live in core Mesa.

In fact, the comment for ALIGN() in macros.h actually says to "see also"
ROUND_DOWN_TO, which...was in a driver somewhere.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
50c9f04c5f i965: Move need_workaround_flush = true to intel_batchbuffer_init.
intel_batchbuffer_init() sets up initial batchbuffer state; it seems
like a reasonable place to initialize this flag.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
ddc8decdb2 i965: Move DriverFlag initialization to brw_init_state().
Configuring which dirty flags we want sounds like a job for
brw_init_state().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
ba0cc79ab9 i965: Merge intelInitContext into brwCreateContext.
The split here was completely arbitrary.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
90d52d2c76 i965: Move viewport driver hook setup to brw_init_driver_functions.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
f118fc26e1 i965: Make brwInitFunctions take brw_context rather than intel_screen.
It actually just wants generation checking, and brw->gen is the usual
way of doing that.  In the future, we'll also want to check brw->hw_ctx,
which isn't available from the screen.

While we're changing the function signature, convert from camel case to
our usual naming conventions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
9848a42287 i965: Merge intelInitFunctions() and brwInitFunctions().
They do exactly the same thing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
0138fd4610 i965: Merge intel_context.c into brw_context.c.
There's no point in having two files for context functions.  This patch
moves the code from intel_context.c into brw_context.c unmodified
(other than whitespace fixes).

Right now, this looks silly; future patches will merge functions and
tidy things up.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
8d315b2583 i965: Move memset of TextureFormatSupported to brw_init_surface_formats.
brw_init_surface_formats already sets entries in TextureFormatsSupported
to true; it may as well take care of initializing it to false too.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
fc5b865cec i965: Remove has_aa_line_parameters.
This flag is only used in one place, and is only set on one platform.
Just check for original Gen4 in the relevant function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
220c1e5610 i965: Move state setup from brwCreateContext to brw_init_state().
This seems like a better place for it, and helps clean up
brwCreateContext (which is full of a lot of random stuff).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
d31b928b93 i965: Remove the brw_context::emit_state_always flag.
This was always set to false, and is only used for debugging.
To enable it, simply change the if (0) block and recompile.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
02b632d8e8 i965: Move hardware feature flags to brw_device_info.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:44 -07:00
Kenneth Graunke
ea890c031d i965: Move device quirks to brw_device_info.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:43 -07:00
Kenneth Graunke
d76f6c7ae4 i965: Move hardware limits to brw_device_info.
Since each kind of device has its own brw_device_info structure, we can
simply store the URB and thread limits there.  This eliminates all the
large if-ladders, and simplifies the context initialization code quite a
bit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-13 00:10:43 -07:00
Kenneth Graunke
afe05e7193 i965: Replace some intel_screen fields with brw_device_info references.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:43 -07:00
Kenneth Graunke
9d490c172b i965: Delete the INTEL_SEPARATE_STENCIL override.
This option was useful during initial development, but it's been ages
since I've heard of anyone using it.  Plus, Gen7+ mandates separate
stencil, so it was really only useful on Sandybridge anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:43 -07:00
Kenneth Graunke
6e9f427ed8 i965: Add a new brw_device_info structure.
The idea is that struct brw_device_info should store statically-known
information about hardware features.  Using the new family name in the
PCI ID table, we can easily grab the right structure.

This is basically the equivalent of intel_device_info in the kernel.

This patch also makes the new structure available from intel_screen, but
nothing uses it.  Right now, it looks very redundant with existing
fields, but that will change.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:43 -07:00
Kenneth Graunke
4a29b9a066 i965: Add the family name to the PCI ID table.
I removed this a while ago, since we never used it, but I'm finally
resurrecting the idea in the next commits.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:43 -07:00
Kenneth Graunke
8d4ecbccd6 i965: Remove #define name from PCI ID table.
Nothing uses the #define name, and it's not terribly useful - the
numerical ID serves the same purpose.  The only thing we could really do
with it is generate slightly prettier preprocessed code.  But who looks
at that?

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:43 -07:00
Kenneth Graunke
90511faedd i965: Pull most driconf option handling into a centralized function.
Using a helper function clarifies the context initialization code.

I would've liked to completely centralize it, but moving the optionCache
code from intelInitExtensions into here would've required setting flags
in the context, which seems like a waste.

v2: Rebase for the introduction of disable_derivative_optimization.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:43 -07:00
Kenneth Graunke
0fb525b87c i965: Move a bunch of code from intelInitContext to brwCreateContext.
Now that intelInitContext isn't shared between i915 and i965, the split
is fairly arbitrary.  This patch moves a bunch of the basic context
creation and generation checking code up to the top-level function
(and slightly earlier).

More will follow.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:43 -07:00
Kenneth Graunke
a25caad9e4 i965: Update the comment about viewport hacks.
It wasn't clear that this was necessary for EGL, or why.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:43 -07:00
Kenneth Graunke
832bcc3613 i965: Pull out INTEL_DEBUG handling into new intel_debug.[ch] files.
Now that there isn't an intel_context structure, the split between
brw_context.[ch] and intel_context.[ch] is rather awkward and arbitrary.
Removing intel_context.[ch] seems desirable, but not everything really
belongs in brw_context.[ch], either.

Moving INTEL_DEBUG handling into separate intel_debug.[ch] files should
make them relatively easy to find.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:43 -07:00
Kenneth Graunke
3f7b4e5d04 i965: Rename brwCreateContext's error parameter to dri_ctx_error.
"error" is a very generic name.  dri_ctx_error is the name used in
intelInitContext(), which is more specific.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-13 00:10:43 -07:00
Eric Anholt
95bd8a332d dri: Move i965-specific context flag logic to dri common.
Nobody else yet can do a forward context anyway, but others should be able
to do debug contexts, and those would have just had no effect currently.
2013-10-13 00:10:43 -07:00
Stephane Marchesin
5ceeeb360e i915g: Fix assert
Now that we support start, assert on start + num < max samplers

Reported by xexaxo
2013-10-12 11:40:54 -07:00
Paul Berry
975c6ce605 mesa: Bump version to 10.0.0.
Mesa now supports OpenGL 3.2 and GLSL 1.50, so bump the Mesa major
version from 9 to 10 to reflect this.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-12 08:58:18 -07:00
Paul Berry
200f9a0576 mesa: Remove warning that geometry shader support is experimental.
Geometry shader support is now working well, and adequately piglit
tested.  There are just a few piglit failures left to fix.  So there's
no need for an "experimental" warning anymore.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-12 08:58:02 -07:00
Paul Berry
b6d6ea396c i965: Turn on GLSL 1.50 and GL 3.2 support for i965 gen7.
Geometry shaders were the last thing we needed to finish before
turning on GLSL 1.50 and GL 3.2 support.  They are now working well,
with just a few piglit failures left to fix.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-12 08:57:45 -07:00
Jay Cornwall
d7d539a1cb radeon/llvm: show LLVM disassembly when available
With code dump enabled LLVM may generate disassembly during compilation.
Show this disassembly when available and prefer it to SI bytecode dump.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jay Cornwall <jay@jcornwall.me>
2013-10-12 00:03:58 -04:00
Roland Scheidegger
7681beedd1 softpipe: fix seamless cube filtering
Fix coord wrapping (and face selection too) in case of edges.
Unfortunately, the coord wrapping is way more complicated than what
the code did, as it depends on the face and the direction where the
texel falls off the face (the logic needed to get this right in fact
seems utterly ridiculous).
Also fix a bug in (y direction under/overflow) face selection.
And get rid of complicated cube corner handling. Just like edge case,
the coord wrapping was wrong and it seems very difficult to fix.
I'm near certain it can't always work anyway (though ordinary seamless
filtering on edge has actually a similar problem but not as severe)
because we don't have per-pixel face, hence could have multiple corner
texels which would make it very difficult to average the remaining texels
correctly. Hence simply pick a texel which would only have fallen off one
edge but not both instead, which is not quite accurate but actually I think
should be enough to meet OpenGL (but not d3d10) requirements.

v2: small fixes suggested by Brian, add some comments.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-12 04:05:57 +02:00
Roland Scheidegger
75f1fea14f llvmpipe: increase fs shader variant instruction cache limit by factor 4
The previous limit of of 128*1024 was reported to cause frequent recompiles
in some apps due to shader variant thrashing on IRC in some apps leading
to noticeable lags.
Note that the LP_MAX_SHADER_VARIANTS limit (1024) was more or less impossible
to reach, since even simple fragment shaders without texturing (glxgears) used
more than twice than 128 instructions, hence the instruction limit would have
always been reached first (excluding things like trivial shaders not writing
color). Even with the new limit it is VERY likely the instruction limit is hit
first.
Should help with such lags due to recompiles (though other shader types have
their own limits, LP_MAX_SETUP_VARIANTS and DRAW_MAX_SHADER_VARIANTS, in
particular the latter seems a bit small (128)).

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-12 04:05:57 +02:00
Vinson Lee
a9a78640d9 mesa: Do not use newlocale on NetBSD.
Fixes this build error.

  CC       imports.lo
../../src/mesa/main/imports.c: In function '_mesa_strtof':
../../src/mesa/main/imports.c:570:20: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'loc'
../../src/mesa/main/imports.c:570:20: error: 'loc' undeclared (first use in this function)
../../src/mesa/main/imports.c:570:20: note: each undeclared identifier is reported only once for each function it appears in
../../src/mesa/main/imports.c:572:7: error: implicit declaration of function 'newlocale'
../../src/mesa/main/imports.c:572:23: error: 'LC_CTYPE_MASK' undeclared (first use in this function)
../../src/mesa/main/imports.c:574:4: error: implicit declaration of function 'strtof_l'
../../src/mesa/main/imports.c:580:1: warning: control reaches end of non-void function

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-10-11 17:04:54 -07:00
Brian Paul
1737189f0a svga: s/0/FALSE/ 2013-10-11 17:07:44 -06:00
Brian Paul
6f1b5052ec mesa: add comment to clarify ctx->Driver.MapBufferRange() return value 2013-10-11 17:07:44 -06:00
Brian Paul
3710b65823 st/mesa: whitespace fixes in st_cb_bufferobjects.c 2013-10-11 17:07:44 -06:00
Brian Paul
ffe529352b vbo: assorted minor clean-ups
Use GL_TRUE/FALSE instead of 1/0.  Remove extraneous parentheses.
Remove trailing whitespace.
2013-10-11 17:07:44 -06:00
Brian Paul
2a429f9d9c glsl: fix signed/unsigned comparison warning 2013-10-11 17:07:44 -06:00
Kristian Høgsberg
1d34927061 wayland: Only pass wl_drm instance to gbm when using gbm platform 2013-10-11 15:30:09 -07:00
Kristian Høgsberg
360a141f24 wayland: Don't rely on static variable for identifying wl_drm buffers
Now that libEGL has been fixed to not leak all kinds of symbols, gbm
links to its own copy of the libwayland-drm.a helper library.  That means
we can't rely on comparing the addresses of a static vtable symbol in that
library to determine if a wl_buffer is a wl_drm_buffer.  Instead, we
move the vtable into the wl_drm struct and use that for comparing.

https://bugs.freedesktop.org/show_bug.cgi?id=69437

Cc: 9.2 <mesa-stable@lists.freedesktop.org>
2013-10-11 15:14:35 -07:00
Vinson Lee
fe6974382b glapi: Do not use backtrace on NetBSD.
execinfo.h is not available on NetBSD.

Fixes this bulid error.

  CC       glapi_gentable.lo
glapi_gentable.c:44:22: fatal error: execinfo.h: No such file or directory

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-10-11 14:48:45 -07:00
Ian Romanick
59f18340c3 glsl: Remove extraneous .dir-locals.el
This was overriding the top-level .dir-locals.el causing some settings
(like forcing spaces instead of tabs!) to be lost.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-11 10:43:37 -07:00
Grigori Goronzy
3de7e11f58 r600g: fix crash in set_framebuffer_state
We should be able to safely set the framebuffer state without a
fragment shader bound. bind_ps_state will take care of updating the
necessary state bits later.

v2: check in update_db_shader_control
2013-10-11 17:33:18 +02:00
Topi Pohjolainen
396c69bf5d mesa: Allow external textures to use fallback (0, 0, 0, 1)
Fixes GL2ExtensionTests/egl_image_external/TestSimpleUnassociated.test
which is part of gles2/3 conformance suite. Here image external
textures are switched to be treated the same as 2D textures. These
can be associated with the fallback texture providing fixed sample
values of (0, 0, 0, 1).

The OES_EGL_image_external spec says:

  "Sampling an external texture which is not associated with any EGLImage
   sibling will return a sample value of (0,0,0,1)."

  "External textures cannot be used with TexImage2D, TexSubImage2D,
   CompressedTexImage2D, CompressedTexSubImage2D, CopyTexImage2D, or
   CopyTexSubImage2D, and an INVALID_ENUM error will be generated if
   this is attempted."

And quoting Chad:

  "That's enforced in _mesa_TexImage*() by calling
   legal_teximage_target(), and enforced in _mesa_TexSubImage*() by
   calling legal_texsubmimage_target(). Each of the
   legal_tex*image_target() functions reject external textures.
   Therefore, allowing GL_TEXTURE_EXTERNAL_OES in store_texsubimage()
   won't violate the above spec quote.

   I think it's safe to allow GL_TEXTURE_EXTERNAL_OES in
   store_texsubimage(), as long as the texture has only a single
   plane. Luckily, that's the only type of external textures that
   Mesa currently supports."

CC: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2013-10-11 09:59:01 +03:00
Chad Versace
9cb8f7a126 doxygen: Add i965 to list of modules in html header
Signed-off-by: Chad Versace <Chad Versace chad@chad-versace.us>
2013-10-10 22:20:39 -07:00
Frank Henigman
49ed5991ee i965: extend fast texture upload
Extend the fast texture upload from BGRA X-tiled to include RGBA,
Alpha/Luminance, and Y-tiled.  Speed improvements, measured with
mesa demos teximage program, on 256 x 256 texture, in MB/s, on a
Sandy Bridge (Ivy is comparable):

              before  after   increase
BGRA/X-tiled   3266    4524    1.39x
BGRA/Y-tiled   1739    3971    2.28x
RGBA/X-tiled    474    4694    9.90x
RGBA/Y-tiled    477    3368    7.06x
   L/X-tiled   1268    1516    1.20x
   L/Y-tiled   1439    1581    1.10x

v2: Cosmetic changes only: reformat and reword comments, make doxygen-friendly,
    rename variables, use existing macros, add an assert.

Signed-off-by: Frank Henigman <fjhenigman@google.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-10-10 18:16:41 -07:00
Alexander von Gluck IV
0fda1cb498 haiku: Fix llvmpipe and clean up softpipe tracing
* Fix LLVM library and defines
* Only enable tracing when scons build=debug

Acked-by: Brian Paul <brianp@vmware.com>
2013-10-10 19:28:23 -05:00
Alexander von Gluck IV
69508950da haiku: Remove common directory search path
* /boot/common no longer exists in Haiku as of
  a few days ago (and this is undefined)

Acked-by: Brian Paul <brianp@vmware.com>
2013-10-10 19:28:23 -05:00
Eric Anholt
8821e9d108 dri: Reference the global driver vtable once at screen init..
This is part of the prep for megadrivers, which won't allow using a single
global symbol due to the fact that there will be multiple drivers built
into the same dri.so file.  For that, we'll need screen init to take a
reference to the driver to set up this vtable.

v2: Fix two missed references to driDriverAPI.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-10-10 16:34:30 -07:00
Eric Anholt
ee8983becc i965: Clean up error handling for context creation.
The intel_screen.c used to be a dispatch to one of 3 driver functions, but
was down to 1, so it was kind of a waste.  In addition, it was trying to
free all of the data that might have been partially freed in the kernel
3.6 check (which comes after intelInitContext, and thus might have had
driverPrivate set and result in intelDestroyContext() doing work on the
freed data).  By moving the driverPrivate setup earlier, we can use
intelDestroyContext() consistently and avoid such problems in the future.

v2: Adjust the prototype of brwCreateContext to use the proper enum
    (fixing a compiler warning in some builds)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-10-10 16:34:30 -07:00
Eric Anholt
18a8f31070 intel: Remove silly check for !bufmgr.
If bufmgr didn't get created, then screen creation failed, and we never
should have got here in the first place.  This was added by Chris Wilson
in 2010 with no explanation for why it would be needed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-10 16:34:30 -07:00
Eric Anholt
083f66fdd6 dri: Move API version validation into dri/common.
i965, i915, radeon, r200, swrast, and nouveau were mostly trying to do the
same logic, except where they failed to.  Notably, swrast had code that
appeared to try to enable GLES1/2 but forgot to set api_mask (thus
preventing any gles context from being created), and the non-intel drivers
didn't support MESA_GL_VERSION_OVERRIDE.

nouveau still relies on _mesa_compute_version(), because I don't know what
its limits actually are, and gallium drivers don't declare limits up front
at all.  I think I've heard talk about doing so, though.

v2: Compat max version should be 30 (noted by Ken)
    Drop r100's custom max version check, too (noted by Emil Velikov)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-10 16:34:30 -07:00
Eric Anholt
d81632fb1e dri: Merge drisw_util.c into dri_util.c
The only important difference was not calling drmGetVersion, and making
the swrast extension vtable.  That doesn't justify duplicating the other
330 lines of code.

v2: fix the scons build (code by Emil Velikov)
v3: fix scons build with swrast-only (code by Emil Velikov)
v4: Drop the new define I added, when we already have __NOT_HAVE_DRM_H.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-10 16:34:30 -07:00
Eric Anholt
683f6daa97 dri: Add an explanatory comment for an important driver entrypoint.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-10 16:34:30 -07:00
Eric Anholt
7f3a131b6e dri: Remove dead comment.
The code it was referencing was removed in 2010.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-10 16:34:30 -07:00
Eric Anholt
36fbe66d3a i965/fs: Convert gen7 to using GRFs for texture messages.
Looking at Lightsmark's shaders, the way we used MRFs (or in gen7's
case, GRFs) was bad in a couple of ways.  One was that it prevented
compute-to-MRF for the common case of a texcoord that gets used
exactly once, but where the texcoord setup all gets emitted before the
texture calls (such as when it's a bare fragment shader input, which
gets interpolated before processing main()).  Another was that it
introduced a bunch of dependencies that constrained scheduling, and
forced waits for texture operations to be done before they are
required.  For example, we can now move the compute-to-MRF
interpolation for the second texture send down after the first send.

The downside is that this generally prevents
remove_duplicate_mrf_writes() from doing anything, whereas previously
it avoided work for the case of sampling from the same texcoord twice.
However, I suspect that most of the win that originally justified that
code was in avoiding the WAR stall on the first send, which this patch
also avoids, rather than the small cost of the extra instruction.  We
see instruction count regressions in shaders in unigine, yofrankie,
savage2, hon, and gstreamer.

Improves GLB2.7 performance by 0.633628% +/- 0.491809% (n=121/125, avg of
~66fps, outliers below 61 dropped).

Improves openarena performance by 1.01092% +/- 0.66897% (n=425).

No significant difference on Lightsmark (n=44).

v2: Squash in the fix for register unspilling for send-from-GRF, fixing a
    segfault in lightsmark.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
2013-10-10 15:54:16 -07:00
Eric Anholt
ee21c8b1e6 i965/fs: Allocate more register classes on gen7.
For texturing from GRFs, we now have payloads of arbitrary sizes up to the
message length limit.

v2 (Kenneth Graunke): Rebase on intel_context -> brw_context change.
v3: Add some comment text.
v4: Change some magic 16s to BRW_MAX_MRF (noted by Ken).  Leave the 11,
    which is the magic "max sampler message length".  BRW_MAX_MRF sizing
    on the little int arrays is retained because I could see us needing to
    extend in the future if we move to GRFs for FB writes (those go to at
    least 12 long in a quick scan of the specs)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)
Acked-by: Matt Turner <mattst88@gmail.com>
2013-10-10 15:54:16 -07:00
Eric Anholt
b6af650a09 i965/fs: Use per-channel interference for register_coalesce_2().
This will let us coalesce into texture-from-GRF arguments, which would
otherwise be prevented due to the live interval for the whole vgrf
extending across all the MOVs setting up the channels of the message

v2 (Kenneth Graunke): Rebase for renames.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-10 15:54:16 -07:00
Eric Anholt
3093085847 i965/fs: Use the new per-channel live ranges for dead code elimination.
v2 (Kenneth Graunke): Rebase on s/live_variables/live_intervals/g.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-10 15:54:16 -07:00
Eric Anholt
b4d676d710 i965/fs: Keep a copy of the live variables class around.
Now optimization passes will be able to look at the per-channel ranges.

v2: Rebase on various optimization pass changes.
v3 (Kenneth Graunke): Rename live_variables to live_intervals; split
   introduction of invalidate_live_intervals() into a separate patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-10 15:54:15 -07:00
Kenneth Graunke
3ea84beb16 i965/fs: Invalidate live intervals when compacting; don't fix them.
When compacting the list of VGRFs, we patch up the live interval ranges
(which are indexed by VGRF number).  Unfortunately, once we make
per-component data available, this will become too complicated to
maintain.  Instead, simply invalidate them.

This was pulled out of a patch by Eric Anholt.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-10 15:54:15 -07:00
Kenneth Graunke
939b0f2c2f i965/fs: Remove start/end aliases in compute_live_intervals().
In compute_live_intervals(), start and end are shorter names for
the virtual_grf_start and virtual_grf_end class members.

Now that the fs_live_intervals class has arrays named start and end
which are indexed by var, rather than VGRF, reusing the name is
confusing.  Plus, most of the code has been factored out, so using the
long names isn't as inconvenient.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-10 15:54:15 -07:00
Eric Anholt
398656d97e i965/fs: Track live variable ranges on a per-channel level.
This is the information we'll actually use to replace the
virtual_grf_start[]/end[] arrays.

No change in shader-db.

v2 (Kenneth Graunke): Rebase; minor comment updates.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-10 15:54:15 -07:00
Eric Anholt
097bf101c3 i965/fs: Factor def[]/use[] setup out to a separate function.
These blocks are about to grow some more code, and the indentation was
getting out of hand.

v2 (Kenneth Graunke): Rebase, minor typo fixes and style changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-10 15:54:15 -07:00
Kenneth Graunke
4b821a97b5 i965/fs: Create a helper function for invalidating live intervals.
For now, this simply sets live_intervals_valid = false, but in the
future it will do something more sophisticated.

Based on a patch by Eric Anholt.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-10 15:54:15 -07:00
Eric Anholt
45ffaeccaf i965/fs: Do live variables dataflow analysis on a per-channel level.
This significantly improves our handling of VGRFs of size > 1.

Previously, we only marked VGRFs as def'd if the whole register was
written by a single instruction.  Large VGRFs which were written
piecemeal would not be considered def'd at all, even if they were
ultimately completely written.

Without being def'd, these were then marked "live in" to the basic
block, often extending the range to preceding blocks and sometimes
even the start of the program.

The new per-component tracking gives more accurate live intervals,
which makes register coalescing more effective.

In the future, this should help with texturing from GRFs on Gen7+.
A sampler message might be represented by a 2-register VGRF which
holds the texture coordinates.  If those are incoming varyings,
they'll be produced by two PLN instructions, which are piecemeal writes.

No reduction in shader-db instruction counts.  However, code which
prints the live interval ranges does show that some VGRFs now have
smaller (and more correct) live intervals.

v2: Rebase on current send-from-GRF code requiring adding extra use[]s.
v3: Rebase on live intervals fix to include defs in the end of the
    interval.
v4 (Kenneth Graunke): Rebase; split off a few preparatory patches;
    add lots of comments; minor style changes; rewrite commit message.
v5 (Eric Anholt): whitespace nit.

Written-by: Eric Anholt <eric@anholt.net> [v1-3]
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [v4]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> (v4)
2013-10-10 15:54:14 -07:00
Kenneth Graunke
5af8388110 i965/fs: Rename num_vars to num_vgrfs in live interval analysis.
num_vars was shorthand for the number of virtual GRFs.  num_vgrfs is a
bit clearer.  Plus, the next patch will introduce "vars" which are
distinct from vgrfs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-10 15:54:14 -07:00
Kenneth Graunke
701e9af15f i965/fs: Short-circuit a loop in live variable analysis.
This has no functional effect, but should make subsequent changes a
little simpler.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-10 15:54:14 -07:00
Paul Berry
8cb9cce040 glsl: Don't allow gl_PerVertex to be redeclared after it's been used.
Fixes piglit tests:
- spec/glsl-1.50/compiler/gs-redeclares-pervertex-in-after-other-usage.geom
- spec/glsl-1.50/compiler/gs-redeclares-pervertex-out-after-other-usage.geom
- spec/glsl-1.50/compiler/gs-redeclares-pervertex-out-after-usage.geom
- spec/glsl-1.50/compiler/vs-redeclares-pervertex-out-after-other-usage.vert
- spec/glsl-1.50/compiler/vs-redeclares-pervertex-out-after-usage.vert

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:40 -07:00
Paul Berry
84b9fa83a0 glsl: Support redeclaration of GS gl_PerVertex input.
Fixes piglit test
spec/glsl-1.50/execution/redeclare-pervertex-subset-vs-to-gs.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:38 -07:00
Paul Berry
fc2330b0be glsl: Catch redeclaration of interface block instance names at compile time.
From section 4.1.9 (Arrays) of the GLSL 4.40 spec (as of revision 7):

    However, unless noted otherwise, blocks cannot be redeclared;
    an unsized array in a user-declared block cannot be sized
    through redeclaration.

The only place where the spec notes that interface blocks can be
redeclared is to allow for redeclaration of built-in interface blocks
such as gl_PerVertex.  Therefore, user-defined interface blocks can
never be redeclared.  This is a clarification of previous intent (see
Khronos bug 10659).

We were already preventing interface block redeclaration using the
same block name at compile time, but we weren't preventing interface
block redeclaration using the same instance name (and different block
names) at compile time.  And we weren't preventing an instance name
from conflicting with a previously-declared ordinary variable.

In practice the problem would be caught at link time, but only because
of a coincidence: since ast_interface_block::hir() wasn't doing any
checking to see if the instance name already existed in the shader, it
was creating a second ir_variable in the shader having the same name
but a different type.  Coincidentally, when the linker checked for
intrastage consistency of global variable declarations, it treated the
two declarations from the same shader as a conflict, so it reported a
link error.

But it seems dangerous to rely on that linker behaviour to catch
illegal redeclarations that really ought to be detected at compile
time.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:35 -07:00
Paul Berry
1b4a7378e9 glsl: Support redeclaration of VS and GS gl_PerVertex output.
Fixes piglit tests:
- spec/glsl-1.50/execution/redeclare-pervertex-out-subset-gs
- spec/glsl-1.50/execution/redeclare-pervertex-subset-vs

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:33 -07:00
Paul Berry
79f515251a glsl: Error check redeclarations of gl_PerVertex.
This patch verifies that:

- The gl_PerVertex input interface block may only be redeclared in a
  geometry shader, and that it may only be redeclared as gl_in[].

- The gl_PerVertex output interface block may only be redeclared in a
  vertex or geometry shader, and that it may only be redeclared as a
  non-array without an interface name.

- gl_PerVertex may not be redeclared as any other type of interface
  block (i.e. as a uniform interface block).

As a side-effect, the code now keeps track of what the previous
declaration of gl_PerVertex was--this will be needed in future
patches.

Fixes piglit tests:
- spec/glsl-1.50/compiler/gs-redeclares-pervertex-in-with-incorrect-name.geom
- spec/glsl-1.50/compiler/gs-redeclares-pervertex-out-as-array.geom
- spec/glsl-1.50/compiler/gs-redeclares-pervertex-out-with-instance-name.geom

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:31 -07:00
Paul Berry
3c83c96dcd glsl: Make it possible to disable a variable in the symbol table.
In later patches, we'll use this in order to implement the required
behaviour that after the gl_PerVertex interface block has been
redeclared, only members of the redeclared interface block may be
used.

v2: Update the function name and comment to clarify that we aren't
actually removing the variable from the symbol table, just disabling
it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:27 -07:00
Paul Berry
24b9bba19b glsl: Add an ir_variable::reinit_interface_type() function.
This will be used by future patches to change an ir_variable's
interface type when the gl_PerVertex built-in interface block is
redeclared.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:22 -07:00
Paul Berry
3699ff4dd1 glsl: Generalize processing of variable redeclarations.
This patch modifies the get_variable_being_redeclared() function so
that it no longer relies on the ast_declaration for the variable being
redeclared.  In future patches, this will allow
get_variable_being_redeclared() to be used for processing
redeclarations of the built-in gl_PerVertex interface block.

v2: Also make get_variable_being_redeclared() static.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:20 -07:00
Paul Berry
78b072b2bc glsl: Don't allow invalid identifiers as struct names.
Fixes piglit test
spec/glsl-1.10/compiler/struct/struct-name-uses-gl-prefix.vert.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:17 -07:00
Paul Berry
9fb6f59552 glsl: Don't allow invalid identifiers as interface block instance names.
Note: we need to make an exception for the gl_PerVertex interface
block, since in geometry shaders it is allowed to be redeclared with
the instance name gl_in.  Future patches will make redeclaration of
gl_PerVertex work properly.

Fixes piglit test
spec/glsl-1.50/compiler/interface-block-instance-name-uses-gl-prefix.vert.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:15 -07:00
Paul Berry
9b5b0320b6 glsl: Don't allow invalid identifier names in struct/interface fields.
Note: we need to make an exception for the gl_PerVertex interface
block, since built-in variables are allowed to be redeclared inside
it.  Future patches will make redeclaration of gl_PerVertex work
properly.

Fixes piglit tests:
- spec/glsl-1.50/compiler/interface-block-array-elem-uses-gl-prefix.vert
- spec/glsl-1.50/compiler/named-interface-block-elem-uses-gl-prefix.vert
- spec/glsl-1.50/compiler/unnamed-interface-block-elem-uses-gl-prefix.vert

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:12 -07:00
Paul Berry
f2dd3a04ce glsl: Don't allow invalid identifiers as interface block names.
Note: we need to make an exception for the gl_PerVertex interface
block, since this is allowed to be redeclared.  Future patches will
make redeclaration of gl_PerVertex work properly.

Fixes piglit test
spec/glsl-1.50/compiler/interface-block-name-uses-gl-prefix.vert.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:10 -07:00
Paul Berry
9bb60a155f glsl: Don't allow unnamed interface blocks to redeclare variables.
Note: some limited amount of redeclaration is actually allowed,
provided the shader is redeclaring the built-in gl_PerVertex interface
block.  Support for this will be added in future patches.

Fixes piglit tests
spec/glsl-1.50/compiler/unnamed-interface-block-elem-conflicts-with-prev-{block-elem,global}.vert.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:08 -07:00
Paul Berry
1838df97a2 glsl: Refactor code to check that identifier names are valid.
GLSL reserves identifiers beginning with "gl_" or containing "__", but
we haven't been consistent about enforcing this rule.  This patch
makes a new function to check whether identifier names are valid.  In
the process it closes a loophole where we would previously allow
function argument names to contain "__".

v2: Rename check_valid_identifier() -> validate_identifier().  Add
curly braces in validate_identifier().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:05 -07:00
Paul Berry
6a157f2e33 glsl: Account for location field when comparing interface blocks.
In commit e2660770731b018411fbe1620cacddaf8dff5287 (glsl: Keep track
of location for interface block fields), I neglected to update
glsl_type::record_key_compare to account for the fact that interface
types now contain location information.  As a result, interface types
that differ only by their location information would not be properly
distinguished.

At the moment this is not a problem, because the only interface block
in which location information != -1 is gl_PerVertex, and gl_PerVertex
is always created in the same way.  However, in the patches that
follow, we'll be adding new ways to create gl_PerVertex (by
redeclaring it), so we'll need location information to be handled
properly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:03 -07:00
Paul Berry
5a234d92af glsl: Construct gl_PerVertex interfaces for GS and VS outputs.
Although these interfaces can't be accessed directly by GLSL (since
they don't have an instance name), they will be necessary in order to
allow redeclarations of gl_PerVertex.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:27:00 -07:00
Paul Berry
fb41f2c531 glsl: Refactor code for creating gl_PerVertex interface block.
Currently, we create just a single gl_PerVertex interface block for
geometry shader inputs.  In later patches, we'll also need to create
an interface block for geometry and vertex shader outputs.  Moving the
code into its own class will make reuse easier.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:26:58 -07:00
Paul Berry
d2e66b953e glsl: Fix block name of built-in gl_PerVertex interface block.
Previously, we erroneously used the name "gl_in" for both the block
name and the instance name.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:26:56 -07:00
Paul Berry
192d05f277 glsl: Construct gl_in with a location of -1.
We use a location of -1 for variables which don't have their own
assigned locations--this includes ir_variables which represent named
interface blocks.  Technically the location assigned to gl_in doesn't
matter, since gl_in is only accessed via its members (which have their
own locations).  But it's nice to be consistent.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-10 14:26:53 -07:00
Christian König
8bc7673ef8 radeon/winsys: fix handling in radeon_drm_cs_flush v2
Calling radeon_drm_cs_flush from multiple threads might cause deadlocks,
fix this by immediately signaling the semaphore after waiting for it.

This is a candidate for the stable branch(es).

Partially fixes: https://bugs.freedesktop.org/show_bug.cgi?id=70123

v2: some fixes on commit message

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-10-10 11:50:38 +02:00
José Fonseca
a922d3413f util: Fix MinGW build.
_GNU_SOURCE appears to not be used reliably.  Use _MSC_VER instead so
that MSVC alone is affected.
2013-10-09 21:17:53 -07:00
José Fonseca
1aef0ef277 llvmpipe: We don't use the draw pipeline for offset_point/line.
Unless the polygon fill mode is different from PIPE_POLYGON_MODE_FILL,
so checking the the polygon mode is sufficient.

Testing done: no regression in polygon-mode-offset
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-10-09 21:09:07 -07:00
Roland Scheidegger
9b3dbaf396 gallivm: kill old per-quad face selection code
Not used since ages, and it wouldn't work at all with explicit derivatives now
(not that it did before as it ignored them but now the code would just use
the derivs pre-projected which would be quite random numbers).

v2: also get rid of 3 helper functions no longer used.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-10-10 04:32:57 +02:00
Roland Scheidegger
47d0613eb7 gallivm: handle explicit derivatives for cubemaps
They need some special handling. Quite complicated.
Additionally, use the same code for implicit derivatives too if no_rho_approx
and no_quad_lod is set, because it seems while generally it should be ok
to use per quad lod for implicit derivatives there's at least some test which
insists that in case of cubemaps the shared lod value MUST come from a pixel
inside the primitive (due to the derivatives becoming different if a different
larger major axis is chosen).

v2: based on Brian's feedback, clean up code a bit.
And use sign bit of major axis instead of pre-select s/t/r sign for coord
mirroring (which should be the same in the end, saves 2 ands).
Also fix two bugs with select/mirror of derivatives, the minor axes need to
use major axis sign as well (instead of major derivative axis sign), and
don't mistakenly use absolute values of major derivative and inverse major
values.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-10-10 04:32:57 +02:00
Roland Scheidegger
ce1d8634aa gallivm: ignore rho approximation for cube maps
There's two reasons for this:
1) even when ignoring rho approximation for cube maps, the result is still
not correct, but it's better as the max error at edges is now sqrt(2) instead
of 2 (which was a full mip level), same as it is for ordinary 2d maps when
doing rho approximations (so the error actually goes from factor 2 at edges and
sqrt(2) completely inside a face to sqrt(2) at edges and 0 inside a face).
2) I want to repurpose rho_no_approx for cubemaps for fully correct cubemap
derivatives (so don't need yet another debug var).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-10 04:32:57 +02:00
Paul Berry
15e05b999b glsl: Modify array_sizing_visitor to handle unnamed interface blocks.
We were already setting the array size of unsized arrays that appeared
inside unnamed interface blocks, but we weren't updating
ir_variable::interface_type to reflect the new array size, causing
bogus link errors.

This patch causes array_sizing_visitor to keep track of all the
unnamed interface types it sees, and the ir_variables corresponding to
each one.  After the visitor runs, a new function,
fixup_unnamed_interface_types(), adjusts each unnamed interface type
to correctly correspond with the array sizes in the ir_variables.

Fixes piglit tests:
- spec/glsl-1.50/execution/unsized-in-unnamed-interface-block-gs
- spec/glsl-1.50/execution/unsized-in-unnamed-interface-block-multiple

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-09 16:49:48 -07:00
Paul Berry
45e46b2e37 glsl: Update call_link_visitor to update max_ifc_array_access.
When multiple shaders of the same type access an interface block
containing an unsized array, we need to set the array size based on
the maximum array element accessed across all the shaders.  This is
similar to what we already do with unsized arrays occurring outside of
interface blocks.

Note: one corner case is not yet addressed by these patches: the case
where one compilation unit defines an interface block containing
unsized arrays and another compilation unit defines the same interface
block containing sized arrays.

Fixes piglit test:
- spec/glsl-1.50/execution/unsized-in-named-interface-block-multiple

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-09 16:49:46 -07:00
Paul Berry
e226669eea glsl/linker: Modify array_sizing_visitor to handle named interface blocks.
Unsized arrays appearing inside named interface blocks now get a
proper size assigned by the array_sizing_visitor.

Fixes piglit tests:
- spec/glsl-1.50/execution/unsized-in-named-interface-block
- spec/glsl-1.50/execution/unsized-in-named-interface-block-gs
- spec/glsl-1.50/linker/unsized-in-named-interface-block
- spec/glsl-1.50/linker/unsized-in-named-interface-block-gs
- spec/glsl-1.50/linker/unsized-in-unnamed-interface-block-gs (*)

(*) is fixed by dumb luck--support for unsized arrays in unnamed
interface blocks will come in a later patch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-09 16:49:41 -07:00
Paul Berry
f878d2060c glsl: Update ir_variable::max_ifc_array_access properly.
This patch modifies update_max_array_access() so that it updates
ir_variable::max_ifc_array_access to reflect the shader's use of
arrays appearing within interface blocks.

v2: Use an ordinary function in ast_array_index.cpp rather than a
virtual function in ir_rvalue.  Avoid dereferencing NULL when handling
accesses to ordinary structs.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-09 16:49:38 -07:00
Paul Berry
ca8a5ce919 glsl: Sanity check max_ifc_array_access in ir_validate::visit(ir_variable *).
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-09 16:49:36 -07:00
Paul Berry
3f4292a6e3 glsl: Add an ir_variable::max_ifc_array_access field.
For interface blocks that contain arrays, this field will contain the
maximum element of each contained array that is accessed by the
shader.  This is a first step toward supporting unsized arrays in
interface blocks.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-09 16:49:31 -07:00
Paul Berry
22d3ef2df1 glsl: Make accessor functions for ir_variable::interface_type.
In a future patch, this will allow us to enforce invariants when the
interface type is updated.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-09 16:49:26 -07:00
Paul Berry
6f19e552af glsl: Move update of max_array_access into a separate function.
Currently, when converting an access to an array element from ast to
IR, we need to see if the array is an ir_dereference_variable, and if
so update the variable's max_array_access.

When we add support for unsized arrays in interface blocks, we'll also
need to account for cases where the array is an ir_dereference_record
and the record is an interface block.

To make this easier, move the update into its own function.

v2: Use an ordinary function in ast_array_index.cpp rather than a
virtual function in ir_rvalue.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-09 16:49:23 -07:00
Paul Berry
2f2f39c389 glsl: Add parser support for unsized arrays in interface blocks.
Although it's not explicitly stated in the GLSL 1.50 spec, unsized
arrays are allowed in interface blocks.

section 1.2.3 (Changes from revision 5 of version 1.5) of the GLSL
1.50 spec says:

    * Completed full update to grammar section.  Tested spec examples
      against it:

      ...

      * add unsized arrays for block members

And section 7.1 (Vertex and Geometry Shader Special Variables)
includes an unsized array in the built-in gl_PerVertex interface
block:

    out gl_PerVertex {
        vec4 gl_Position;
        float gl_PointSize;
        float gl_ClipDistance[];
    };

Furthermore, GLSL 4.30 contains an example of an unsized array
occurring inside an interface block.  From section 4.3.9 (Interface
Blocks):

    uniform Transform {  // API uses "Transform[2]" to refer to instance 2
        mat4           ModelViewMatrix;
        mat4           ModelViewProjectionMatrix;
        vec4           a[];  // array will get implicitly sized
        float          Deformation;
    } transforms[4];

This patch adds the parser rule to support unsized arrays inside
interface blocks.  Later patches in the series will add the
appropriate semantics to handle them.

Fixes piglit tests:
- spec/glsl-1.50/execution/unsized-in-unnamed-interface-block
- spec/glsl-1.50/linker/unsized-in-unnamed-interface-block

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-09 16:49:21 -07:00
Paul Berry
8cf35c3d2f glsl: Rename the fourth argument to get_interface_instance.
Interface declarations have two names associated with them: the block
name and the instance name.  It's the block name that needs to be
passed to get_interface_instance().  This patch renames the argument
so that there's no confusion.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-09 16:49:16 -07:00
Kenneth Graunke
b330125790 i965/blorp: Allow format conversions for CopyTexSubImage.
BLORP performs blits by drawing a rectangle with a shader that samples
from the source texture, and writes color data to the destination.

The sampler always returns 32-bit RGBA float data, regardless of the
source format's component ordering or data type.  Likewise, the render
target write message takes 32-bit RGBA float data, and converts it
appropriately.  So the bulk of the work is already taken care of for us.

This greatly accelerates a lot of CopyTexSubImage calls, and makes
Legends of Aethereus playable on Ivybridge.  At the default settings,
LOA continually blits between SRGBA8888 (the window format) and
RGBA16_FLOAT.  Since neither BLORP nor our BLT paths supported this,
it fell back to meta, spending 33% of the CPU in floorf() converting
between floats and half-floats.

v2: Use != instead of ^ (suggested by Ian).  Note that only
    CopyTexSubImage is affected by this patch (caught by Eric).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-09 16:36:50 -07:00
Kenneth Graunke
72aade48fe i965/blorp: Rework sRGB override behavior.
The previous code for sRGB overrides assumes that the source and
destination formats are equal, other than the color space.  This won't
be feasible when we add support for format conversions.

Here are a few cases, and how the old code handled them:

1.  RGB8 -> SRGB8, MSAA     ==>   SRGB8 -> SRGB8
2.  RGB8 -> SRGB8, single   ==>    RGB8 -> RGB8
3. SRGB8 ->  RGB8, MSAA     ==>    RGB8 -> RGB8
4. SRGB8 ->  RGB8, single   ==>   SRGB8 -> SRGB8

Apparently, preserving the behavior of #1 is important.  When doing a
multisample to single-sample resolve, blending the samples together in
an sRGB correct fashion results in a noticably higher quality image.
It also is necessary to pass Piglit's EXT_framebuffer_multisample
accuracy color tests.

Paul, Eric, Anuj, and I talked about this, and aren't sure that it
matters in the other cases.

This patch preserves the behavior of #1, but otherwise reverts to
doing everything in linear space, changing the behavior of case #4.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-09 16:36:50 -07:00
Kenneth Graunke
0589eaecde i965/blorp: Explain why Z24 can't use a sensible format.
We could conceivably use BRW_SURFACEFORMAT_R24_UNORM_X8_TYPELESS for
Z24 source images, allowing conversions from Z24 to either Z16 or Z32F.

Unfortunately, we can't use it for destination images since it isn't
supported as a render target.

Using different formats for sources or destinations would be painful,
so for now, punt.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-09 16:36:50 -07:00
Kenneth Graunke
590d71791a i965/blorp: Use R32_FLOAT for Z32F surfaces.
Currently, all that matters is that we copy the correct number of bits,
so any format that has 32-bits of data will work fine.

Once BLORP begins handling format conversions, the sampler will need to
correctly interpret the data.  We don't need a depth format, but we do
need the right number of components and data type (FLOAT).

For Z32F, this means using R32_FLOAT.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-09 16:36:49 -07:00
Kenneth Graunke
4dc25b7615 i965/blorp: Use R16_UNORM for Z16 surfaces.
Currently, all that matters is that we copy the correct number of bits,
so any format that has 16-bits of data will work fine.

Once BLORP begins handling format conversions, the sampler will need to
correctly interpret the data.  We don't need a depth format, but we do
need the right number of components and data type (UNORM).

For Z16, this means using R16_UNORM.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-09 16:36:49 -07:00
Kenneth Graunke
6f7c41dd1d i965/blorp: Add support for non-render-target formats.
Once blorp gains the ability to do format conversions, it's conceivable
that the source format may be texturable but not supported as a render
target.  This would break Paul's code, which assumes that it can use the
render_target_format array even for the source format.

There are three ways to convert MESA_FORMAT enums to BRW_SURFACEFORMAT
enums:

1. brw_format_for_mesa_format()

   This translates the Mesa format to the most equivalent BRW format.

2. brw->render_target_format[]

   This is used for renderbuffers, and handles the subset of formats
   that are renderable.  However, it's not always equivalent, since
   it overrides a few non-renderable formats.  For example, it
   converts B8G8R8X8_UNORM to B8G8R8A8_UNORM so it can be rendered to.

3. translate_tex_format()

   This is used for textures.  It wraps brw_format_for_mesa_format(),
   but overrides depth textures, and one sRGB case on Gen4.

BLORP has a fourth function, which uses brw->render_target_format[]
and overrides depth formats (differently than translate_tex_format).

This patch makes the BLORP function to use brw_format_for_mesa_format()
for textures/source data, since not everything will be a render target.
It continues using brw->render_target_format[] for render targets, since
it needs the format overrides that provides.

We don't use translate_tex_format() since the additional overrides are
not useful or simply redundant.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-09 16:36:49 -07:00
Kenneth Graunke
4b2e819e10 i965/blorp: Add an is_render_target parameter to surface_info::set.
This allows us to determine whether we're setting up a format for
the source (as a texture) or destination (as a render target).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-09 16:36:49 -07:00
José Fonseca
dbc1f3677c util/u_math: Fix C++ include of u_math.h on MSVC.
GNU C++ compiler declares the C99 lrint, etc. when _GNU_SOURCE is
defined, but MSVC does not.

Trivial.
2013-10-10 00:31:53 +01:00
Zack Rusin
edde6c77bd llvmpipe: abstract the code to set number of subpixel bits
As we're moving towards expanding the number of subpixel
bits and the width of the variables used in the computations
we need to make this code a bit more centralized.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-09 18:30:31 -04:00
Zack Rusin
87fe4a33d3 llvmpipe: implement 64 bit mul opcodes in llvmpipe
Both the imul_hi and umul_hi are working with this patch.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-09 18:30:27 -04:00
Zack Rusin
6905698fc2 gallium: Add support for 32x32 muls with 64 bit results
The code introduces two new 32bit integer multiplication opcodes which
can be used to produce correct 64 bit results. GLSL, OpenCL and D3D10+
require them. We use two seperate opcodes, because they match the
behavior of GLSL and OpenCL, are a lot easier to add than a single
opcode with multiple destinations and because there's not much (any)
difference wrt code-generation.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-09 18:30:20 -04:00
Zack Rusin
c01c6a95b4 gallivm: support printing of 64 bit integers
only 8 and 32 bit integers were supported before.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-10-09 18:29:05 -04:00
Eric Anholt
58bab95c95 i965/blorp: Fix the register types on blorp's push constants.
The UD values were getting set up as floats.  This happened to work out
because they were used as the second argument where the first was a dword,
and gen6+ doesn't do source conversions.  But it did trigger fulsim
warnings, and it meant if you used the push constant as the first operand
you would have been disappointed.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-09 11:43:46 -07:00
Eric Anholt
8da15d7544 i965: Fix 3D texture layout by more literally copying from the spec.
Fixes 3 texelFetch tests in piglit all.tests on ivb, and cubemap npot on gm45.

v2: Don't forget the gen4 DL=6 cubemap behavior.

Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v1)
2013-10-09 11:28:19 -07:00
Eric Anholt
bfe6e5dda5 mesa: Fix compiler warnings when ALIGN's alignment is "1 << value".
We hadn't run into order of operation warnings before, apparently, since
addition is so low on the order.

Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-09 11:28:19 -07:00
Eric Anholt
791550aa8e i965: Don't forget the cube map padding on gen5+.
We had a fixup for gen4's 3d-layout cubemaps (which, iirc, we'd
experimentally found to be necessary!), but while the spec still requires
it on gen5, we'd been missing it in the array-layout cubemaps.

Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-09 11:28:19 -07:00
Gaetan Nadon
e6fb744141 egl/main: remove undefined X11_LIBS automake variable
The EGL library has some references to x11 but it gets the link flags
from the XCB_DRI2_LIBS if and only if HAVE_EGL_PLATFORM_X11 is true.

The X11_LIBS variable was probably coming from a PKG_CHECK_MODULES (x11)
earlier in history.

If it is possible to have HAVE_EGL_DRIVER_GLX without HAVE_EGL_PLATFORM_X11
then the link flags for libX11 should be passed. However, it won't come
from X11_LIBS which is undefined.

Reported-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Gaetan Nadon <memsize@videotron.ca>
2013-10-09 10:36:01 -04:00
Gaetan Nadon
bc93c3798a gallium/state_trackers/glx: X11/Xlib.h: No such file or directory
The compiler cannot find the Xlib.h in the installed system headers.
All supplied include directives point to inside the mesa module.
The X11_CFLAGS variable is undefined (not defined in config.status).

It appears the intent was to use X11_INCLUDES defined in configure.ac.

The Xlib.h file is not installed on my workstation. It is supplied in
the libx11-dev package. This allows an X developer control over which
version of this file is used for X development.

Use to test: --enable-gallium-egl --enable-xlib-glx --disable-dri

Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gaetan Nadon <memsize@videotron.ca>
2013-10-09 10:28:12 -04:00
Gaetan Nadon
54b028ba89 gallium/targets/libgl-xlib: X11/Xlib.h: No such file or directory
The compiler cannot find the Xlib.h in the installed system headers.
All supplied include directives point to inside the mesa module.
The X11_CFLAGS variable is undefined (not defined in config.status).

It appears the intent was to use X11_INCLUDES defined in configure.ac.

The Xlib.h file is not installed on my workstation. It is supplied in
the libx11-dev package. This allows an X developer control over which
version of this file is used for X development.

Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gaetan Nadon <memsize@videotron.ca>
2013-10-09 10:24:35 -04:00
Gaetan Nadon
d901d7e08e gallium/state_trackers/egl: use X11_INCLUDES rather than X11_CFLAGS
The X11_CFLAGS variable is undefined (not defined in config.status).
It appears the intent was to use X11_INCLUDES defined in configure.ac.
It is used for building the code in the x11 subdir.

The build does not fail on this one as LIBDRM_CFLAGS happens to have
the inludedir value as the one for X11. It will not always be the case.
The option --enable-gallium-egl is required durimg configuration.

Acked-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Gaetan Nadon <memsize@videotron.ca>
2013-10-09 10:23:00 -04:00
Grigori Goronzy
bd19e25703 st/vdpau: really block until surface is idle
pipe_screen::fence_finish with zero timeout returns quickly and
doesn't wait at all. Fix that, and also delete the fence afterwards,
so that QuerySurfaceStatus returns the right state later.

Addresses:
https://trac.videolan.org/vlc/ticket/9281
https://bugs.freedesktop.org/show_bug.cgi?id=68792

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-10-09 13:02:40 +02:00
Grigori Goronzy
48563bd45c st/vdpau: add new formats to OutputSurface rendering
OutputSurfaces have simple YCbCr rendering functionality built in,
but so far only 4:2:0 subsampling worked correctly. This fixes 4:2:2
and 4:4:4 formats.

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-10-09 13:02:40 +02:00
Grigori Goronzy
1a5bac2149 st/vdpau: fix GenerateCSCMatrix with NULL procamp
As per API specification, it is legal to supply a NULL procamp. In this
case, a CSC matrix according to the colorspace should be generated,
but no further adjustments are made.

Addresses:
https://trac.videolan.org/vlc/ticket/9281
https://bugs.freedesktop.org/show_bug.cgi?id=68792

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-10-09 13:02:40 +02:00
Grigori Goronzy
5b4e2db12d radeon/uvd: disable VC-1 simple/main profile
It doesn't work (decodes to garbage) with most videos on UVD 3.0. Worse
yet, it often results in random memory corruption or GPU hangs. Rumor
has it only the newest UVD hardware could do it anyway.

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-10-09 13:02:40 +02:00
Grigori Goronzy
5403dd4b68 radeon/uvd: try to fix VC-1 decoding
The DPB size calculations seem to be off; there is various random
corruption happening, even with advanced profile. Always assuming
a minimum number of references appears to fix it, similarly to
H.264. This might overallocate the DPB.  Also clean up the SPS/PPS
field setup so that it matches VC-1 specifications better.

With these changes, all advanced profile VC-1 files I could get my
hand on work fine.

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-10-09 13:02:40 +02:00
Grigori Goronzy
0bb05484bf radeon/uvd: fix video format reporting
UVD can only support NV12 in the case of hardware decoding, but we
can still use all other formats for software decoding. Use the UNKNOWN
profile to signal that we're not interesting in hardware decoding.

v2: use profile instead of entrypoint

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-10-09 13:02:40 +02:00
Marek Olšák
c207fa6c18 gallium/dri targets: use DRI_DRIVER_LDFLAGS
which contains -Wl,-Bsymbolic. If I understand it correctly, it prevents
symbols from clashing if multiple drivers are loaded at the same time.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-09 12:04:38 +02:00
Marek Olšák
6b7c039dc2 radeonsi: fix occlusion queries for CIK
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-10-09 11:44:48 +02:00
Marek Olšák
ec922ef987 radeonsi: draw register fixes for CIK
This doesn't fix any known issue. I'm just following the docs.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-10-09 11:44:48 +02:00
Chia-I Wu
a26e17a365 i965: keep SecHalf flag after register coalescing
Copy sechalf to the new register, otherwise we would read wrong HW registers.

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-09 14:49:11 +08:00
Chia-I Wu
3db52b6e36 i965: allow SIMD8 sampler messages in SIMD16 mode
When the instruction to send the sampler message is forced uncompressed or
sechalf, send SIMD8 one even in SIMD16 mode.

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-09 14:49:11 +08:00
Chia-I Wu
44f0777f17 i965: make BRW_COMPRESSION_2NDHALF valid for brw_SAMPLE
SIMD8 sampler messages are allowed in SIMD16 mode, and they could not work
without BRW_COMPRESSION_2NDHALF.  Later PRMs (gen5 and later) do not
explicitly state whether BRW_COMPRESSION_2NDHALF is allowed, but they do have
examples using send with SecHalf.  It should be safe to assume SecHalf is
valid.

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-09 14:49:11 +08:00
Vinson Lee
1176a3aac6 i965: Initialize brw_blorp_const_color_program::prog_data.
Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-08 22:10:46 -07:00
Eric Anholt
8c197d4aae i965: Fix a compiler warning about conservative depth enums.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-10-08 14:34:35 -07:00
Paul Berry
d14fcd7db7 i965/gs: Fixup gl_PointSize on entry to geometry shaders.
gl_PointSize is stored in the w component of VARYING_SLOT_PSIZ, but
the geometry shader infrastructure assumes that it should look for all
geometry shader inputs of type float in the x component.  So when
compiling a geomtery shader that uses a gl_PointSize input, fix it up
during the shader prolog by moving the w component to the x component.

This is similar to how we emit fixups and workarounds for vertex
shader attributes.

Fixes piglit test spec/glsl-1.50/execution/geometry/core-inputs.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-08 12:44:24 -07:00
Bryan Cain
8f758b0b92 glsl/gs: handle gl_ClipDistance geometry input in lower_clip_distance.
This corresponds to the lowering of gl_ClipDistance to
gl_ClipDistanceMESA for vertex and geometry shader outputs.  Since
this lowering pass occurs after lower_named_interface blocks, it deals
with 2D arrays (gl_ClipDistance[vertex][clip_plane]) rather than 1D
arrays in an interface block
(gl_in[vertex].gl_ClipDistance[clip_plane]).

v2 (Paul Berry <stereotype441@gmail.com>): Fix indexing order for
gl_ClipDistance input lowering.  Properly lower bulk assignment of
gl_ClipDistance inputs.  Rework for GLSL 1.50 style geometry shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

v3 (Paul Berry <stereotype441@gmail.com>): Add comments and assertions
to clarify that the 2D version of clip distance is only used for
geometry shader inputs.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-08 12:44:21 -07:00
Paul Berry
c09adcb21b glsl/gs: add gl_in support to builtin_variables.cpp.
Previously, builtin_variables.cpp was written assuming that we
supported ARB_geometry_shader4 style geometry shader inputs, meaning
that each built-in varying input to a geometry was supplied via an
array variable whose name ended in "In", e.g. gl_PositionIn or
gl_PointSizeIn.

However, in GLSL 1.50 style geometry shaders, things work
differently--built-in inputs are supplied to geometry shaders via a
built-in interface block called gl_in, which contains all the built-in
inputs using their usual names (e.g. the gl_Position input is supplied
to the geometry shader as gl_in[i].gl_Position).

This patch adds the necessary logic to builtin_variables.cpp to create
the gl_in interface block and populate it accordingly for geometry
shader inputs.  The old ARB_geometry_shader4 style varyings are
removed, though they can easily be added back in the future if we
decide to support ARB_geometry_shader4.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-08 12:44:19 -07:00
Paul Berry
378ff1dbac glsl: Keep track of location for interface block fields.
This patch adds a "location" element to struct glsl_struct_field, so
that we can keep track of the gl_varying_slot associated with each
built-in geometry shader input.

In lower_named_interface_blocks, we use this value to populate the
"location" field in the ir_variable that stores each geometry shader
input.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-10-08 12:44:01 -07:00
Adam Jackson
e166a58c43 glx: Generate fewer errors in MakeContextCurrent
For a few reasons.

1: In the (current) common case, these conditionals are never true. All
we're doing by checking them is slowing down MakeCurrent.  The server
does these checks already anyway.

2: GLX >= 3.0 contexts may legally be made current without a bound
framebuffer.

This does not fix piglit/glx-create-context-current-no-framebuffer, but
is a prerequisite for fixing it.

Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-10-08 13:24:20 -04:00
Adam Jackson
d101204c23 glx: Propagate failures from SendMakeCurrentRequest where possible
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-10-08 13:24:20 -04:00
Adam Jackson
68412d5006 glx: Hide xGLXMakeCurrentReply inside SendMakeCurrentRequest
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-10-08 13:24:20 -04:00
Marek Olšák
15a201c610 st/dri: don't export any private symbols
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-08 16:23:52 +02:00
Marek Olšák
085e5adede gallium/swrast: don't export any private symbols
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-08 16:23:52 +02:00
Marek Olšák
c787a9767c gallium/radeon: don't export any private symbols
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-08 16:23:52 +02:00
Marek Olšák
790c8a2405 configure.ac: report an error if LLVM shared libs are disabled and CL is enabled
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-08 16:23:52 +02:00
Marek Olšák
e9c9d28203 st/mesa: improve format selection for GLES
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2013-10-08 16:23:04 +02:00
Stéphane Marchesin
20bf508a42 i915g: Rename sampler to fragment_sampler
Otherwise it is fairly confusing.
2013-10-07 20:53:55 -07:00
Stéphane Marchesin
8c6594074e i915g: Fix the sampler bind function
The new sampler bind sends us NULL samplers, so we need to count
the number of valid samplers ourselves. This fixes ~500 piglit
regressions from the sampler rework.

While we're at it, let's also support start.
2013-10-07 20:51:53 -07:00
Chad Versace
6cd1da8377 gen7: Use logical, not physical, dims in 3DSTATE_DEPTH_BUFFER (v2)
In 3DSTATE_DEPTH_BUFFER, we set Width and Height to the miptree slice's
physical dimensions. (Logical and physical dimensions may differ for
multisample surfaces).

However, in SURFACE_STATE, we always set Width and Height to the slice's
logical dimensions. We should do the same for 3DSTATE_DEPTH_BUFFER,
because the hw docs say so.

No Piglit regressions (-x glx -x glean) on Ivybridge with Wayland.

v2: No Piglit regressions, for real this time.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-10-07 11:55:24 -07:00
Chad Versace
ccad802ed5 doxygen: Generate Doxygen for i965
Now, one can do the following to generate and read the i965 Doxygen:

  cd $MESA_TOP/doxygen
  make
  firefox i965/index.html

Reviewed-by: Frank Henigman <fjhenigman@google.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-10-07 11:55:16 -07:00
Matt Turner
b645913ff6 i965: Remove the "ARF" register file.
The registers in the architecture register file don't share much in
common, so there's no point in grouping them together. Use the HW_REG
class instead. The vec4 backend already does this.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 11:38:52 -07:00
Matt Turner
e7dc88026a i965: Fixup for don't dead-code eliminate instructions that write to the accumulator.
Accidentally pushed an old version of the patch.

v2: Set destination register using brw_null_reg().
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 11:38:15 -07:00
Matt Turner
c4e6569fc8 i965: Generate code for ir_binop_imul_high.
v2: Make accumulator's type match the type of the operation. Noticed by
    Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 10:43:19 -07:00
Matt Turner
85154241d6 i965: Use the multiplication result's type for the accumulator.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-07 10:43:19 -07:00
Matt Turner
6ff8f06308 i965/fs: Disable CSE on instructions writing to HW_REG.
CSE would otherwise combine the two mul(8) emitted by [iu]mulExtended:

	mul(8)  acc0 x y
	mach(8) null x y
	mov(8)  lsb  acc0
	...
	mul(8)  acc0 x y
	mach(8) msb  x y
Into:
	mul(8)  temp x y
	mov(8)  acc0 temp
	mach(8) null x y
	mov(8)  lsb  acc0
	...
	mov(8)  acc0 temp
	mach(8) msb  x y

But mul(8) into the accumulator produces more than 32-bits of precision,
which is required and lost if multiplying into a general register and
moving to the accumulator.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-07 10:43:19 -07:00
Matt Turner
06e41a02a3 glsl: Implement [iu]mulExtended() built-ins for ARB_gpu_shader5.
These built-ins have two "out" parameters, which makes implementing them
efficiently with our current compiler infrastructure difficult. Instead,
implement them in terms of the existing ir_binop_mul IR (to return the
low 32-bits) and a new ir_binop_mul64 which returns the high 32-bits.

v2: Rename mul64 -> imul_high as suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 10:43:19 -07:00
Matt Turner
69909c866b i965: Add Gen assertion checks for newer instructions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 10:43:19 -07:00
Matt Turner
92dc16c3e2 i965: Don't dead-code eliminate instructions that write to the accumulator.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 10:41:17 -07:00
Matt Turner
014cce3dc4 i965: Generate code for ir_binop_carry and ir_binop_borrow.
Using the ADDC and SUBB instructions on Gen7.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 10:41:17 -07:00
Matt Turner
4ec37317c5 i965: Add UD null register helpers.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 10:41:16 -07:00
Matt Turner
6f9428eb68 glsl: Implement usubBorrow() built-in for ARB_gpu_shader5.
i965 implements this with a single (multiple destination) instruction,
SUBB. Emitting SUBB directly from usubBorrow() would be ideal, but our
optimization passes don't know how to copy with expressions with
side-effects.

Radeon has an SUBB_UINT instruction that only generates the borrow
bit. I've chosen to go this route and implement usubBorrow() by doing the
subtraction and the borrow operations separately.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-07 10:41:16 -07:00
Matt Turner
6c125973f3 glsl: Implement uaddCarry() built-in for ARB_gpu_shader5.
i965 implements this with a single (multiple destination) instruction,
ADDC. Emitting ADDC directly from uaddCarry() would be ideal, but our
optimization passes don't know how to copy with expressions with
side-effects.

Radeon has an ADDC_UINT instruction that only generates the carry
bit. I've chosen to go this route and implement uaddCarry() by doing the
addition and the carry operations separately.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-07 10:41:16 -07:00
Matt Turner
499d7a7f6e glsl: Add ir_binop_carry and ir_binop_borrow.
Calculates the carry out of the addition of two values and the
borrow from subtraction respectively. Will be used in uaddCarry() and
usubBorrow() built-in implementations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-07 10:41:16 -07:00
Ian Romanick
ae514416b2 glsl_compiler: Enable any extension that any Mesa driver enables
The only GLSL extension that is not enabled is AMD_vertex_shader_layer.
I think the standalone-compiler could enable this (as shading language
support is complete), but no driver enables it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 09:59:23 -07:00
Ian Romanick
136568ea18 glsl_compiler: Sort extensions by name
Makes it a little easier to see which ones are missing.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 09:59:23 -07:00
Ian Romanick
587cd971c8 glsl_compiler: Always log the compiler diagnostics
Not just when there's an error.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 09:59:23 -07:00
Ian Romanick
3646d65f6a glsl_compiler: Set max GLSL version on the command line
Infer whether or not to use ES based on the GLSL version (100 or 300 are
for ES).  This replaces the --glsl-es command line option.  Set various
compiler limits based on the minimums required for the specified GLSL
version.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 09:59:23 -07:00
Ian Romanick
257db619c6 glsl_compiler: Use no_argument instead of 0 in getopt_long options
The choices aren't just 0 and 1, so using the enum names is much more
clear.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 09:59:23 -07:00
Ian Romanick
75e9bd13c4 glsl_compiler: Re-enable building glsl_compiler
This allows application developers to use Mesa's compiler as a
standalone validator for their shaders.

This is mostly a revert of commit 569f0e4.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-07 09:59:23 -07:00
Ian Romanick
5d6b0e7f1b glsl: Remove glsl_parser_state MaxVaryingFloats field
Pull the data directly from the context like the other varying related
limits.  The parser state shadow copies were added back when the parser
state didn't have a pointer to the context.  There's no reason to do it
now days.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-07 09:59:23 -07:00
Ian Romanick
7db50171be glsl: Set gl_MaxVertexOutputs from VertexProgram.MaxOutputComponents etc
gl_MaxVertexOutputVectors => ctx->Const.VertexProgram.MaxOutputComponents
gl_MaxFragmentInputVectors => ctx->Const.FragmentProgram.MaxInputComponents

v2: Add types so that the code compiles.  Pointed out by Brian.

v3: Leave gl_MaxVaryingFloats et al. as-is.  Suggested by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com> [v2]
Reviewed-by: Marek Olšák <marek.olsak@amd.com> [v2]
Reviewed-by: Paul Berry <stereotype441@gmail.com> [v2]
2013-10-07 09:59:23 -07:00
Ian Romanick
42305fb502 glsl: Count shader inputs and outputs separately
Starting with OpenGL 3.2 input limits and output limits for stages may
not match.  This means they need to be accounted separately.

No piglit regressions.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-07 09:59:23 -07:00
Emilio Pozuelo Monfort
d4b5bc62af glapi: add output info to GetProgramiv's params
Signed-off-by: Emilio Pozuelo Monfort <emilio.pozuelo@collabora.co.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-07 09:06:33 -07:00
Laurent Carlier
72465fcf57 clover: fix building with llvm-3.4 since rev191922
http://llvm.org/viewvc/llvm-project?view=revision&revision=191922
2013-10-07 08:41:02 -07:00
Brian Paul
e58dd465f0 st/mesa: silence warning about unhandled ir_query_levels in switch 2013-10-07 09:08:16 -06:00
Christian König
289d928c8e radeon/vdpau: only export necessary symbols
Export only the absolutely necessary symbols in radeon vdpau targets.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-10-07 11:16:53 +02:00
Christian König
731f5471fb radeon/uvd: optimize message handling a bit
No need to keep a copy of the message in system memory anymore,
since it should now be in GART memory on newer chips.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-10-07 11:16:53 +02:00
Kenneth Graunke
cfbfb50cb8 docs: Mark a few more things as "in progress" in GL3.txt. 2013-10-06 13:58:53 -07:00
Ilia Mirkin
7178d6ac59 dri/nouveau: add AllocTextureImageBuffer implementation
This fixes issues where get_rt_format would see a 0 format because the
nouveau_surface had not been properly initialized. Fixes crash on
supertuxkart startup (which still fails due to out-of-vram issues).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
2013-10-06 12:59:18 -07:00
Francisco Jerez
b3c04362b4 glsl: Fix usage of the wrong union member in program_resource_visitor::recursion.
In the array-of-struct case, recursion() takes the row_major flag for
each iteration from 't->fields.structure[i]', but 't' is not a record
type.  Inherit the array declaration row_major flag instead.

This mistake was found by running piglit on valgrind.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69449
Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-06 12:55:14 -07:00
Marek Olšák
373f8670d1 Revert "r600g: only flush the caches that need to be flushed during CP DMA operations"
This reverts commit 7948ed1250.

It caused graphical corruption. I've got no idea why.

Bugzilla:
https://bugs.freedesktop.org/show_bug.cgi?id=70042
https://bugs.freedesktop.org/show_bug.cgi?id=68451

Conflicts:
	src/gallium/drivers/r600/evergreen_hw_context.c
	src/gallium/drivers/r600/r600_hw_context.c
	src/gallium/drivers/r600/r600_pipe.h
2013-10-06 03:13:48 +02:00
Chris Forbes
2656c6118b i965/ivb: Flag RG32F quirk for texture gather regardless of swizzles
As of ARB_gpu_shader5, textureGather doesn't always read the
post-swizzle RED channel -- so we can't just look at the red swizzle
state.

Theoretically we could only flag the quirk if *some* green swizzle is in
use, but that's probably more trouble than it's worth.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-06 11:25:14 +13:00
Chris Forbes
e8ec2e0344 i965/vs: Add support for textureGather(.., comp)
- For HSW: Select the channel based on the component selected (swizzle
  is done in HW)
- For IVB: Select the channel based on the swizzle state for the
  component selected. Only apply the RG32F w/a if we actually want
  green -- we're about to flag it regardless of swizzle state.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-06 11:25:11 +13:00
Chris Forbes
09c6fd450d i965/fs: Add support for textureGather(.., comp)
- For HSW: Select the channel based on the component selected (swizzle
  is done in HW)
- For IVB: Select the channel based on the swizzle state for the
  component selected. Only apply the RG32F w/a if we actually want
  green -- we're about to flag it regardless of swizzle state.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-06 11:25:03 +13:00
Chris Forbes
7335bc7526 glsl: add ARB_gpu_shader5's additional textureGather signatures
- gsampler2DRect support
- optional `comp` parameter

Future patches will add shadow sampler support and
textureGatherOffsets().

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-06 11:13:17 +13:00
Chris Forbes
88ee9bc9d1 glsl: Add support for specifying the component in textureGather
ARB_gpu_shader5 introduces new variants of textureGather* which have an
explicit component selector, rather than relying purely on the sampler's
swizzle state.

This patch adds the GLSL plumbing for the extra parameter.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-06 11:12:29 +13:00
Chris Forbes
f93a63bfcc docs: mark ARB_conservative_depth done on i965
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2013-10-06 11:05:37 +13:00
Chris Forbes
7ec4668696 i965: Enable ARB_conservative_depth for Gen7+.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-06 11:05:35 +13:00
Chris Forbes
4697955c5b i965/wm: Program correct conservative depth modes
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-06 11:05:10 +13:00
Brian Paul
64b1a1d459 docs: rephrase 9.2.1, 9.1.7 news item
Both are bug-fix releases, not new development releases.
2013-10-05 14:25:25 -06:00
Brian Paul
21315bfb71 docs: add the MD5 sums for the 9.2.1 and 9.1.7 releases 2013-10-05 14:20:37 -06:00
Timothy Arceri
c70e2471dc docs: Mark off KHR_debug, update relnotes
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-05 11:41:05 -07:00
Chris Forbes
84e1a396ec i965/vs: add missing break between ir_query_levels and ir_tg4 cases
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2013-10-05 23:18:45 +13:00
Chris Forbes
2beb60c4e7 docs: Mark off ARB_texture_query_levels, update relnotes
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2013-10-05 19:16:33 +13:00
Chris Forbes
317e172677 i965: enable ARB_texture_query_levels on Gen6+
Theoretically would work on Gen5 as well but requires GLSL 1.30, which
is not (yet) enabled by default there.

V2: Enable for Gen5 conditionally on GLSL version.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-05 19:16:33 +13:00
Chris Forbes
4be21a07ea i965/vs: implement ir_query_levels
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-05 19:16:33 +13:00
Chris Forbes
fa6440acdb i965/fs: implement ir_query_levels
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-05 19:16:33 +13:00
Chris Forbes
7480ae3cb8 i965: ignore all texturing opcodes without a coordinate, for cubemap normalize
Previously we special-cased textureSize() but this is the more correct
condition.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-05 19:16:33 +13:00
Chris Forbes
7a4754d7d9 glsl: add plumbing for GL_ARB_texture_query_levels
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-05 19:16:32 +13:00
Chris Forbes
6ce4e7672e mesa: add plumbing for GL_ARB_texture_query_levels
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-05 19:16:32 +13:00
Carl Worth
30e6501820 docs: Add release notes for 9.1.7 release
Including a news item.
2013-10-04 21:58:51 -07:00
Carl Worth
058fa59d6b docs: Add release notes and NEWS item for 9.2.1 release
Better late than never, right?
2013-10-04 21:58:51 -07:00
Alexander von Gluck IV
765baec8f7 haiku: Ensure correct libraries are referenced. 2013-10-04 18:20:09 -05:00
Alexander von Gluck IV
a4144af400 haiku: Clean up code, use target-helpers
* Thanks for the help xexaxo!
2013-10-04 18:20:09 -05:00
Alexander von Gluck IV
4d15ef5121 haiku: Drop haiku-softpipe.c; fix extern C
* It isn't needed any longer as we're
  moving in the code that called it.
* The winsys code is C, so make sure
  we include the header in the extern C
2013-10-04 18:20:09 -05:00
Alexander von Gluck IV
bc2fb19773 haiku: Correct Haiku softpipe library
* Use LoadableModule vs SharedLibrary
2013-10-04 18:20:09 -05:00
Alexander von Gluck IV
8730236d1a haiku: Add first Haiku renderer (softpipe)
* This shared library gets parsed by the
  system as a system "add-on"
2013-10-04 18:20:09 -05:00
Alexander von Gluck IV
c9f1217e1f haiku: Build Haiku's libGL from within Mesa
* This in essence means that Mesa would be
  taking control of Haiku's OpenGL kit.
* This works by dispatching renderers from the
  OpenGL add-ons directory
2013-10-04 18:20:09 -05:00
Vinson Lee
1349766612 glsl: Define isnormal for Oracle Solaris Studio.
This patch fixes this Oracle Solaris Studio build error.

"../../src/glsl/ir_constant_expression.cpp", line 1398: Error: The function "isnormal" must have a prototype.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-10-04 15:37:33 -07:00
Grigori Goronzy
8419c5c3ce r600g: texture offsets for non-TXF instructions
All texture instructions can use offsets, not just TXF. Offsets into
the literals array were wrong, too.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2013-10-04 22:44:47 +02:00
Marek Olšák
c04b8d1dab r600g: remove an assertion causing a crash at context cleanup
Compute samplers are advertised, but not implemented.
I think that's intentional.
2013-10-04 20:01:51 +02:00
Marek Olšák
eda1f2aa12 r300g: remove unused function r300_lacks_vertex_textures 2013-10-04 20:01:48 +02:00
Ian Romanick
0667e2c969 mesa: Don't return any data for GL_SHADER_BINARY_FORMATS
We return 0 for GL_NUM_SHADER_BINARY_FORMATS, so
GL_SHADER_BINARY_FORMATS should not write any data to the application
buffer.

Fixes piglit test 'arb_get_program_binary-overrun shader'.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-04 10:08:45 -07:00
Brian Paul
a50c5f8d24 svga: fix incorrect memcpy src in svga_buffer_upload_piecewise()
As we march over the source buffer we're uploading in pieces, we
need to memcpy from the current offset, not the start of the buffer.
Fixes graphical corruption when drawing very large vertex buffers.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matthew McClure <mcclurem@vmware.com>
2013-10-04 10:25:37 -06:00
Matthew McClure
d164d50a85 util: when packing depth values, round to nearest.
This patch adds the lrint, lrintf, llrint, and llrintf rounding utility
functions. When packing unorm depth values, we will round to nearest.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-10-04 10:55:51 +01:00
Tom Stellard
b280516e11 radeonsi/compute: Fix segfault caused by recent refactoring
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-10-03 17:29:54 -07:00
Brian Paul
b181be6266 radeonsi: Fix build
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

https://bugs.freedesktop.org/show_bug.cgi?id=70106
2013-10-03 17:29:42 -07:00
Emil Velikov
757ec72b23 configure: set HAVE_COMMON_DRI when building only swrast
With commit cb1febb07, I have incorrectly removed HAVE_COMMON_DRI
assuming that swrast does not need to build the translations for
driconf options, as effectively swrast/drisw does not use them.

With the incoming unification work of dri and drisw, it makes
sense just to revert the offending hunk.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70057
Reported-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-03 16:52:38 -07:00
Brian Paul
99a471c67b radeonsi/compute: fix bind_compute_sampler_states() breakage
Remove the assignment and the no-op function.
2013-10-03 17:32:40 -06:00
Paul Berry
800610f9eb i965/fs: Improve accuracy of dFdy() to match dFdx().
Previously, we computed dFdy() using the following instruction:

  add(8) dst<1>F src<4,4,0)F -src.2<4,4,0>F { align1 1Q }

That had the disadvantage that it computed the same value for all 4
pixels of a 2x2 subspan, which meant that it was less accurate than
dFdx().  This patch changes it to the following instruction when
c->key.high_quality_derivatives is set:

  add(8) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1Q }

This gives it comparable accuracy to dFdx().

Unfortunately, align16 instructions can't be compressed, so in SIMD16
shaders, instead of emitting this instruction:

  add(16) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1H }

We need to unroll to two instructions:

  add(8) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1Q }
  add(8) (dst+1)<1>F (src+1)<4,4,1>.xyxyF -(src+1)<4,4,1>.zwzwF { align16 2Q }

Fixes piglit test spec/glsl-1.10/execution/fs-dfdy-accuracy.

Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-10-03 13:49:15 -07:00
Brian Paul
9267565ee4 gallium/tests: fix SHADER typo 2013-10-03 14:24:55 -06:00
Emil Velikov
13895abd86 gallium-egl: use standard variable types over EGLBoolean/EGLint
The inferface/prototype in native_wayland_bufmgr.h uses boolean/int, as
well as the rest of the file. Convert to improve consistency and to
prevent gcc compiler warnings due to type miss-match.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-03 14:05:29 -06:00
Brian Paul
379deaf5c6 gallium: remove old bind_*_sampler_states() functions
The new bind_sampler_states() function takes a shader argument to
specify the shader stage.
2013-10-03 14:05:29 -06:00
Brian Paul
55e81b06e7 gallium/docs: update bind_sampler_states() documentation 2013-10-03 14:05:28 -06:00
Brian Paul
1e2fbf2657 cso: make sure all sampler states are set/cleared 2013-10-03 14:05:28 -06:00
Brian Paul
7d7a9714d2 freedreno: use new bind_sampler_states() function 2013-10-03 14:05:28 -06:00
Brian Paul
88b17a15f3 svga: don't hook in old bind_fragment_sampler_states() functions 2013-10-03 14:05:28 -06:00
Brian Paul
27c054edf0 radeon: don't use old bind_vertex/fragment_sampler_states() hooks 2013-10-03 14:05:28 -06:00
Brian Paul
1e8d3eb08d i915g: remove old bind_vertex/fragment_sampler_states() hooks 2013-10-03 14:05:28 -06:00
Brian Paul
edd9af675c noop: remove old bind_*_sampler_states() functions 2013-10-03 14:05:28 -06:00
Brian Paul
f233ee0cd6 galahad: remove old bind_*_sampler_states() functions 2013-10-03 14:05:28 -06:00
Brian Paul
d0520d5bf6 vl: remove old bind_fragment_sampler_states() calls 2013-10-03 14:05:28 -06:00
Brian Paul
3925e521d6 util: remove old bind_fragment_sampler_states() calls from blitter code 2013-10-03 14:05:28 -06:00
Brian Paul
9fa6722a68 draw: remove use of old bind_fragment_sampler_states() 2013-10-03 14:05:28 -06:00
Brian Paul
7478236da9 nouveau: remove old bind_*_sampler_states() functions 2013-10-03 14:05:28 -06:00
Brian Paul
1446600d1a cso: remove use of old bind_*_sampler_states() functions 2013-10-03 14:05:28 -06:00
Brian Paul
bcf7508a7d rbug: remove old bind_*_sampler_states() functions 2013-10-03 14:05:28 -06:00
Brian Paul
22480c5b5b identity: remove old bind_*_sampler_states() functions 2013-10-03 14:05:28 -06:00
Brian Paul
dd4816e3fd trace: remove old bind_*_sampler_states() functions 2013-10-03 14:05:28 -06:00
Brian Paul
5807105ad7 ilo: don't hook up old bind_*_sampler_states() functions 2013-10-03 14:05:28 -06:00
Brian Paul
2d0effaa10 llvmpipe: remove old bind_*_sampler_states() functions 2013-10-03 14:05:27 -06:00
Brian Paul
6e640545ac softpipe: remove old bind_*_sampler_states() functions 2013-10-03 14:05:27 -06:00
Brian Paul
93e6694f2c clover: remove bind_compute_sampler_states() calls 2013-10-03 14:05:27 -06:00
Brian Paul
a5350a9f3e gallium/tests: use pipe_context::bind_sampler_states() 2013-10-03 14:05:27 -06:00
Brian Paul
bc367ab54d gallium/tools: update dump_state.py to use bind_sampler_states() 2013-10-03 14:05:27 -06:00
Brian Paul
3f0627c2ad nouveau: implement pipe_context::bind_sampler_states() 2013-10-03 14:05:27 -06:00
Brian Paul
550f9ee64c softpipe: implement pipe_context::bind_sampler_states() 2013-10-03 14:05:26 -06:00
Brian Paul
8280b29d7c radeon: implement pipe_context::bind_sampler_states() 2013-10-03 14:05:26 -06:00
Brian Paul
0de99d52b7 svga: implement pipe_context::bind_sampler_states() 2013-10-03 14:05:26 -06:00
Brian Paul
6ef9fc791e trace: implement pipe_context::bind_sampler_states() 2013-10-03 14:05:26 -06:00
Brian Paul
e64112b1f9 rbug: implement pipe_context::bind_sampler_states() 2013-10-03 14:05:26 -06:00
Brian Paul
bd1514849b noop: implement pipe_context::bind_sampler_states() 2013-10-03 14:05:26 -06:00
Brian Paul
c772338488 llvmpipe: implement pipe_context::bind_sampler_states() 2013-10-03 14:05:26 -06:00
Brian Paul
41a9be70e4 ilo: implement pipe_context::bind_sampler_states() 2013-10-03 14:05:26 -06:00
Brian Paul
9564ec8317 identity: implement pipe_context::bind_sampler_states() 2013-10-03 14:05:26 -06:00
Brian Paul
aec11d48cf i915g: implement pipe_context::bind_sampler_states() 2013-10-03 14:05:26 -06:00
Brian Paul
e5d000c3f1 galahad: implement pipe_context::bind_sampler_states() 2013-10-03 14:05:26 -06:00
Brian Paul
4bdf7d3842 clover: use pipe_context::bind_sampler_states() if non-null 2013-10-03 14:05:26 -06:00
Brian Paul
96b9c09495 vl: use pipe_context::bind_sampler_states() if non-null 2013-10-03 14:05:26 -06:00
Brian Paul
bbc1fd8c80 util: use pipe_context::bind_sampler_states() if non-null 2013-10-03 14:05:26 -06:00
Brian Paul
27d500a844 draw: use pipe_context::bind_sampler_states() if non-null 2013-10-03 14:05:26 -06:00
Brian Paul
5cba8725a4 cso: use pipe_context::bind_sampler_states() if non-null 2013-10-03 14:05:26 -06:00
Brian Paul
755d788fe2 gallium: add pipe_context::bind_sampler_states()
The bind_vertex/geometry/fragment/compute_sampler_states() functions
will be replaced by a single functions.
2013-10-03 14:05:26 -06:00
Brian Paul
9b99451da2 r300g: rename r300_bind_sampler_states to r300_bind_fragment_sampler_states 2013-10-03 14:05:26 -06:00
Brian Paul
c368479e38 draw: rename bind_sampler_states variables
Put 'fragment' in the names.  In preparation for upcoming function
renaming.
2013-10-03 14:05:25 -06:00
Marek Olšák
c7d91a6f13 r600g: fix ínitialization of non_disp_tiling flag
This fixes a regression caused by e64633e8c3
2013-10-03 18:30:49 +02:00
Marek Olšák
b893bbf438 r600g,radeonsi: create aux_context last
This fixes a regression caused by 68f6dec32e.
2013-10-03 18:30:49 +02:00
Marek Olšák
52bfe8e0f6 r300g/swtcl: don't call draw_prepare_shader_outputs 2013-10-03 18:30:49 +02:00
Brian Paul
bde5b626c2 st/mesa: silence warning about unhandled enum in switch statement 2013-10-03 09:14:03 -06:00
Chris Forbes
d133592619 mesa: fix make check for ARB_texture_gather
Clean up inconsistency in enum decoration:
- Use the undecorated enums where possible.
- MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB remains decorated, since it
  has no undecorated equivalent in GL4.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70054
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-03 21:38:48 +13:00
Chris Forbes
61519f15ac docs: Mark off ARB_texture_gather 2013-10-03 07:58:12 +13:00
Chris Forbes
88f196ab6e i965/hsw: Apply gather4 RG32F w/a using SCS instead of shader.
The new surface channel select bits allow us to avoid having to
recompile the shader for this workaround.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-03 07:56:40 +13:00
Chris Forbes
7df985ad47 i965: Enable ARB_texture_gather on Gen7
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-03 07:56:37 +13:00
Chris Forbes
dd4c2a516c i965: use gather slots in the binding table for gather4.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-03 07:56:34 +13:00
Chris Forbes
c08f2083ee i965: Emit a second set of SURFACE_STATE for gather4 from textures.
This allows us to use a different surface format for gather4, which is
required for R32G32_FLOAT to work on Gen7.

V4: - Only emit alternate surface state for shaders which will actually
      use it.
    - Pass a simple 'for_gather' flag rather than a function pointer.
      The callee can decide what w/a to apply.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-03 07:56:29 +13:00
Chris Forbes
5901d48b41 i965: make room in the binding table for a full alternate set of surface_states
Worst-case is that *every* texunit uses a format that needs overriding.

V4: Place the gather slots last, so shaders which don't use gather don't
    get penalized by having a huge binding table.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-03 07:56:26 +13:00
Chris Forbes
855b2a8f4a i965: Add BRW_SURFACEFORMAT_R32G32_FLOAT_LD, required for IVB gather4 w/a
gather4 GREEN channel against a surface with format R32G32_FLOAT doesn't work
correctly on IVB. w/a from bspec:

   - use R32G32_FLOAT_LD = 0x97 instead, for gather4 only.
   - select BLUE channel to read GREEN

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-03 07:56:23 +13:00
Chris Forbes
cfa3c8a0d3 i965: w/a for gather4 green RG32F
V4: Only flag quirks if there are any uses of gather in the shader,
    to avoid spurious recompiles just because someone happened to use
    RG32F.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-03 07:56:20 +13:00
Chris Forbes
36e25ccd29 glsl: flag shaders which use gather4 at all
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-03 07:56:02 +13:00
Chris Forbes
4ed3930f97 i965/vs: Add support for ir_tg4
Pretty much the same as the FS case. Channel select goes in the header,

V2: Less mangling.
V3: Avoid sampling at all, for degenerate swizzles.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-03 07:55:59 +13:00
Chris Forbes
942a4ec18f i965/fs: Add support for ir_tg4
Lowers ir_tg4 (from textureGather and textureGatherOffset builtins) to
SHADER_OPCODE_TG4.

The usual post-sampling swizzle workaround can't work for ir_tg4,
so avoid doing that:

* For R/G/B/A swizzles use the hardware channel select (lives in the
   same dword in the header as the texel offset), and then don't do
   anything afterward in the shader.
* For 0/1 swizzles blast the appropriate constant over all the output
   channels instead of sampling.

V2: Avoid duplicating header enabling block
V3: Avoid sampling at all, for degenerate swizzles.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-03 07:55:56 +13:00
Chris Forbes
fb455500bf i965: add SHADER_OPCODE_TG4
Adds the Gen7 message IDs, a new SHADER_OPCODE_TG4 pseudo-op, and
low-level support for emitting it via generate_tex().

V3: Updated for changes in master.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-03 07:55:55 +13:00
Maxence Le Dore
18002d9eda glsl: add texture gather changes
V2 [Chris Forbes]:
   - Add new pattern, fixup parameter reading.

V3: Rebase onto new builtins machinery

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-03 07:55:54 +13:00
Maxence Le Dore
d3575622b7 mesa: add texture gather changes
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-03 07:55:51 +13:00
Chris Forbes
0d7fc10bcd i965: fix bogus swizzle in brw_cubemap_normalize
When used with a cube array in VS, failed assertion in ir_validate:

   Assignment count of LHS write mask channels enabled not
   matching RHS vector size (3 LHS, 4 RHS).

To fix this, swizzle the RHS correctly for the writemask.

This showed up in the ARB_texture_gather tests, which exercise cube
arrays in the VS.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-03 07:54:53 +13:00
Vincent Lejeune
4e4c32ba11 r600/llvm: Adds support for MSAA 2013-10-02 17:30:21 +02:00
Vincent Lejeune
8edbd7609b r600g/llvm: Undef z and w component of 2D TXP inst 2013-10-02 17:30:14 +02:00
Vincent Lejeune
9f183eb7de r600g/llvm: fix txq for texture buffer 2013-10-02 17:30:07 +02:00
Chia-I Wu
848c0e72f3 i965: compute DDX in a subspan based only on top row
Consider only the top-left and top-right pixels to approximate DDX in a 2x2
subspan, unless the application requests a more accurate approximation via
GL_FRAGMENT_SHADER_DERIVATIVE_HINT or this optimization is disabled from the
new driconf option disable_derivative_optimization.

This results in a less accurate approximation.  However, it improves the
performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% (at 95.0%
confidence) on Haswell.  No noticeable image quality difference observed.

The improvement comes from faster sample_d.  It seems, on Haswell, some
optimizations are introduced to allow faster sample_d when all pixels in a
subspan have the same derivative.  I considered SAMPLE_STATE too, which allows
one to control the quality of sample_d on Haswell.  But it gave much worse
image quality without giving better performance comparing to this change.

No piglit quick.tests regression on Haswell (tested with v1).

v2: better guess for precompile program key

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-10-02 15:26:40 +08:00
Chris Forbes
72edba1659 i965/blorp: Use passed in framebuffer rather than ctx->DrawBuffer
We have the destination framebuffer object passed in; there's no need to
go digging around in the context.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-10-02 18:31:24 +13:00
Francisco Jerez
ef8cc3e51f ralloc: Remove the rzalloc-based new/delete operator definition macro.
Using it encourages the (IMHO worrying) practice of leaving member
variables uninitialized in constructor definitions.  This macro
shouldn't be necessary anymore after the last patch series fixing all
its users to initialize all member variables from the class
constructor.  Remove it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 17:39:45 -07:00
Francisco Jerez
fcbbecb9bc st/mesa: Switch glsl_to_tgsi_instruction to the non-zeroing allocator.
All member variables of glsl_to_tgsi_instruction are already being
initialized from its implicitly defined constructor, it's not
necessary to use rzalloc to allocate its memory.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 17:30:52 -07:00
Francisco Jerez
03d46344df mesa/program: Switch ir_to_mesa_instruction to the non-zeroing allocator.
All member variables of ir_to_mesa_instruction are already being
initialized from its implicitly defined constructor, it's not
necessary to use rzalloc to allocate its memory.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 17:30:52 -07:00
Francisco Jerez
23e8673afb i965: Switch vec4_live_variables to the non-zeroing allocator.
All member variables of vec4_live_variables are already being
initialized from its constructor, it's not necessary to use rzalloc to
allocate its memory, and doing so makes it more likely that we will
start relying on the allocator to zero out all memory if the class is
ever extended with new member variables.

That's bad because it ties objects to some specific allocation scheme,
and gives unpredictable results when an object is created with a
different allocator -- Stack allocation, array allocation, or
aggregation inside a different object are some of the useful
possibilities that come to my mind.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 17:30:52 -07:00
Francisco Jerez
c307d27c5e i965: Switch fs_live_variables to the non-zeroing allocator.
All member variables of fs_live_variables are already being
initialized from its constructor, it's not necessary to use rzalloc to
allocate its memory, and doing so makes it more likely that we will
start relying on the allocator to zero out all memory if the class is
ever extended with new member variables.

That's bad because it ties objects to some specific allocation scheme,
and gives unpredictable results when an object is created with a
different allocator -- Stack allocation, array allocation, or
aggregation inside a different object are some of the useful
possibilities that come to my mind.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 17:30:52 -07:00
Francisco Jerez
ced327ec64 i965: Switch fs_inst to the non-zeroing allocator.
All member variables of fs_inst are already being initialized from its
constructor, it's not necessary to use rzalloc to allocate its memory,
and doing so makes it more likely that we will start relying on the
allocator to zero out all memory if the class is ever extended with
new member variables.

That's bad because it ties objects to some specific allocation scheme,
and gives unpredictable results when an object is created with a
different allocator -- Stack allocation, array allocation, or
aggregation inside a different object are some of the useful
possibilities that come to my mind.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 17:30:51 -07:00
Francisco Jerez
a5d843ebdf i965: Switch ip_record to the non-zeroing allocator.
All member variables of ip_record are already being initialized from
its constructor, it's not necessary to use rzalloc to allocate its
memory, and doing so makes it more likely that we will start relying
on the allocator to zero out all memory if the class is ever extended
with new member variables.

That's bad because it ties objects to some specific allocation scheme,
and gives unpredictable results when an object is created with a
different allocator -- Stack allocation, array allocation, or
aggregation inside a different object are some of the useful
possibilities that come to my mind.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 17:30:51 -07:00
Francisco Jerez
ddd694293a i965: Initialize all member variables of cfg_t on construction.
The cfg_t object relies on the memory allocator zeroing out its
contents before it's initialized, which is quite an unusual practice
in the C++ world because it ties objects to some specific allocation
scheme, and gives unpredictable results when an object is created with
a different allocator -- Stack allocation, array allocation, or
aggregation inside a different object are some of the useful
possibilities that come to my mind.  Initialize all fields from the
constructor and stop using the zeroing allocator.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 17:30:51 -07:00
Francisco Jerez
fde23b61a9 i965: Initialize all member variables of bblock_t on construction.
The bblock_t object relies on the memory allocator zeroing out its
contents before it's initialized, which is quite an unusual practice
in the C++ world because it ties objects to some specific allocation
scheme, and gives unpredictable results when an object is created with
a different allocator -- Stack allocation, array allocation, or
aggregation inside a different object are some of the useful
possibilities that come to my mind.  Initialize all fields from the
constructor and stop using the zeroing allocator.

v2: Use zero initialization for numeric types instead of default construction.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 17:30:51 -07:00
Francisco Jerez
58d772cb41 glsl: Switch ast_type_qualifier to the non-zeroing allocator.
All member variables of ast_type_qualifier are already being
initialized from its implicitly defined constructor, it's not
necessary to use rzalloc to allocate its memory.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 17:30:51 -07:00
Francisco Jerez
8bd1c69f3b glsl: Switch ast_node to the non-zeroing allocator.
All member variables of ast_node are already being initialized from
its constructor, but some of its derived classes were leaving members
uninitialized -- Fix them.

Using rzalloc makes it more likely that we will start relying on the
allocator to zero out all memory if the class is ever extended with
new member variables.  That's bad because it ties objects to some
specific allocation scheme, and gives unpredictable results when an
object is created with a different allocator -- Stack allocation,
array allocation, or aggregation inside a different object are some of
the useful possibilities that come to my mind.

v2: Use NULL initialization instead of default construction for pointers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 17:30:51 -07:00
Francisco Jerez
70953b5fea i965: Initialize all member variables of vec4_instruction on construction.
The vec4_instruction object relies on the memory allocator zeroing out
its contents before it's initialized, which is quite an unusual
practice in the C++ world because it ties objects to some specific
allocation scheme, and gives unpredictable results when an object is
created with a different allocator -- Stack allocation, array
allocation, or aggregation inside a different object are some of the
useful possibilities that come to my mind.  Initialize all fields from
the constructor and stop using the zeroing allocator.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 17:30:51 -07:00
Francisco Jerez
43bf36b080 glsl: Initialize all member variables of _mesa_glsl_parse_state on construction.
The _mesa_glsl_parse_state object relies on the memory allocator
zeroing out its contents before it's initialized, which is quite an
unusual practice in the C++ world because it ties objects to some
specific allocation scheme, and gives unpredictable results when an
object is created with a different allocator -- Stack allocation,
array allocation, or aggregation inside a different object are some of
the useful possibilities that come to my mind.  Initialize all fields
from the constructor and stop using the zeroing allocator.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 17:30:51 -07:00
Francisco Jerez
0e72db9f97 mesa: Fix misplaced includes of "main/uniforms.h".
Several C++ source files include "main/uniforms.h" from an extern "C"
block, which is both unnecessary, because "uniforms.h" already checks
for a C++ compiler and sets the right linkage, and incorrect, because
the header file includes other C++ headers ("glsl_types.h" and
"ir_uniform.h") that are supposed to get C++ linkage.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-10-01 17:30:51 -07:00
Grigori Goronzy
6349b3235c st/egl: flush resources before presentation
Fixes regression on r600g due to fast clear introduced by commit
edbbfac6.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2013-10-01 21:42:02 +02:00
Paul Berry
d99b5b2d82 i965/gs: Fix incorrect numbering of DWORDs in 3DSTATE_GS
In commit 247f90c77e (i965/gs: Set
control data header size/format appropriately for EndPrimitive()), I
incorrectly numbered the DWORDs in the 3DSTATE_GS command starting
from 1 instead of starting from 0.  This caused the control data
format to be programmed into the wrong DWORD, resulting in corruption
in some geometry shaders that used an output type of points.

This patch numbers the DWORDs starting from 0, as we do for all other
commands, which causes the control data format to be programmed into
the correct DWORD.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-10-01 11:06:17 -07:00
Brian Paul
6659131be3 mesa: check for bufSize > 0 in _mesa_GetSynciv()
The spec doesn't say GL_INVALID_VALUE should be raised for bufSize <= 0.
In any case, memcpy(len < 0) will lead to a crash, so don't allow it.

CC: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-01 10:10:01 -06:00
Brian Paul
755602df12 mesa: minor fix-ups for _mesa_validate_sync()
Return bool instead of int.  Const-qualify the syncObj.  Add some comments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-01 10:10:01 -06:00
Brian Paul
79a03068cd mesa: add missing error checks in _mesa_GetObject[Ptr]Label()
Error checking bufSize isn't mentioned in the spec, but it is in the
man pages.  However, I believe the man page is incorrect.  Typically,
GL functions that take GLsizei parameters check that they're positive
or non-negative.  Negative values don't make sense here.

A spec bug has been filed with Khronos/ARB.

v2: check for negative values, not <= 0.
2013-10-01 10:10:01 -06:00
Brian Paul
69daf335a0 mesa: use caller string in error message in get_label_pointer()
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2013-10-01 10:10:00 -06:00
Brian Paul
ecd155a428 mesa: asst. clean-ups in copy_label()
This incorporates Vinson's change to check for a null src pointer as
detected by coverity.

Also, rename the function params to be src/dst, const-qualify src,
and use GL types to match the calling functions.  And add some more
comments.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2013-10-01 10:10:00 -06:00
Alex Deucher
d2eb281fb2 st/xorg: Include u_surface.h for u_copy_rect
Fixes build errors.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-01 11:49:08 -04:00
Emil Velikov
9c446afb18 winsys/freedreno/drm: drop obsolete .gitignore
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:52 -07:00
Emil Velikov
16661a9d84 winsys/freedreno/drm: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:52 -07:00
Emil Velikov
5d7690991a winsys/nouveau/drm: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:52 -07:00
Emil Velikov
0d36f5c3be winsys/i915/sw: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:52 -07:00
Emil Velikov
56dfbbd24a st/xvmc: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-10-01 07:29:52 -07:00
Emil Velikov
10bd3a3f71 st/xorg: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:52 -07:00
Emil Velikov
556207e579 st/xa: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:52 -07:00
Emil Velikov
f7df719b39 st/wgl: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:51 -07:00
Emil Velikov
9f03c763e9 st/vega: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:51 -07:00
Emil Velikov
bfbbc7c8c8 st/vdpau: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-10-01 07:29:51 -07:00
Emil Velikov
c0024c4548 st/osmesa: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:51 -07:00
Emil Velikov
921fdf1429 st/glx: consolidate C sources list into Makefile.sources
Move glx/{,xlib/}Makefile.am to preserve file list

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:51 -07:00
Emil Velikov
760c1a6e66 st/gbm: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:51 -07:00
Emil Velikov
4e9028b638 st/egl: consolidate C sources lists into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:51 -07:00
Emil Velikov
edd11ece38 st/dri/sw: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:51 -07:00
Emil Velikov
f9ddeac213 st/dri: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:50 -07:00
Emil Velikov
d8afbc6177 st/clover: consolidate CPP sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:50 -07:00
Emil Velikov
1918c37008 galahad: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:50 -07:00
Emil Velikov
38d80c01d0 noop: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:50 -07:00
Emil Velikov
d7c66ff59e identity: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:50 -07:00
Emil Velikov
959ed5c163 freedreno: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:50 -07:00
Emil Velikov
b91a9cdeaa trace: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:50 -07:00
Emil Velikov
e369126709 llvmpipe: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:49 -07:00
Emil Velikov
2234e187c6 rbug: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:49 -07:00
Emil Velikov
9bc5ced1c7 softpipe: consolidate C sources list into Makefile.sources
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:49 -07:00
Emil Velikov
6ea73bb395 r600: use NEED_RADEON_LLVM over R600_NEED_RADEON_GALLIUM
libllvmradeon.la is available whenever NEED_RADEON_LLVM is set, using
R600_NEED_RADEON_GALLIUM is rather ambiguous and unnecessary. Drop it
in favour of NEED_RADEON_LLVM.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:49 -07:00
Emil Velikov
4334666b47 gallium/radeon: drop unused variable LIBGALLIUM_LIBS
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:49 -07:00
Emil Velikov
e11ff60e28 mesa/drivers: drop HAVE_*_DRI from individual makefiles
The mesa/drivers/dri/Makefile.am already guards the individual
targets/subdirs with HAVE_*_DRI before including them. Thus making
the additional check within each Makefile.am unnecessary.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-01 07:29:49 -07:00
Johannes Obermayr
cb1febb074 gallium/targets: Make use of prebuilt libdricommon.la.
libdricommon.la is available whenever a non swrast driver is built.
All the classic dri drivers make use of the prebuild library but all
of the gallium ones rebuild it explicitly.

While we're here gallium/{llvm,soft}pipe does not require HAVE_COMMON_DRI
thus do not set in during configure.

v2: [Emil] Add commit message and drop HAVE_COMMON_DRI from configure.ac
v3: [Emil] Rebase and resolve targets/r*/dri conflicts

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-01 07:29:49 -07:00
Vinson Lee
eb0a57acaa i915: Fix memory leak in do_blit_readpixels.
Fixes "Resource leak" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-30 22:08:48 -07:00
Vinson Lee
76df7edacf llvmpipe: Remove unnecessary null check of shader.
shader has already been dereferenced earlier so cannot be null here.

Fixes "Dereference before null check" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-30 22:00:54 -07:00
Vinson Lee
ac82495d6d util/u_format: Assert that format block size is at least 1 byte.
The block size for all formats is currently at least 1 byte. Add an
assertion for this.

This should silence several Coverity "Division or modulo by zero"
defects.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-30 21:53:04 -07:00
Vinson Lee
505a6de7fc draw: Add a null check for draw.
There is an earlier null check for draw so draw could be null here as
well.

Fixes "Dereference after null check" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-30 21:46:42 -07:00
Vinson Lee
9b388c66fc st/vdpau: Include u_surface.h for u_copy_rect.
Fix build errors.

  CC     surface.lo
surface.c: In function 'vlVdpVideoSurfaceGetBitsYCbCr':
surface.c:247:10: error: implicit declaration of function 'util_copy_rect' [-Werror=implicit-function-declaration]

  CC     output.lo
output.c: In function 'vlVdpOutputSurfaceGetBitsNative':
output.c:216:4: error: implicit declaration of function 'util_copy_rect' [-Werror=implicit-function-declaration]

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-09-30 20:49:38 -07:00
Vinson Lee
05474ac9c4 st/vdpau: Include u_format.h for util_format_description.
Fix build error.

  CC     device.lo
device.c: In function 'vlVdpDefaultSamplerViewTemplate':
device.c:251:4: error: implicit declaration of function 'util_format_description' [-Werror=implicit-function-declaration]
device.c:251:9: warning: assignment makes pointer from integer without a cast [enabled by default]
device.c:252:12: error: dereferencing pointer to incomplete type
device.c:252:28: error: 'UTIL_FORMAT_SWIZZLE_0' undeclared (first use in this function)
device.c:252:28: note: each undeclared identifier is reported only once for each function it appears in
device.c:254:12: error: dereferencing pointer to incomplete type
device.c:256:12: error: dereferencing pointer to incomplete type
device.c:258:12: error: dereferencing pointer to incomplete type

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-09-30 20:38:06 -07:00
Vinson Lee
14442c46fb st/xvmc: Include u_surface.h for u_copy_rect.
This patch fixes the build error introduced with commit
81bb98e928.

  CC     subpicture.lo
subpicture.c: In function 'upload_sampler':
subpicture.c:181:4: error: implicit declaration of function 'util_copy_rect' [-Werror=implicit-function-declaration]
subpicture.c: In function 'XvMCClearSubpicture':
subpicture.c:304:21: error: storage size of 'uc' isn't known
subpicture.c:328:4: error: implicit declaration of function 'util_fill_rect' [-Werror=implicit-function-declaration]
subpicture.c:304:21: warning: unused variable 'uc' [-Wunused-variable]

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-09-30 20:15:53 -07:00
Brian Paul
9f6e76a91e st/egl: include u_format.h for util_format_get_blocksize() 2013-09-30 19:02:27 -06:00
Brian Paul
1d05caf9f2 svga: fix pixel center integer
The svga/d3d9 convention is that pixel centers are at integer coordinates.
Fixes piglit glsl-arb-fragment-coord-conventions test.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-09-30 18:50:37 -06:00
Brian Paul
360610c89e svga: return 0 for PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER
Using the map/unmap path for glTexImage is a little bit faster
than blitting.  Also, this fixes about 50 assorted piglit failures
that seem to be related to the blit version of glReadPixels.

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-09-30 18:50:37 -06:00
Brian Paul
395fac25a6 svga: we don't support TGSI_OPCODE_CONT
So return PIPE_SHADER_CAP_TGSI_CONT_SUPPORTED = 0.
2013-09-30 18:50:37 -06:00
Brian Paul
81bb98e928 gallium: include u_surface.h instead of u_rect.h
u_rect.h was including u_surface.h just to avoid touching a bunch
of other source files after some functions were moved from u_rect.h
to u_surface.h.  This patch cleans up that hack.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-30 18:50:37 -06:00
Eric Anholt
48b9720272 i965: Reenable glBitmap() after the sRGB winsys enabling.
The format of the window system framebuffer changed from ARGB8888 to
SARGB8, but we're still supposed to render to it the same as ARGB8888
unless the user flipped the GL_FRAMEBUFFER_SRGB switch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for stable branches.
2013-09-30 16:49:43 -07:00
Ian Romanick
3e1fdf3899 mesa: Remove all traces of GL_OES_matrix_get
I believe this extension was enabled by accident.  As far as I can tell,
there has never been any code in Mesa to actually support it.  Not only
that, this extension is only useful in the common-lite profile, and Mesa
does the common profile.

This "fixes" the piglit test oes_matrix_get-api.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-30 16:40:00 -07:00
Carl Worth
9baf35de5c Use -Bsymbolic when linking libEGL.so
For some reason that I don't yet fully understand, Glaze does not work with
libEGL unless libEGL is linked with -Bsymbolic.[*]

Beyond that specific reason, all of the reasons for which libGL.so is linked
with -Bsymbolic, (see the commit history), should also apply here.

[*] The specific behavior I am seeing is that when Glaze calls dlopen for
libEGL.so, ifunc resolvers within Glaze for EGL functions are called before
the dlopen returns. These resolvers cannot succeed, as they need the return
value from dlopen in order to find the functions to resolve to. I don't know
what's causing these resolvers to be called, but I have verified that linking
libEGL with -Bsymbolic causes this problematic behavior to stop.

CC: "9.1 and 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-30 15:49:16 -07:00
Paul Berry
4c4934636c i965/blorp: retype destination register for texture SEND instruction to UW.
From the bspec documentation of the SEND instruction:

    "destination region cannot cross the 256-bit register boundary."

To avoid violating this restriction when executing SIMD16 texturing
operations (such as those used by blorp), we need to ensure that the
destination of the SEND instruction doesn't exceed 256 bits in size.
An easy way to do this is to set the type of the destination register
to UW (unsigned word), since 16 unsigned words can fit inside a
256-bit register.  Fortunately, this has no effect on the sampling
operation, since the sampler always infers the destination data type
from the sampler message rather than from the type of the instruction
operand.

Previously, we did this for texturing operations issued by the vec4
and fs back-ends, but not for blorp.  This patch makes blorp use the
same trick.

I haven't observed any behavioural difference on actual hardware due
to this patch, but it avoids a warning from the simulator so it seems
like the right thing to do.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-30 15:16:44 -07:00
Eric Anholt
1c7f75e45e i965: Add a real native TexStorage path.
We originally had a path just did the loop and called
ctx->Driver.AllocTextureImageBuffer(), which I moved into Mesa core.  But
we can do better, avoiding incorrect miptree size guesses and later
texture validations by just directly allocating the miptree and setting it
to all the images.

v2: drop debug printf.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-30 14:35:42 -07:00
Eric Anholt
aff7f335c1 i965: Add missing license to intel_tex_validate.c.
I've rewritten a lot of this file.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-30 14:35:42 -07:00
Eric Anholt
8037c0b69c i965: Always allocate validated miptrees from level 0.
No change in copies during a piglit run, but it's one less first_level !=
0 in our codebase.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-30 14:35:42 -07:00
Eric Anholt
16060c5adc i965: Don't relayout a texture just for baselevel changes.
As long as the baselevel, maxlevel still sit inside the range we had
previously validated, there's no need to reallocate the texture.

I also hope this makes our texture validation logic much more obvious.
It's taken me enough tries to write this change, that's for sure.  Reduces
miptree copy count on a piglit run by 1.3%, though the change in amount of
data moved is much smaller.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-30 14:35:42 -07:00
Eric Anholt
97bdb4c039 i965: Don't allocate a 1-level texture when GL_GENERATE_MIPMAP is set.
Given that a teximage that calls us with this flag set will immediately
proceed to allocate the other levels, we can probably just go ahead and
allocate those levels now.

Reduces miptree copies in piglit by about .05%.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-30 14:35:42 -07:00
Eric Anholt
6ca9b532d8 i965: Stop allocating miptrees with first_level != 0.
If the caller shows up with GL_BASE_LEVEL != 0, it doesn't mean that the
texture will over the course of its lifetime have that nonzero baselevel,
it means that the caller is filling the texture from the bottom up for
some reason (one could imagine demand-loading detailed texture layers at
runtime, for example).  If we allocate from just the current baselevel, it
means when they come along with the next level up, we'll have to allocate
a new miptree and copy all of our bits out of the first miptree.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-30 14:35:42 -07:00
Eric Anholt
3b9a2dc938 i965: Drop a special case for guessing small miptree levels.
Let's say you started allocating your 2D texture with level 2 of a tree as
a 1x1 image.  The driver doesn't know if this means that level 0 is 4x4 or
4x1 or 1x4, so we would just allocate a single 1x1 and let it get copied
in to the real location at texture validate time later.

Since this is just a temporary allocation that *will* get copied, the
extra space allocation of just taking the normal path which will happen to
producing a 4x1 level 0, 2x1 level 1, and 1x1 level 2 is the right way to
go, to reduce complexity in the normal case.

No change in miptree copies over the course of a piglit run.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-30 14:35:42 -07:00
Eric Anholt
7de88ac380 i965: Totally switch around how we handle nonzero baselevel-first_level.
This has no effect currently, because intel_finalize_mipmap_tree() always
makes mt->first_level == tObj->BaseLevel.

The change I made before to handle it
(b1080cfbdb) got very close to working, but
after fixing some unrelated bugs in the series, it still left
tex-miplevel-selection producing errors when testing textureLod().  The
problem is that for explicit LODs, the sampler's LOD clamping is ignored,
and only the surface's MIP clamping is respected.  So we need to use
surface mip clamping, which applies on top of the sampler's mip clamping,
so the sampler change gets backed out.

Now actually tested with a non-regressing series producing a non-zero
computed baselevel.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-30 14:35:42 -07:00
Eric Anholt
9c116d5eac i965: Always look up from the object's mt when setting up texturing state.
We know that the object's mt is equal to the firstimage's mt because it's
gone through intel_finalize_mipmap_tree().  Saves a lookup of firstimage
on pre-gen7.

v2: Merge in the warning fix that appeared later in the series (noted by
    Chad)

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-30 14:35:42 -07:00
Vinson Lee
114ae47475 r600g/sb: Move variable dereference after null check.
Fixes "Deference before null check" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-09-30 10:27:52 -07:00
Brian Paul
0d441aac3d st/mesa: fix comment typo 2013-09-30 09:06:52 -06:00
Marek Olšák
7b25f52a95 r600g,radeonsi: workaround for late shared screen initialization
Accidentally broken by the consolidation.
2013-09-30 13:01:13 +02:00
Laurent Carlier
868791f0ba r600g: Fix build failure introduced with r600_texture.c consolidation
It seems that case with opencl enabled was forgotten

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2013-09-29 22:01:04 +02:00
Marek Olšák
4e9aa6711f radeon: make texture logging more useful
This has been very useful for tracking down bugs in libdrm.

The *_PRINT_TEXDEPTH environment variables were probably never used,
so I removed them.
2013-09-29 15:18:10 +02:00
Marek Olšák
e64633e8c3 r600g,radeonsi: share r600_texture.c
The function r600_choose_tiling is new and needs a review.

The only change in functionality is that it enables 2D tiling for compressed
textures on SI. It was probably accidentally turned off.

v2: don't make scanout buffers linear
2013-09-29 15:18:10 +02:00
Marek Olšák
4069d39465 r600g: remove compute_global_transfer_* calls from texture_transfer_map/unmap
Textures can never have target==PIPE_BUFFER.
2013-09-29 15:18:10 +02:00
Marek Olšák
ef6680d3ee r600g: move the low-level buffer functions for multiple rings to drivers/radeon
Also slightly optimize r600_buffer_map_sync_with_rings.
2013-09-29 15:18:09 +02:00
Marek Olšák
1bb77f81db r600g,radeonsi: consolidate tiling_info initialization
and the util_format_s3tc_init calls too.
2013-09-29 15:18:09 +02:00
Marek Olšák
09fc5d6e26 radeonsi: implement clear_buffer using CP DMA, initialize CMASK with it
More work needs to be done for this to be entirely shared with r600g.
I'm just trying to share r600_texture.c now.

The reason I put the implementation to si_descriptors.c is that the emit
function had already been there.
2013-09-29 15:18:09 +02:00
Marek Olšák
68f6dec32e r600g: move aux_context and r600_screen_clear_buffer to drivers/radeon
This will be used in the next commit.
2013-09-29 15:18:09 +02:00
Marek Olšák
0cb9de1dd0 radeonsi: move debug options to R600_DEBUG 2013-09-29 15:18:09 +02:00
Marek Olšák
ba650ccf91 r600g: move some debug options to drivers/radeon 2013-09-29 15:18:09 +02:00
Marek Olšák
2814202ef4 r600g,radeonsi: share the async dma interface
r600_texture.c is one step closer to r600g.
2013-09-29 15:18:09 +02:00
Marek Olšák
e916267285 radeonsi: move radeonsi-specific functions out of r600_texture.c 2013-09-29 15:18:08 +02:00
Marek Olšák
31169400a0 r600g,radeonsi: remove unused code 2013-09-29 15:18:08 +02:00
Marek Olšák
6f21009cb3 r600g: move r600g-specific functions out of r600_texture.c 2013-09-29 15:18:08 +02:00
Marek Olšák
bfea9c498d r600g,radeonsi: consolidate r600_texture structures 2013-09-29 15:18:08 +02:00
Marek Olšák
4ea2e5a4e7 r600g: get rid of r600_texture::is_rat
It's always 0.
2013-09-29 15:18:08 +02:00
Marek Olšák
ba29324dba r600g: get rid of r600_texture::array_mode 2013-09-29 15:18:08 +02:00
Marek Olšák
39801d4ba7 r600g,radeonsi: consolidate transfer, cmask, and fmask structures 2013-09-29 15:18:08 +02:00
Marek Olšák
a62cd6949c radeon drivers: handle PIPE_CAP_MAX_VIEWPORTS 2013-09-29 15:18:07 +02:00
Marek Olšák
900b1863c8 radeon/llvm: fix TGSI_OPCODE_UCMP
This doesn't fix any known issue (I haven't run piglit with this yet),
but the code was obviously completely wrong. It looks like copy-pasted from CMP.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-09-29 14:49:23 +02:00
Marek Olšák
2bda5f3298 st/mesa: fix GLSL mix(.., .., bvecN)
v2: use CMP on drivers without native integer support
2013-09-29 14:42:42 +02:00
Tom Stellard
a64d3dd135 configure.ac: Add a more informative warning when libclc.pc is not found v2
v2:
  - Don't display an error message when the user doesn't ask for libclc.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-09-27 20:20:35 -07:00
Vinson Lee
b2d5757831 mesa: Include stdint.h in mtypes.h for uint32_t symbol.
This patch fixes the MSVC build error introduced with commit
b2e327e08f.

api_arrayelt.c
src\mesa\main/mtypes.h(1809) : error C2061: syntax error : identifier 'uint32_t'
src\mesa\main/mtypes.h(1810) : error C2059: syntax error : '}'
src\mesa\main/mtypes.h(1825) : error C2079: 'Minimum' uses undefined union 'gl_perf_monitor_counter_value'
src\mesa\main/mtypes.h(1828) : error C2079: 'Maximum' uses undefined union 'gl_perf_monitor_counter_value'

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-09-26 20:48:47 -07:00
Kenneth Graunke
aac75f877d i965/fs: Don't double-accept operands of logical and/or/xor operations.
If the argument to emit_bool_to_cond_code() is an ir_expression, we
loop over the operands, calling accept() on each of them, which
generates assembly code to compute that subexpression.  We then emit
one or two final instruction that perform the top-level operation on
those operands.

If it's not an expression (say, a boolean-valued variable), we simply
call accept() on the whole value.

In commit 80ecb8f1 (i965/fs: Avoid generating extra AND instructions on
bool logic ops), Eric made logic operations jump out of the expression
path to the non-expression path.

Unfortunately, this meant that we would first accept() the two operands,
skip generating any code that used them, then accept() the whole
expression, generating code for the operands a second time.

Dead code elimination would always remove the first set of redundant
operand assembly, since nothing actually used them.  But we shouldn't
generate it in the first place.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-09-26 16:55:18 -07:00
Kenneth Graunke
e5c49bc25b i965: Add #define for MI_REPORT_PERF_COUNT on Gen6+.
This appears in Volume 1 Part 1 of the Sandybridge PRM on page 48.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-26 16:55:18 -07:00
Kenneth Graunke
0f2da77307 i965: Add support for GL_AMD_performance_monitor on Ironlake.
Ironlake's counters are always enabled; userspace can simply send a
MI_REPORT_PERF_COUNT packet to take a snapshot of them.  This makes it
easy to implement.

The counters are documented in the source code for the intel-gpu-tools
intel_perf_counters utility.

v2: Adjust for core data structure changes.  Add a table mapping buffer
    object offsets to exposed counters (which changes each generation).
    Finally, add report ID assertions to sanity check the BO layout
    (thanks to Carl Worth).

v3: Update for core BeginPerfMonitor hook changes (requested by Brian).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-26 16:55:18 -07:00
Kenneth Graunke
b2e327e08f mesa: Add core support for the GL_AMD_performance_monitor extension.
This provides an interface for applications (and OpenGL-based tools) to
access GPU performance counters.  Since the exact performance counters
available vary between vendors and hardware generations, the extension
provides an API the application can use to get the names, types, and
minimum/maximum values of all available counters.  Counters are also
organized into groups.

Applications create "performance monitor" objects, select the counters
they want to track, and Begin/End monitoring, much like OpenGL's query
API.  Multiple monitors can be in flight simultaneously.

v2: Pass ctx to all driver hooks (suggested by Christoph), and attempt
    to fix overallocation of bitsets (caught by Christoph).  Incomplete.

v3: Significantly rework core data structures.  Store counters in groups
    rather than in a global list.  Use their array index in the group's
    counter list as the ID rather than trying to store a globally unique
    counter ID.  Use bitsets for active counters within a group, and
    also track which groups are active so that's easy to query.

v4: Remove _mesa_ prefix on static functions; detect out of memory
    conditions in new_performance_monitor(); make BeginPerfMonitor hook
    return a boolean rather than setting m->Active or raising an error.
    Switch to GLuint/unsigned for NumGroups, NumCounters, and
    MaxActiveCounters (which also means switching a bunch of temporary
    variable types).  All suggested by Brian Paul.  Also, remove
    commented out code at the bottom of the block.  Finally, fix the
    dispatch sanity test (noticed by Ian Romanick).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com> [v3]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-26 16:55:18 -07:00
Kenneth Graunke
f91475d4ab glsl: Create and use a has_uniform_buffer_objects() helper.
This is better than overriding the extension enable based on the
language version; it's robust against shaders that do:

   #version 140
   #extension GL_ARB_uniform_buffer_object : disable

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-26 16:55:18 -07:00
Kenneth Graunke
e4af55c78f glsl: Create and use a has_explicit_attrib_location() helper.
Explicit attribute locations are supported with GLSL 3.30, GLSL ES 3.00,
or "#extension GL_ARB_explicit_attrib_location: enable".  Using a helper
function makes it easy to check for this.

This enables support in GLSL 3.30, which was previously missing.

Previously, we overrode the extension enable flag for ES 3.00.  This is
not robust against a shader such as:

   #version 330
   #extension GL_ARB_explicit_attrib_location : disable

Disabling extensions should not remove core language functionality.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-26 16:55:18 -07:00
Kenneth Graunke
e9b410b54d mesa: Remove 'invalidate_state' parameter to _mesa_dirty_texobj().
Every caller passed true.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-26 16:55:18 -07:00
Eric Anholt
1c904466aa mesa: Remove some remaining FEATURE_* detritus.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-26 16:29:39 -07:00
Chris Forbes
fe2528c0b6 i965: Fix cube array coordinate normalization
Hardware requires the magnitude of the largest component to not exceed
1; brw_cubemap_normalize ensures that this is the case.

Unfortunately, we would previously multiply the array index for cube
arrays by the normalization factor. The incorrect array index would then
cause the sampler to attempt to access either the wrong cube, or memory
outside the cube surface entirely, resulting in garbage rendering or in
the worst case, hangs.

Alter the normalization pass to only multiply the .xyz components.

Fixes broken rendering in the arb_texture_cube_map_array-cubemap piglit,
which was recently adjusted to provoke this behavior.

V2: Fix indent.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "9.2" mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-26 18:24:22 +12:00
Zack Rusin
d83ef680e2 draw/clip: don't emit so many empty triangles
Compress empty triangles (don't emit more than one in a row) and
never emit empty triangles if we already generated a triangle
covering a non-null area. We can't skip all null-triangles
because c_primitives expects ones that were generated from vertices
exactly at the clipping-plane, to be emitted.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-25 19:42:22 -04:00
Zack Rusin
60c448faea llvmpipe: count c_primitives before discarding null prims
We need to count the clipper primitives before the rasterizer
discards one it considers to be null.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-25 19:41:02 -04:00
Zack Rusin
1291e833e7 llvmpipe: we need to subdivide if fb is bigger in either direction
We need to subdivide triangles if either of the dimensions is
larger than the max edge length, not when both of them are larger.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-25 19:38:21 -04:00
Marek Olšák
028b26e2ef radeon/llvm: fix shadow cube texturing for GL3.0
The fix is at the end (TGSI_TEXTURE_SHADOWCUBE handling), but I also
restructured the code for it to be more readable.

Fixes spec/!OpenGL 3.0/sampler-cube-shadow.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-25 20:45:23 +02:00
Marek Olšák
57f38e9f92 radeonsi: fix blitting the last 2 mipmap levels of compressed textures
This fixes compressedteximage piglit tests.

+10 piglits

Evergreen and Cayman have the same issue. R600 and R700 don't.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-25 20:45:22 +02:00
Marek Olšák
296adb6de9 radeonsi: add missing colorbuffer formats (rework format translation)
This fixes some piglits, e.g:
  spec/!OpenGL 3.0/required-renderbuffer-attachment-formats.

This can be ported to r600g.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-25 20:45:22 +02:00
Marek Olšák
f9ea435ebc radeonsi: bypass alpha-test for integer colorbuffers
Fixes spec/EXT_texture_integer/fbo-blending.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-25 20:45:22 +02:00
Marek Olšák
f7d004b9ad r600g: fix texture buffer object cache flushing
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-09-25 20:45:22 +02:00
Marek Olšák
6317a3fb31 r600g: fix constant buffer cache flushing
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-09-25 20:45:22 +02:00
Christian König
4871128e58 radeon/winsys: keep screen pointer in winsys v2
Only create one screen for each winsys instance.
This helps with buffer sharing and interop handling.

v2: rebased and some minor cleanup

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-25 19:41:31 +02:00
Christian König
f6e2aa0e12 build/radeonsi: group all targets in common subdir
Allows us to share more code between different targets.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2013-09-25 19:41:27 +02:00
Christian König
015853b568 build/r600: group all targets in common subdir
Allows us to share more code between different targets.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2013-09-25 19:41:23 +02:00
Christian König
533e9a04b4 build/r300: group build target in common subdir
Allows us to share more code between different targets.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2013-09-25 19:41:03 +02:00
Christian König
1c57d9a6c6 radeon/uvd: try to place msg/fb buffer into GART
This is only supported on NI+, but the kernel takes care of those limitations.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-09-25 10:59:03 +02:00
Christian König
f9f14201c1 radeon/uvd: move alignment to winsys
Similar to GFX and DMA.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-25 10:58:58 +02:00
Christian König
5f6ae61e69 st/vdpau: use a separate lock per decoder
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-09-25 10:58:58 +02:00
Christian König
34b5a4e0d8 st/vdpau: use new vlc function to serach for VC-1 start codes
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-09-25 10:58:58 +02:00
Christian König
eb1cb253b7 vl/mpeg12: use new vlc function to search for start codes
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-09-25 10:58:58 +02:00
Christian König
e3ecea9ddf vl/vlc: add fast forward search for byte value
Commonly used to find start codes and has far less overhead
to searching manually.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-09-25 10:58:58 +02:00
Vinson Lee
59157d1c96 glsl: Initialize ir_lower_jumps_visitor member variables.
Fixes "Unintialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-24 22:54:25 -07:00
Vinson Lee
94e3ecae2d glsl: Initialize lower_vector_visitor::dont_lower_swz.
Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-24 22:51:23 -07:00
Vinson Lee
74b02b8e3f glsl: Initialize assignment_generator member variables.
Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-24 22:16:39 -07:00
Vinson Lee
6128c226b4 glsl: Remove unused pointer value.
Silences "Unused pointer value" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-24 22:10:36 -07:00
Zack Rusin
71ecc2cf71 Revert "llvmpipe: increase number of subpixel bits to eight"
This reverts commit 755c11dc5e.
We agreed that this is band-aid that's not very useful and
the proper solution is to rewrite the rasterization algo
so that it operates on 64 bit values.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-09-24 15:10:02 -04:00
Dylan Noblesmith
49f8fc64de mesa: remove handcounted magic number
Also make it a compile-time error with STATIC_ASSERT.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-24 11:29:17 -07:00
Dylan Noblesmith
ea3847b12e mesa: remove outdated comment
No such argument exists since this commit:

commit 92f3fca0ea
Author:     Ian Romanick <ian.d.romanick@intel.com>
AuthorDate: Sun Aug 21 17:23:58 2011 -0700
Commit:     Ian Romanick <ian.d.romanick@intel.com>
CommitDate: Tue Aug 23 14:52:09 2011 -0700

    mesa: Remove target parameter from dd_function_table::BufferSubData

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-24 11:27:12 -07:00
Dylan Noblesmith
2f5d41ce79 mesa: remove stale comment
This line stopped making sense in the great sed
replace of commit f9995b3075

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-24 11:27:03 -07:00
Zack Rusin
e5ec5aef2b llvmpipe: align the array used for subdivived vertices
When subdiving a triangle we're using a temporary array to store
the new coordinates for the subdivided triangles. Unfortunately
the array used for that was not aligned properly causing
random crashes in the llvm jit code which was trying to load
vectors from it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-23 18:10:51 -04:00
Vinson Lee
f036d55515 glapi: Move declaration before code.
This patch fixes the MSVC build error introduced by commit
673129e0b9.

enums.c
mesa\main\enums.c(3776) : error C2143: syntax error : missing ';' before 'type'
mesa\main\enums.c(3781) : error C2065: 'elt' : undeclared identifier
mesa\main\enums.c(3781) : warning C4047: '!=' : 'int' differs in levels of indirection from 'void *'
mesa\main\enums.c(3782) : error C2065: 'elt' : undeclared identifier
mesa\main\enums.c(3782) : error C2223: left of '->offset' must point to struct/union
mesa\main\enums.c(3782) : warning C4033: '_mesa_lookup_enum_by_nr' must return a value

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-09-23 14:14:32 -07:00
Eric Anholt
11e494a572 mesa: Use -Bsymbolic in the linker to locally resolve Mesa-internal symbols.
Normally, LD_PRELOAD will take precedence over your own symbols, which you
want for things like malloc() in libc.  But we don't have any local
symbols we would want overridden (like hash_table_insert(), for example!),
so tell the linker to resolve them internally.  This also avoids calls
through the PLT.

Saves almost 100k on libdricore's size, and gets us a bunch of the
performance back that we had with non-dricore.

Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
2013-09-23 12:45:22 -07:00
Eric Anholt
10ef949424 glsl: Hide many classes local to individual .cpp files in anon namespaces.
This gives the compiler the chance to inline and not export class symbols
even in the absence of LTO.  Saves about 60kb on disk.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
2013-09-23 12:45:22 -07:00
Eric Anholt
07572621bc mesa: Drop an extra copy-and-pasted copy in the program clone function.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
2013-09-23 12:45:22 -07:00
Eric Anholt
669b88eb12 mesa: Convert some runtime asserts to static asserts.
Noticed while grepping through the code for something else.

v2: Don't convert really-runtime asserts to static asserts.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
2013-09-23 12:45:22 -07:00
Eric Anholt
673129e0b9 mesa: Shrink the size of the enum string lookup struct.
Since it's only used for debug information, we can misalign the struct and
save the disk space.  Another 19k on a 64-bit build.

v2: Make a compiler.h macro to only use the attribute if we know we can.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
2013-09-23 12:45:22 -07:00
Eric Anholt
c0378b6400 mesa: Remove the extra enum strings and extra lookup table.
Now that there's no name -> enum direction, we can drop the extra strings,
and merge the offsets table and the reduced_enums table.

Between the previous commit and this one, Mesa core drops by 30k.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
2013-09-23 12:45:22 -07:00
Eric Anholt
3b29a6ec91 mesa: Remove _mesa_lookup_enum_by_name().
It's been unused for a long time.  I stopped digging through git history
as of 2009.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
2013-09-23 12:45:22 -07:00
Zack Rusin
755c11dc5e llvmpipe: increase number of subpixel bits to eight
Unfortunately d3d10 requires a lot higher precision (e.g.
wgf11clipping tests for it). The smallest number of precision
bits with which it passes is 8. That means that we need to
decrease the maximum length of an edge that we can handle without
subdivision by 4 bits. Abstracted the code a bit to make it easier
to change once to switch to 64bit rasterization.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-23 14:53:07 -04:00
Vinson Lee
6d29db715b glsl: Define isnormal and copysign for MSVC to fix build.
This patch fixes these MSVC build errors.

ir_constant_expression.cpp
src\glsl\ir_constant_expression.cpp(564) : warning C4244: '=' : conversion from 'int' to 'float', possible loss of data
src\glsl\ir_constant_expression.cpp(1384) : error C3861: 'isnormal': identifier not found
src\glsl\ir_constant_expression.cpp(1385) : error C3861: 'copysign': identifier not found

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69541
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Matt Turner <mattst88@gmail.com>
2013-09-22 16:11:36 -07:00
Johannes Obermayr
6016dabfa2 Suppress clang's warnings about unused CFLAGS and CXXFLAGS.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-22 13:10:43 -07:00
Christian König
8bbcc43ad9 radeon/uvd: async flush the UVD cs
No need to block for the CS thread here.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-22 10:33:20 +02:00
Christian König
01a0dbcb96 winsys/radeon: share winsys between different fd's
Share the winsys between different fd's if they point to the same device.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-22 10:33:20 +02:00
Christian König
0653c66ef4 winsys/radeon: remove cs_queue_empty
Waiting for an empty queue is nonsense and can lead to deadlocks if we have
multiple waiters or another thread that continuously sends down new commands.

Just post the cs to the queue and immediately wait for it to finish.

This is a candidate for the stable branch.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-22 10:33:20 +02:00
Christian König
f7ccb84aa1 winsys/radeon: fix killing the CS thread
Kill the thread only after we checked that it's not used any more, not before.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-22 10:33:20 +02:00
Eric Anholt
938956ad52 i965/gen4: Fix fragment program rectangle texture shadow compares.
The rescale_texcoord(), if it does something, will return just the
GLSL-sized coordinate, leaving out the 3rd and 4th components where we
were storing our projected shadow compare and the texture projector.
Deref the shadow compare before using the shared rescale-the-coordinate
code to fix the problem.

Fixes piglit tex-shadow2drect.shader_test and txp-shadow2drect.shader_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69525
NOTE: This is a candidate for stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-21 16:48:58 -07:00
Abdiel Janulgue
1266f01dc7 i965/gen7.5: Fix missing Shader Channel Select entries on Haswell
Probably non-intentional, but the SURFACE_STATE setup refactoring
for buffer surfaces had missed the scs bits when creating constant
surface states.

Fixes broken GLB 2.5 on Haswell where the knight's textures are missing

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-21 12:53:13 -07:00
Kenneth Graunke
4f1ebb8ddd i965, mesa: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS macros.
These classes declared a placement new operator, but didn't declare a
delete operator.  Switching to the macro gives them a delete operator,
which probably is a good idea anyway.

This also eliminates a lot of boilerplate.

v2: Properly use RZALLOC in Mesa IR/TGSI translators.  Caught by Eric
    and Chad.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-21 09:17:21 -07:00
Kenneth Graunke
81a3759bb5 glsl: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS in a bunch of places.
This eliminates a lot of boilerplate and should be 100% equivalent.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-21 09:17:06 -07:00
Kenneth Graunke
bfbad9d1a8 ralloc: Introduce new macros for defining C++ new/delete operators.
Most of our C++ classes define placement new and delete operators so we
can do convenient allocation via:

   thing *foo = new(mem_ctx) thing(...)

Currently, this is done via a lot of boilerplate.  By adding simple
macros to ralloc, we can condense this to a single line, making it
trivial to add this feature to a new class.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-21 09:16:02 -07:00
Grigori Goronzy
edbbfac6cf r600g: fast color clears for single-sample buffers
Allocate a CMASK on demand and use it to fast clear single-sample
colorbuffers. Both FBOs and window system colorbuffers are fast
cleared. Expand as needed when colorbuffers are mapped or displayed
on screen.

v2: cosmetics, move transfer expansion into dma_blit

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2013-09-20 20:35:55 +02:00
Grigori Goronzy
56d9a397aa r600g: add support for separately allocated CMASKs
v2: check for NULL cbufs

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2013-09-20 20:35:55 +02:00
Marek Olšák
419cd5f2a2 gallium: add flush_resource context function
r600g needs explicit flushing before DRI2 buffers are presented on the screen.

v2: add (stub) implementations for all drivers, fix frontbuffer flushing
v3: fix galahad

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2013-09-20 20:35:55 +02:00
Marek Olšák
d2bd63433a radeonsi: simplify and fix MSAA texture sampling for array textures
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-20 20:35:55 +02:00
Marek Olšák
defedc0f61 radeonsi: fix textureOffset and texelFetchOffset GLSL functions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-20 20:35:55 +02:00
José Fonseca
1569b3e536 llvmpipe: Fix rendering to PIPE_FORMAT_R10G10B10A2_UNORM.
We must take rounding in consideration when re-scaling to narrow
normalized channels, such as 2-bit normalized alpha.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-20 17:34:57 +01:00
José Fonseca
2ab4e1d1e6 draw: Ensure draw_pt_middle_end::bind_parameters is never NULL.
Prevents calling NULL pointer with softpipe in certain cases.

Trivial.
2013-09-20 17:34:57 +01:00
José Fonseca
75c394f567 tools/trace: Simple script to compare two traces.
Based on the earlier apitrace tracediff.sh script.
2013-09-20 17:34:57 +01:00
Ian Romanick
1cc3b90d47 mesa: Silence GCC warning 'comparison between signed and unsigned integer expressions'
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-19 17:15:09 -05:00
Ian Romanick
7db6b5aa91 mesa: Fix broken call to print_table_stats
The function takes a parameter, but none was given.  Also, in the
non-GET_DEBUG case, silence the unused parameter warning.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-19 17:15:09 -05:00
Ian Romanick
b4cf56cdf8 glsl: Set VertexProgram.MaxOutputComponents and FragmentProgram.MaxInputComponents in standalone scaffolding
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-19 17:14:49 -05:00
Ian Romanick
be8963a18f mesa: Allow several ARB_geometry_shader4 queries in OpenGL 3.2
GL_MAX_GEOMETRY_TEXTURE_IMAGE_UNITS, GL_MAX_GEOMETRY_OUTPUT_VERTICES,
GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS, and
GL_MAX_GEOMETRY_UNIFORM_COMPONENTS all have the same enum value and
meaning as their _ARB counterparts.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-19 16:29:44 -05:00
Ian Romanick
df371e2b1b mesa: Expose MAX_GEOMETRY_{INPUT,OUTPUT}_COMPONENTS on OpenGL 3.2
The comment '# GL 3.0 / GLES3' was incorrect.  The
MAX_VERTEX_OUTPUT_COMPONENTS and MAX_FRAGMENT_INPUT_COMPONENTS queries
were added in OpenGL 3.2 (with geometry shaders) and OpenGL ES 3.0.
This just fixes that comment.

v2: Add the GEOMETRY queries in the existing '# GL 3.2' section since
they have nothing to do with GLES3.  Suggested by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-19 16:29:44 -05:00
Ian Romanick
965d9e649d mesa: Get GL_MAX_FRAGMENT_INPUT_COMPONENTS from FragmentProgram.MaxInputComponents
In OpenGL ES 3.0 the minimum-maximum for GL_MAX_VERTEX_OUTPUT_VECTORS is 16,
but the minimum-maximum for GL_MAX_FRAGMENT_INTPUT_VECTORS is 15.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-19 16:29:44 -05:00
Ian Romanick
d1ade4eaf1 mesa: Get GL_MAX_VERTEX_OUTPUT_COMPONENTS from VertexProgram.MaxOutputComponents
In OpenGL ES 3.0 the minimum-maximum for GL_MAX_VERTEX_OUTPUT_VECTORS is 16,
but the minimum-maximum for GL_MAX_FRAGMENT_INTPUT_VECTORS is 15.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-19 16:29:44 -05:00
Ian Romanick
67a2d31735 i915: Set VertexProgram.MaxOutputComponents and FragmentProgram.MaxInputComponents
This was the only remaining place in Mesa that sets MaxVaryings without
also setting these values.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-19 16:29:44 -05:00
Ian Romanick
e1f8c58590 i965: Set *Program.Max{Input,Output}Components
Now that MaxVaryings is > 16, VertexProgram.MaxOutputComponents,
GeometryProgram.MaxInputComponents, GeometryProgram.MaxOutputComponents,
and FragmentProgram.MaxInputComponents also need to be set.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Paul Berry <stereotype441@gmail.com>
2013-09-19 16:29:44 -05:00
Ian Romanick
d358c6b700 mesa: Set default values for Max{Input,Output}Components in init_program_limits
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-19 16:29:44 -05:00
Ian Romanick
052c9ae1f3 mesa: Remove gl_constants::MaxVaryingComponents
There are no longer any users.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: Zack Rusin <zackr@vmware.com>
2013-09-19 16:29:44 -05:00
Ian Romanick
d91249df1a mesa: Use correct data for MAX_{VERTEX,GEOMETRY}_VARYING_COMPONENTS_ARB queries
Previously gl_constants::MaxVaryingComponents was used.  Now
gl_constants::VertexProgram::MaxOutputs and
gl_constants::GeometryProgram::MaxOutputs are used.

This means that st_extensions.c had to be updated to set these fields
instead of MaxVaryingComponents.  It was previously the only place that
set MaxVaryingComponents.

I believe that the structure is allocated by calloc, so the value should
be initialized to zero in non-Gallium drivers before and after my
change.  Right now nobody enables GL_ARB_geometry_shader4, so it's
pretty much dead code anyway.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: Zack Rusin <zackr@vmware.com>
2013-09-19 16:29:44 -05:00
Ian Romanick
a384238c3d mesa: Track per-stage shader input and output limits independently
In OpenGL 3.2 these are independently queryable.  In addition, the spec
has different minimum-maximums for various values.
GL_MAX_VERTEX_OUTPUT_COMPONENTS is 64, but
GL_MAX_GEOMETRY_OUTPUT_COMPONENTS (and GL_MAX_FRAGMENT_INPUT_COMPONENTS)
is 128.

In OpenGL ES 3.0 these are also independently queryable.  The spec has
different minimum-maximums for various values.
GL_MAX_VERTEX_OUTPUT_VECTORS is 16, but GL_MAX_FRAGMENT_INTPUT_VECTORS
is 15.

None of these values are used yet.  I have just added space to the
structures.  Future patches will add users and eventually remove some
old fields.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: Zack Rusin <zackr@vmware.com>
2013-09-19 16:29:43 -05:00
Ian Romanick
d38765f3c8 mesa: Support GL_MAX_VERTEX_OUTPUT_COMPONENTS query with ES3
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
2013-09-19 16:29:43 -05:00
Kenneth Graunke
b6b549ccfc i965: Refactor Gen4-6 SURFACE_STATE setup for buffer surfaces.
This was an embarassingly large amount of copy and pasted code,
and it wasn't particularly simple code either.  By factoring it out
into a helper function, we consolidate the complexity.

v2: Properly NULL-check bo.  Caught by Eric Anholt.
v3: Do the subtraction by 1 in gen7_emit_buffer_surface_state, rather
    than making callers do it.  This makes the buffer_size parameter
    the actual size of the buffer.  Suggested by Paul Berry.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-19 10:52:58 -07:00
Kenneth Graunke
e114cbff96 i965: Refactor Gen7+ SURFACE_STATE setup for buffer surfaces.
This was an embarassingly large amount of copy and pasted code,
and it wasn't particularly simple code either.  By factoring it out
into a helper function, we consolidate the complexity.

v2: Properly NULL-check bo.  Caught by Eric Anholt.
v3: Do the subtraction by 1 in gen7_emit_buffer_surface_state, rather
    than making callers do it.  This makes the buffer_size parameter
    the actual size of the buffer.  Suggested by Paul Berry.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-19 10:52:58 -07:00
Kenneth Graunke
35a54ad02f i965: Fix off by one errors in texture buffer size calculations.
The value that's split into width/height/depth needs to be the size of
the buffer minus one.  This makes it consistent with the constant buffer
and shader time SURFACE_STATE setup code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-19 10:52:58 -07:00
Kenneth Graunke
34b11334d4 i965: Fix writemask != 0 assertions on Sandybridge.
This fixes myriads of regressions since commit 169f9c030c
("i965: Add an assertion that writemask != NULL for non-ARFs.").

On Sandybridge, our control flow handling (such as brw_IF) does:

   brw_set_dest(p, insn, brw_imm_w(0));
   insn->bits1.branch_gen6.jump_count = 0;

This results in a IMM destination with zero for the writemask.  IMM
destinations are rather bizarre, but the code has been working for ages,
so I'm loathe to change it.

Fixes glxgears on Sandybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-19 10:52:58 -07:00
Kenneth Graunke
d2d90d66d8 glsl: Delete builtin_builder::shader when destroying built-ins.
I would use _mesa_delete_shader, but it's declared static, and we don't
really need any of the stuff in it anyway.

This fixes a memory leak caught by Valgrind.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-19 10:52:58 -07:00
Kenneth Graunke
9f64bb2312 i965: Fix brw_gs_prog_data_compare to actually check field members.
&a and &b are the address of the local stack variables, not the actual
structures.  Instead of comparing the fields of a and b, we compared
...some stack memory.

Not a candidate for stable since GS code doesn't exist in 9.2.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-19 10:52:57 -07:00
Kenneth Graunke
4e4b079916 i965: Fix brw_vs_prog_data_compare to actually check field members.
&a and &b are the address of the local stack variables, not the actual
structures.  Instead of comparing the fields of a and b, we compared
...some stack memory.

Caught by Valgrind on Piglit's glsl-lod-bias test (among many others).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68233
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: mesa-stable@lists.freedesktop.org
2013-09-19 10:52:57 -07:00
Kenneth Graunke
feaad189b4 i965: Move binding table code to a new file, brw_binding_tables.c.
The code to upload the binding tables for each stage was scattered
across brw_{vs,gs,wm}_surface_state.c and brw_misc_state.c, which also
contain a lot of code to populate individual SURFACE_STATE structures.

This patch brings all the binding table upload code together, and splits
it out from the code which fills in SURFACE_STATE entries.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-19 10:52:57 -07:00
Kenneth Graunke
113a75ff2d i965: Use brw_upload_binding_table() for the pixel shader as well.
This is not quite the same: brw_upload_binding_table() also has code to
early-return if there are no entries, while the existing code did not.

The PS binding table is unlikely to be empty since it will have at least
one color buffer.  If it ever is empty, early returning seems wise.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-19 10:52:57 -07:00
Kenneth Graunke
72340839ca i965: Generalize brw_vec4_upload_binding_table() beyond vec4 stages.
Instead of passing in a brw_vec4_prog_data structure, we can simply
pass the one field it needs: the number of entries in the binding table.

We also need to pass in the shader time surface index rather than
hardcoding SURF_INDEX_VEC4_SHADER_TIME.

Since the resulting function is stage-agnostic, this patch removes
"vec4_" from the name.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-19 10:52:57 -07:00
Kenneth Graunke
254891b3fc i965: Convert loop to memcpy in brw_vec4_upload_binding_table().
This is probably more efficient.  At any rate, it's less code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-19 10:52:57 -07:00
Kenneth Graunke
0532b200f3 i965: Update comments in brw_vec4_upload_binding_table().
The first comment was a bit stale; there are more kinds of surfaces than
textures and pull constants.

The second was a leftover "to do" comment for something I already did.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-09-19 10:52:57 -07:00
Gaetan Nadon
79930c6027 winsys/sw/xlib: fix compile error in xlib_sw_winsys.c.
xlib_sw_winsys.h:5:22: fatal error: X11/Xlib.h: No such file or directory

The compiler cannot find the Xlib.h in the installed system headers.
All supplied include directives point to inside the mesa module.
The X11_CFLAGS variable is undefined (not defined in config.status).

It appears the intent was to use X11_INCLUDES defined in configure.ac.

The Xlib.h file is not installed on my workstation. It is supplied in
the libx11-dev package. This allows an X developer control over which
version of this file is used for X development.

Signed-off-by: Gaetan Nadon <memsize@videotron.ca>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-19 10:49:57 -07:00
Gaetan Nadon
092f2e8336 glx: fix compile error in egl_glx.c.
egl_glx.c:40:22: fatal error: X11/Xlib.h: No such file or directory

The compiler cannot find the Xlib.h in the installed system headers.
All supplied include directives point to inside the mesa module.
The X11_CFLAGS variable is undefined (not defined in config.status).

It appears the intent was to use X11_INCLUDES defined in configure.ac.

The Xlib.h file is not installed on my workstation. It is supplied in
the libx11-dev package. This allows an X developer control over which
version of this file is used for X development.

Signed-off-by: Gaetan Nadon <memsize@videotron.ca>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-19 10:49:47 -07:00
Rob Clark
7dab097a51 freedreno/a3xx: fix typo mixup w/ mipfilter
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-19 11:47:40 -04:00
Rob Clark
575a6e7ec5 freedreno: fix glReadPixels
duh, we still need to flush if there are pending draws and it isn't an
unsynchronized case.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-19 11:45:01 -04:00
Roland Scheidegger
532dc8939f gallivm: adjust wrap mode to CLAMP_TO_EDGE always for cube maps.
Technically without seamless filtering enabled GL allows any wrap mode, which
made sense when supporting true borders (can get seamless effect with border
and CLAMP_TO_BORDER), but gallium doesn't support borders and d3d9 requires
wrap modes to be ignored and it's a pain to fix up the sampler state (as it
makes it texture dependent). It is difficult to imagine a situation where an
app really wants another behavior so just cheat here. (It looks like some
graphics hw (intel) actually requires this too hence it should be safe.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-09-19 17:14:36 +02:00
Adrian Negreanu
602d368446 android: Remove builtin_compiler
The first part was done in:

   commit c845140a20
   Author: Kenneth Graunke <kenneth@whitecape.org>
   Date:   Tue Sep 3 21:22:17 2013 -0700

Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-18 09:35:55 -07:00
José Fonseca
e150c0da71 util/u_blit: Implement util_blit_pixels via pipe_context::blit.
This removes a lot of code, but not everything, as util_blit_pixels_tex
is still useful when one needs to override pipe_sampler_view::swizzle_?.

Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-18 11:25:02 +01:00
José Fonseca
d8c7e13886 util/u_blit: Support blits from cubemaps.
By calling util_map_texcoords2d_onto_cubemap.

A new parameter for util_blit_pixels_tex is necessary, as
pipe_sampler_view::first_layer is always supposed to point to the first
face when sampling from cubemaps.

Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-18 11:24:59 +01:00
José Fonseca
fb1d992da4 vega: Use pipe_context::blit instead of util_blit_pixels_tex.
Only compile-tested but it seems straightforward.

Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-18 11:23:28 +01:00
Kenneth Graunke
ec44d56a5b i965: Rename brw_{fs,vec4}_emit.cpp to brw_{fs,vec4}_generator.cpp.
The previous names were really confusing to talk about:
- brw_fs_visitor() contained methods named emit_whatever().
- brw_fs_generator() contained methods named generate_whatever(), but
  lived in brw_fs_emit.cpp.

So when someone said "the emit layer", or "emit code", we weren't sure
whether they meant the visitor's emit() functions or the generator in
brw_fs_emit.cpp.

By renaming these files, the method names, class names, and file names
all match, which is much less confusing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
2013-09-18 00:08:31 -07:00
Matt Turner
a3b51a22f7 glsl: Correctly validate fma()'s types.
lrp() can take a scalar as a third argument, and fma() cannot.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-17 17:02:06 -07:00
Matt Turner
d56bbd0441 glsl: Add frexp signatures and implementation.
I initially implemented frexp() as an IR opcode with a lowering pass,
but since it returns a value and has an out-parameter, it would break
assumptions our optimization passes make about ir_expressions being pure
(i.e., having no side effects).

For example, if opt_tree_grafting encounters this code:

uniform float u;
void main()
{
  int exp;
  float f = frexp(u, out exp);
  float g = float(exp)/256.0;
  float h = float(exp) + 1.0;
  gl_FragColor = vec4(f, g, h, g + h);
}

it may try to optimize it to this:

uniform float u;
void main()
{
  int exp;
  float g = float(exp)/256.0;
  float h = float(exp) + 1.0;
  gl_FragColor = vec4(frexp(u, out exp), g, h, g + h);
}

Some hardware has an instruction which performs frexp(), but we would
need some other compiler infrastructure to be able to generate it, such
as an intrinsics system that would allow backends to emit specific code
for particular bits of IR.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-17 17:01:58 -07:00
Matt Turner
c43d6060b1 i965: Lower ldexp.
v2: Drop frexp lowering.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-17 16:59:26 -07:00
Matt Turner
d0b8ea60b7 glsl: Add ldexp_to_arith lowering pass.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-17 16:59:23 -07:00
Matt Turner
5561251b58 glsl: Allow vectors to be created from ir_constant().
Note the parameter name change in the int version of ir_constant, to
avoid the conflict with the loop iterator.

v2: Make analogous change to builtin_builder::imm().
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-17 16:59:14 -07:00
Matt Turner
b2ab840130 glsl: Add support for ldexp.
v2: Drop frexp. Rebase on builtins rewrite.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-17 16:59:05 -07:00
Paul Berry
4b0488ef4e i965: Add some missing bits to {mesa,brw,cache}_bits[].
These data structures are used for debug output, so it wasn't hurting
anything that there were missing bits.  But it's good to keep things
up to date.

This patch also adds static asserts so that the {brw,cache}_bits[]
arrays are the proper size, so that we don't forget to add to them in
the future.  Unfortunately there's no convenient way to assert that
mesa_bits[] is the proper size.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-17 15:18:18 -07:00
Paul Berry
3374dabce7 i965/gs: Implement basic gl_PrimitiveIDIn functionality.
If the geometry shader refers to the built-in variable
gl_PrimitiveIDIn, we need to set a bit in 3DSTATE_GS to tell the
hardware to dispatch primitive ID to r1, and we need to leave room for
it when allocating registers.

Note: this feature doesn't yet work properly when software primitive
restart is in use (the primitive ID counter will incorrectly reset
with each primitive restart, since software primitive restart works by
performing multiple draw calls).  I plan to address that in a future
patch series.

Fixes piglit test "spec/glsl-1.50/execution/geometry/primitive-id-in".

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-17 15:18:14 -07:00
Paul Berry
f67fa8f3c8 i965/gs: New gs primitive types are supported by HW primitive restart.
When we previously implemented primitive restart, we didn't add cases
to brw_primitive_restart.c's can_cut_index_handle_prims() for the
primitive types that are introduced with geometry shaders.  It turns
out that all of the new primitive types are supported by hardware
primitive restart.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-17 15:18:11 -07:00
Paul Berry
9791af90e3 i965/gs: Add new primitive types.
As part of its support for geometry shaders, GL 3.2 introduces four
new primitive types: GL_LINES_ADJACENCY, GL_LINE_STRIP_ADJACENCY,
GL_TRIANGLES_ADJACENCY, and GL_TRIANGLE_STRIP_ADJACENCY.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-17 15:18:07 -07:00
Roland Scheidegger
93b5f71179 gallivm: some bits of seamless cube filtering implementation
Simply adjust wrap mode to clamp_to_edge. This is all that's needed for a
correct implementation for nearest filtering, and it's way better than
using repeat wrap for instance for linear filtering (though obviously this
doesn't actually do seamless filtering).

v2: fix s/t wrap not r/s...

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-09-18 00:00:37 +02:00
Kenneth Graunke
b8244b0056 i965: Remove MIPLAYOUT_BELOW from Gen4-6 constant buffer surface state.
Specifying a miptree layout makes no sense for constant buffers.

This has no functional change since BRW_SURFACE_MIPMAPLAYOUT_BELOW is
just a #define for 0.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-17 13:17:07 -07:00
Kristian Høgsberg
a1b6e69e45 egl: Also add EGL_TEXTURE_FORMAT as a valid eglQueryWaylandBufferWL attribute
Now that we have a table of accepted eglQueryWaylandBufferWL() attributes,
we should also list EGL_TEXTURE_FORMAT.
2013-09-16 22:22:49 -07:00
Stanislav Vorobiov
1281a90532 egl: add EGL_WAYLAND_Y_INVERTED_WL attribute
This enables querying of wl_buffer's orientation
2013-09-16 22:20:27 -07:00
Kenneth Graunke
9ad6dda21e i965: Use gen7_upload_constant_state for 3DSTATE_CONSTANT_PS as well.
Now we use gen7_upload_constant_state() for all three shader stages.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-16 18:25:14 -07:00
Kenneth Graunke
e776c18afb i965: Set brw_stage_state::push_const_size for PS constants.
This paves the way for using gen7_upload_constant_state for PS data.

The formula is copied from gen7_wm_state.c.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-16 18:25:11 -07:00
Kenneth Graunke
d385edf4c3 i965: Introduce a prog_data temporary in gen6_upload_wm_push_constants.
This saves a bit of typing and shortens a few lines.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-16 18:25:07 -07:00
Paul Berry
24765c58bd i965/gen6+: Support 128 varying components.
GL 3.2 requires us to support 128 varying components for geometry
shader outputs and fragment shader inputs, and 64 varying components
otherwise.  But there's no hardware limitation that restricts us to 64
varying components, and core Mesa doesn't currently allow different
stages to have different maximum values, so just go ahead and enable
128 varying components for all stages.  This gets us better test
coverage anyway.

Even though we are only working on GL 3.2 support for gen7 right now,
gen6 also supports 128 varying components, so go ahead and switch it
on there too.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:58 -07:00
Paul Berry
f5d38c58ee i965/ff_gs: Generate URB writes using a loop.
Previously we only ever did 1 URB write, since the maximum number of
varyings we support is small enough to fit in 1 URB write (when using
BRW_URB_SWIZZLE_NONE, which is what the pre-Gen7 GS always uses).  But
we're about to increase the number of varying components we support
from 64 to 128.

With 128 varyings, the most URB writes we'll have to do is 2, but it's
just as easy to write a general-purpose loop.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:55 -07:00
Paul Berry
57b8cff33c i965/gen6: Fix assertions on VS/GS URB size.
The "{VS,GS} URB Entry Allocation Size" fields of 3DSTATE_URB allow
values in the range 0-4, but they are U8-1 fields, so the range of
possible allocation sizes is 1-5.  We were erroneously prohibiting a
size of 5.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:52 -07:00
Paul Berry
784044c206 i965/vec4: Generate URB writes using a loop.
Previously we only ever did 1 or 2 URB writes, since the maximum
number of varyings we support is small enough to fit in 2 URB writes.
But GL 3.2 requires the geometry shader to support 128 output varying
components, and this could require up to 3 URB writes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:49 -07:00
Paul Berry
875972029e i965/fs: When >64 input components, order them to match prev pipeline stage.
Since the SF/SBE stage is only capable of performing arbitrary
reorderings of 16 varying slots, we can't arrange the fragment shader
inputs in an arbitrary order if there are more than 16 input varying
slots in use.  We need to make sure that slots 16-31 match the
corresponding outputs of the previous pipeline stage.

The easiest way to accomplish this is to just make all varying slots
match up with the previous pipeline stage.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:46 -07:00
Paul Berry
a4546ec114 i965/fs: Simplify computation of key.input_slots_valid during precompile.
The for loop was rather silly.  In addition to checking brw->gen < 6
on each loop iteration, it took pains to exclude bits from
fp->Base.InputsRead that don't correspond to fragment shader inputs.
But those bits would never have been set in the first place, since the
only bits that are ever set in fp->Base.InputsRead are fragment shader
inputs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:43 -07:00
Paul Berry
8a36f4382b i965/gs: Stop storing an input VUE map in the GS program key.
Now that the vertex shader output VUE map is determined solely by a
64-bit bitfield, we don't have to store it in its entirety in the
geometry shader program key; instead, we can just store the bitfield,
and let the geometry shader infer the VUE map at compile time.

This dramatically reduces the size of the geometry shader program key,
which we want to keep small since it gets recomputed whenever the
active program changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:40 -07:00
Paul Berry
d1ad447f01 i965/gen6+: Remove VUE map dependency on userclip_active.
Previously, on Gen6+, we laid out the vertex (or geometry) shader VUE
map differently depending whether user clipping was active.  If it was
active, we put the clip distances in slots 2 and 3 (where the clipper
expects them); if it was inactive, we assigned them in the order of
the gl_varying_slot enum.

This made for unnecessary recompiles, since turning clipping on/off
for a shader that used gl_ClipDistance might rearrange the varyings.
It also required extra bookkeeping, since it required the user
clipping flag to be provided to brw_compute_vue_map() as a parameter.

With this patch, we always put clip distances at in slots 2 and 3 if
they are written to.  do_vs_prog() and do_gs_prog() are responsible
for ensuring that clip distances are written to when user clipping is
enabled (as do_vs_prog() previously did for gen4-5).

This makes the only input to brw_compute_vue_map() a bitfield of which
varyings the shader writes to, a fact that we'll take advantage of in
forthcoming patches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:36 -07:00
Paul Berry
3a83b20dcc i965/fs: Stop wasting input attribute space on gl_FragCoord and gl_FrontFacing.
Previously, if a fragment shader accessed gl_FragCoord or
gl_FrontFacing, we would assign them their own slots in the fragment
shader input attribute array, using up space that could be made
available to real varyings.  This was not strictly necessary (since
these values are not true varyings, and are instead computed from
other data available in the FS payload).  But we had to do it anyway
because the SF/SBE setup code assumed that every 1 bit in the
gl_program::InputsRead bitfield corresponded to a genuine varying
variable.

Now that the SF/SBE code consults brw_wm_prog_data and only sets up
the attributes that the fragment shader actually needs, we don't have
to do this anymore.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:32 -07:00
Paul Berry
0af1252ae4 i965/sf: Consult brw_wm_prog_data when setting up SF/SBE state.
Previously, the SF/SBE setup code delivered varying inputs to the FS
in the order in which they appear in the gl_program::InputsRead
bitfield, since that's what the FS expects.

When we add support for more than 64 varying components, this will no
longer always be the case, because the Gen6+ SF/SBE stage is only
capable of performing arbitrary reorderings of 16 varying slots.  So,
when there are more than 16 vec4's worth of varying inputs, the FS
will have to adjust the order its input varyings in order to partially
match the order of outputs from the geometry or vertex shader.

To allow extra flexibility in the ordering of FS varyings, this patch
causes the SF/SBE to deliver varying inputs to the FS in exactly the
order that the FS requests, by consulting brw_wm_prog_data::urb_setup
and brw_wm_prog_data::num_varying_inputs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:29 -07:00
Paul Berry
af84bbd2ca i965/sf: Consolidate common code for setting up gen6-7 attribute overrides.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:25 -07:00
Paul Berry
d5b4095356 i965/sf: Use BRW_SF_URB_ENTRY_READ_OFFSET rather than hardcoded values.
We always program the SF unit to start reading the vertex URB entry at
offset 1.  In upcoming patches, we'll be adding FS code that relies on
this.  So consistently use the constant BRW_SF_URB_ENTRY_READ_OFFSET
rather than hardcoding a 1.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:21 -07:00
Paul Berry
8c2b9bd1df i965/fs: Consult brw_wm_prog_data::num_varying_inputs when setting up WM state.
Previously, we assumed that the number of varying inputs consumed by
the fragment shader was equal to the number of bits set in
gl_program::InputsRead.  However, we'll soon be making two changes
that will cause that not to be true:

- We'll stop wasting varying input space for gl_FragCoord and
  gl_FrontFacing, which aren't varyings.

- For fragment shaders that have more than 16 varying inputs, we'll
  adjust the layout of the inputs to account for the fact that the
  SF/SBE pipeline stage can't reorder inputs beyond the first 16; if
  there are GS outputs that the FS doens't use (or vice versa) this
  may cause the number of FS varying inputs to change.

So, instead of trying to guess the number of FS inputs from
gl_program::InputsRead, simply read it from
brw_wm_prog_data:num_varying_inputs, which is guaranteed to be correct
since it's populated by fs_visitor::calculate_urb_setup().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:18 -07:00
Paul Berry
8c69eaba1a i965/fs: Change brw_wm_prog_data::urb_read_length to num_varying_inputs.
On gen4-5, the FS stage reads varying inputs from URB entries that
were output by the SF thread, where each register stores the
interpolation setup for two components of a vec4, therefore the FS
urb_read_length is twice the number of FS input varyings.  On gen6+,
varying inputs are directly deposited in the FS payload by the SF/SBE
fixed function logic, so urb_read_length is irrelevant.

However, in future patches, it will be nice to be able to consult
brw_wm_prog_data to determine how many varying inputs the FS expects
(rather than inferring it from gl_program::InputsRead).  So instead of
storing urb_read_length, we simply store num_varying_inputs in
brw_wm_prog_data.  On gen4-5, we multiply this by 2 to recover the URB
read length.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:14 -07:00
Paul Berry
58f01bd17d i965/fs: Expose "urb_setup" as part of brw_wm_prog_data.
At the moment, for Gen6+, the FS assumes that all varying inputs are
delivered to it in the order in which they appear in the
gl_program::InputsRead bitfield, and the SF/SBE setup code ensures
that they are delivered in this order.

When we add support for more than 64 varying components, this will no
longer always be possible, because the Gen6+ SF/SBE stage is only
capable of performing arbitrary reorderings of 16 varying slots.

To allow extra flexibility in the ordering of FS varyings, this patch
causes the FS to advertise exactly what ordering it expects.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 12:53:05 -07:00
Chia-I Wu
4a6939edae ilo: make ilo_bind_sampler_states return void
So that it can be hooked up pipe_context::bind_sampler_states that is
currently living on another branch.
2013-09-17 00:20:50 +08:00
Kenneth Graunke
120d100627 glsl/tests: Update .gitignore for new unit test.
I rarely run 'git status', so I failed to notice this was missing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-16 08:26:09 -07:00
Kenneth Graunke
1da3ff1b1c glsl/tests: Add a test for properties of sampler types.
For each sampler type, this tests that:
- The base type is GLSL_TYPE_SAMPLER.
- The dimensionality is set correctly.
- The returned data type is correct.
- The sampler_array and sampler_shadow flags are set correctly.
- sampler_coordinate_components() returns the correct value.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2013-09-15 21:48:20 -07:00
Dave Airlie
2f508f244e st/mesa: don't dereference stObj->pt if NULL
It seems a user app can get us into this state, I trigger the fail
running fbo-maxsize inside virgl, it fails to create the backing
storage for the texture object, but then segfaults here when it
should fail the completeness test.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-09-16 08:33:02 +10:00
Dave Airlie
bbe3d6dc29 nouveau: fix regression since float comparison instructions (v2)
Fix the return type and allow src and dst types for comparison
to be separate, this at least fixes the two test cases I've written.

v2: drop the u32->s32 change

Acked-by: Christoph Bumiller <christoph.bumiller@speed.at>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-09-16 08:32:42 +10:00
Rico Schüller
6f52295129 vdpau/decode: Check max width and max height.
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-09-15 16:18:08 +02:00
Rob Clark
ffa3244534 freedreno: PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE
When the old contents do not need to be preserved, it is faster to
create a new backing bo rather than stall.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
d7be322410 freedreno/a3xx: fix VFD_INDEX_MAX overflow
max_index may be 0xffffffff.  The hardware does not need 1 + max_index
(although it does not hurt unless max_index wraps around to zero).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
c756a3ef70 freedreno: add debug option to disable GMEM bypass
Useful for debugging.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
cdec879e38 freedreno/a3xx: handle front_ccw
Used by supertuxkart.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
cda75253f7 freedreno/a3xx: stencil fixes
For mem->gmem we don't sample depth/stencil as it's native type.  So we
need to setup the swizzle state for the sampler based on the format used
for sampling.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
65ae4392ce freedreno/a3xx: alpha-test
Needed by some games, like etuxracer and supertuxkart which use alpha
test rather than blending, to handle texture transparency.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
dbf041e61f freedreno/a3xx/compiler: implement SUB
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
1a42d4ee34 freedreno/a3xx: use INDIRECT state load for shaders
With a debug option to force DIRECT (mainly to make it easier for
capturing cmdstream dumps).  Using INDIRECT for large shaders at least
makes a noticable reduction in CPU load, which helps for CPU limited
games.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
6e9c386d16 freedreno: avoid stalling at ringbuffer wraparound
Because of how the tiling works, we can't really flush at arbitrary
points very easily.  So wraparound is handled by resetting to top of
ringbuffer.  Previously this would stall until current rendering is
complete.  Instead cycle through multiple ringbuffers to avoid a stall.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
ca505303a7 freedreno: emit markers to scratch registers
Emit markers by writing to scratch registers in order to "triangulate"
gpu lockup position from post-mortem register dump.  By comparing
register values in post-mortem dump to command-stream, it is possible to
narrow down which DRAW_INDX caused the lockup.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
1e6d290f21 freedreno: split out WFI helper
Mostly just to give an easy debug/instrumentation point.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
74052347f3 freedreno: fd_draw helper
Have a single helper that all draws come through.. mainly for a
convenient debug and instrumentation point.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
4712904ddc freedreno/a3xx: fix gpu lockup in some piglit tests
The varying-out config comes from the inputs of the frag shader (so that
we aren't exporting unneeded varyinges).  The varyings-count should come
from the frag shader as well, to avoid a discrepency in configuration
and resulting gpu lockup.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
64c134cedb freedreno/a3xx/compiler: add LIT
Needed by glxgears and etuxracer ;-)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Rob Clark
cb9e07aa84 freedreno: multi-slice resources (cubemap, mipmap, etc)
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-09-14 13:31:58 -04:00
Paul Berry
71ffac691b glsl/builtins: Fix {texture1D,texture2D,shadow1D}ArrayLod availibility.
These functions are defined in EXT_texture_array, which makes no
mention of what shader types they should be allowed in.  At the time
EXT_texture_array was introduced, functions ending in "Lod" were
available only in vertex shaders, however this restriction was lifted
in later spec versions and extensions.

We already have the function lod_exists_in_stage() for figuring out
whether functions ending in "Lod" should be available, so just re-use
that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-13 14:59:06 -07:00
Kenneth Graunke
4b3c0a797f i965: Use brw_stage_state for WM data as well.
This gets the VS, GS, and PS all using the same data structure.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-13 14:26:52 -07:00
Kenneth Graunke
e6e5f88848 i965: Increase the size of brw_stage_state::surf_offset.
Since BRW_MAX_WM_SURFACES is greater than BRW_MAX_VEC4_SURFACES, the
existing array isn't large enough to be used by the WM.  Increasing it
will make it possible to share them.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-13 14:26:50 -07:00
Kenneth Graunke
3a835b699a i965: Add comments to the new brw_state_state structure's fields.
These are largely based on the similar fields in brw->wm.

v2: Add a better comment than "Scratch buffer".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-13 14:26:31 -07:00
Ian Romanick
ea373f03e8 mesa: Rename MESA_shader_integer_mix to EXT_shader_integer_mix
Everyone at the Khronos meeting was as surprised that GLSL didn't
already support this as we were.  Several vendors said they'd ship it,
but there didn't seem to be enough interest to put in the effort to make
it ARB or KHR.

v2: Fix a couple typos and rename the spec file to
EXT_shader_integer_mix.spec.  Suggested by Roland.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-13 09:56:36 -05:00
Marek Olšák
f4e35f897e radeonsi: fix and enable transform feedback for CIK
The CP_STRMOUT_CNTL register was moved again.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-09-13 01:08:04 +02:00
Marek Olšák
f317ce5c5d radeonsi: fix gl_InstanceID with non-zero start_instance
start_instance doesn't affect gl_InstanceID.

There's no piglit test, but it's kinda obvious the code was wrong.

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-09-13 01:08:03 +02:00
Marek Olšák
9c75d2f65b gallium: comment that INSTANCEID doesn't include start_instance
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-09-13 01:08:03 +02:00
Marek Olšák
122a880b78 radeonsi: enable streamout AKA transform feedback for SI
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-13 01:07:56 +02:00
Marek Olšák
8d03d923b6 radeonsi: implement streamout shader support
The shader is responsible for writing to streamout buffers using
the TBUFFER_STORE_FORMAT_* instructions.

The locations of some input SGPRs and VGPRs are assigned dynamically, because
the input SGPRs controlling streamout are not declared if they are not needed,
decreasing the indices of all following inputs.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-13 01:04:44 +02:00
Marek Olšák
9d16e70b3f radeonsi: implement glDrawTransformFeedback functionality
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-13 01:04:44 +02:00
Marek Olšák
6cf29c7dab radeonsi: fix streamout queries
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-13 01:04:44 +02:00
Marek Olšák
91ede46222 radeonsi: implement streamout flush properly
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-13 01:04:44 +02:00
Marek Olšák
2993ccab38 radeonsi: bind streamout buffers to VGT and the vertex shader
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-13 01:04:44 +02:00
Marek Olšák
e4c5d3ee27 radeonsi: handle rasterizer_discard and set GS_OUT_PRIM_TYPE
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-13 01:04:44 +02:00
Marek Olšák
9eb3b9dc2b radeonsi: initialize the first CS like any other
So that the "init" state is always emitted first and not later in draw_vbo.

This fixes streamout where the "init" state, which disables streamout,
was emitted in draw_vbo after streamout was enabled.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-13 01:04:44 +02:00
Marek Olšák
2b0a54d6ec radeonsi: integrate shared streamout state
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-13 01:04:44 +02:00
Marek Olšák
4ea35023c5 radeon: don't emit streamout state if there are no streamout buffers
This could happen if set_stream_output_targets is called twice
in a row without a draw call in between.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-13 01:04:44 +02:00
Marek Olšák
60416cb173 radeon: don't emit VGT_STRMOUT_BUFFER_BASE on SI
The register doesn't exist on SI.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-09-13 01:04:44 +02:00
Kenneth Graunke
2b71b3d466 mesa: Disallow relinking if a program is used by an active XFB object.
Paused transform feedback objects may refer to a program other than the
current program.  If any active objects refer to a program, LinkProgram
must reject the request to relink.

The code to detect this is ugly since _mesa_HashWalk is awkward to use,
but unfortunately we can't use hash_table_foreach since there's no way
to get at the underlying struct hash_table (and even then, we'd need to
handle locking somehow).

Fixes the last subcase of Piglit's new ARB_transform_feedback2
api-errors test.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-12 10:19:10 -07:00
Kenneth Graunke
9cc74c93f8 mesa: Reject ResumeTransformFeedback if the wrong program is bound.
This is actually a pretty important error condition: otherwise, you
could set up transform feedback with one program, and resume it with
a program that generates a completely different set of outputs.

Fixes a subcase of Piglit's new ARB_transform_feedback2 api-errors test.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-12 10:19:09 -07:00
Kenneth Graunke
c732f68cf4 mesa: Track the vertex program active at BeginTransformFeedback() time.
The next few patches will use this for API error checking.

All of the drivers appear to CALLOC_STRUCT transform feedback objects,
so this should be properly NULL initialized on creation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-12 10:19:07 -07:00
Kenneth Graunke
a7d616da69 mesa: Disallow TransformFeedbackVaryings when active.
Fixes a subcase of Piglit's new ARB_transform_feedback2 api-errors test.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-12 10:18:59 -07:00
Christian König
2487324591 radeon/uvd: move more logic into the common files
Move the code back into the common UVD files since we now
have base structures for R600 and radeonsi.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-09-12 15:16:30 +02:00
Christian König
56be937d42 radeon/uvd: use more sane defaults for bitstream buffer size
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-09-12 15:16:06 +02:00
Andreas Boll
32637f56a5 os: First check for __GLIBC__ and then for PIPE_OS_BSD
Fixes FTBFS on kfreebsd-*

Debian GNU/kFreeBSD doesn't provide getprogname() since it uses stdlib.h
from glibc. Instead it provides program_invocation_short_name from glibc.

You can find the same order in src/mesa/drivers/dri/common/xmlconfig.c

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Tested-by: Julien Cristau <jcristau@debian.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-12 12:35:34 +02:00
José Fonseca
315f8f17d0 llvmpipe: Remove the special path for TGSI_OPCODE_EXP.
It was wrong for EXP.y, as we clamped the source before computing the
fractional part, and this opcode should be rarely used, so it's not
worth the hassle.
2013-09-12 11:24:24 +01:00
José Fonseca
e75211df0f trace: Several enhancements to dump_state.py
- Handle more calls
- Handle more state
- Try to normalize the output a bit, to eliminate spurious differences
2013-09-12 11:24:24 +01:00
José Fonseca
9641f1037c trace: Support bigger TGSI shaders.
Trivial.
2013-09-12 11:24:24 +01:00
Kenneth Graunke
c59659ca08 glsl: Use sampler_coordinate_components instead of passing it by hand.
We used to pass the number of components actually used for the
coordinate (rather than padding, shadow comparitors, and projectors) by
hand, specifying it on every _texture() call.

The new helper function can just compute this, eliminating a lot of
potential mistakes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-11 22:48:32 -07:00
Kenneth Graunke
694be9115d glsl: Add a new glsl_type::sampler_coordinate_components() function.
This computes the number of components necessary to address a sampler
based on its dimensionality.  It will be useful for texturing built-ins.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-11 22:48:32 -07:00
Johannes Obermayr
5eb7ff1175 Move nv30, nv50 and nvc0 to nouveau.
It is planned to ship openSUSE 13.1 with -shared libs.
nouveau.la, nv30.la, nv50.la and nvc0.la are currently LIBADDs in all nouveau
related targets.
This change makes it possible to easily build one shared libnouveau.so which is
then LIBADDed.
Also dlopen will be faster for one library instead of three and build time on
-jX will be reduced.

Whitespace fixes were requested by 'git am'.

Signed-off-by: Johannes Obermayr <johannesobermayr@gmx.de>
Acked-by: Christoph Bumiller <christoph.bumiller@speed.at>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-11 21:47:07 +02:00
Paul Berry
ebcdaa7bbc i965/gs: implement EndPrimitive() functionality in the visitor.
According to GLSL, the shader may call EndPrimitive() at any point
during its execution, causing the line or triangle strip currently
being output to be terminated and a new strip to be begun.

This is implemented in gen7 hardware by using one control data bit per
vertex, to indicate whether EndPrimitive() was called after that
vertex was emitted.

In order to make this work without sacrificing too much efficiency, we
accumulate 32 control data bits at a time in a GRF.  When we have
accumulated 32 bits (or when the shader terminates), we output them to
the appropriate DWORD in the control data header and reset the
accumulator to 0.

We have to take special care to make sure that EndPrimitive() calls
that occur prior to the first vertex have no effect.

Since geometry shaders that output a large number of vertices are
likely to be rare, an optimization kicks in if max_vertices <= 32.  In
this case, we know that we can wait until the end of shader execution
before any control data bits need to be output.

I've tried to write the code in such a way that in the future, we can
easily adapt it to output stream ID bits (which are two bits/vertex
instead of one).

Fixes piglit tests "spec/glsl-1.50/glsl-1.50-geometry-end-primitive *".

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-11 11:17:54 -07:00
Paul Berry
564a900a45 i965/vec4: Add the ability to emit opcodes with just a dst register.
This is needed for GS_OPCODE_PREPARE_CHANNEL_MASKS.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-11 11:17:50 -07:00
Paul Berry
6ced0fa57f i965/gs: Add opcodes needed for EndPrimitive().
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-11 11:17:41 -07:00
Paul Berry
a74af8148d i965/gen7: Add the ability to send URB_WRITE_OWORD messages.
Previously, brw_urb_WRITE() would always generate a URB_WRITE_HWORD
message, we always wanted to write data to the URB in pairs of varying
slots or larger (an HWORD is 32 bytes, which is 2 varying slots).

In order to support geometry shader EndPrimitive functionality, we'll
need the ability to write to just a single OWORD (16 byte) slot, since
we'll only be outputting 32 of the control data bits at a time.  So
this patch adds a flag that will cause brw_urb_WRITE to generate a
URB_WRITE_OWORD message.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-11 11:17:31 -07:00
Paul Berry
bf5419e389 i965/gen7: Allow URB_WRITE channel masks to be used.
Previously, brw_urb_WRITE() would unconditionally override the channel
masks in the URB_WRITE message to 0xff (indicating that all channels
should be written to the URB).

In order to support geometry shader EndPrimitive functionality, we'll
need the ability to set the channel masks programatically, so that we
can output just 32 of the control data bits at a time.  So this patch
adds a flag that will prevent brw_urb_WRITE() from overriding them.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-11 11:17:24 -07:00
Paul Berry
247f90c77e i965/gs: Set control data header size/format appropriately for EndPrimitive().
The gen7 geometry shader uses a "control data header" at the beginning
of the output URB entry to store either

(a) flag bits (1 bit/vertex) indicating whether EndPrimitive() was
    called after each vertex, or

(b) stream ID bits (2 bits/vertex) indicating which stream each vertex
    should be sent to (when multiple transform feedback streams are in
    use).

Fortunately, OpenGL only requires separate streams to be supported
when the output type is points, and EndPrimitive() only has an effect
when the output type is line_strip or triangle_strip, so it's not a
problem that these two uses of the control data header are mutually
exclusive.

This patch modifies do_vec4_gs_prog() to determine the correct
hardware settings for configuring the control data header, and
modifies upload_gs_state() to propagate these settings to the
hardware.

In addition, it modifies do_vec4_gs_prog() to ensure that the output
URB entry is large enough to contain both the output vertices *and*
the control data header.

Finally, it modifies vec4_gs_visitor so that it accounts for the size
of the control data header when computing the offset within the URB
where output vertex data should be stored.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

v2: Fixed incorrect handling of IVB/HSW differences.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-11 11:17:14 -07:00
Paul Berry
1a33e0233a glsl: During linking, record whether a GS uses EndPrimitive().
This information will be useful in the i965 back end, since we can
save some compilation effort if we know from the outset that the
shader never calls EndPrimitive().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-11 11:16:35 -07:00
Paul Berry
79d9c6b7ff i965/gs: Add a state atom to set up geometry shader state.
v2: Do not attempt to share the code that uploads
3DSTATE_BINDING_TABLE_POINTERS_GS, 3DSTATE_SAMPLER_STATE_POINTERS_GS,
or 3DSTATE_GS with VS.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

v3: Add _NEW_TRANSFORM to gen7_gs_state.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-11 11:16:25 -07:00
Paul Berry
ec5c924290 i965/gen7: Extract a function for setting up a shader stage's constants.
This will allow us to reuse some code when setting up the geometry
shader stage.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-11 11:16:19 -07:00
Torsten Duwe
3bc642cbf6 wayland-egl.pc requires wayland-client.pc.
Mesa provides the wayland-egl libs and the pkgconfig file, but the headers
originate from the wayland package. Ensure everything matches, by requiring
application builds to look at the wayland headers as well.

Signed-off-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Johannes Obermayr <johannesobermayr@gmx.de>
2013-09-11 10:51:02 -07:00
Johannes Obermayr
87ebbe1270 st/gbm: Add $(WAYLAND_CFLAGS) for HAVE_EGL_PLATFORM_WAYLAND. 2013-09-11 10:50:34 -07:00
Maarten Lankhorst
b217d48364 st/dri: do not create a new context for msaa copy
Commit b77316ad75
    st/dri: always copy new DRI front and back buffers to corresponding MSAA buffers

introduced creating a pipe_context for every call to validate, which is not required
because the callers have a context anyway.

Only exception is egl_g3d_create_pbuffer_from_client_buffer, can someone test if it
still works with NULL passed as context for validate? From examining the code I
believe it does, but I didn't thoroughly test it.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: 9.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-11 09:03:44 +02:00
Kenneth Graunke
169f9c030c i965: Add an assertion that writemask != NULL for non-ARFs.
We've observed GPU hangs on Ivybridge from the following instruction:

mov(8) g115<1>.F 0D { align16 WE_normal NoDDChk 1Q };

There should be no reason to ever set the writemask on a destination
register to zero, except for perhaps the ARF NULL register.

This patch adds an assertion to enforce this for non-ARF registers.
Excluding ARFs is conservative yet should still catch the majority
of mistakes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-09-10 17:52:59 -07:00
Kenneth Graunke
4e5eb8ba25 i965/vec4: Only zero out unused message components when there are any.
Otherwise, coordinates with four components would result in a MOV
with a destination writemask that has no channels enabled:

mov(8) g115<1>.F 0D { align16 WE_normal NoDDChk 1Q };

At best, this is stupid: we emit code that shouldn't do anything.
Worse, it apparently causes GPU hangs (observable with Chris's
textureGather test on CubeArrays.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Cc: mesa-stable@lists.freedesktop.org
2013-09-10 17:52:56 -07:00
Kenneth Graunke
17eb1df7b8 i965/vec4: Simplify the computation of coord_mask and zero_mask.
We can easily compute these without loops, resulting in simpler and
shorter code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-09-10 17:52:36 -07:00
Matt Turner
66be7b4c27 docs: Clean up autoconf.html.
Remove long dead options and clarify some things.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69148
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-10 16:59:35 -07:00
Henri Verbeet
bd77f51758 mesa: Properly set the fog scale (gl_Fog.scale) to +INF when fog start and end are equal.
This was originally introduced by commit
ba47aabc98, but unfortunately the commit message
doesn't go into much detail about why +INF would be a problem here.

A similar issue exists for STATE_FOG_PARAMS_OPTIMIZED, but allowing infinity
there would potentially introduce NaNs where they shouldn't exist, depending
on the values of fog end and the fog coord. Since STATE_FOG_PARAMS_OPTIMIZED
is only used for fixed function (including ARB_fragment_program with fog
option), and the calculation there probably isn't very stable to begin with
when fog start and end are close together, it seems best to just leave it
alone.

This fixes piglit glsl-fs-fogscale, and a couple of Wine D3D tests. No piglit
regressions on Cayman.

Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-10 22:25:16 +02:00
Vinson Lee
09e385ee3b mesa: Use correct enum conversion function.
Fixes "Mixing enum types" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-10 10:56:38 -07:00
Vinson Lee
fd66a85f6b mesa: Ensure gl_sync_object is fully initialized.
278372b47e added the uninitialized pointer
field gl_sync_object:Label. A free of this pointer, added in commit
6d8dd59cf5, resulted in a crash.

This patch fixes piglit ARB_sync regressions with swrast introduced by
6d8dd59cf5.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-10 10:54:26 -07:00
Vinson Lee
49f2ba2cb0 radeonsi: Add parentheses around '|' operands.
Fixes GCC parentheses warning.

r600_texture.c: In function 'si_texture_create':
r600_texture.c:518:20: warning: suggest parentheses around arithmetic in operand of '|' [-Wparentheses]
      !(templ->bind & PIPE_BIND_CURSOR | PIPE_BIND_LINEAR)) {
                    ^

Fixes "Wrong operator used" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-10 10:44:09 -07:00
Vinson Lee
d93e23ba25 util: Fix unmatched parenthesis.
Fixes MSVC build error introduced with commit
923d346714.

src\gallium\auxiliary\util\u_cpu_detect.c(286) : fatal error C1012: unmatched parenthesis : missing '('

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-09-10 10:33:47 -07:00
Brian Paul
923d346714 util: don't use _fxsave() with MSVC 2010 or older
And update _MSC_VER comments in p_config.h

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-10 11:01:37 -06:00
Vinson Lee
787ac4207e glsl: Add missing va_end in builtin_builder::add_function.
Fixes "Missing varargs init or cleanup" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-10 09:52:03 -07:00
Vinson Lee
118cdd1d3f glsl: Initialize builtin_builder member variables.
Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-10 09:49:02 -07:00
Brian Paul
395b941086 glsl: fix variadic macro for MSVC
MSVC doesn't accept the rest... syntax.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-09 17:52:44 -06:00
Brian Paul
1ddb56d160 glsl: remove struct keyword from ir_variable declarations
To silence MSVC warnings.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-09 17:52:44 -06:00
Kenneth Graunke
0bb3cd8090 Revert "i965/vec4: Only zero out unused message components when there are any."
This reverts commit 6c3db2167c, which I
accidentally pushed along with other code.  A better version of the fix
will be committed later.
2013-09-09 15:33:16 -07:00
Matt Turner
89f5f675ad i965: Allow immediates to be folded into logical and shift instructions.
These instructions will be used with immediate arguments in the upcoming
ldexp lowering pass and frexp implementation.

v2: Add vec4 support as well.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 15:01:08 -07:00
Matt Turner
d83221c2d3 i965: Enable MESA_shader_integer_mix.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-09 15:01:08 -07:00
Matt Turner
56fff7063d glsl: Implement MESA_shader_integer_mix extension.
Because why doesn't GLSL allow you to do this already?

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-09 15:01:08 -07:00
Matt Turner
fd183fa02c glsl: Use conditional-select in mix().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-09 15:01:08 -07:00
Matt Turner
8477262958 i965: Add support for ir_triop_csel.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-09 15:01:08 -07:00
Matt Turner
7aaa38728f glsl: Add conditional-select IR.
It's a ?: that operates per-component on vectors. Will be used in
upcoming lowering pass for ldexp and the implementation of frexp.

 csel(selector, a, b):
   per-component result = selector ? a : b

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-09 15:01:08 -07:00
Kenneth Graunke
60850b7b9f glsl: Rename ir_function_signature::builtin_info to builtin_avail.
builtin_info was originally going to be a structure containing a bunch
of information, but after various rewrites, it turned into a boolean
availability predicate.

builtin_avail is a better name than builtin_info, since it doesn't
store any information other than availability.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 14:54:46 -07:00
Kenneth Graunke
260965b7a7 build: Delete cross-compiling macros.
Now that builtin_compiler is gone, nothing uses these.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 14:42:33 -07:00
Kenneth Graunke
b973b44a4d glsl: Add missing type inference for ir_binop_bfm.
Matt noticed that this was missing.  Nothing uses this currently.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 14:42:33 -07:00
Kenneth Graunke
722eff674b glsl: Delete old built-in function generation code.
None of this is used anymore.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 14:42:33 -07:00
Kenneth Graunke
c845140a20 glsl: Remove builtin_compiler from the build system.
We don't actually use anything from builtin_function.cpp, so we don't
need to generate it anymore.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 14:42:33 -07:00
Kenneth Graunke
76d2f73643 glsl: Switch to the new built-in function module.
All built-ins are now handled by the new code; the old system is dead.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 14:42:33 -07:00
Kenneth Graunke
7ddc312c1b glsl: Write a new built-in function module.
This creates a new replacement for the existing built-in function code.
The new module lives in builtin_functions.cpp (not builtin_function.cpp)
and exists in parallel with the existing system.  It isn't used yet.

The new built-in function code takes a significantly different approach:

Instead of implementing built-ins via printed IR, build time scripts,
and run time parsing, we now implement them directly in C++, using
ir_builder.  This translates to faster load times, and a much less
complex build system.

It also takes a different approach to built-in availability: each
signature now stores a boolean predicate, which makes it easy to
construct arbitrary expressions based on _mesa_glsl_parse_state's
fields.  This is much more flexible than the old system, and also
easier to use.

Built-ins are also now stored in a single gl_shader object, rather
than being spread out across a number of shaders that need to be linked.
When searching for a matching prototype, we simply consult the
availability predicate.  This also simplifies the code.

v2: Incorporate Matt Turner's feedback: use the new fma() function rather
    than expr().  Don't expose textureQueryLOD() in GLSL 4.00 (since it
    was renamed to textureQueryLod()).  Also correct some #undefs.
v3: Incorporate Paul Berry's feedback: rename legacy to compatibility;
    add comments to explain a few things; fix uvec availability; include
    shaderobj.h instead of repeating the _mesa_new_shader prototype.
v4: Fix lack of TEX_PROJECT on textureProjGrad[Offset] (caught by oglc).
    Add an out_var convenience function (more feedback by Matt Turner).
v5: Rework availability predicates for Lod functions.  They were broken.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Enthusiastically-acked-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 14:42:18 -07:00
Kenneth Graunke
8d90328eb3 glsl: Add optional parameters to the ir_factory constructor.
Each ir_factory needs an instruction list and memory context in order to
be useful.  Rather than creating an object and manually assigning these,
we can just use optional parameters in the constructor.

This makes it possible to create a ready-to-use factory in one line:

   ir_factory body(&sig->body, mem_ctx);

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:22 -07:00
Kenneth Graunke
666df56551 glsl: Add IR builder shortcuts for a bunch of random opcodes.
Adding new convenience emitters makes it easier to generate IR involving
these opcodes.

bitfield_insert is particularly useful, since there is no expr() for
quadops.

v2: Add fma() and rename lrp() operands to x/y/a to match the GLSL
    specification (suggested by Matt Turner).  Fix whitespace issues.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:22 -07:00
Kenneth Graunke
1a6c0efa11 glsl: Expose IR builder support for arbitrary swizzling.
IR builder already offers a lot of swizzling functions, such as
swizzle_xxxx, swizzle_z, or swizzle_for_size.

The swizzle_xxxx style is convenient if you statically know which
components you want.  swizzle_for_size is great if you want to select
the first few components.  However, if you want to select components
based on, say, a loop counter, none of those are sufficient.

IR builder actually already had support for arbitrary swizzling, but
didn't expose it.  This patch exposes that API.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:22 -07:00
Kenneth Graunke
202238824b glsl: Add a new ir_builder::dotlike() function.
dotlike() uses ir_binop_mul for scalars, and ir_binop_dot for vectors.

When generating built-in functions, we often want to use regular
multiply for scalar signatures, and dot() for vector signatures.
ir_binop_dot only works on vectors, so we have to switch opcodes,
even if the code is otherwise identical.  dotlike() makes this easy.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:22 -07:00
Kenneth Graunke
d716b3376c glsl: Add IR builder support for generating return statements.
We use "ret" as the function name since "return" is a C++ keyword, and
"ir_return" is already a class name.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:22 -07:00
Kenneth Graunke
f72a8498e7 glsl: Add IR builder support for conditional assignments.
This adds two new signatures:

   assign(lhs, rhs, condition, writemask);
   assign(lhs, rhs, condition);

All the other existing APIs still exist.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:22 -07:00
Kenneth Graunke
eff2ca1ac3 glsl: Add IR builder support for triops.
Now that we have the ir_expression constructor that does type inference,
this is trivial to do.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:22 -07:00
Kenneth Graunke
7f0f60cd84 glsl: Add an ir_expression triop constructor with type inference.
We already have ir_expression constructors for unary and binary
operations, which automatically infer the type based on the opcode and
operand types.

These are convenient and also required for ir_builder support.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:22 -07:00
Kenneth Graunke
183f7a3e6f glsl: Add missing type inference support for ARB_gpu_shader5 unops.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:21 -07:00
Kenneth Graunke
33faaf0b4a glsl: Initialize lod_info in the ir_texture constructor.
This isn't strictly necessary, since creators of ir_texture objects
should set LOD when relevant.  However, it's nice to have a NULL pointer
in case they forget.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:21 -07:00
Kenneth Graunke
1b3a482a96 glsl: Skip unavailable built-ins when printing out similar candidates.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:21 -07:00
Kenneth Graunke
1ffcef04ce glsl: Skip unavailable built-ins when matching signatures.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:21 -07:00
Kenneth Graunke
3e820e3aef glsl: Pass _mesa_glsl_parse_state into matching_signature and such.
During compilation, we'll use this to determine built-in availability.
The plan is to have a single shader containing every built-in in every
version of the language, but filter out the ones that aren't actually
available to the shader being compiled.

At link time, we don't actually need this filtering capability: we've
already imported prototypes for every built-in that the shader actually
calls, and they're flagged as is_builtin().  The linker doesn't import
any additional prototypes, so it won't pull in any unavailable
built-ins.  When resolving prototypes to function definitions, the
linker ensures the values of is_builtin() match, which means that a
shader can't trick the linker into importing the body of an unavailable
built-in by defining a suspiciously similar prototype.

In other words, during linking, we can just pass in NULL.  It will work
out fine.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:21 -07:00
Kenneth Graunke
0823a87a75 glsl: Add a method to tell whether a built-in is available.
We can simply call the stored predicate function.  If state is NULL,
just report that the function is available.

v2: Add a comment (requested by Paul Berry).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:52:16 -07:00
Kenneth Graunke
d403a10573 glsl: Mark _mesa_glsl_parse_state::is_version() as const.
This promises the method won't modify the contents of the object.
This allows us to call it even with a const pointer to the state.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:46:51 -07:00
Kenneth Graunke
4b0bac0dce glsl: Convert ir_function_signature::is_builtin to a method.
A signature is a built-in if and only if builtin_info != NULL, so we
don't actually need a separate flag bit.  Making a boolean-valued
method allows existing code to ask the same question while not worrying
about the internal representation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:46:51 -07:00
Kenneth Graunke
ca321d07fd glsl: Store a predicate for whether a built-in signature is available.
For the upcoming built-in function rewrite, we'll need to be able to
answer "Is this built-in function signature available?".

This is actually a somewhat complex question, since it depends on the
language version, GLSL vs. GLSL ES, enabled extensions, and the current
shader stage.

Storing such a set of constraints in a structure would be painful, so
instead we store a function pointer.  When creating a signature, we
simply point to a predicate that inspects _mesa_glsl_parse_state and
answers whether the signature is available in the current shader.

Unfortunately, IR reader doesn't actually know when built-in functions
are available, so this patch makes it lie and say that they're always
present.  This allows us to hook up the new functionality; it just won't
be useful until real data is populated.  In the meantime, the existing
profile mechanism ensures built-ins are available in the right places.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-09 11:46:50 -07:00
Kenneth Graunke
6c3db2167c i965/vec4: Only zero out unused message components when there are any.
Otherwise, coordinates with four components would result in a MOV
with a destination writemask that has no channels enabled:

mov(8) g115<1>.F 0D { align16 WE_normal NoDDChk 1Q };

At best, this is stupid: we emit code that shouldn't do anything.
Worse, it apparently causes GPU hangs (observable with Chris's
textureGather test on CubeArrays.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
2013-09-09 11:26:53 -07:00
Paul Berry
2924b5f73b vbo: Implement new gs prim types in vbo_count_tessellated_primitives.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-09-09 09:34:46 -07:00
Ian Romanick
2937d704dc i965: Enable AMD_seamless_cubemap_per_texture
The change is very small.  Do seamless filtering if either the context
enable is set or the sampler enable is set.

The AMD_seamless_cubemap_per_texture says:

    "If TEXTURE_CUBE_MAP_SEAMLESS_ARB is emabled (sic) globally or the
    value of the texture's TEXTURE_CUBE_MAP_SEAMLESS_ARB parameter is
    TRUE, seamless cube map sampling is enabled..."

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-08 07:54:12 -07:00
Ian Romanick
4a19503516 mesa: Always use seamless cubemap filtering in GLES3
Appendix F.2 of the OpenGL ES 3.0.0 spec says:

    "OpenGL ES 3.0 requires that all cube map filtering be
    seamless. OpenGL ES 2.0 specified that a single cube map face be
    selected and used for filtering."

Setting the field only in the context will work fine with sampler
objects (and drivers that support AMD_seamless_cubemap_per_texture)
because seamless filtering is used if *either* the context or the
sampler enable it:

    "If TEXTURE_CUBE_MAP_SEAMLESS_ARB is emabled (sic) globally or the
    value of the texture's TEXTURE_CUBE_MAP_SEAMLESS_ARB parameter is
    TRUE, seamless cube map sampling is enabled..."

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reported-by: Maxence Le Dore <maxence.ledore@gmail.com>
Thanked-by: Maxence Le Dore <maxence.ledore@gmail.com>
2013-09-08 07:54:12 -07:00
Ian Romanick
e334ff43c4 mesa: Don't allow glSamplerParameteriv(GL_TEXTURE_CUBE_MAP_SEAMLESS) in ES
There is no GL_TEXTURE_CUBE_MAP_SEAMLESS in any version of OpenGL ES or
in any extension that applies to OpenGL ES.  The same error check
already occurs for glTexParameteri.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: Maxence Le Dore <maxence.ledore@gmail.com>
2013-09-08 07:54:12 -07:00
Ian Romanick
7efe55cb2d docs: initial 9.3 release notes file
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-09-08 07:54:11 -07:00
Chia-I Wu
e67f99bd29 ilo: preliminary GEN 7.5 support
This is based on grepping for brw->is_haswell in i965 to see how GEN 7.5
differs from GEN 7.  Slightly tested with Xonotic and some Mesa demos.
2013-09-08 01:22:52 +08:00
Alex Deucher
18805b16c8 radeonsi: add berlin pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-09-06 19:27:23 -04:00
Alex Deucher
9bc47dbe50 r600g: remove DMA padding
This is now handled in the winsys.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-09-06 19:10:27 -04:00
Alex Deucher
a81beee37e radeon/winsys: pad IBs to a multiple of 8 DWs
This aligns the gfx, compute, and dma IBs to 8 DW boundries.
This aligns the the IB to the fetch size of the CP for optimal
performance. Additionally, r6xx hardware requires at least 4
DW alignment to avoid a hw bug.  This also aligns the DMA
IBs to 8 DW which is required for the DMA engine.  This
alignment is already handled in the gallium driver, but that
patch can be removed now that it's done in the winsys.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: "9.2" <mesa-stable@lists.freedesktop.org>
CC: "9.1" <mesa-stable@lists.freedesktop.org>
2013-09-06 19:08:35 -04:00
Axel Davy
e8f9195e5f gallium, intel: Implements new __DRI_IMAGE_USE_LINEAR and PIPE_BIND_LINEAR flags to enforce no tiling.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2013-09-06 15:02:34 -07:00
Vinson Lee
0a0f543082 mesa: Ensure gl_query_object is fully initialized.
278372b47e added the uninitialized pointer
field gl_query_object:Label. A free of this pointer resulted in a crash.

This patch fixes piglit regressions with swrast introduced by
6d8dd59cf5.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69047
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-06 14:51:51 -07:00
Zack Rusin
e9f1f6ab42 gallivm: support indirect registers on both dimensions
We support indirect addressing only on the vertex index, but some
shaders also use indirect addressing on attributes. This patch
adds support for indirect addressing on both dimensions inside
gs arrays.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-09-06 15:05:27 -04:00
Stéphane Marchesin
f9b37f7183 i915g: Document fall-through switch
Fixes warning reported by Coverity.
2013-09-06 11:05:25 -07:00
Stéphane Marchesin
519a2cf950 i915g: Handle i915->batch == NULL correctly in flush
Fixes warning reported by Coverity.
2013-09-06 11:05:24 -07:00
Stéphane Marchesin
9e14895884 i915g: Remove useless comparison
Fixes "Macro compares unsigned to 0" defect reported by Coverity.
2013-09-06 11:05:24 -07:00
Stéphane Marchesin
7125af2957 i915g: Fix initial array index
Fixes "Out-of-bounds read" defect reported by Coverity.
2013-09-06 11:05:24 -07:00
Brian Paul
ac8448dd97 mesa: add GL_KHR_debug functions to dispatch_sanity.cpp
Fixes 'make check' failures.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-06 07:53:41 -06:00
Timothy Arceri
238201158f docs: Add some notes on submitting patches
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-06 07:52:18 -06:00
Tom Stellard
505fad04f1 r600g/compute: Fix bug in compute memory pool
When adding a new buffer to the beginning of the memory pool, we were
accidentally deleting the buffer that was first in the buffer list.
This was caused by a bug in the memory pool's linked list
implementation.
2013-09-05 17:18:00 -07:00
Tom Stellard
f0435ebb07 r600g/compute: Don't flush the cs in pipe_context::launch_grid()
This is the state tracker's responsibility.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-05 17:17:43 -07:00
Matt Turner
16cedf3a25 i965: Remove never used DPA2 opcode.
DPA2 is listed in the "Defeatured Instructions" section of the
965 PRM, Volume 4:

"The following instructions are removed from Gen4 implementation mainly
 due to implementation cost/schedule reasons.  They are candidates for
 future generations."

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-05 14:55:27 -07:00
Matt Turner
4a6100054c i965: Remove never used RSR and RSL opcodes.
RSR and RSL are listed in the "Defeatured Instructions" section of the
965 PRM, Volume 4:

"The following instructions are removed from Gen4 implementation mainly
 due to implementation cost/schedule reasons.  They are candidates for
 future generations."

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-05 14:55:19 -07:00
Dominik Behr
0f6fce1585 glsl: propagate max_array_access through function calls
Fixes a bug where if an uniform array is passed to a function the accesses
to the array are not propagated so later all but the first vector of the
uniform array are removed in parcel_out_uniform_storage resulting in
broken shaders and out of bounds access to arrays in
brw::vec4_visitor::pack_uniform_registers.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dominik Behr <dbehr@chromium.org>
2013-09-05 14:36:11 -07:00
Ilia Mirkin
85f7df81a9 nv30: fix inconsistent setting of push->user_priv
It's set to &nv30->bufctx everywhere else.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-09-05 20:46:56 +02:00
Paul Berry
588ec545ac i965/gen7.5: Fix lower bound on number of VS URB entries.
Haswell GT2 and GT3 require the number of vertex shader URB entries to
be at least 64, not 32.

At the moment, we always meet this requirement automatically, because
in the absence of a geometry shader, we assign all available URB space
to the vertex shader.  But when we turn on support for geometry
shaders, this lower limit will become important.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-05 09:52:47 -07:00
Paul Berry
ae79e3332e i965/vs: Move vs-specific code out of brw_vec4_visitor.cpp.
This patch creates a new file brw_vec4_vs_visitor.cpp, to contain code
that is specific to the vertex shader.  Now the organization of vertex
shader and geometry shader visitor code is symmetric: vs-specific code
is in brw_vec4_vs_visitor.cpp, gs-specific code is in
brw_vec4_gs_visitor.cpp, and code shared between vs and gs is in
brw_vec4_visitor.cpp.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-05 09:52:42 -07:00
Paul Berry
e241e7c979 i965/vec4: Make with_writemask() non-static.
This will allow it to be shared between brw_vec4_visitor.cpp and
brw_vec4_vs_visitor.cpp (which will be created in the next patch).

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-05 09:52:38 -07:00
Paul Berry
8f9a339c10 i965/vs: Move vs-specific code out of brw_vec4.h.
Now brw_vec4.h contains only code that is shared between the vertex
and geometry shaders.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-05 09:52:33 -07:00
Paul Berry
9dfa8ae662 i965/gs: Don't assign gl_Layer its own slot in the VUE map.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-09-05 09:52:20 -07:00
Stéphane Marchesin
8709e2b6c5 i915g: Implement writemask fixup
The fixup code emulates non-BGRA render targets by adding an
extra instruction at the end of fragment shaders to swizzle the
output. To do this, we also swizzle the blend function. However
an oversight until now was that the writemask wasn't getting
swizzled. This patch fixes that which fixes a bunch of piglit
tests.
2013-09-04 19:48:18 -07:00
Stéphane Marchesin
b1461acf15 i915g: Stop calling draw_prepare_shader_outputs
It's not useful on i915g since we don't support primid. Fixes
piglit point tests on i915g.
2013-09-04 19:48:18 -07:00
Rico Schüller
8b302e1635 glx: Initialize OpenGL version to 1.0
The old code in dri2_glx suffered from a typographical error that caused
the default version to be 2.1 instead of 1.2 (minimum required by the
Linux OpenGL ABI).  drisw_glx had a similar error resulting in a default
version of 0.1.

Some driver/card combinations (r200/RV280, i915/915G) don't support
OpenGL 2.1.  These create in some corner cases an indirect context
instead of a direct context when calling glXCreateContextAttribsARB().
This happens because of a bad default value.  To avoid this, just used
the default value specified by the GLX_ARB_create_context specification:

    "The default values for GLX_CONTEXT_MAJOR_VERSION_ARB and
    GLX_CONTEXT_MINOR_VERSION_ARB are 1 and 0 respectively. In this
    case, implementations will typically return the most recent version
    of OpenGL they support which is backwards compatible with OpenGL 1.0
    (e.g. 3.0, 3.1 + GL_ARB_compatibility, or 3.2 compatibility
    profile)"

Refactor all the default value setting to dri2_convert_glx_attribs, and
make sure the correct defaults are set in that one place.

Signed-off-by: Rico Schüller <kgbricola@web.de>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla http://bugs.winehq.org/show_bug.cgi?id=34238
Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
2013-09-04 16:07:21 -07:00
Stéphane Marchesin
4e861ac4a1 i915g: Add more optimizations
This patch adds liveness analysis to i915g and a couple
optimizations which benefit from it. One interesting
optimization turns (fake) indirect texture accesses into direct
texture accesses (the i915 supports a maximum of 4 indirect
texture accesses). Among other things this fixes a bunch of
piglit tests.
2013-09-04 12:11:02 -07:00
Ian Romanick
a974b915b6 glsl: Remove unused prog parameter from tfeedback_decl::init
It looks like commit 53febac removed the last user of that parameter.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-04 08:13:11 -07:00
Ian Romanick
0851aa7365 glsl: Validate qualifiers on VS color outputs with FS color inputs
The vertex shader color outputs (gl_FrontColor, gl_BackColor,
gl_FrontSecondaryColor, and gl_BackSecondaryColor) don't have the same
names as the matching fragment shader color inputs (gl_Color and
gl_SecondaryColor).  As a result, the qualifiers on them were not being
properly cross validated.

Full spec compliance required ir_variable::used and
ir_variable::assigned be set properly.  Without the preceeding patch,
which fixes the ::clone method to copy them, this will not be the case.

Fixes all of the previously failing piglit
spec/glsl-1.30/linker/interpolation-qualifiers tests.

v2: Update callers of cross_validate_types_and_qualifiers and
cross_validate_front_and_back_color.  The function signature changed in
v2 of a previous patch.  Suggested by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47755
2013-09-04 08:11:45 -07:00
Ian Romanick
ceceaf53ce glsl: Copy ir_variable::assigned and ir_variable::used fields in ::clone method
Nothing currently relies on this, but one of the next patches will.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-04 08:10:01 -07:00
Ian Romanick
c0e4a4adb7 glsl: Refactor a bunch of the code out of cross_validate_outputs_to_inputs
The new function, cross_validate_types_and_qualifiers, will have
multiple callers from this file in future commits.

v2: Don't pass the names of the producer / consumer stages to
cross_validate_types_and_qualifiers.  Instead, pass the types and get
the names only in the error paths.  Suggested by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-04 08:08:15 -07:00
Ian Romanick
87252bf97b glsl: Reallow precision qualifiers on structure members
Changes to the grammar for GL_ARB_shading_language_420pack (commit
6eec502) moved precision qualifiers out of the type_specifier production
chain.  This caused declarations such as:

    struct S {
        lowp float f;
    };

to generate parse errors.  Section 4.1.8 (Structures) of both the GLSL
ES 1.00 spec and GLSL 1.30 specs says:

        "Member declarators may contain precision qualifiers, but may not
        contain any other qualifiers."

So, it sure seems like we shouldn't generate a parse error. :)

Instead of type_specifier, use fully_specified_type in struct members.
However, fully_specified_type allows a lot of other qualifiers that are
not allowed on structure members, so expeclitly disallow them.

Note, this makes struct_declaration look an awful lot like
member_declaration (used for interface blocks).  We may want to
(somehow) unify these rules to reduce code duplication at some point.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68753
Reported-by: Aras Pranckevicius <aras@unity3d.com>
Cc: Aras Pranckevicius <aras@unity3d.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-09-04 08:02:23 -07:00
Timothy Arceri
51a279254f mesa: Setup remaining infrastucture and enable KHR_debug
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:49 -06:00
Timothy Arceri
9405be4add glapi: Setup autogeneration infrastructure for KHR_debug
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:49 -06:00
Timothy Arceri
6964fa7ea3 mesa: Remap debug type and severity
Remap any type or severity exclusive to KHR_debug to
something suitable for ARB_debug_output

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:49 -06:00
Timothy Arceri
b5c4795f38 mesa: Implement GL_DEBUG_OUTPUT
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:49 -06:00
Timothy Arceri
a7f5eb8ebb mesa: Update builds scripts to build object labels
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:49 -06:00
Timothy Arceri
262b5ff667 mesa: Implement KHR_debug ObjectLabel functions
V3: make sure to add null terminator when setting label,
generate error when the client specifies an explicit
length that exceeds MAX_LABEL_LENGTH, set label pointer
to NULL when freed, and output correct length in
MAX_LABEL_LENGTH error message.

V2: fixed indentation of comment

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:49 -06:00
Timothy Arceri
21b5bf712b mesa: make _mesa_validate_sync() non-static
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:49 -06:00
Timothy Arceri
6d8dd59cf5 mesa: free object labels when deleting
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:48 -06:00
Timothy Arceri
278372b47e mesa: add debug Label field to several data structures
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:48 -06:00
Timothy Arceri
6faf7052a2 mesa: make _mesa_lookup_list() non-static
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:48 -06:00
Timothy Arceri
97f9f11ec4 mesa: make _mesa_lookup_arrayobj() non-static
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:48 -06:00
Timothy Arceri
797b9dc3ff mesa: Implement glPushDebugGroup and glPopDebugGroup
V4: fixes _mesa_error() compiler warnings (BrianP).

V3: removed C++ style comment

V2: fixed spelling typo in comment

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:48 -06:00
Timothy Arceri
60f435319c mesa: Add a clone function to mesa hash
V2: const qualify table parameter

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:48 -06:00
Timothy Arceri
f5badf4671 mesa: Share common code between ARB_debug_output and KHR_debug functions
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:48 -06:00
Timothy Arceri
77d38fd3fb mesa: Add some constants and state variables for KHR_debug functions
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-09-04 07:47:48 -06:00
Kenneth Graunke
644fbbd3eb mesa: Rename gl_context::swtnl_im to vbo_context; use proper type.
The main GL context's swtnl_im field is the VBO module's vbo_context
structure.  Using the name "swtnl" in the name is confusing since
some drivers use hardware texturing and lighting, but still rely on the
VBO module for drawing.

v2: Forward declare the type and use that instead of void *
    (suggested by Eric Anholt).
v3: Remove unnecessary cast (pointed out by by Topi Pohjolainen).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-03 11:30:15 -07:00
Kenneth Graunke
6e143af66d i965: Rename "prim" parameter to "prims" where it's an array.
Some drawing functions take a single _mesa_prim object, while others
take an array of primitives.  Both kinds of functions used a parameter
called "prim" (the singular form), which was confusing.

Using the plural form, "prims," clearly communicates that the parameter
is an array of primitives.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-03 11:29:33 -07:00
Kenneth Graunke
9f7d5870a3 i965: Actually check every primitive for cut index support.
can_cut_index_handle_prims() was passed an array of _mesa_prim objects
and a count, and ran a loop for that many iterations.  However, it
treated the array like a pointer, repeatedly checking the first element.

This patch makes it actually check every primitive.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-03 11:29:09 -07:00
Michel Dänzer
6b5c802c30 radeonsi: Don't save/restore FMASK sampler view states for u_blitter
Fixes assertion failues in 24 piglit tests with
MESA_GL_VERSION_OVERRIDE=3.0, 12 of which are now passing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-02 17:25:27 +02:00
Michel Dänzer
9933b85e12 radeonsi: Expose pure integer vertex formats
Fixes 20 piglit tests with MESA_GL_VERSION_OVERRIDE=3.0.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-02 17:25:27 +02:00
Maarten Lankhorst
ad4dc77231 nvc0: restore viewport after blit
Based on calim's original fix in the nine branch.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
2013-09-02 17:09:21 +02:00
Christian König
3e81b8eedd radeon/uvd: save the aligned width & height
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=68845

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-09-02 15:42:13 +02:00
Chia-I Wu
da33347131 glx: make the interval of LIBGL_SHOW_FPS adjustable
LIBGL_SHOW_FPS=1 makes GLX print FPS every second while other values do
nothing.  Extend it so that LIBGL_SHOW_FPS=N will print the FPS every N
seconds.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-09-02 11:42:58 +08:00
Kenneth Graunke
b8211ab3ed i965: Use the proper element of the prim array in brw_try_draw_prims.
The VBO module actually calls us with an array of _mesa_prim objects.
For example, it may break up a DrawArrays() call into multiple
primitives when primitive restart is enabled.

Previously, we treated prim like a pointer, always accessing element 0.
This worked because all of the primitive objects in a single draw call
have the same value for num_instances and basevertex.

However, accessing an array as a pointer and using the wrong object's
fields is misleading.  For stylistic reasons alone, we should use the
right object.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-01 18:54:39 -07:00
Kenneth Graunke
976d1d6665 i965: Combine brw_emit_prim and gen7_emit_prim.
These functions have almost identical code; the only difference is that
a few of the bits moved around.  Adding a few trivial conditionals
allows the same function to work on all generations, and the resulting
code is still quite readable.

v2: Comment that the workaround flush is only necessary on SNB
    (requested by Paul Berry).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-01 18:54:37 -07:00
Kenneth Graunke
a3335417e3 i965: Remove unused ATTRIB_BIT_DWORDS define.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-09-01 18:53:55 -07:00
Christoph Bumiller
7fe159ba74 nvc0: delete compute object on screen destruction
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-09-01 20:57:15 +02:00
Joakim Sindholt
2a7762bdb6 nvc0: fix blitctx memory leak
Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
2013-09-01 20:56:23 +02:00
Christoph Bumiller
1048d89907 nvc0: don't use bufctx in nvc0_cb_push
Too many calls into libdrm when a single one is enough.
2013-09-01 20:53:11 +02:00
Christoph Bumiller
528a48ee8d nvc0: clear the flushed flag 2013-09-01 20:52:27 +02:00
Christoph Bumiller
5399206056 nvc0/ir: add f32 long immediate cannot saturate
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-09-01 20:51:56 +02:00
Tiziano Bacocco
7086636358 nvc0/ir: fix use after free in texture barrier insertion pass
Fixes crash with Amnesia: The Dark Descent.

Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
2013-09-01 20:51:39 +02:00
Ilia Mirkin
3282697621 nv30: find first unused texcoord rather than bailing if first is used
This fixes shaders produced by supertuxkart.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-09-01 20:38:21 +02:00
Emil Velikov
dc10251d08 nouveau: initialise the nouveau_transfer maps
Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-09-01 20:38:07 +02:00
Chris Forbes
f35dea05b1 i965/fs: Gen4: Zero out extra coordinates when using shadow compare
Fixes broken rendering if these MRFs contained anything other than zero.

NOTE: This is a candidate for stable branches.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-09-01 19:50:59 +12:00
Paul Berry
4cc692e355 i965/gs: Implement support for geometry shader samplers.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-31 17:13:10 -07:00
Paul Berry
89563489ff i965/gs: add geometry shader support to brw_texture_surfaces.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-31 17:13:07 -07:00
Paul Berry
08d8ff0965 i965/gs: generalize brw_texture_surfaces in preparation for gs.
There is a slight functionality change.  Previously we would compute a
common value for num_samplers for all stages, and populate that many
entries in each stage's surf_offset table regardless of how many
samplers each stage used.  Now we only populate the number of entries
in the surf_offset table corresponding to the number of samplers
actually used by the stage.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-31 17:13:04 -07:00
Paul Berry
5a8033f142 i965: Modify signature to update_texture_surface functions.
Previously these functions would accept a pointer to the binding table
and an index indicating which entry in the binding table should be
updated.  Now they merely take a pointer to the binding table entry to
be updated.

This will make it easier to generalize brw_texture_surfaces to support
geometry shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-31 17:12:53 -07:00
Paul Berry
f560ce4a38 i965/vs: generalize gen6_vs_push_constants in preparation for GS.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-31 17:12:43 -07:00
Paul Berry
4ec2604422 i965/gs: make the state atom for compiling Gen7 geometry shaders.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Use "unsigned" rather than "GLuint".
2013-08-31 17:12:33 -07:00
Paul Berry
130f0f78be i965/gs: Implement support for geometry shader surfaces.
This patch implements pull constant upload, binding table upload, and
surface setup for geometry shaders, by re-using vertex shader code
that was generalized in previous patches.

Based on work by Eric Anholt <eric@anholt.net>.

v2: Update ditry bits for brw_gs_ubo_surfaces to account for commit
77d8fbc (mesa: add & use a new driver flag for UBO updates instead of
_NEW_BUFFER_OBJECT).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-31 17:12:21 -07:00
Paul Berry
f986222754 i965/vs: generalize brw_vs_binding_table in preparation for GS.
v2: Use GLbitfield instead of GLbitfield64 in
brw_vec4_upload_binding_table.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-31 17:12:15 -07:00
Paul Berry
1b19f2c576 i965: generalize brw_vs_pull_constants in preparation for GS.
v2: Use GLbitfield instead of GLbitfield64 in
brw_upload_vec4_pull_constants.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-31 17:12:09 -07:00
Paul Berry
555f9cf46d i965: Make sure constants re-sent after constant buffer reallocation.
The hardware requires that after constant buffers for a stage are
allocated using a 3DSTATE_PUSH_CONSTANT_ALLOC_{VS,HS,DS,GS,PS}
command, and prior to execution of a 3DPRIMITIVE, the corresponding
stage's constant buffers must be reprogrammed using a
3DSTATE_CONSTANT_{VS,HS,DS,GS,PS} command.

Previously we didn't need to worry about this, because we only
programmed 3DSTATE_PUSH_CONSTANT_ALLOC_{VS,HS,DS,GS,PS} once on
startup (or, previous to that, whenever BRW_NEW_CONTEXT was flagged).
But now that we reallocate the constant buffers whenever geometry
shaders are switched on and off, we need to make sure the constant
buffers are reprogrammed.

We do this by adding a new bit, BRW_NEW_PUSH_CONSTANT_ALLOCATION, to
brw->state.dirty.brw.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-31 17:11:59 -07:00
Paul Berry
27eecefc67 i965/gs: Allocate push constant space for use by GS.
Previously, we would always use the same push constant allocation
regardless of what shader programs were being run: the available push
constant space was split into 2 equal size partitions, one for the
vertex shader, and one for the fragment shader.

Now that we are adding geometry shader support, we need to do
something smarter.  This patch adjusts things so that when a geometry
shader is in use, we split the available push constant space into 3
nearly-equal size partitions instead of 2.

Since the push constant allocation is now affected by GL state, it can
no longer be set up by brw_upload_initial_gpu_state(); instead it must
be set up by a state atom.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-31 17:11:49 -07:00
Paul Berry
df62421382 i965/gen7: Emit CS stall after 3DSTATE_PUSH_CONSTANT_ALLOC_PS.
This is required by the internal hardware docs and the PRM.  Probably
the reason we were getting away with not doing it was because we only
emitted 3DSTATE_PUSH_CONSTANT_ALLOC_PS during startup.  However that's
going to change with the introduction of geometry shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-31 17:11:46 -07:00
Paul Berry
fffba41c68 i965/gs: Allocate URB space for use by GS.
Previously, we gave all of the URB space (other than the small amount
that is used for push constants) to the vertex shader.  However, when
a geometry shader is active, we need to divide it up between the
vertex and geometry shaders.

The size of the URB entries for the vertex and geometry shaders can
vary dramatically from one shader to the next.  So it doesn't make
sense to simply split the available space in two.  In particular:

- On Ivy Bridge GT1, this would not leave enough space for the worst
  case geometry shader, which requires 64k of URB space.

- Due to hardware-imposed limits on the maximum number of URB entries,
  sometimes a given shader stage will only be capable of using a small
  amount of URB space.  When this happens, it may make sense to
  allocate substantially less than half of the available space to that
  stage.

Our algorithm for dividing space between the two stages is to first
compute (a) the minimum amount of URB space that each stage needs in
order to function properly, and (b) the amount of additional URB space
that each stage "wants" (i.e. that it would be capable of making use
of).  If the total amount of space available is not enough to satisfy
needs + wants, then each stage's "wants" amount is scaled back by the
same factor in order to fit.

When only a vertex shader is active, this algorithm produces
equivalent results to the old algorithm (if the vertex shader stage
can make use of all the available URB space, we assign all the space
to it; if it can't, we let it use as much as it can).

In the future, when we need to support tessellation control and
tessellation evaluation pipeline stages, it should be straightforward
to expand this algorithm to cover them.

v2: Use "unsigned" rather than "GLuint".

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-31 17:11:35 -07:00
Paul Berry
53f6e79633 i965: Make CACHE_NEW_GS_PROG.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-31 17:11:25 -07:00
Paul Berry
a702f6325c i965/gs: Create brw_context::gs structure to track GS program state.
v2: Change name from "vec4_gs" to simply "gs".

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-31 17:11:15 -07:00
Paul Berry
ec94e3c3d0 i965: Move data from brw->vs into a base class if gs will also need it.
This paves the way for sharing the code that will set up the vertex
and geometry shader pipeline state.

v2: Rename the base class to brw_stage_state.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-31 17:11:05 -07:00
Paul Berry
cdf03b6928 i965/gs: Update defines related to GS surface organization.
Defines that previously referred to VS now refer to VEC4, since they
will be shared by the user-programmable vertex shader and geometry
shader stages.

Defines that previously referred to the Gen6 geometry shader stage
(which is only used for transform feedback) are now renamed to
explicitly refer to Gen6, to avoid confusion with the Gen7
user-programmable geometry shader stage.

Based on work by Eric Anholt <eric@anholt.net>.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-31 17:10:54 -07:00
Paul Berry
b3a4d5c785 i965: Move vec4 register allocation data structures to brw->vec4.
This will avoid confusion when we add geometry shaders, since these
data structures will be shared by vertex and geometry shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-31 17:10:44 -07:00
Paul Berry
56a2e57bdb i965: Rename user-defined gs structs from vec4_gs to gs.
Now that the name "gs" is no longer used to refer to the legacy fixed
function geometry shaders, we can use it to refer to user-defined
geometry shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-31 17:10:34 -07:00
Paul Berry
32e16e2337 i965: rename legacy gs structs and functions to ff_gs.
"ff" is for "fixed function".  This frees up the name "gs" to refer to
user-defined geometry shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-31 17:10:15 -07:00
Marek Olšák
a77ee8b548 radeonsi: simplify and improve flushing
This mimics r600g. The R600_CONTEXT_xxx flags are added to rctx->b.flags
and si_emit_cache_flush emits the packets. That's it. The shared radeon code
tells us when the streamout cache should be flushed, so we have to check
the flags anyway.

There is a new atom "cache_flush", because caches must be flushed *after*
resource descriptors are changed in memory.

Functional changes:

* Write caches are flushed at the end of CS and read caches are flushed
  at its beginning.

* Sampler view states are removed from si_state, they only held the flush
  flags.

* Everytime a shader is changed, the I cache is flushed. Is this needed?
  Due to a hw bug, this also flushes the K cache.

* The WRITE_DATA packet is changed to use TC, which fixes a rendering issue
  in openarena. I'm not sure how TC interacts with CP DMA, but for now it
  seems to work better than any other solution I tried. (BTW CIK allows us
  to use TC for CP DMA.)

* Flush the K cache instead of the texture cache when updating resource
  descriptors (due to a hw bug, this also flushes the I cache).
  I think the K cache flush is correct here, but I'm not sure if the texture
  cache should be flushed too (probably not considering we use TC
  for WRITE_DATA, but we don't use TC for CP DMA).

* The number of resource contexts is decreased to 16. With all of these cache
  changes, 4 doesn't work, but 8 works, which suggests I'm actually doing
  the right thing here and the pipeline isn't drained during flushes.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-31 01:34:30 +02:00
Marek Olšák
aa5c40f97c radeonsi: convert constant buffers to si_descriptors
There is a new "class" si_buffer_resources, which should be good enough for
implementing any kind of buffer bindings (constant buffers, vertex buffers,
streamout buffers, shader storage buffers, etc.)

I don't even keep a copy of pipe_constant_buffer - we don't need it.

The main motivation behind this is to have a well-tested infrastrusture
for setting up streamout buffers.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-31 01:34:30 +02:00
Marek Olšák
a81c3e00fe radeonsi: use r600_common_context, r600_common_screen, r600_resource
Also r600_hw_context_priv.h and si_state_streamout.c are removed, because
they are no longer needed.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-31 01:34:30 +02:00
Marek Olšák
d5b23dfc1c r600g: move streamout state to drivers/radeon
This streamout state code will be used by radeonsi.

There are new structures r600_common_context and r600_common_screen.
What is inherited by what is shown here:

pipe_context -> r600_common_context -> r600_context
pipe_screen -> r600_common_screen -> r600_screen

The common structures reside in drivers/radeon. Currently they only contain
enough functionality to be able to handle streamout. Eventually I'd like
the whole pipe_screen implementation to be shared and some of the context
stuff too.

This is quite big, but most changes are because of the new structures and
the fact r600_write_value is replaced by radeon_emit.

Thanks to Tom Stellard for fixing the build for r600g/compute.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-31 01:34:30 +02:00
Marek Olšák
13a1a8b877 radeonsi: cleanup initialization of SGPR shader parameters
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-31 01:34:29 +02:00
Marek Olšák
d698f19cba r600g,radeonsi: remove unused variables
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-31 01:34:29 +02:00
Marek Olšák
89a665eb5f draw: fix segfaults with aaline and aapoint stages disabled
There are drivers not using these optional stages.

Broken by a3ae5dc7dd.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-31 01:34:29 +02:00
Kenneth Graunke
a35b320250 i965/fs: Detect GRF sources in split_virtual_grfs send-from-GRF code.
It is incorrect to assume that src[0] of a SEND-from-GRF opcode is the
GRF.  For example, FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD uses src[1] for
the GRF.

To be safe, loop over all the source registers and mark any GRFs.  We
probably won't ever have more than one, but it's simpler to just check
all three rather than attempting to bail early.

Not observed to fix anything yet, but likely to.  Parallels the bug fix
in the previous commit, which actually does fix known failures.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: mesa-stable@lists.freedesktop.org
2013-08-30 15:49:31 -07:00
Kenneth Graunke
4e3d1712a2 i965/vs: Detect GRF sources in split_virtual_grfs send-from-GRF code.
It is incorrect to assume that src[0] of a SEND-from-GRF opcode is the GRF.
VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 uses an IMM as src[0], and stores the
GRF as src[1].

To be safe, loop over all the source registers and mark any GRFs.  We
probably won't ever have more than one, but it's simpler to just check
all three rather than attempting to bail early.

Fixes assertion failures in Unigine Sanctuary since we started making
register allocation rely on split_virtual_grfs working.  (The register
classes were actually sufficient, we were just interpreting an IMM as
a virtual GRF number.)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68637
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: mesa-stable@lists.freedesktop.org
2013-08-30 15:49:31 -07:00
Niels Ole Salscheider
217d2f7359 radeonsi: Do not suspend timer queries
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2013-08-30 23:30:00 +02:00
Roland Scheidegger
431e60625b draw: fix PIPE_MAX_SAMPLER/PIPE_MAX_SHADER_SAMPLER_VIEWS issues
pstipple/aaline stages used PIPE_MAX_SAMPLER instead of
PIPE_MAX_SHADER_SAMPLER_VIEWS when dealing with sampler views.
Now these stages can't actually handle sampler_unit != texture_unit anyway
(they cannot work with d3d10 shaders at all due to using tex not sample
opcodes as "mixed mode" shaders are impossible) but this leads to crashes if
a driver just installs these stages and then more than PIPE_MAX_SAMPLER views
are set even if the stages aren't even used.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-30 23:20:04 +02:00
Roland Scheidegger
f37edb5e20 gallivm: handle unbound textures in texture sampling / texture queries
Turns out we don't need to do much extra work for detecting this case,
since we are guaranteed to get a empty static texture state in this case,
hence just rely on format being 0 and return all zero then.
Previously needed dummy textures (would just have crashed on format being 0
otherwise) which cannot return the correct result for size queries and when
sampling textures with wrap modes using border.
As a bonus should hugely increase performance when sampling unbound textures -
too bad it isn't a useful feature :-).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-30 23:20:03 +02:00
Roland Scheidegger
bb7dc1b2f6 softpipe: handle NULL sampler views for texture sampling / queries
Instead of crashing just return all zero.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-30 23:20:03 +02:00
Roland Scheidegger
81ab3e57bc softpipe: check if so_target is NULL before accessing it
No idea if this is working right but copied straight from llvmpipe.
(Not only does this check the so_target but also use buffer->data instead
of buffer for the mapping.)
Just trying to get rid of a segfault testing something else...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-30 23:20:03 +02:00
Roland Scheidegger
289faa7e23 gallivm: (trivial) don't pass sampler_unit variable down to filtering funcs
The only reason this was needed was because the fetch texel function had to
get the (dynamic) border color, but this is now done much earlier.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-30 23:20:03 +02:00
Roland Scheidegger
61add3cc3c gallivm: don't use AoS path if min/mag filter are different with multiple lods
Instead of enhancing the AoS path so it can deal with it, just use SoA. Fixing
AoS path wouldn't be all that difficult (use all the same logic as SoA) but
considered not worth it for now.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-30 23:20:03 +02:00
Eric Anholt
bdf3f50e9a mesa: Don't choose S3TC for generic compression if we can't compress.
If the app is asking us to do GL_COMPRESSED_RGBA, then the app obviously
doesn't have pre-compressed data to hand us.  So don't choose a storage
format that we won't actually be able to compress and store.

Fixes black screen in warzone2100 when libtxc_dxtn is not present.  Also
66 piglit tests.

NOTE: This is a candidate for the 9.2 branch.
Reported-by: Paul Wise <pabs@debian.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-30 11:49:09 -07:00
Eric Anholt
b188467fdf mesa: Rip out more extension checking from texformat.c.
You should only be flagging the formats as supported if you support them
anyway.

NOTE: This is a candidate for the 9.2 branch. (required for next commit)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-30 11:49:07 -07:00
Eric Anholt
b1080cfbdb i965: Switch gen4-6 to using the sampler's base level for GL BASE_LEVEL.
Thanks to Ken for trawling through my neglected public branches and
finding the bug in this change (inside a megacommit) that made me abandon
this work.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-30 11:30:45 -07:00
Eric Anholt
f217791ee2 i965/gen7: Use the base_level field of the sampler to handle GL's BASE_LEVEL.
This avoids the need to get the inter- and intra-tile offset and adjust
our miptree info based on them.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-30 11:30:45 -07:00
Eric Anholt
2e2445fa7e i965: Add missing state reset at the end of blorp.
These are things that happen to be occurring because of the batch flush at
the start of the blorp op (which exists to prevent batch space or aperture
space overflow), but the intention was for this sequence of state resets at
the end of blorp to be everything necessary for the next draw call.

Found when debugging the next commit, by comparing brw_new_batch() and
intel_batchbuffer_reset() to brw_blorp_exec().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-30 11:30:44 -07:00
Eric Anholt
85aff83f3e i965: Drop extra flush when calling intel_miptree_map_raw().
The code that got replaced with map_raw didn't do the flush, but now
map_raw() is responsible for it and we don't have to worry about it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-30 11:30:44 -07:00
Eric Anholt
535fbf286c i965: Make a slight distinction in perf debug for BOs versus miptrees.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-30 11:30:44 -07:00
Eric Anholt
7801a8cc89 intel: Reuse intel_glFlush().
v2 (Kenneth Graunke): Rebase on latest master.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-30 11:30:44 -07:00
Eric Anholt
313f2bc32b intel: Add support for the new flush_with_flags extension.
This gives us more information about why we're flushing that we can
use for handling our throttling.

v2 (Kenneth Graunke): Rebase on latest master, add missing
   FLUSH_VERTICES and FLUSH_CURRENT, which fixes a regression in Glean's
   polygonOffset test.
v3 (anholt): Drop FLUSH_CURRENT -- FLUSH_VERTICES is what we need, which
   is "get any queued prims out of VBO and into the driver", not "update
   ctx->Current so we can read it with the CPU."  Also drop batch->used
   check, which intel_batchbuffer_flush() does anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-30 11:30:44 -07:00
Eric Anholt
bbdc83bca9 intel: Add a batch flush between front-buffer downsample and X protocol.
This was already happening because blorp happens to flush at the end of
every call, but we have been talking about removing that at some point,
and this would surely get overlooked.

v2 (Kenneth Graunke): Rebase on latest master.  Note that we did remove
   the other flush, and this change actually did get overlooked!

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-30 11:30:44 -07:00
Eric Anholt
6404fcb266 i965: Directly call intel_batchbuffer_flush() after i915 split.
intel_flush() now did nothing except call through (and
intel_batchbuffer_flush() does the no-op check, too!)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-30 11:30:44 -07:00
Eric Anholt
09e2df5961 i965/vs: Fix regression on pre-gen6 with no VS uniforms in use.
df06745c5a made it so that we didn't
allocate extra uniform space for unused clip planes, which also
incidentally made us not allocate any space at all, which we were relying
on for this no-uniforms case.  Instead of putting the knowledge of this
special HW exception into the thing that normally preallocates prog_data
for us, just allocate it here.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68766
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-30 11:29:50 -07:00
Vadim Girlin
f7217b99f2 r600g: enable SB backend by default
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-30 15:51:11 +04:00
Vadim Girlin
29ff2e907d r600g: fix color exports when we have no CBs
We need to export at least one color if the shader writes it,
even when nr_cbufs==0.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-08-30 15:51:11 +04:00
Vinson Lee
74be77a99e nvc0/ir: Initialize NVC0LegalizePostRA member variables.
Fixes "Uninitialized pointer field" defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-08-29 20:42:24 -07:00
Roland Scheidegger
a479f34025 gallivm: support per-pixel min/mag filter in SoA path
Since we can have per-pixel lod we should also honor the filter per-pixel
(in fact we didn't honor it per quad neither in the multiple quad case).
Do this by running the linear path and simply beating the weights into shape
(the sample with the higher weight is the one which should have been chosen
with nearest filtering hence adjust filter weight to 1.0/0.0 based on that).
If all pixels use nearest filter (either min and mag) then still run just a
nearest filter as this is way cheaper (probably around 4 times faster for 2d,
more for 3d case) and it should be relatively rare that pixels really need
different filtering. OTOH if all pixels would require linear don't do anything
special since the linear path with filter adjustments shouldn't really be all
that much more expensive than ordinary linear, and we think it's rare that
min/mag filters are configured differently so there doesn't seem much value
in trying to optimize this further.
This does not yet fix the AoS path (though currently AoS is only used for
single quads hence it could be considered less broken, just never honoring
per-pixel filter decision but doing it per quad).

v2: simplify code a bit (unify min linear and min nearest cases)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-30 02:16:45 +02:00
Roland Scheidegger
81cfcdbd87 gallivm: don't calculate square root of rho if we use accurate rho method
While a sqrt here and there shouldn't hurt much (depending on the cpu) it is
possible to completely omit it since rho is only used for calculating lod and
there log2(x) == 0.5*log2(x^2). Depending on the exact path taken for
calculating lod this means we get a simple mul instead of sqrt (in case of
nearest mip filter in fact we don't need to replace the sqrt with something
else at all), only in some not very useful path this doesn't work (combined
brilinear calculation of int level and fractional lod, accurate rho calc but
brilinear filtering seems odd).
Apart from being faster as an added bonus this should increase our crappy
fractional accuracy of lod, since fast_log2 is only good for ~3bits and this
should increase accuracy by one bit (though not used if dimension is just one
as we'd need an extra mul there as we never had the squared rho in the first
place).

v2: use separate ilog2_sqrt function if we have squared rho.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-30 02:16:45 +02:00
Roland Scheidegger
10e40ad11d gallivm: refactor num_lods handling
This is just preparation for per-pixel (or per-quad in case of multiple quads)
min/mag filter since some assumptions about number of miplevels being equal
to number of lods no longer holds true.
This change does not change behavior yet (though theoretically when forcing
per-element path it might be slower with different min/mag filter since the
code will respect this setting even when there's no mip maps now in this case,
so some lod calcs will be done per-element just ultimately still the same
filter used for all pixels).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-30 02:16:45 +02:00
Vinson Lee
4a6d2f3dd7 radeonsi: Early return if no depth or stencil on release builds.
Fixes "Missing break in switch" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-08-29 15:49:12 -07:00
Rob Clark
de10d383d0 freedreno: pipe loader for either kgsl or msm
The downstream android kernel driver is "kgsl", the upstream drm/kms
driver is called "msm".  Since libdrm_freedreno handles the differences
between the two, we need to load the same thing for either device.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-29 17:35:05 -04:00
Rob Clark
e95b7d89b9 freedreno: updates for msm drm/kms driver
There where some small API tweaks in libdrm_freedreno to enable support
for msm drm/kms driver.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-29 17:35:05 -04:00
Rob Clark
0267f264cc freedreno/a3xx/compiler: handle sync flags better
We need to set the flag on all the .xyzw components that are written by
the instruction, not just on .x.  Otherwise a later use of rN.y (for
example) will not trigger the appropriate sync bit to be set.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-29 17:35:04 -04:00
Rob Clark
4a2b5b2384 freedreno/a3xx/compiler: better const handling
Seems like most/all instructions have some restrictions about const src
registers.  In seems like the 2 src (cat2) instructions can take at most
one const, and the 3 src (cat3) instructions can take at most one const
in the first 2 arguments.  And so on.  Handle this properly now.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-29 17:35:04 -04:00
Anuj Phogat
9c0b7be964 glsl: Allow precision qualifiers for sampler types
GLSL 1.30 doesn't allow precision qualifiers on sampler types,
but in GLSL ES, sampler types are also allowed. This seems like
an oversight (since the intention of including these in GLSL 1.30
is to allow compatibility with ES shaders).

Currently, Mesa allows "default" precision qualifiers to be set for
sampler types in GLSL (commit d5948f2). This patch makes it follow
GLSL ES rules and also allow declaring sampler variables with a
precision qualifier in GLSL 1.30 (and later). e.g.
uniform lowp sampler2D sampler;

This fixes a shader compilation error in Khronos OpenGL conformance
test "depth_texture_mipmap".

V2: Update comments.
Signed-off-by: Ian Romanick <idr@lists.freedesktop.org>

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <idr@lists.freedesktop.org>
Cc: <mesa-stable@lists.freedesktop.org>
2013-08-29 12:10:57 -07:00
Matt Turner
1ecfdba98a glsl: Add heuristics to print floating-point numbers better.
v2: Fix *.expected files to match.
Reviewed-by: Paul Berry <strereotype441@gmail.com>
2013-08-29 12:07:28 -07:00
Jonathan Gray
57cf5946ce radeonsi: Make sure libdrm_radeon headers are picked up from the right place
And remove libdrm/ from a winsys include statement.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
2013-08-29 15:37:44 +02:00
Brian Paul
4e7f1346ae draw: fix point/line/triangle determination in draw_need_pipeline()
The previous point/line/triangle() functions didn't handle GS primitives.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-29 07:29:31 -06:00
Christian König
aebd065a64 radeon/uvd: fix MPEG2/4 ref frame index limit
Otherwise the first few frames have an incorrect reference index.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-29 08:51:12 +02:00
Vinson Lee
57684d52e9 nouveau: Copy m4x4 and m8x8 separately.
Silences Coverity "Out-of-bounds access" defect.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-08-28 23:23:49 -07:00
Kenneth Graunke
df06745c5a i965: Allocate just enough space for user clip planes in uniform arrays.
Previously, we allocated space in brw_vs_prog_data's params and
pull_params arrays for MAX_CLIP_PLANES vec4s---even when it wasn't
necessary.

On a 64-bit architecture, this used 0.5 kB of space (8 clip planes *
4 floats per plane * 8 bytes per float pointer * 2 arrays of pointers =
512 bytes).  Since this cost was per-vertex shader, it added up.

Conveniently, we already store the number of clip plane constants in the
program key.  By using that, we can allocate the exact amount of space
needed.  For the common case where user clipping is disabled, this means
0 bytes.

While we're here, mention exactly what code requires this extra space,
since it wasn't obvious.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-28 14:12:48 -07:00
Chad Versace
72b3c6c96f i965: Silence unused variable warning in release build
Use `(void) success;` to silence this warning:

  i965/brw_vs.c:481:12:
  warning: unused variable 'success' [-Wunused-variable]
         bool success = do_vs_prog(brw, ctx->Shader.CurrentVertexProgram,

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-28 10:42:51 -07:00
Brian Paul
031c3393a1 docs: minor fixes for 9.2 release notes
Fix incorrect </li> tag, fix language.
(cherry picked from commit 2377205bcb)
2013-08-27 18:59:05 -06:00
Ian Romanick
e496583975 docs: Add news item for 9.2 release
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-27 16:38:57 -07:00
Ian Romanick
9f2608bc46 docs: Import 9.2 release notes
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-27 16:38:57 -07:00
Fabian Bieler
cd18269705 mesa/main: Check for 0 size draws after validation.
When validating draw parameters move check for 0 draw count last
(drawing with count 0 is not an error), so that other parameters (e.g.: the
primitive type) are validated and the correct errors (if applicable) are
generated.

>From the OpenGL 3.3 spec page 33 (page 48 of the PDF):
"[Regarding DrawArraysOneInstance, in terms of which other draw operations
are defined:]
If count is negative, an INVALID_VALUE error is generated."

This patch also changes the bahavior of MultiDrawElements to perform the draw
operation if some primitive's index counts are zero.

Signed-off-by: Fabian Bieler <fabianbieler@fastmail.fm>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-27 15:11:52 -07:00
Matt Turner
ac74de3710 glsl: Add built-ins from ARB_shader_bit_encoding to ARB_gpu_shader5.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-27 15:06:16 -07:00
Matt Turner
4929be0b5f i965/vs: Add support for translating ir_triop_fma into MAD.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-27 15:03:30 -07:00
Matt Turner
530842127e i965/fs: Add support for translating ir_triop_fma into MAD.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-27 15:03:30 -07:00
Matt Turner
e817b94a2c i965/fs: Assert that ir_expressions are usable by 3-src instructions.
MAD will be generated directly from ir_triop_fma, so this assertion
checks that all ir_expressions are usable.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-27 15:03:30 -07:00
Matt Turner
d55c543c36 glsl: Add support for new fma built-in in ARB_gpu_shader5.
v2: Add constant folding support.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-27 15:03:30 -07:00
Matt Turner
6829c18609 glsl: Add new fma built-in IR and prototype from ARB_gpu_shader5.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-27 15:03:30 -07:00
Marek Olšák
adb93e3bda r300g: enable MSAA on r300-r400, be careful about using color compression
MSAA was tested by one user on RS690 and it works for him with color
compression (CMASK) disabled. Our theory is that his chipset lacks CMASK RAM.

Since we don't have hardware documentation about which chipsets actually have
CMASK RAM, I had to take a guess based on the presence of HiZ.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-08-27 23:18:54 +02:00
Fabio Pedretti
aa3905423e configure.ac: Bump Wayland requirement to 1.2.0
Since 8d29b52 wayland 1.2.0 is required.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-27 08:40:40 -07:00
Roland Scheidegger
bd3909f265 draw: clean up setting stream out information a bit
In particular noone is interested in the vertex count, so drop that,
and also drop the duplicated num_primitives_generated /
so.primitives_storage_needed variables in drivers. I am unable for now to figure
out if primitives_storage_needed in SO stats (used for d3d10) should
increase if SO is disabled, though the equivalent num_primitives_generated
used for OpenGL definitely should increase. In any case we were only counting
when SO is active both in softpipe and llvmpipe anyway so don't pretend there's
an independent num_primitives_generated counter which would count always.
(This means the PIPE_QUERY_PRIMITIVES_GENERATED count will still be wrong just
as before, should eventually fix this by doing either separate counting for this
query or adjust the code so it always counts this even if SO is inactive depending
on what's correct for d3d10.)

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-27 16:59:39 +02:00
Roland Scheidegger
aff2ecf09a llvmpipe: support nested/overlapping queries for all query types
There's just no way resetting the counters is working with nested/overlapping
queries.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-27 16:59:01 +02:00
Roland Scheidegger
4900e625bd softpipe: support nested/overlapping queries for all query types
There's just no way resetting the counters is working with nested/overlapping
queries.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-27 16:58:20 +02:00
Matt Turner
d8ac987f6a glsl: Disallow uniform block layout qualifiers on non-uniform block vars.
Cc: 9.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68460
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-26 23:19:14 -07:00
Kristian Lehmann
cec7b5c5bc Fixed and/or order mistake, resulting in compiling llvmpipe without llvm installed
Cc: 9.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68544
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-26 22:13:45 -07:00
Ian Romanick
d127a0343d i915: Optimize SEQ and SNE when two operands are uniforms
SEQ and SNE are not native i915 instructions, so they each generate at
least 3 instructions.  If both operands are uniforms or constants, we
get 5 instructions like:

                U[1] = MOV CONST[1]
                U[0].xyz = SGE CONST[0].xxxx, U[1]
                U[1] = MOV CONST[1].-x-y-z-w
                R[0].xyz = SGE CONST[0].-x-x-x-x, U[1]
                R[0].xyz = MUL R[0], U[0]

This code is stupid.  Instead of having the individual calls to
i915_emit_arith generate the moves to utemps, do it in the caller.  This
results in code like:

                U[1] = MOV CONST[1]
                U[0].xyz = SGE CONST[0].xxxx, U[1]
                R[0].xyz = SGE CONST[0].-x-x-x-x, U[1].-x-y-z-w
                R[0].xyz = MUL R[0], U[0]

This allows fs-temp-array-mat2-index-col-wr and
fs-temp-array-mat2-index-row-wr to fit in hardware limits (instead of
falling back to software rasterization).

NOTE: Without pending patches to the piglit tests, these tests will now
fail.  This is an unrelated, pre-existing issue.

v2: Copy most of the body of the commit message into comments in the
code.  Suggested by Eric.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-08-26 22:11:26 -07:00
Tom Stellard
f3e86d4a68 clover: Don't use PIPE_TRANSFER_UNSYNCHRONIZED for blocking copies
CC: "9.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-08-26 18:27:03 -07:00
Niels Ole Salscheider
ef6ed7220a st/clover: Add event to deps even if it has been triggered
The command is submitted once the event has been triggered, but it might not
have completed yet. Therefore, we have to add it to deps in order to wait on it.

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-08-26 18:25:17 -07:00
Niels Ole Salscheider
4a3505d548 st/clover: Profiling support
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2013-08-26 18:25:17 -07:00
Dave Airlie
4763a032a0 tgsi_build: fix order of arguments for ind register build
This was broken when arrayid was added.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-08-27 10:41:27 +10:00
Dave Airlie
81204d0e9c tgsi: finish declaration parsing for arrays.
I previously fixed this partly in 9e8400f4c9,
however I didn't go far enough in testing it, now when I parse a TGSI shader
with arrays in it my iterator can see the ArrayID set to the proper value.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-08-27 10:41:09 +10:00
Brian Paul
92cbfded6a svga: replace 0 with PIPE_OK in a few places 2013-08-26 15:49:16 -06:00
Brian Paul
5e7ac28ebf swrast: init i0, i1 values to silence warnings
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-26 12:52:06 -06:00
Brian Paul
ef47ab520d mesa: init dst values in COPY_CLEAN_4V_TYPE_AS_FLOAT()
to silence gcc 4.8.1 warnings.  And improve the ASSERT(0) call.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-26 12:52:06 -06:00
Brian Paul
f91f6ef739 glsl: init limit=0 to silence uninitialized var warning
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-26 12:52:06 -06:00
Kenneth Graunke
d65e3c082a i965/vs: Allocate register set once at context creation.
Now that we use a fixed set of register classes, we can set up the
register set and conflict graphs once, at context creation, rather than
on every VS compile.  This is obviously less expensive, and also what
we already do in the FS backend.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-26 11:21:10 -07:00
Kenneth Graunke
a149f744d9 i965/vs: Move base_reg_count computation to brw_alloc_reg_set().
We're soon going to be calling brw_alloc_reg_set() from outside of the
visitor, where we don't have the precomputed "max_grf" variable handy.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-26 11:21:10 -07:00
Kenneth Graunke
7aaaa8bc8f i965/vs: Expose the payload registers to the register allocator.
For now, nothing else can get allocated over them.  That may change at
some point in the future.

This also means that base_reg_count can be computed without knowing the
number of registers used for the payload, which is required if we want
to allocate the register set once at context creation time.

See commit 551e1cd44f, which implemented
virtually identical code in the FS backend.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-26 11:21:10 -07:00
Kenneth Graunke
528d70d0b5 i965/vs: Use a fixed set of register classes.
Arrays, structures, and matrices use large VGRFs of arbitrary sizes.
However, split_virtual_grfs() breaks those down into VGRFs of size 1.

For reference, commit 5d90b98879 is the
analogous change to the FS backend.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-26 11:21:10 -07:00
Paul Berry
cfe39ea14e i965: Allow C++ type safety in the use of enum brw_urb_write_flags.
(From a suggestion by Francisco Jerez)

If an enum represents a bitfield of flags, e.g.:

enum E {
  A = 1,
  B = 2,
  C = 4,
  D = 8,
};

then C++ normally prohibits statements like this:

enum E x = A | B;

because A and B are implicitly converted to ints before OR-ing them,
and an int can't be stored in an enum without a type cast.  C, on the
other hand, allows an int to be implicitly converted to an enum
without casting.

In the past we've dealt with this situation by storing flag bitfields
as ints.  This avoids ugly casting at the expense of some type safety
that C++ would normally have offered (e.g. we get no warning if we
accidentally use the wrong enum type).

However, we can get the best of both worlds if we override the |
operator.  The ugly casting is confined to the operator overload, and
we still get the benefit of C++ making sure we don't use the wrong
enum type.

v2: Remove unnecessary comment and unnecessary use of "enum" keyword.
Use static_cast.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-08-26 10:15:51 -07:00
Paul Berry
612226c43b i965: Remove redundant (and uninitialized) field vec4_generator::ctx.
We never noticed that this field was uninitialized because it is only
used in an error path that reports internal Mesa errors.

But it's silly to have it around anyway because &brw->ctx is
equivalent.

Should fix Coverity defect CID 1063351: Uninitialized pointer field
(UNINIT_CTOR) /src/mesa/drivers/dri/i965/brw_vec4_emit.cpp: 148

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-26 08:55:39 -07:00
Paul Berry
4bf91ca791 i965: Don't try to fall back when creating unrecognized program targets.
If brwNewProgram is asked to create a program for an unrecognized
target, don't bother falling back on _mesa_new_program().  That just
hides bugs.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>

v2: Use assert() rather than _mesa_problem().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-26 08:55:39 -07:00
Michel Dänzer
46fd81e586 radeonsi: Also set the depth component mask bit for stencil-only exports
The stencil values come out wrong without this for some reason.

50 more little piglits.

Cc: mesa-stable@lists.freedesktop.org
2013-08-26 15:47:50 +02:00
Kenneth Graunke
7fa18774bd glsl: Add built-in function prototypes for GLSL 3.30
330.frag is a direct copy of 150.frag.
330.glsl is 150.glsl combined with ARB_shader_bit_encoding.glsl.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-25 20:32:39 -07:00
Kenneth Graunke
8f00409d23 glsl: Bump standalone compiler versions to 3.30.
These are necessary in order to compile the built-in functions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-25 20:32:39 -07:00
Kenneth Graunke
7950315583 mesa: Set query->EverBound in glQueryCounter().
glIsQuery is supposed to return false for names returned by glGenQueries
until their first use.  BeginQuery is a use, but QueryCounter is also a
use.

From the ARB_timer_query spec:
"A timer query object is created with the command

      void QueryCounter(uint id, enum target);

 [...] If <id> is an unused query object name, the
 name is marked as used [...]"

Fixes Piglit's spec/ARB_timer_query/query-lifetime.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: mesa-stable@lists.freedesktop.org
2013-08-25 20:29:59 -07:00
Henri Verbeet
b5ddaf9975 r600g: Implement the new float comparison instructions for Cayman as well.
I assume this should have been part of commit
7727fbb7c5. This (obviously) fixes a lot tests.

Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-08-25 13:00:02 +02:00
Ilia Mirkin
bac6efe8e3 nv30: add forgotten PIPE_CAP_CUBE_MAP_ARRAY cap to list
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-25 10:47:28 +02:00
Ilia Mirkin
293fa4e559 nouveau/video: avoid overwriting base codec init with template
Commit 53e20b8b introduced the use of a template to initialize some
common fields. Move this copying of fields to before the common vp3
fields are initialized.

Reported-by: Martin Peres <martin.peres@labri.fr>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-25 10:14:30 +02:00
Rob Clark
56ea2c4816 freedreno/a3xx: don't leak so much
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:58:01 -04:00
Rob Clark
9b9038496c freedreno/a3xx/compiler: fix SGT/SLT/etc
The cmps.f.* instruction doesn't actually seem to give a float 1.0 or
0.0 output.  It either needs a cov.u16f16 or add.s + sel.f16.  This
makes SGT/SLT/etc more similar to CMP, so handle them in trans_cmp().

This fixes a bunch of piglit tests.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:23:32 -04:00
Rob Clark
572d4646f7 freedreno/a3xx/compiler: bit of re-arrange/cleanup
It seems there are a number of cases where instructions have limitations
about taking reading src's from const register file, so make
get_unconst() a bit easier to use.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:23:32 -04:00
Rob Clark
d63bbac3a5 freedreno/a3xx/compiler: make compiler errors more useful
We probably should get rid of assert() entirely, but at this stage it is
more useful for things to crash where we can catch it in a debugger.
With compile_error() we have a single place to set an error flag (to
bail out and return an error on the next instruction) so that will be a
small change later when enough of the compiler bugs are sorted.

But re-arrange/cleanup the error/assert stuff so we at least get a dump
of the TGSI that triggered it.  So we see some useful output in piglit
logs.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:23:32 -04:00
Rob Clark
4c91930a25 freedreno: fix segfault when no color buffer bound
Don't crash when no color buffer bound.  Something caught when starting
to run piglit, fixes a hanful of piglit tests.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:23:32 -04:00
Rob Clark
7eeab24344 freedreno/a3xx/compiler: cat4 cannot use const reg as src
Category 4 instructions (rsq, rcp, sqrt, etc) seem to be unable to take
a const register as src.  In these cases we need to move the src to a
temporary gpr first.

This is the second case of such a restriction, where the instruction
encoding appears to support a const src, but in fact the hw appears to
ignore that bit.  So split things out into a helper that can be re-used
for any instructions which have this limitation.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:23:32 -04:00
Rob Clark
2effac5a67 freedreno/a3xx/compiler: use max_reg rather than file_count
Our current (rather naive) register assignment is based on mapping
different register files (INPUT, OUTPUT, TEMP, CONST, etc) based on the
max register index of the preceding file.  But in some cases, the lowest
used register in a file might not be zero.  In which case
file_count[file] != file_max[file] + 1.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:23:32 -04:00
Rob Clark
aee1ed708a freedreno/a3xx/compiler: handle saturate on dst
Sometimes things other than color dst need saturating, like if there is
a 'clamp(foo, 0.0, 1.0)'.  So for saturated dst add the extra
instructions to fix up dst.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:23:32 -04:00
Rob Clark
8b250bb8aa freedreno/a3xx/compiler: fix CMP
The 1st src to add.s needs (r) flag (repeat), otherwise it will end up:

  add.s dst.xyzw, tmp.xxxx -1

instead of:

  add.s dst.xyzw, tmp.xyzw, -1

Also, if we are using a temporary dst to avoid clobbering one of the src
registers, we actually need to use that as the dst for the sel
instruction.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:23:32 -04:00
Rob Clark
528bee59fe freedreno/a3xx: some texture fixes
Stop hard coding bits that indicate texture type (2d/3d/cube/etc).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:21:59 -04:00
Rob Clark
fd59f3ea98 freedreno: update register headers
resync w/ rnndb database

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:12:26 -04:00
Rob Clark
c2babfccb5 freedreno: add debug option to disable scissor optimization
Useful for testing and debugging.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:11:50 -04:00
Rob Clark
ae1a3f1736 freedreno/a3xx: fix viewport on gmem->mem resolve
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:04:29 -04:00
Rob Clark
fbef4e795f freedreno/a3xx: fix color inversion on mem->gmem restore
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-08-24 13:04:29 -04:00
Niels Ole Salscheider
288a252523 radeonsi: Handle additional PIPE_COMPUTE_CAP_*
This patch adds support for:
PIPE_COMPUTE_CAP_MAX_INPUT_SIZE
PIPE_COMPUTE_CAP_MAX_LOCAL_SIZE

Return the values reported by the closed source driver for now.

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-23 17:00:01 -07:00
Niels Ole Salscheider
04349541cd radeonsi: copy r600_get_timestamp
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-08-23 16:59:55 -07:00
Niels Ole Salscheider
db6f4165f4 radeonsi: Implement PIPE_QUERY_TIMESTAMP
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-08-23 16:59:44 -07:00
Roland Scheidegger
ad9b5b9ae9 gallivm: fix min/mag switchover point for nearest/none mip filter
Previously, the min/mag switchover point when using nearest/none mip
filter was effectively -0.5 which can't be right. Looks like new OpenGL
thinks it's ok if it's always 0.0 (older versions required 0.5 in some
cases), let's hope everybody else thinks that's fine too.
Refactor this slightly and get the per-quad/per-pixel min/mag decision
values further down to sampling, though still only the first component
is used yet.
While here also fix code trying to skip lod bias application etc. when
mipfilter is none, as this is still needed for determining min/mag filter.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-23 23:46:28 +02:00
Jon Severinsson
b47bde0079 gallium/osmesa: Link, not copy, the shared library to the LIB_DIR.
Just like all other mesa libraries...

CC: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 12:58:48 -07:00
Jon Severinsson
aeb9c9e4b0 gallium/osmesa: Always link with the c++ linker.
Just like all other gallium targets...

CC: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 12:58:45 -07:00
Jon Severinsson
c811190430 gallium/osmesa: Make and install an osmesa.pc.
As of "2f142d59 build: Add --enable-gallium-osmesa flag." the pkgconfig
file from classic osmesa is no longer installed when building gallium
osmesa, so copy it to gallium osmesa and install the copy instead.

CC: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 12:58:30 -07:00
Paul Berry
60ddb96f7e i965/gs: Add a data structure for tracking VS output VUE map.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:03:47 -07:00
Paul Berry
06918f84c2 i965/vec4: Make a function for setting up vec4 program key clip info.
This functionality will need to be reused by geometry shaders.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:03:43 -07:00
Paul Berry
5b5d10bcd3 i965: Make prim_to_hw_prim accessible outside brw_draw.c.
We will need access to this array in order to configure the geometry
shader.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:03:38 -07:00
Paul Berry
16512ba70d i965/gs: add GS visitors.
This patch introduces the vec4_gs_visitor class, which translates
geometry shaders from GLSL IR to back-end opcodes.

This class is derived from vec4_visitor (which is also the base class
for vec4_vs_visitor), so as a result most of the back end code is
shared.  The only parts that differ are:

- Geometry shaders use a different input payload organization, since
  the inputs need to match up with the outputs of the previous
  pipeline stage (vec4_gs_visitor::setup_payload() and
  vec4_gs_visitor::setup_varying_inputs()).

- Geometry shader input array dereferences need a special stride
  computation, since all geometry shader inputs are interleaved into
  one giant array (vec4_gs_visitor::compute_array_stride()).

- There are no geometry shader system values
  (vec4_gs_visitor::make_reg_for_system_value()).

- At the beginning of a geometry shader, extra data in R0 needs to be
  zeroed out, and a vertex counter needs to be initialized
  (vec4_gs_visitor::emit_prolog()).

- When EmitVertex() appears in the shader, the current contents of
  output variables need to be emitted to the URB, and the vertex
  counter needs to be incremented
  (vec4_gs_visitor::visit(ir_emit_vertex *)).

- When generating a URB_WRITE message to output vertex data, the
  current state of the vertex counter needs to be used to store a
  write offset in the message header
  (vec4_gs_visitor::emit_urb_write_header()).

- The URB_WRITE message that outputs vertex data needs to be sent
  using GS_OPCODE_URB_WRITE, since VS_OPCODE_URB_WRITE would overwrite
  the offsets in the message header
  (vec4_gs_visitor::emit_urb_write_opcode()).

- At the end of a geometry shader, the final vertex count needs to be
  delivered using a URB WRITE message
  (vec4_gs_visitor::emit_thread_end()).

- EndPrimitive() functionality is not implemented yet
  (vec4_gs_visitor::visit(ir_end_primitive *)).

- There is no support for assembly shaders
  (vec4_gs_visitor::emit_program_code()).

v2: Make num_input_vertices const.  Refer to registers as rN rather
than gN, for consistency with the PRM.  Fix misspelling.  Improve
comment in the ir_emit_vertex visitor explaining why we emit vertices
inside a conditional.  Enclose the conditional code in the
ir_emit_vertex visitor between curly braces.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:03:34 -07:00
Paul Berry
35bdd552d5 i965/gs: Add GS_OPCODE_SET_DWORD_2_IMMED.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:03:31 -07:00
Paul Berry
7417eddea9 i965/gs: Add GS_OPCODE_SET_VERTEX_COUNT.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:03:27 -07:00
Paul Berry
ce722fd65d i965/gs: Add GS_OPCODE_SET_WRITE_OFFSET.
v2: Added a comment to vec4_generator::generate_gs_set_write_offset().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:03:23 -07:00
Paul Berry
4416cb7992 i965/gs: Add GS_OPCODE_THREAD_END.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:03:19 -07:00
Paul Berry
96eb2f3536 i965/gs: Add GS_OPCODE_URB_WRITE.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:03:15 -07:00
Paul Berry
eaa63cbbc2 i965/gs: Add a flag allowing URB write messages to use a per-slot offset.
This will be used by geometry shaders to implement the EmitVertex()
function, since it requires writing data to a dynamically-determined
offset within the geometry shader's URB entry.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:03:12 -07:00
Paul Berry
a9e8c10bd7 i965: Combine 4 boolean args of brw_urb_WRITE into a flags bitfield.
The arguments to brw_urb_WRITE() were getting pretty unwieldy, and we
have to add more flags to support geometry shaders anyhow.

Also plumb these flags through brw_clip_emit_vue(),
brw_set_urb_message(), and the vec4_instruction class.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:03:08 -07:00
Paul Berry
591fc0861c i965/gs: Add a case to brwNewProgram() for geometry shaders.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:03:05 -07:00
Paul Berry
ebbb8c0c76 i965/gs: Create structs for use by GS program compilation.
v2: Make id "unsigned" rather than "GLuint".

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:03:01 -07:00
Paul Berry
3167dca3d4 i965/gs: Add a case to brwBindProgram() for geometry shaders.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:02:58 -07:00
Paul Berry
158dcdc0e2 i965/gs: Add brw->geometry_program.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:02:54 -07:00
Paul Berry
7f57101ad5 i965/vec4: Virtualize setup_payload instead of setup_attributes.
When I initially generalized the vec4_visitor class in preparation for
geometry shaders, I assumed that the setup_attributes() function would
need to be different between vertex and geometry shaders, but its
caller, setup_payload(), could be shared.  So I made
setup_attributes() a virtual function.

It turns out this isn't true; setup_payload() needs to be different
too, since the geometry shader payload sometimes includes an extra
register (primitive ID) that has to come before uniforms.

So setup_payload() needs to be the virtual function instead of
setup_attributes().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:02:51 -07:00
Paul Berry
626495d269 i965/vec4: Allow for dispatch_grf_start_reg to vary.
Both 3DSTATE_VS and 3DSTATE_GS have a dispatch_grf_start_reg control,
which determines the register where the hardware delivers data sourced
from the URB (push constants followed by per-vertex input data).

For vertex shaders, we always set dispatch_grf_start_reg to 1, since
R1 is always the first register available for push constants in vertex
shaders.

For geometry shaders, we'll need the flexibility to set
dispatch_grf_start_reg to different values depending on the behvaiour
of the geometry shader; if it accesses gl_PrimitiveIDIn, we'll need to
set it to 2 to allow the primitive ID to be delivered to the thread in
R1.

This patch eliminates the assumption that dispatch_grf_start_reg is
always 1.  In vec4_visitor, we record the regnum that was passed to
vec4_visitor::setup_uniforms() in prog_data for later use.  In
vec4_generator, we consult this value when converting an abstract
UNIFORM register to a concrete hardware register.  And in the code
that emits 3DSTATE_VS, we set dispatch_grf_start_reg based on the
value recorded in prog_data.

This will allow us to set dispatch_grf_start_reg to the appropriate
value when compiling geometry shaders.  Vertex shaders will continue
to always use a dispatch_grf_start_reg of 1.

v2: Make dispatch_grf_start_reg "unsigned" rather than "GLuint".

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:02:47 -07:00
Paul Berry
72168f5f00 i965/vec4: Move vec4 data structures and functions to brw_vec4.{cpp,h}.
This patch moves the following things into brw_vec4.{cpp,h}:

- struct brw_vec4_compile
- struct brw_vec4_prog_key
- brw_vec4_prog_data_compare()
- brw_vec4_prog_data_free()

This will allow us to avoid having to include brw_vs.h in
geometry-shader-specific files.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:02:44 -07:00
Paul Berry
e556286802 i965: Make brw_{shader,vec4}.h safe to include from C.
The patch that follows will move the definition of struct
brw_vec4_prog_key from brw_vs.h to brw_vec4.h, making it necessary for
brw_vs.h to include brw_vec4.h (because brw_vs.h defines struct
brw_vs_prog_key, which contains brw_vec4_prog_key as a member).  Since
brw_vs.h is included from C source files, that means that brw_vec4.h
will need to be safe to include from C.  Same for brw_shader.h, since
it is included by brw_vec4.h.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:02:40 -07:00
Paul Berry
5fb13d871e i965: Stop including brw_vs.h from brw_vec4.h.
This is backwards from what we are going to want in the long term, which is:

- brw_vec4.h declares general-purpose vec4 infrastructure needed by
  both VS and GS
- brw_vs.h includes brw_vec4.h and adds VS-specific parts.
- brw_gs.h includes brw_vec4.h and adds GS-specific parts.

Note that at the moment brw_vec.h contains a fair amount of
VS-specific declarations--I plan to address that in a later patch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:02:37 -07:00
Paul Berry
52bac6e4ff i965: Initialize all elements of ctx->ShaderCompilerOptions.
Otherwise any GS that requires lowering (e.g. one that uses
gl_ClipDistance as an input or output) will fail to work.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:02:34 -07:00
Paul Berry
61a5bd8336 i965: Make brw_{program,vs}.h safe to include from C++.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:02:31 -07:00
Paul Berry
ad65825098 mesa/program: Make prog_instruction.h and program.h safe to include from C++.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:02:25 -07:00
Paul Berry
44e07de3ac glsl: Refactor handling of gl_ClipDistance/gl_ClipVertex linkage rules for GS.
This patch extracts the following logic from
validate_vertex_shader_executable():

(a) Generate an error if the shader writes to both gl_ClipDistance and
    gl_ClipVertex.

(b) Record whether the shader writes to gl_ClipDistance in
    gl_shader_program for use by the back-end.

(c) Record the size of gl_ClipDistance in gl_shader_program for use by
    transform feedback logic.

And moves it into a function that is shared between vertex and
geometry shaders.

Strictly speaking we only need to have shared logic for (b) and (c)
right now (since (a) only matters in compatibility contexts, and we're
only implementing geometry shaders in core contexts right now).  But
the three are closely related enough that it seems sensible to keep
them together.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-23 11:02:15 -07:00
Timothy Arceri
f0072e3c6b mesa: Fix assertion error with glDebugMessageControl
enums were being converted twice resulting in incorrect values.
The extra conversion has been removed and the redundant assert is
removed also.

Cc: 9.2 <mesa-stable@lists.freedesktop.org>

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-23 08:15:19 -06:00
Kenneth Graunke
a27180d0d8 mesa: Specify a better GL_MAX_SERVER_WAIT_TIMEOUT limit.
The previous value of (GLuint64) ~0 has some problems:

GL_MAX_SERVER_WAIT_TIMEOUT is supposed to be a GLuint64 value, but has
to be queried via GetInteger64v(), which returns a GLint64.  This means
that some applications are likely to treat it as a signed integer, where
~0 means -1.  Negative values are nonsensical and problematic.

When interpreted correctly, ~0 translates to about 0.58 million years,
which seems rather excessive.

This patch changes it to 0x1fff7fffffff, which is about 1.11 years.
This is still plenty long, and is the same as both an int64 and uint64.
Applications that accidentally store it in a 32-bit int/unsigned also
get a non-negative value, which is again the same as both int and
unsigned.  This value was suggested by Ian Romanick.

v2: Add the ULL prefix on the constant (suggested by Ian).

Fixes Piglit's spec/!OpenGL 3.2/get-integer-64v.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2013-08-22 23:08:20 -07:00
Kenneth Graunke
62411681da meta: Set correct viewport and projection in decompress_texture_image.
_mesa_meta_begin() sets up an orthographic project and initializes the
viewport based on the current drawbuffer's width and height.  This is
likely the window size, since it occurs before the meta operation binds
any temporary buffers.

decompress_texture_image needs the viewport to be the size of the image
it's trying to draw.  Otherwise, it may only draw part of the image.

v2: Actually set the projection properly too.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68250
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Mak Nazecic-Andrlon <owlberteinstein@gmail.com>
2013-08-22 20:28:53 -07:00
Chad Versace
ce8639a766 i965: Fix misapplication of gles3 srgb workaround
Fixes inconsistent failure of gles2conform/GL2Tests/glUniform/glUniform.test
under gnome-shell. What follows is a description of the bug and its fix.

When intel_update_renderbuffers() allocates a miptree for a winsys
renderbuffer, it propagates the renderbuffer's format to become also the
miptree's format.

If the winsys color buffer format is SARGB, then, in the first call to
eglMakeCurrent, intel_gles3_srgb_workaround() changes the renderbuffer's
format to ARGB. That is, it changes the format from sRGB to non-sRGB.
However, it changes the renderbuffer's format *after*
intel_update_renderbuffers() has allocated the renderbuffer's miptree.
Therefore, when eglMakeCurrent returns, the miptree format (SARGB)
differs from the renderbuffer format (ARGB).

If the X server reallocates the color buffer,
intel_update_renderbuffers() will create a new miptree for the
renderbuffer. The new miptree's format (ARGB) will differ from old
miptree's format (SARGB). This mismatch between old and new miptrees
causes bugs.

Fix the bug by moving intel_gles3_srgb_workaround() to occur *before*
intel_update_renderbuffers().

CC: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67934
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-22 10:54:36 -07:00
Roland Scheidegger
bd0b6c5180 gallivm: do per-element lod for lod bias and explicit derivs too
Except for explicit derivs with cube maps which are very bogus anyway.
Just like explicit lod this is only used if no_quad_lod is set in
GALLIVM_DEBUG env var.
Minification is terrible on cpus which don't support true vector shifts
(but should work correctly). Cannot do the min/mag filter decision (if
they are different) per pixel though, only selecting different mip levels
works.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-22 19:05:52 +02:00
Roland Scheidegger
33694a1800 gallivm: (trivial) fix int/uint border color clamping
Just a copy & paste error.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=68409.
Note that the test passing before probably simply means it doesn't verify
clamping of the border color itself as required by the OpenGL spec.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-22 19:05:52 +02:00
Roland Scheidegger
6ff9008544 gallivm: (trivial) fix linear aos sampling of 3d compressed formats
block size depth is always 1 even for compressed formats (unless someone
invents true 3d compressed formats at least which we can't represent).
Nearest (and soa) path had it right.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-22 19:05:52 +02:00
Michel Dänzer
237cb074cb radeonsi: Fix y/z/w component values of TGSI_SEMANTIC_FOG pixel shader inputs
They are defined as constant 0.0/0.0/1.0.

Three more little piglits.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-08-22 16:12:17 +02:00
José Fonseca
fb62388d6a gallium: Support PIPE_FORMAT_R10G10B10A2_UINT.
Same as PIPE_FORMAT_B10G10R10A2_UINT but without the swizzling.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-22 12:14:15 +01:00
José Fonseca
c5f2cd6e41 trace: Handle null tokens.
Used for example on stream out without geometry shader.
2013-08-22 12:14:15 +01:00
Chia-I Wu
b6037e734e ilo: do not need last shader stage for 3DSTATE_SBE
We have set up 3DSTATE_SBE (or 3DSTATE_SF on GEN6) in
ilo_shader_select_kernel_routing().  There is no need to pass the last shader
stage to the GPE function.
2013-08-22 15:18:29 +08:00
Chia-I Wu
627d7ca763 ilo: fix a potential issue with STATE_SIP
Command length is ORed to the wrong place.  Since the ORed value is zero,
there is no real change.
2013-08-22 15:18:29 +08:00
Chia-I Wu
475d7ecce2 ilo: add GEN check to 3DSTATE_CLIP
Assert that gen6_emit_3DSTATE_CLIP is for GEN 6 and 7.
2013-08-22 15:18:29 +08:00
Matt Turner
2f142d596f build: Add --enable-gallium-osmesa flag.
The Gallium implementation is apparently not ready for regular
consumption, so as much as I hate adding more build-time options, here's
another.

Acked-by: Brian Paul <brianp@vmware.com>
2013-08-21 23:07:10 -07:00
Ian Romanick
dded321f92 glsl: Give a warning, not an error, for UBO qualifiers on non-matrices.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59648
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-21 23:06:59 -07:00
Matt Turner
921ef55a72 glsl: Remove ubo_qualifiers_allowed variable.
No longer used.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-08-21 22:47:02 -07:00
Matt Turner
77373e020e glsl: Drop duplicate error messages.
This same message is printed in the validate_matrix_layout_for_type
function.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-08-21 22:47:02 -07:00
Matt Turner
1a45db9705 glsl: Rename ubo_qualifiers_valid to ubo_qualifiers_allowed.
The variable means that UBO qualifiers are allowed in a particular
context (e.g., not allowed in a struct field declaration), rather than a
particular set of UBO qualifiers are valid.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-08-21 22:47:02 -07:00
Kenneth Graunke
9d08756ac7 i965/fs: Add code to print out global copy propagation sets.
This was invaluable when debugging the global copy propagation
algorithm.  We may as well commit it in case someone needs to print
out the sets in the future.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-21 21:05:50 -07:00
Armin K
63ac68bae3 osmesa: Symlink shared library to LIB_DIR
Cc: 9.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Brian Paul <brianp at vmware.com>
Reviewed-by: Brian Paul <brianp at vmware.com>
2013-08-21 17:55:32 -06:00
Brian Paul
e4217396b7 svga: minor clean-ups in emit_hw_vs_vdecl() 2013-08-21 17:55:06 -06:00
Roland Scheidegger
e6013e4bee gallivm: unify sin and cos implementation
The (complicated!) math is all identical, there's just minimal differences how
sign bit is calculated plus there's an additional subtraction for the argument
going into the polynomial for cos.
The logic stays 100% the same (with a small exception, sign bit calculation for
sin is minimally simplified, applying sign mask after xoring the arguments
instead of applying it to each argument).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-21 22:05:53 +02:00
Roland Scheidegger
275d2efeed gallivm: add comment for bogus min/mag filter selection with nearest mip filter
Detected this hunting some other bug, not sure if it really needs fixing but
it is definitely wrong.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-21 22:05:52 +02:00
Roland Scheidegger
21d8fa2759 gallivm: fix rho calculation for 1d case
Was using wrong (undefined) vector element (the elements are at 0/2 position,
not 0/1).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-21 22:05:52 +02:00
Ville Syrjälä
e6893b99ad i965/gen7: Set MOCS L3 cacheability for IVB/BYT (v2)
IVB/BYT also has the same L3 cacheability control in MOCS as HSW,
so let's make use of it.

pts/xonotic and pts/reaction @ 1920x1080 gain ~4% on my IVB GT2. Most
other things show less gains/no regressions, except furmark which
loses some 10 points.

I didn't have a BYT at hand for testing.

v2: Don't check (brw->gen == 7) in gen7 functions. (chadv)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-21 10:14:04 -07:00
Ville Syrjälä
22161983c3 i965/hsw: Populate MOCS for STATE_BASE_ADDRESS (v2)
Just spotted these unpopulated MOCS fields when comparing the code
against BSpec. Set the MOCS to the same as everywhere else in Haswell:
L3-cacheable.

v2: Annotate state packet fields (chadv).

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-21 10:14:04 -07:00
Maarten Lankhorst
10aa3677cc glapi/gen: build temporary files in the build directory
Writing to the source directory can cause multiple parallel builds
from the same source to fail. Create the temporary files in the
build directory.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-21 18:34:59 +02:00
Ian Romanick
f53b634807 mesa: Never advertise _S3TC compressed formats
The NVIDIA driver doesn't expose them, and piglit's
arb_texture_compression-invalid-formats expects them to not be there.

This, with the previous commit, fixes piglit
arb_texture_compression-invalid-formats.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-21 07:48:31 -07:00
Ian Romanick
40550c8ced mesa: Only advertise GL_ETC1_RGB8_OES in ES contexts
There is no extension for this format in desktop GL, so an application
can't give the format back to glCompressedTexImage2D.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-21 07:46:51 -07:00
Ian Romanick
cabd45773b glsl: Track existence of default float precision in GLSL ES fragment shaders
This is required by the spec, and it's a bit tricky because the default
precision is scoped.  As a result, I'm slightly abusing the symbol
table.

Fixes piglit no-default-float-precision.frag tests and the piglit
default-precision-nested-scope-0[1234].frag tests that are currently on
the piglit mailing list for review.

On IRC I got confirmation from cwabbot that ARM (Mali T6xx and T400)
enforces this requirement and from kusma that NVIDIA (Tegra2) enforces
this requirement.  We should be safe from regressing shipping
applications.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-21 07:44:26 -07:00
Ian Romanick
73e2d69792 glsl: Merge precision qualifiers too
We never noticed this before because we previously didn't enfoce GLSL ES
fragement shader requirements that precision be defined.  There may also
have been some interaction here with the addition of
GL_ARB_shading_language_420pack, but it doesn't appear to me that it
added any new bugs (just perhaps uncovered some old ones).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-21 07:43:48 -07:00
Ian Romanick
b15b62c54c glsl: Pass type to is_valid_default_precision_type instead of name
This is used by the next patch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-21 07:43:48 -07:00
Rico Schüller
00fcdc81ff vdpau/decode: Fix comment.
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-21 11:25:36 +02:00
Rico Schüller
d8d90ecf30 vl/query: Only support VDP_CHROMA_TYPE_420 for 12 bit formats.
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-21 11:25:10 +02:00
Roland Scheidegger
4b45b61fef util: add avx2 and xop detection to cpu detection code
Going to need this soon (not going to bother with avx2 intrinsics at this time
but don't want to do workarounds for true vector shifts if llvm itself can use
them just fine and won't need the gazillion instruction emulation).
Not really tested other than my cpu returns 0 for these features...
(I have no idea if llvm actually would emit avx2/xop instructions neither...)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-20 23:00:24 +02:00
Roland Scheidegger
9299128bf2 gallivm: fix bogus aos path detection
Need to check the wrap mode of the actually used coords not a fixed 2.
While checking more than necessary would only potentially disable aos and
not cause any harm I'm pretty sure for 3d textures it could have caused
assertion failures (if s,t coords have simple filter and r not).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-20 23:00:24 +02:00
Roland Scheidegger
fe92d7fab4 gallivm: do clamping of border color correctly for all formats
Turns out it is actually very complicated to figure out what a format really
is wrt range, as using channel information for determining unorm/snorm etc.
doesn't work for a bunch of cases - namely compressed, subsampled, other.
Also while here add clamping for uint/sint as well - d3d10 doesn't actually
need this (can only use ld with these formats hence no border) and we could
do this outside the shader for GL easily (due to the fixed texture/sampler
relation) do it here too just so I can forget about it.

v2: move border color clamping out of fetch texel. Also change it to clamp
the whole border vector at once (and use vectorized load of border color),
which saves a couple of instructions - needs some different handling of
mixed signed/unsigned formats so skip the per channel stuff and just derive
this from first channel except for special formats.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-20 23:00:24 +02:00
Roland Scheidegger
ac1a2714c7 gallivm: implement better control of per-quad/per-element/scalar lod
There's a new debug value used to disable per-quad lod optimizations
in fragment shader (ignored for vs/gs as the results are just too wrong
typically). Also trying to detect if a supplied lod value is really a
scalar (if it's coming from immediate or constant file) in which case
sampler code can use this to stay on per-quad-lod path (in fact for
explicit lod could simplify even further and use same lod for both
quads in the avx case but this is not implemented yet).
Still need to actually implement per-element lod bias (and derivatives),
and need to handle per-element lod in size queries.

v2: fix comments, prettify.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-20 23:00:24 +02:00
Brian Paul
d427278a2d mesa: use ARRAY_SIZE() macro instead of magic number
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-20 13:14:25 -06:00
Ross Burton
76feef0823 build: fix out-of-tree builds in gallium/auxiliary
The rules were writing files to e.g. util/u_indices_gen.py, but in an
out-of-tree build this directory doesn't exist in the build directory.  So,
create the directories just in case.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Ross Burton <ross.burton@intel.com>
2013-08-20 10:35:14 -07:00
Michel Dänzer
be301f707e radeonsi: Always pre-load separate VGPRs for centroid vs. center interpolation
The LLVM R600 backend currently always uses separate VGPRs for these.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68162
(Centroid interpolation is identical to center interpolation without
multisampling, so the shader hardware was only pre-loading one set of
interpolation coefficients, and the pixel shader code was using
uninitialized values as the centroid interpolation coefficients)

Cc: mesa-stable@lists.freedesktop.org
Tested-by: Laurent Carlier <lordheavym@gmail.com>
2013-08-20 18:50:28 +02:00
Michel Dänzer
5edcb682c9 radeonsi: Fix SPI_BARYC_CNTL register initialization
The centroid / center interpolation related bits have different meanings
as of SI.

Fixes 7 centroid interpolation related piglit tests.
2013-08-20 18:50:10 +02:00
Maarten Lankhorst
86751cbddf gallium/osmesa: add same checks to OSMesaMakeCurrent as the other osmesa
Fixes a opengl crash in wine.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-08-20 12:36:17 +02:00
Maarten Lankhorst
603160d4c0 gallium/osmesa: link against static libglapi library too to get the gl exports
This should fix missing symbols in a osmesa built against shared glapi
osmesa build. All opengl exports were missing that are defined in the
static glapi, so link against both to fix this.

I could swear I've done this before, maybe there was a glitch in the matrix.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47824
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-08-20 10:44:53 +02:00
Kenneth Graunke
a4ff1fd388 i965: Shorten sampler loops in precompile key setup.
Now that we have the number of samplers available, we don't need to
iterate over all 16.  This should be particularly helpful for vertex
shaders.

v2: Use the correct shader program (caught by Paul Berry).

This needs to initialize the exact same set of sampler swizzles as
the actual key setup, or else we end up doing recompiles due to some
being XYZW and others being 0.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-20 01:09:52 -07:00
Chia-I Wu
ce87c51e9a ilo: add ILO_DEBUG=flush
When specified, ilo will print a line similar to

  cp flushed for render with 949+888 DWords (22.4%) because of frame end

for every ilo_cp_flush() call.
2013-08-20 13:54:39 +08:00
Chia-I Wu
216a576e11 ilo: add ILO_DEBUG=draw
It can print out pipe_draw_info and the dirty bits set, useful for debugging.
2013-08-20 13:54:38 +08:00
Vinson Lee
ff3cb378ad r600g/sb: Move memsets of member structs to within constructor bodies.
Silences "Uninitialized pointer field" defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-08-19 17:37:08 -07:00
Ian Romanick
574e4843e9 glsl: Use alignment of container record for its first field
The first field of a record in a UBO has the aligment of the record
itself.

Fixes piglit vs-struct-pad, fs-struct-pad, and (with the patch posted to
the piglit list that extends the test) layout-std140.

NOTE: The bit of strangeness with the version of visit_field without the
record_type poitner is because that method is pure virtual in the base
class.  The original implementation of the class did this to ensure
derived classes remembered to implement that flavor.  Now they can
implement either flavor but not both.  I don't know a C++ way to enforce
that.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68195
Cc: "9.2 9.1" mesa-stable@lists.freedesktop.org
2013-08-19 16:39:04 -07:00
Ian Romanick
5ac884fd9f glsl: Add new overload of program_resource_visitor::visit_field method
The outer-most record is passed into the visit_field method for
the first field.  In other words, in the following structure:

    struct S1 {
        vec4 v;
        float f;
    };

    struct S {
        S1 s1;
        S1 s2;
    };

    uniform Ubo {
        S s;
    };

s.s1.v would get record_type = S (because s1.v is the first non-record
field in S), and s.s2.v would get record_type = S1.  s.s1.f and s.s2.f
would get record_type = NULL becuase they aren't the first field of
anything.

This new overload isn't used yet, but the next patch will add several
uses.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: "9.2 9.1" mesa-stable@lists.freedesktop.org
2013-08-19 16:39:04 -07:00
Ian Romanick
d9bb8b7b56 glsl: Disallow embedded structure definitions
Continue to allow them in GLSL 1.10 because the spec allows it.
Generate an error in all other versions because the specs specifically
disallow it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-19 16:39:04 -07:00
Ian Romanick
5fb1dd51f3 meta: Add default precision qualifier to all fragement shaders
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-19 16:39:04 -07:00
Ian Romanick
5ac247a73e glsl: Add default precision qualifiers for ES builtins
Once the compiler proplerly checks for default precision qualifiers,
these shaders will cease to compile.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-19 16:39:04 -07:00
Ian Romanick
0b5fb6d417 glsl: Remove extra "types" from error message
Send it straight to the Department of Redundancy Department.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-19 16:39:04 -07:00
Kenneth Graunke
e197f53730 i965: Make the VS binding table as small as possible.
For some reason, we didn't use this information even though the VS
backend has computed it (albeit poorly) for ages.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 13:17:00 -07:00
Kenneth Graunke
7e9559c9ba i965/vs: Rework binding table size calculation.
Unlike the FS, the VS backend already computed the binding table size.
However, it did so poorly: after compilation, it looked to see if any
pull constants/textures/UBOs were in use, and set num_surfaces to the
maximum surface index for that category.  If the VS only used a single
texture or UBO, this overcounted by quite a bit.

The shader time surface was also noted at state upload time (during
drawing), not at compile time, which is inefficient.  I believe it also
had an off by one error.

This patch computes it accurately, while also simplifying the code.

It also renames num_surfaces to binding_table_size, since num_surfaces
wasn't actually the number of surfaces used.  For example, a VS that
used one UBO and no other surfaces would have set num_surfaces to
SURF_INDEX_VS_UBO(1) == 18, rather than 1.  A bit of a misnomer there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 13:17:00 -07:00
Kenneth Graunke
c642bd3dcc i965/vs: Plumb brw_vec4_prog_data into vec4_generator().
This will be useful for the next commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 13:17:00 -07:00
Kenneth Graunke
60689c05d1 i965/fs: Make the FS binding table as small as possible.
Computing the minimum size was easy, and done at compile-time for no
extra overhead here.  Making the binding table smaller wastes less batch
space.

Adding the CACHE_NEW_WM_PROG dirty bit isn't strictly necessary, since
other atoms depend on it and flag BRW_NEW_SURFACES.  However, it's best
to add it for clarity and safety.  It shouldn't add any new overhead.

v2: Use binding_table_size, rather than max_surface_index.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-19 13:17:00 -07:00
Kenneth Graunke
6d89bc803d i965/fs: Track the binding table size in brw_wm_prog_data.
By tracking the maximum surface index used by the shader, we know just
how small we can make the binding table.

Since it depends entirely on the shader program, we can just compute
it once at compile time, rather than at binding table emit time (which
happens during drawing).

v2: Store binding_table_size, rather than max_surface_index, for
    consistency with the VS (which needs to be able to represent 0
    surfaces).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-19 13:17:00 -07:00
Kenneth Graunke
7c717690b5 i965: Use SURF_INDEX_DRAW() for drawbuffer binding table indices.
SURF_INDEX_DRAW() has been the identity function since the dawn of time,
and both the shader code and binding table upload code relied on that,
simply using X rather than SURF_INDEX_DRAW(X).

Even if that continues to be true, using the macro clarifies the code.

The comment about draw buffers needing to be first in order for
headerless render target writes to work turned out to be wrong; with
this change, SURF_INDEX_DRAW can be changed to arbitrary indices and
everything continues working.

The confusion was over the "Render Target Index" field in the FB write
message header.  If it were a binding table index, then RT 0 would have
to be at index 0 for headerless FB writes to work.  However, it's
actually an index into the blend state table, so there's no problem.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Paul Berry <stereotype441@gmail.com>
2013-08-19 13:17:00 -07:00
Kenneth Graunke
c5fe7d063c i965: Shorten sampler loops in key setup.
Now that we have the number of samplers available, we don't need to
iterate over all 16.  This should be particularly helpful for vertex
shaders.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 13:17:00 -07:00
Kenneth Graunke
d0401d09ce i965: Make sampler counts available for the entire drawing operation.
Previously, we computed sampler counts when generating the SAMPLER_STATE
table.  By computing it earlier, we should be able to shorten a bunch of
loops.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 13:17:00 -07:00
Kenneth Graunke
c6e572275b i965: Split the brw_samplers atom into separate FS/VS stages.
This allows us to avoid uploading the VS sampler state table if only the
fragment program changes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 13:17:00 -07:00
Kenneth Graunke
7e01af662a i965: Upload separate VS and FS sampler state tables.
Now, each shader stage has a sampler state table that only refers to the
samplers actually used by that problem.  This should make the VS table
non-existant or very small.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 13:16:59 -07:00
Kenneth Graunke
2b7f876a6a i965: Make upload_sampler_state_table a virtual function.
This allows us to coalesce the brw_samplers and gen7_samplers atoms.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 13:16:59 -07:00
Kenneth Graunke
decc708c7c i965: Upload separate per-stage sampler state tables.
Also upload separate sampler default/texture border color entries.

At the moment, this is completely idiotic: both tables contain exactly
the same contents, so we're simply wasting batch space and CPU time.

However, soon we'll only upload data for textures actually /used/ in
a particular stage, which will usually make the VS table empty and
very likely eliminate all redundancy.  This is just a stepping stone.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 13:16:59 -07:00
Kenneth Graunke
9525bcf5f7 i965: Un-hardcode border color table from update_sampler_state().
Like the previous patch, this simply pushes direct access to brw->wm up
one level in the call chain.  Rather than passing the whole array, we
just pass a pointer to the correct spot in the array, similar to what we
do for the actual sampler state structure.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 13:16:59 -07:00
Kenneth Graunke
ed4459b10b i965: Un-hardcode border color table from upload_default_color.
When we begin uploading separate sampler state tables for VS and FS,
we won't be able to use &brw->wm.sdc_offset[ss_index].  By passing it in
as a parameter, we push the problem up to the caller.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 13:16:59 -07:00
Kenneth Graunke
f5a690cb68 i965: Split sampler count variable to be per-stage.
Currently, we only have a single sampler state table shared among all
stages, so we just copy wm.sampler_count into vs.sampler_count.

In the future, each shader stage will have its own SAMPLER_STATE table,
at which point we'll need these separate sampler counts.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 13:16:59 -07:00
Kenneth Graunke
44960ef918 i965/fs: Re-enable global copy propagation.
I believe the data flow analysis actually works now, and it should be
safe to re-enable global copy propagation.  It even does things now.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:24 -07:00
Kenneth Graunke
72f2249c11 i965/fs: Fix computation of livein.
Since the initial value for livein is an overestimation (0xffffffff),
it's extremely likely that it will shrink, which means we can't simply
OR in new bits - we need to fully recompute it based on the current
liveout values.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:24 -07:00
Kenneth Graunke
70b02a7fac i965/fs: Fully recompute liveout at each step.
Since we start with an overestimation of livein (0xffffffff), successive
steps can actually take away values.  This means we can't simply OR in
new liveout values; we need to recompute it from scratch at each
iteration of the fixed-point algorithm.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:24 -07:00
Kenneth Graunke
d20b472d0a i965/fs: Skip the initial block when updating livein/liveout.
The starting block always has livein = 0 and liveout = copy.  Since we
start with real data, not estimates, there's no need to refine it with
the fixed point algorithm.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:24 -07:00
Kenneth Graunke
731145c579 i965/fs: Drop unnecessary and incorrect liveout initialization.
The previous commit properly initialized liveout.  This previous
(and incorrect) initialization is no longer necessary.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:24 -07:00
Kenneth Graunke
1d40c784f2 i965/fs: Properly initialize the livein/liveout sets.
Previously, livein was initialized to 0 for all blocks.  According to
the textbook, it should be the universal set (~0) for all blocks except
the one representing the start of the program (which should be 0).

liveout also needs to be initialized to COPY for the initial block.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:24 -07:00
Kenneth Graunke
f06826cece i965/fs: Use the COPY set in the calculation for liveout.
According to page 360 of the textbook, the proper formula for liveout
is:

CPout(n) = COPY(i) union (CPin(i) - KILL(i))

Previously, we omitted COPY.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:24 -07:00
Kenneth Graunke
a291c59bba i965/fs: Simplify liveout calculation.
Excluding the existing liveout bits is a deviation from the textbook
algorithm.  The reason for doing so was to determine if the value
changed, which means the fixed-point algorithm needs to run for another
iteration.

The simpler way to do that is to save the value from step (N-1) and
compare it to the new value at step N.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:24 -07:00
Kenneth Graunke
597efd2b67 i965/fs: Create the COPY() set for use in copy propagation dataflow.
This is the "COPY" set from Muchnick's textbook, which is necessary
to do the dataflow algorithm correctly.

v2: Simplify initialization based on Paul Berry's observation that
    out_acp contains exactly what needs to be in the COPY set.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:24 -07:00
Kenneth Graunke
669d4d7f77 i965/fs: Rename setup_kills() to setup_initial_values().
Although this function currently only initializes the KILL set, it will
soon initialize other data flow sets as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:24 -07:00
Kenneth Graunke
2ef81372dc i965/fs: Separate the updating of liveout/livein.
To compute the actual liveout/livein data flow values, we start with
some initial values and apply a fixed-point algorithm until they settle.

Previously, we iterated through all blocks, updating both liveout and
livein together in one pass.  This is awkward, since computing livein
for a block requires knowing liveout for all parent blocks.  Not all
of those parent blocks may have been processed yet.

This patch separates the two.  First, we update liveout for all blocks.
At iteration N of the fixed-point algorithm, this uses livein values
from iteration N-1.  Secondly, we update livein for all blocks.  At
step N, this uses the liveout information we just computed (in step N).

This ensures each computation has a consistent picture of the data,
rather than seeing an random mix of data from steps N-1 and N depending
on the order of the blocks in the CFG data structure.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:24 -07:00
Kenneth Graunke
7d86042dee i965/fs: Rename "cont" to "progress" in dataflow algorithm.
This variable indicates that the fixed-point algorithm made changes to
the data at this step, so it needs to run for another iteration.

"progress" seems a nicer name for that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:23 -07:00
Kenneth Graunke
0225dea6c4 i965/fs: Switch to a do-while loop in copy propagation dataflow.
The fixed-point algorithm needs to run at least once, so a do-while loop
is more natural.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:23 -07:00
Kenneth Graunke
3c68662bb1 i965/fs: Skip global copy propagation step.
The dataflow analysis used for global copy propagation is severely
broken, and I believe it doesn't actually do anything.  Fixing it will
require a lot of changes, each of which might break things.

Once all the fixes land, we can re-enable this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:23 -07:00
Emil Velikov
b9d1173f2c vl/buffers: consistent use on VL_MAX_SURFACES
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-08-19 18:32:08 +02:00
Emil Velikov
e7c17eb819 st/vdpau: drop unnecessary variable prof
Any decent compiler will do this for us, although doing this
will make grepping through the code alot easier.

v2: In both mixer and query interface
v3: rebase

Reviewed-by: Christian König <christian.koenig@amd.com> [v1]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-08-19 18:32:08 +02:00
Emil Velikov
1d260360d8 vl/idct: cleanup all idct buffers
Code should loop through and cleanup the three (VL_NUM_COMPONENTS) idct
buffers, rather than doing the first one three times.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-08-19 18:32:08 +02:00
Emil Velikov
5354d2e76a vl/buffer: add sanity check after CALLOC_STRUCT
Check if we have successfully allocated memory.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-08-19 18:32:08 +02:00
Emil Velikov
eab9bad1ac st/xvmc: exit gracefully if we fail to create video buffer
Free any allocated memory and return BadAlloc if create_video_buffer()
has failed to create a buffer.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-08-19 18:32:07 +02:00
Emil Velikov
5e91c15290 st/vdpau: don't try to create video buffer when the format is FORMAT_NONE
Not seen in the wild yet, but seems like a reasonable thing to do.
[suggested by Christian]

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-19 18:32:03 +02:00
Andy Furniss
3448b66dac vdpau/vl 422 chroma width/height mix up
I was looking into some minor 422 issues/discrepencies I noticed long
ago using vdpau on my rv790.

I noticed that there is code that is halving height rather than width -
422 is full height AFAIK.

Making the changes below doesn't actually make any noticable difference
to what I was looking into.

Maybe there are more but here's three I've found so far

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-19 18:31:26 +02:00
Vinson Lee
b1d05eeb1f radeonsi: Ensure fmask_format is initialized in release builds.
Fixes "Uninitialized scalar variable" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-08-19 09:19:19 -07:00
Paul Berry
c6b6c93643 i965: STATIC_ASSERT that there aren't too many BRW_NEW_* flags.
We are getting close to the maximum number of BRW_NEW_* bits that can
be stored in brw->state.dirty.brw without overflowing 32 bits, and
geometry shaders are going to add more.  Add a STATIC_ASSERT so that
we will be alerted when we need to switch to 64 bits.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-19 08:28:17 -07:00
Christian König
5ddd840f5a vl: add entrypoint to is_video_format_supported
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19 10:21:15 +02:00
Christian König
a15cbabb8b vl: add entrypoint to get_video_param
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19 10:21:15 +02:00
Christian König
f2f7064e56 vl: rename pipe_video_decoder to pipe_video_codec
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19 10:21:15 +02:00
Christian König
8e423ab984 vl: rename enum pipe_video_codec to pipe_video_format
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19 10:21:15 +02:00
Christian König
53e20b8b41 vl: use a template for create_video_decoder
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19 10:21:14 +02:00
Marek Olšák
d13003f544 glsl: don't eliminate texcoords that can be set by GL_COORD_REPLACE
Tested by examining generated TGSI shaders from piglit/glsl-routing.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Henri Verbeet <hverbeet@gmail.com>
Tested-by: Henri Verbeet <hverbeet@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-18 12:27:08 +02:00
Ilia Mirkin
a8346a2f52 nv50: allow non-nv12 buffers to be created, just pass them through to vl
Since we expose non-NV12 formats as supported when there is no decoer
profile selected, make sure that those formats are actually allowed to
be allocated.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-17 17:58:36 +02:00
Eric Anholt
bef423bee6 dri: Choose a decent global driNConfigOptions.
Previously, we were asserting that each driver specified an NConfigOptions
exactly equal to the number of options they supplied, leading to frequent
bugs when people would forget to adjust the value when adjusting driver
options.  Instead, just overallocate the table by a bit and leave sanity
checking to the assert in findOption().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-17 11:43:19 +02:00
Kenneth Graunke
703a2f4219 i965: Improve comments for driver hooks in intel_buffer_object.c.
Consistently using a "The ___ driver hook." line at the the top of each
function's comment block makes it easy to see at a glance what function
is being implemented.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 19:00:49 -07:00
Kenneth Graunke
96a0fe7e4d i965: Split intel_upload code out into a separate file.
This code upload performs batched uploads via a BO.  By moving it out to
a separate file, intel_buffer_objects.c only provides the core buffer
object functionality.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 19:00:49 -07:00
Kenneth Graunke
76c2533470 i965: Move GL_APPLE_object_purgeable functionality into a new file.
GL_APPLE_object_purgeable creates a mechanism for marking OpenGL objects
as "purgeable" so they can be thrown away when system resources become
scarce.  It specifically applies to buffer objects, textures, and
renderbuffers.

The intel_buffer_objects.c file provides core functionality for GL
buffer objects, such as MapBufferRange and CopyBufferSubData.  Having
texture and renderbuffer functionality in that file is a bit strange.

The 2010 copyright on the new file is because Chris Wilson first added
this code in January 2010 (commit 755915fa).

v2: Actually remember to call the new dd table setup function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 19:00:49 -07:00
Marek Olšák
aafb0f9e06 radeonsi: fix feature support reporting
broken by 21d9a1b5ef
2013-08-17 02:49:00 +02:00
Niels Ole Salscheider
5394ee8f30 clover: Fix linkage of libOpenCL
Clover needs the option component of llvm.

Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
2013-08-16 16:52:31 -07:00
Marek Olšák
21d9a1b5ef radeonsi: require LLVM 3.4 for MSAA 2013-08-17 01:48:25 +02:00
Marek Olšák
87b88f1dae radeonsi: don't make scanout resources linear except for cursors
The surface allocator understands the scanout flag just fine.

This seems to improve performance for Ubuntu Unity on top of st/xorg
and it fixes the cursor.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
89ca4a00f5 radeonsi: remove useless code from tex_fetch_args
The array slice has already been added to "address".

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
5550554f1e radeonsi: disable unbound colorbuffers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
356c041167 radeonsi: port texture improvements from r600g
This started as an attempt to add support for MSAA texture transfers and
MSAA depth-stencil decompression for the DB->CB copy path.
It has gotten a bit out of control, but it's for the greater good.

Some changes do not make much sense, they are there just to make it look
like the other driver.

With a few cosmetic modifications, r600_texture.c can be shared with
a symlink.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
4855acd461 radeonsi: implement texture fetching for compressed MSAA textures (v2)
v2: use resource slots 16..31 for FMASK textures

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
f671dfa8aa radeonsi: add FMASK texture binding slots and resource setup (v2)
v2: bind FMASK textures to shader resource slots 16..31

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
3c3feb38f4 radeonsi: implement FMASK decompression for MSAA texturing
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
8c04f25360 radeonsi: scanout buffers cannot be a destination of MSAA resolve
Resolving to scanout buffers just doesn't work.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
2a4b2e2305 radeonsi: implement MSAA colorbuffer compression for rendering
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
2f1c449415 radeonsi: implement uncompressed MSAA texturing
This is glBlitFramebuffer support for MSAA surfaces as required by GL 3.0
and texturing as required by GL 3.2 and GL_ARB_texture_multisample.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
f083f79751 radeonsi: disable alpha-to-coverage for integer colorbuffers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
6d4755a4d7 radeonsi: implement GL_SAMPLE_ALPHA_TO_ONE
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
07955d4f2b radeonsi: implement uncompressed MSAA rendering and color resolving
This is basic MSAA support which should work with most apps.
Some features are missing, those will be implemented by other commits.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
c8e70e64ac radeonsi: add flexible shader descriptor management and use it for sampler views
It moves all sampler view descriptors to a buffer.
It supports partial resource updates and it can also unbind resources
(required for FMASK texturing).

The buffer contains all sampler view descriptors for one shader stage,
represented as an array. On top of that, there are N arrays in the buffer,
which are used to emulate context registers as implemented by the previous
ASICs (each array is a context).

This uses the RCU synchronization approach to avoid read-after-write hazards
as discussed in the thread:
"radeonsi: add FMASK texture binding slots and resource setup"

CP DMA is used to clear the descriptors at context initialization and to copy
the descriptors from one context to the next.

v2: - use PKT3_DMA_DATA on CIK (I'll test CIK later)
    - turn the bool CP DMA parameters into self-explanatory flags
    - add a nice simple API for packet emission to radeon_winsys.h
    - use 256 contexts, 128 causes texture corruption in openarena
2013-08-17 01:48:25 +02:00
Tom Stellard
764502b481 radeonsi/compute: Let the state tracker do all the flushing
It shouldn't be necessary to call radeon_winsys::cs_flush() from
radeonsi_launch_grid(), because the state tracker is responsible for
flushing the pipeline at the appropriate time.  The current behavior is
also wrong, because radeonsi_launch_grid() submits packets to the
compute ring, but when the state tracker calls pipe->flush() everything
is submitted to the graphics ring.  This has the potential to create a
race condition.

The downside of removing this flush is that the compute dispatch packets
will be sent to the graphics ring rather than the compute ring.
In the future we will need to come up with a way to detect 'compute'
command streams and submit them to the appropriate ring.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2013-08-17 01:48:25 +02:00
Kenneth Graunke
e29931aa74 i965: Dump more information about batch buffer usage.
Previously, INTEL_DEBUG=bat would dump messages like:

intel_mipmap_tree.c:1643: Batchbuffer flush with 456b used

This only reported the space used for command packets, and didn't
report any information on the space used for indirect state.

Now it dumps:

intel_context.c:366: Batchbuffer flush with 6128b (pkt) + 4288b (state)
= 10416b (31.8%)

This conveniently shows the breakdown of space used for packets vs.
state, as well as the percentage of batchbuffer space.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 15:54:24 -07:00
Kenneth Graunke
2a9492f321 i965: Add Gen7 depth stall flushes before disabling depth in BLORP.
We emit these before configuring depth in the normal path, or actually
using the depth buffer in BLORP - we just failed to emit them when
disabling depth altogether.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-16 15:03:55 -07:00
Kenneth Graunke
8fba8d4ee7 i965: Add Gen6 depth stall flushes before disabling depth in BLORP.
We emit these before configuring depth in the normal path, or actually
using the depth buffer in BLORP - we just failed to emit them when
disabling depth altogether.

On Sandybridge, this also requires the post_sync_nonzero flush.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-16 15:03:38 -07:00
Matt Turner
9c48ae751a i965: Don't copy propagate bitcasts with source modifiers.
Previously, copy propagation would cause bitcast_f2u(abs(float)) to
be performed in a single step, but the application of source modifiers
(abs, neg) happens after type conversion, leading to incorrect results.

That is, for bitcast_f2u(abs(float)) we would in fact generate code to
do abs(bitcast_f2u(float)).

For example, whereas bitcast_f2u(abs(float)) might result in a register
argument such as
   (abs)g2.2<0,1,0>UD

v2: Set interfered = true and break in register_coalesce instead of
    returning false.

Reviewed-by: Paul Berry <stereoytpe441@gmail.com>
2013-08-16 13:11:07 -07:00
Matt Turner
0ae9ca12a8 i965: Emit MOVs for neg/abs.
Necessary to avoid combining a bitcast and a modifier into a single
operation. Otherwise if safe, the MOV should be removed by
copy-propagation or register coalescing.

With this and the next patch, there are only four changes in shader-db:
all a single extra instruction. The code does something like
   mov a.w, -b.x
and copy propagation doesn't work because it only handles no-op
swizzles. Seems acceptable, given the known limitation of our copy
propagation.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereoytpe441@gmail.com>
2013-08-16 13:11:07 -07:00
Anuj Phogat
079bdba05f i965/blorp: Add support for single sample scaled blit with bilinear filter
Currently single sample scaled blits with GL_LINEAR filter falls
back to meta path. Patch removes this limitation in BLORP engine
and implements single sample scaled blit with bilinear filter.
No piglit, gles3 regressions are observed with this patch on Ivybridge.

V2: Use "sample" message to utilize the linear filtering functionality
built in to hardware.
V3: Define a bool variable (bilinear_filter) to handle the conditions
for GL_LINEAR blits.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 09:46:15 -07:00
Anuj Phogat
aff371b634 i965/blorp: Define a function to clamp texture coordinates
New function clamp_tex_coords() clamps the texture coordinates
to texture boundaries.  This function will also be utilized later
for the BLORP implementation of single-sample scaled blit with
bilinear filter.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 09:46:15 -07:00
Anuj Phogat
6066fb1721 i965/blorp: Use more appropriate variable names
When we talk about both multi-sample and single-sample scaled blits,
rect_grid_{x1, y1} are more appropriate variable names as compared
to sample_grid_{x1, y1}. There are no functional changes in this patch.
It just prepares for the BLORP implementation of single-sample scaled
blit with bilinear filter.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 09:46:15 -07:00
Anuj Phogat
d944a6144f meta: Fix blitting a framebuffer with renderbuffer attachment
This patch fixes a case of framebuffer blitting with renderbuffer
as color attachment and GL_LINEAR filter. Meta implementation of
glBlitFrambuffer() converts source color buffer to a texture and
uses it to do the scaled blitting in to destination buffer. Using
the exact source rectangle to create the texture does incorrect
linear filtering along the edges. This patch makes the changes to
extend the texture edges by one pixel in x, y directions. This
ensures correct linear filtering.
It fixes failing piglit fbo-attachments-blit-scaled-linear test.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
CC: "9.2" <mesa-stable@lists.freedesktop.org>
CC: "9.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 09:46:15 -07:00
Ilia Mirkin
a2061eea0f nv50: add vp3/vp4 support for mpeg2/vc1
h264/mpeg4 remain disabled for pre-nvc0, there's some minor
bug/difference which causes the decoding to hang after some frames.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-16 09:48:47 +02:00
Ilia Mirkin
b3f6f127f2 nv50: separate video logic from noalloc
The upcoming vp3 logic will want the video layout, but allocated by the
miptree.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-16 09:48:26 +02:00
Ilia Mirkin
c1a6f59b20 nv30: remove no-longer-used formats from table
Commit 14ee790df7 removed the formats from the vtxfmt_table but forgot
to also update the info_table.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
2013-08-16 09:48:09 +02:00
Fredrik Höglund
0e7a61a29f mesa: Update the BGRA vertex array error handling
The error code was changed from INVALID_VALUE to INVALID_OPERATION
in OpenGL 3.3. We should also generate an error when size is BGRA
and normalized is FALSE.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-15 21:38:13 -07:00
Kenneth Graunke
90129da82c i965/fs: Fix Sandybridge regressions from SEL optimization.
Sandybridge is the only platform that supports an IF instruction
with an embedded comparison.  In this case, we need to emit a CMP
to go along with the SEL.

Fixes regressions in Piglit's glsl-fs-atan-3, fs-unpackHalf2x16,
fs-faceforward-float-float-float, isinf-and-isnan fs_basic, and
isinf-and-isnan fs_fbo.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68086
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: lu hua <huax.lu@intel.com>
2013-08-15 15:33:00 -07:00
Kenneth Graunke
c189840b21 i965: Force X-tiling for 128 bpp formats on Sandybridge.
128 bpp formats are not allowed to be Y-tiled on any architectures
except Gen7.

+11 Piglits on Sandybridge (mostly regression fixes since the
switch to Y-tiling).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63867
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64261
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-15 15:18:48 -07:00
Ian Romanick
41eef83cc0 mesa/vbo: Fix handling of attribute 0 in non-compatibilty contexts
It is only in OpenGL compatibility-style contexts where generic
attribute 0 and GL_VERTEX_ARRAY have a bizzare, aliasing relationship.
Moreover, it is only in OpenGL compatibility-style contexts and OpenGL
ES 1.x where one of these attributes provokes the vertex.  In all other
APIs each implicit call to glArrayElement provokes a vertex regardless
of which attributes are enabled.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Robert Bragg <robert@sixbynine.org>
Cc: "9.0 9.1 9.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55503
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66292
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67548
2013-08-15 14:59:37 -07:00
Zack Rusin
7115bc3940 draw: handle nan clipdistance
If clipdistance for one of the vertices is nan (or inf) then the
entire primitive should be discarded.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-15 16:26:32 -04:00
Vinson Lee
035bf21983 i915,i965: Fix memory leak in try_pbo_upload (v2)
Fixes "Resource leak" defect reported by Coverity.
Tested on Haswell, no Piglit regressions.

v2: Apply to i965, not just i915. (chadv)

CC: "9.2, 9.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-15 10:37:22 -07:00
Roland Scheidegger
6ca18e06ae gallivm: revert accidentally commited hunk
That magic wasn't meant to be commited, need to work on some proper fix.
2013-08-15 19:26:39 +02:00
Roland Scheidegger
5626a84a00 gallivm: do per-sample depth comparison instead of doing it post-filter
Doing the comparisons pre-filter is highly recommended by OpenGL (and d3d9)
and definitely required by d3d10.
This actually doesn't do it pre-filter but more "in-filter" as otherwise
need to push the comparisons even further down into fetch code and this
also trivially allows using a somewhat cheaper lerp.
Doing it pre-filter would actually have some performance advantage for UNORM
formats (because the comparisons should be done in texture format, we'd only
need to convert the shadow ref coord to texture format once, but in turn would
save converting the per-sample texture values to floats) but this gets a bit
messy as this has implications for border color handling as well (which needs
to be done prior to depth comparisons, hence would also need to convert border
color to texture format too or use some other tricks like doing separate border
color / shadow ref comparison and simply using that result directly when doing
border replacement).
Should make no difference for nearest filtering, and performance for linear
filtering should be mostly the same too (essentially have one more comparison
instruction per sample, and replace the sub/mul/add lerp with a sub/and/and/add
special "lerp" which all in all shouldn't be much of a difference).

v2: get rid of old code completely

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-15 18:42:20 +02:00
Michel Dänzer
3b2f3f90ac radeonsi: Pixel shaders pre-load one more SGPR
Acked-by: Marek Olšák <maraeo@gmail.com>
2013-08-15 17:55:00 +02:00
Michel Dänzer
f0753a3cd4 radeonsi: TGSI_SEMANTIC_CLIPVERTEX doesn't use any parameters 2013-08-15 17:54:40 +02:00
Michel Dänzer
2f98dc223f radeonsi: Don't export unused clip distance vectors from vertex shader
E.g. the Source engine seems to always write to gl_ClipVertex, but normally
doesn't enable any GL_CLIP_DISTANCEn states. This change removes some
irrelevant parts from the generated vertex shader code in such cases.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-15 17:53:50 +02:00
Michel Dänzer
b00269aa58 radeonsi: Don't leave gaps between position exports from vertex shader
If the vertex shader exports clip distances but not point size, use
position exports 1/2 instead of 2/3 for the clip distances. Fixes
geometry corruption in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66974

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-15 17:42:26 +02:00
Roland Scheidegger
abdd32dcd5 llvmpipe: fix stencil bug if we have both stencil and depth tests
This is a very well hidden bug found by accident (only the fixed glean
tstencil2 test so far seems to hit it).
We must use new mask with combined s_pass values and orig_mask values
for zpass/zfail stencil ops, otherwise both the sfail op and one of
zpass/zfail op are applied (probably not hit in most tests because
some of the ops tend to be KEEP usually).

Note: this is a candidate for the 9.2 branch.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-15 17:30:07 +02:00
Roland Scheidegger
7ae9cc71f0 st/mesa: use new float comparison opcodes if native integers are supported
Should get rid of some float-to-int conversions (with negation).
No piglit regressions (with llvmpipe).

v2: fix bogus formatting spotted by Brian.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-15 17:30:07 +02:00
Ilia Mirkin
4ea191fb2d nvc0: move video param and format support functions to nouveau
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:48 +02:00
Ilia Mirkin
9255019a53 nvc0: move firmware loading functions to nouveau
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:48 +02:00
Ilia Mirkin
9d8c076803 nvc0: move some of the simpler decoder functions into nouveau
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:48 +02:00
Ilia Mirkin
73f4499a02 nvc0: move vp param filling logic into nouveau
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:48 +02:00
Ilia Mirkin
e1cd987bb6 nvc0: move bsp param-filling logic into nouveau
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:48 +02:00
Ilia Mirkin
d6a82a7747 nvc0: move nvc0_decoder into nouveau, rename to nouveau_vp3_decoder
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:47 +02:00
Ilia Mirkin
86e5c3c97b nvc0: standardize on using #if for NVC0_DEBUG_FENCE
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:47 +02:00
Ilia Mirkin
b57875bbb3 nvc0: refactor video buffer management logic into nouveau_vp3
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:47 +02:00
Ilia Mirkin
940f7cec77 nv50: allow forcing PMPEG use, for ease of testing
This also allows people who don't want to install the binary blobs
required for VP2 to still get MPEG decoding.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:15:23 +02:00
Ilia Mirkin
ee3ca3614e nv30: hook up PMPEG support via nouveau_video, enables XvMC to work
Force the format to be the reasonable format that doesn't require an
inverse z-scan.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:15:12 +02:00
Ilia Mirkin
6010c683d0 nouveau: set buffer format of video buffer
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:15:04 +02:00
Ilia Mirkin
8975f83402 nouveau: fix number of surfaces in video buffer, use defines
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:15:02 +02:00
Ilia Mirkin
14ee790df7 nv30: U8_USCALED only works for size 4
See https://bugs.freedesktop.org/show_bug.cgi?id=61635 for a sample
program. Changing it to use a vec4 makes it work. Remove the unsupported
formats.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
2013-08-15 15:14:25 +02:00
Chris Forbes
4f739646b0 i965: allow 8 user clip planes on CTG+
There's no need to use a clip flag for NEGW on these gens, so
no reason we can't just enable 8 planes.

V2: - Bump (and document!) MAX_VERTS in the clip code.
    - Fix clip flag masks in the clip unit state and in the shader
      prolog
    - Move this to the end of the series for less breakage.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 07:24:56 +12:00
Chris Forbes
ee0b8e0f06 i965: get rid of clip plane compaction
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 07:24:56 +12:00
Chris Forbes
cf52f6435e i965/clip: Support clip distances for line clipping
This does the same thing as we do for triangle clipping -- select the
appropriate source (either dot(hpos,fixed plane) or a clipdistance
slot).

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 07:24:56 +12:00
Chris Forbes
2a8a85e1ad i965/clip: remove spurious clipvertex param
Nothing in the clipper uses gl_ClipVertex any more, so we don't care
where it is.

V2: Don't bother fishing out the clipvertex offset either.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 07:24:56 +12:00
Chris Forbes
45540921ec i965/clip: Use clip distances for all user clipping
V2: Adjust explanation of load_clip_distance()

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 07:24:55 +12:00
Chris Forbes
bf9ede92c2 i956/clip: push dp4 into load_clip_distance
Soon the dp4 is only going to be used for fixed clip planes.

V2: Remove old inaccurate comment about the behavior of this function;
add a better explanation above.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 07:24:55 +12:00
Chris Forbes
265336e75a i965/clip: Track offset into the vertex for clipdistance
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 07:24:55 +12:00
Chris Forbes
3b738f5f85 i965/Gen4-5: Set clip flags from clip distances
V2: - Use the new VS_OPCODE_UNPACK_FLAGS_SIMD4X2 to correctly split the
      flags for the two vertices being processed together.
    - Don't apply bogus masking of clip flags. The set of plane enables
      aren't included in the shader key, and we wouldn't want the
      recompiles anyway.

V3: - Tidy up spurious instructions, name temps properly.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V2] Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 07:24:55 +12:00
Chris Forbes
a9be50f776 i965: add new VS_OPCODE_UNPACK_FLAGS_SIMD4X2
Splits the bottom 8 bits of f0.0 for further wrangling
in a SIMD4x2 program. The 4 bits corresponding to the channels in each
program flow are copied to the LSBs of dst.x visible to each flow.

This is useful for working with clipping flags in the VS.

V3: - Fixup immediate types
    - Teach scheduler about the hidden dep on flags

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
V2: Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 07:24:38 +12:00
Chris Forbes
9e2c1e28a1 i965/vs: add vec4_instruction::depends_on_flags
We're about to have an instruction that depends on the flags but isn't
predicated. This lays the groundwork.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2013-08-16 07:21:43 +12:00
Chris Forbes
c5e2d0454b i965/clip: Enable interpolation of clip distances
Previously we had disabled interpolation of the clip distances as a
special case, since they were unused.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 07:21:42 +12:00
Chris Forbes
972e2f11c0 i965/vs: Do legacy clip lowering earlier
We need to produce clip flags for the vertex header on Gen4/5, so
clip plane lowering has to be done before we try to emit the flags/psiz
attribute.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 07:21:37 +12:00
Chris Forbes
9e07a68cad i965/Gen4-5: ensure VUE slots for clipdistance are valid if user clipping is enabled.
V2: We don't particularly care where they fall in the VUE map, as long
as they are allocated somewhere, and occupy two contiguous slots. Don't
fiddle with the SF layout at all -- there's no need.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 07:20:47 +12:00
Chia-I Wu
a453eb6f86 ilo: fix fragment shaders that use PCB on GEN7+
Missed this commit when preparing PCB changes for upstreaming.
2013-08-15 11:35:46 +08:00
Vinson Lee
ae645b83fc nouveau: Fix variable name.
Fixes build error introduced with commit
d1ba1055d9.

  CC     nouveau_video.lo
nouveau_video.c: In function 'nouveau_screen_get_video_param':
nouveau_video.c:866:33: error: 'screen' undeclared (first use in this function)
nouveau_video.c:866:33: note: each undeclared identifier is reported only once for each function it appear

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-08-14 17:35:31 -07:00
Matt Turner
57a6bcd56b glsl: Add i2b() and b2i() to ir_builder.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-14 17:15:06 -07:00
Matt Turner
1cf76c72da glsl: Add nequal() to ir_builder.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-14 17:15:06 -07:00
Matt Turner
16be6298c0 glsl: Add abs() to ir_builder.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-14 17:15:06 -07:00
Matt Turner
6bfb1a8344 glsl: Add bitcast_i2f() to ir_builder.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-14 17:15:06 -07:00
Marek Olšák
3d1b01662b radeonsi: unduplicate code in create_context
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:03 +02:00
Marek Olšák
e801b78aa0 radeonsi: initialize the radeon_surface structure
this fixes valgrind warnings

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:03 +02:00
Marek Olšák
731c6aa52d radeonsi: correct sampler function names
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:03 +02:00
Marek Olšák
0469171159 radeonsi: rename r600_texture::dirty_db_mask to dirty_level_mask
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:03 +02:00
Marek Olšák
363b2805f7 radeonsi: rename r600_resource_texture to r600_texture
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:02 +02:00
Marek Olšák
128819d394 tgsi: add info about MSAA samplers to tgsi_shader_info
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:02 +02:00
Marek Olšák
0ee4bae70d tgsi: fix the location of sample index
The sample index is always in W.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:02 +02:00
Roland Scheidegger
7727fbb7c5 r600/radeonsi: implement new float comparison instructions
Also use ordered comparisons for old cmp instructions.

Tested-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Tom Stellard <tom@stellard.net>
2013-08-15 00:40:14 +02:00
Roland Scheidegger
72874d2352 nv50: implement new float comparison instructions
untested.

Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
2013-08-15 00:40:14 +02:00
Roland Scheidegger
e858921d52 ilo: implement new float comparison instructions
untested.

Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-08-15 00:40:14 +02:00
Roland Scheidegger
e58c2310b8 gallivm: already pass coords in the right place in the sampler interface
This makes things a bit nicer, and more importantly it fixes an issue
where a "downgraded" array texture (due to view reduced to 1 layer and
addressed with (non-array) samplec instruction) would use the wrong
coord as shadow reference value. (This could also be fixed by passing
target through the sampler interface much the same way as is done for
size queries, might do this eventually anyway.)
And if we'd ever want to support (shadow) cube map arrays, we'd need
5 coords in any case.

v2: fix bugs (texel fetch using wrong layer coord for 1d, shadow tex
using wrong shadow coord for 2d...). Plus need to project the shadow
coord, and just for fun keep projecting the layer coord too.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-15 00:40:14 +02:00
Roland Scheidegger
d4b43cedb6 gallivm: change coordinate handling throughout functions
Instead of passing s,t,r coordinates pass a coord array - the reason is that
I need to pass more coords (in particular for shadow "coord", future will also
need another one for cube map arrays) so just pass them as an array.
Also, to simplify things, use fixed location for the shadow reference value I
want to get rid of the silly "where is the right coord value" game.
Keep old-style however for aos sampling (which is not going to need shadow
coord, though for cube map arrays it still would need fixing).
(Next patch will pass those through using the new arrangement directly from
sampler interface.)

v2: fix up soa split path (unreachable currently but still...)

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-15 00:40:14 +02:00
Roland Scheidegger
c6c55ad3e9 gallivm: fix border color with normalized texture formats
We need to put border color into texture format color space which
essentially means clamping for non-float, normalized formats (not entirely
sure if we're also meant to quantize the float but it's probably ok not to
do it thankfully).
For OpenGL we could do this easily outside generated code due to the
1:1 sampler/texture correspondence but not for d3d10 which is terrible
(as we recalculate a constant over and over again per shader invocation).
Fortunately border color should be rare enough that we don't care THAT much.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-15 00:40:14 +02:00
Zack Rusin
27cedd8aec llvmpipe: fix pipeline statistics with a null ps
If the fragment shader is null then pixel shader invocations have
to be equal to zero. And if we're running a null ps then clipper
invocations and primitives should be equal to zero but only
if both stancil and depth testing are disabled.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-14 18:23:36 -04:00
Zack Rusin
a3ae5dc7dd draw: make sure that the stages setup outputs
Calling the prepare outputs cleans up the slot assignments
for outputs, unfortunately aapoint and aaline didn't have
code to reset their slots after the initial setup, this
was messing up our slot assignments. The unfilled stage
was just missing the initial assignment of the face slot.
This fixes all of the reported piglit failures.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-14 18:23:35 -04:00
Paul Berry
98d2498404 glsl: Fix incorrect pattern matching in ir_set_program_inouts
In commit 8fc41df (glsl: Modify ir_set_program_inouts to handle
geometry shaders), when attempting to pattern match the "foo" part of
expressions such as:

   foo[i][j]
   foo[i]

I incorrectly called as_dereference_variable() on the subexpression
foo[i] instead of foo.  As a result, the pattern never matched, so
ir_set_program_inouts would fall back on marking the entire variable
as used, rather than just the portion indexed by the array.

This didn't result in incorrect behaviour, but it could have resulted
in inefficiency by causing the back-end to allocate resources for
unused parts of an input or output array.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-14 10:53:47 -07:00
Rico Schüller
d1ba1055d9 vl: Add support for max level query v2
This patch adds the level query support to the video decoders
and uses some more reasonable defaults.

v2: (ck) add commit message

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-14 13:20:01 +02:00
Ian Romanick
830f4df993 glsl: Emit better warnings for things that look like default precision statements
Previously we would emit a warning for empty declarations like

float;

We would also emit the same warning for things like

highp float;

However, this second case is most likely the application trying to set
the default precision.  This makes the compiler generate a stronger
warning with some suggestion of a fix.

It really seems like this should be an error.  I'll bet that 100% of the
time someone writes 'highp float;' the actually meant 'precision highp
float;'.  Alas, both AMD and NVIDIA accept this syntax, and the spec
doesn't explicitly forbid it.

This makes piglit's precision-05.vert generate the following warnings:

0:12(11): warning: empty declaration with precision qualifier, to set the default precision, use `precision lowp float;'
0:13(12): warning: empty declaration with precision qualifier, to set the default precision, use `precision mediump int;'

v2: Add { } around a one-line if body and fix a comment.  Suggested by
Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-13 20:47:20 -07:00
Paul Berry
825f9ff5d3 glsl/ast: Don't perform GS input array checks on non-inputs.
Previously, we were accidentally calling
handle_geometry_shader_input_decl() on non-input interface block
declarations, resulting in bogus error checking.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-13 20:02:55 -07:00
Paul Berry
91c8fea924 glsl/ast: Fix assertion failure when GS input declared as non-array.
Previously, if a geometry shader input was declared as a non-array, we
would flag the proper compiler error, but then before we got a chance
to report it to the client, handle_geometry_shader_input_decl() would
assertion fail.

With this patch, handle_geometry_shader_input_decl() ignores
non-arrays.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-13 20:02:54 -07:00
Paul Berry
336351e971 glsl/ast: Check that geometry shader interface block inputs are arrays.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-13 20:02:54 -07:00
Paul Berry
3b837e637e i965/gen7+: Fix build error introduced by renaming upload_3dstate_so_decl_list.
Commit 9f9ccf707c renamed
upload_3dstate_so_decl_list to gen7_upload_3dstate_so_decl_list but
forgot to update the caller.
2013-08-13 19:36:27 -07:00
Jon Severinsson
9298f537a7 radeon/llvm: Add missing "%s" format string to fprintf.
This fixes a compilation warning with -Wformat-security.

CC: "9.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-13 19:18:14 -07:00
Chad Versace
11b8f8e7e4 i965: Move arrays brw_multisample_positions* to new header
Move the arrays to the new header brw_multisample_state.h, which will be
shared with Broadwell code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-13 18:04:20 -07:00
Chad Versace
7eecda29c8 i965: Refactor names of sample_positions_8/4x arrays
Place each array in the brw namespace by renaming it:
    sample_positions_4x -> brw_multisample_positions_4x
    sample_positions_8x -> brw_multisample_positions_8x

This prepares for moving the arrays to a header shared by gen6 and gen8.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-13 18:03:59 -07:00
Kenneth Graunke
9f9ccf707c i965/gen7+: Mark upload_3dstate_so_decl_list as non-static (v2)
We will reuse this for Broadwell.

v2: Prefix function name with 'gen7'. (chadv)

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-13 18:03:57 -07:00
Kenneth Graunke
f4e5c235de i965: Mark a few brw_draw_upload.c functions as non-static
We will reuse these for Broadwell.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-13 18:02:13 -07:00
Ian Romanick
1b35e33af4 glsl: Require function return type arrays be explicitly sized
Fixes piglit array-function-return-unsized.vert.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-13 17:53:33 -07:00
Ian Romanick
42624b1c81 glsl: Move and refine test for unsized arrays in GLSL ES
GLSL ES does not allow unsized arrays, and GLSL ES 1.00 does not allow
array initializers.  However, GLSL ES 3.00 allows array initializers,
and the initializer can explicitly size the array.  The specification
even includes some examples of this:

    float x[] = float[2] (1.0, 2.0);     // declares an array of size 2
    float y[] = float[] (1.0, 2.0, 3.0); // declares an array of size 3

    float a[5];
    float b[] = a;

Move the unsized array check to after the initializer has been
processed.  If the array is still unsized, generate the error.  This
should have no effect in GLSL ES 1.00 because, as previously mentioned,
array initializers are not allowed.

Fixes piglit "glsl-es-3.00 compiler array-sized-by-initializer.vert".

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
2013-08-13 17:53:33 -07:00
Ian Romanick
d5aee174b8 glx: Generate GLXBadDrawable when drawable is zero
Fixes piglit glx-query-drawable-GLXBadDrawable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-13 17:53:33 -07:00
Ian Romanick
ef83bd2b95 mesa: Use _mesa_detach_renderbuffer when deleting a texture
The functional change is that now invalidate_framebuffer is called if
the texture is actually detached from one of the currently bound FBOs.
Previously this was only done for renderbuffers.

The remaining changes make the texture delete path look more similar to
the renderbuffer delete path.  This includes adding relevant spec
quotations to justify the behavior.

Fixes piglit fbo-incomplete "delete texture of bound FBO" test.

v2: Move 'fb->Attachment[i].Texture == att' check from previous patch to
this patch... where it was intended to be in the first place.  Noticed
by Chad.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-13 17:53:33 -07:00
Ian Romanick
438cc6bc49 mesa: Make detach_renderbuffer available outside fbobject.c
Also add a return value indicating whether any work was done.

This will be used by the next patch.

v2: Move 'fb->Attachment[i].Texture == att' check to the next
patch... where it was intended to be in the first place.  Noticed by
Chad.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-13 17:53:33 -07:00
Ian Romanick
341fb93c16 meta: Don't call _mesa_Ortho with width or height of 0
Fixes failures in oglconform fbo mipmap.manual.color,
mipmap.manual.colorAndDepth, mipmap.automatic, and
mipmap.manualIterateTexTargets subtests.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-13 17:53:33 -07:00
Vadim Girlin
17bb96b03d r600g/sb: use MULADD workaround on R7xx for MULADD_IEEE
Looks like the same issue that was seen with MULADD in trans slot on
R7xx also affects MULADD_IEEE (maybe all OP3 instructions and MULADD is
just a most frequently used?). So the workaround is to not allow affected
instructions to be placed into the trans slot.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=67927

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-14 01:03:18 +04:00
Roland Scheidegger
6991f86945 gallivm: implement new float comparison instructions returning integer masks
FSEQ/FSGE/FSLT/FSNE work just the same as SEQ/SGE/SLT/SNE except skip the
select.
And just for consistency use the same appropriate ordered/unordered comparisons
for the old opcodes as well.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-13 19:09:17 +02:00
Roland Scheidegger
0930082ffd tgsi: implement new float comparison instructions returning integer masks
Also while here add a bunch of other forgotten (integer) instructions to
tgsi_util_get_inst_usage_mask() (which isn't used for much except optimizing
away unused input components), though it may still be incomplete.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-13 19:09:17 +02:00
Roland Scheidegger
e7a5bf7a34 gallium: add new float comparison instructions returning integer masks
Newer graphic languages don't want messy float mask results but instead true
"boolean" mask results for float comparisons. Otherwise just need to convert
the floats back to integers. Need to keep the old opcodes however due to both
legacy (gl and d3d9) needing them and because older hw can't really deal with
integers. These new FSEQ/FSGE/FSLT/FSNE opcodes are part of integer API and
hence must be supported if a driver claims to support glsl 1.30 (or
PIPE_SHADER_CAP_INTEGERS).

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-13 19:09:17 +02:00
Chia-I Wu
3b6cee1634 ilo: enable dumping of WM PCB
It was disabled because it wasn't supported.
2013-08-13 16:28:24 +08:00
Chia-I Wu
0f8a86682f ilo: no binding table change when constants are pushed
When constants can be pushed, and nothing else requires new SURFACE_STATEs,
there is no need to emit BINDING_TABLE_STATE.
2013-08-13 16:26:03 +08:00
Chia-I Wu
c6e1e0157b ilo: support push constant model in shaders
Source constants from URB constant data when the constant data can fit in the
PCB.
2013-08-13 16:04:35 +08:00
Chia-I Wu
5e30ffbda6 ilo: support copying constant buffer 0 to PCB
Add ILO_KERNEL_PCB_CBUF0_SIZE so that a kernel can specify how many bytes of
constant buffer 0 need to be copied to PCB.
2013-08-13 15:52:41 +08:00
Chia-I Wu
5df62dce34 ilo: make constant buffer 0 upload optional
Add ILO_KERNEL_SKIP_CBUF0_UPLOAD so that we can skip constant buffer 0 upload
when the kernel does not need it.
2013-08-13 15:52:37 +08:00
Chia-I Wu
8b5b5fe394 Revert "ilo: initialize constant buffer SURFACE_STATE early"
This reverts commit a9b800aa81.  With push
constant support, the constructed SURFACE_STATE is unused and wasted.  The
change only slows things down.
2013-08-13 15:24:58 +08:00
Armin K
f423eba46e gbm: Link to libwayland-drm if Wayland EGL platform is enabled
We were relying on libEGL to pull in libwayland-client symbols, but with
commit 2c2e64edab cleaned up the
symbol leak.

https://bugs.freedesktop.org/show_bug.cgi?id=67962
2013-08-12 15:16:22 -07:00
Roland Scheidegger
cd2f26090a gallivm: fix exec_mask interaction with geometry shader after end of main
Because we must maintain an exec_mask even if there's currently nothing
on the mask stack, we can still have an exec_mask at the end of the program.
Effectively, this mask should be set back to default when returning from main.
Without relying on END/RET opcode (I think it's valid to have neither) it is
actually difficult to do this, as there doesn't seem any reasonable place to
do it, so instead let's just say the exec_mask is invalid outside main (which
it really is effectively).
The problem is that geometry shader called end_primitive outside the shader
(in the epilogue), and as a result used a bogus mask, leading to bugs if we
had to set the (somewhat misnamed) ret_in_main bit anywhere. So just avoid
the mask combining function when called from outside the shader.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-12 23:33:00 +02:00
Roland Scheidegger
dfa7b72563 draw: simplify prim mask construction
The code was quite weird, the second comparison was in fact a complete no-op
and we can also do the comparison with the vector directly instead of scalar,
which should not also be faster but it is way more obvious how that mask
is actually going to look like.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-12 23:33:00 +02:00
Roland Scheidegger
7147094ff2 gallivm: simplify geometry shader mask handling a bit
Instead of reducing masks to 0/1 simply use the mask directly as -1.
Also use some signed comparison instead of unsigned (as far as I understand
these values have to be (very) small and signed means llvm doesn't have to
apply additional logic to do the unsigned comparisons the cpu can't do).
Saves a couple of instructions in some test geometry shader here.

v2: that was a bit to much optimization, don't skip combining the masks...

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-12 23:33:00 +02:00
Roland Scheidegger
84fce45321 draw: (trivial) dump tgsi for geometry shaders with GALLIVM_DEBUG_TGSI
And dump the variant key too (same as vs does).
Just so I can stop wondering why I see the tgsi dump for fs and vs but not
gs...
2013-08-12 23:33:00 +02:00
Roland Scheidegger
8c5283dc17 gallivm: (trivial) fix typo in argument declaration of lp_build_size_query_soa
Was meant to match the name used elsewhere, spotted by Anthony.
2013-08-12 23:33:00 +02:00
Kenneth Graunke
4d95efd146 i965/fs: Add dump_instruction() support for ARF destinations.
CMP instructions use BRW_ARF_NULL as a destination.  Prior to this
patch, dump_instruction() decoded the destination as "???".

Now it decodes BRW_ARF_NULL as "(null)" and other ARFs numerically.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-12 13:13:06 -07:00
Kenneth Graunke
ee7bfab068 i965/fs: Remove extraneous newline in dump_instruction() for CMP.
This resulted in printouts like:

   246: cmp.cmod.f0.0
    ???, vgrf152, 0.000000f, (null),

With this patch, CMP is properly printed on one line.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-12 13:13:04 -07:00
Kenneth Graunke
80e1c2f35f i965/fs: Optimize IF/MOV/ELSE/MOV/ENDIF to SEL when possible.
Many GLSL shaders contain code of the form:

   x = condition ? foo : bar

The compiler emits an ir_if tree for this, since each subexpression
might be a complex tree that could have side-effects and short-circuit
logic operations.

However, the common case is to simply pick one of two constants or
variable's values---which is exactly what SEL is for.  Replacing IF/ELSE
with SEL also simplifies the control flow graph, making optimization
passes which work on basic blocks more effective.

The shader-db statistics:

   total instructions in shared programs: 1655247 -> 1503234 (-9.18%)
   instructions in affected programs:     949188 -> 797175 (-16.02%)

   2,970 shaders were helped, none hurt.  Gained 181 SIMD16 programs.

This helps Valve's Source Engine games (max -41.33%), The Cave
(max -33.33%), Serious Sam 3 (max -18.64%), Yo Frankie! (max -30.19%),
Zen Bound (max -22.22%), GStreamer (max -6.12%), and GLBenchmark 2.7
(max -1.94%).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-12 13:13:01 -07:00
Kenneth Graunke
2c32c3985c i965/fs: Consider predicated SEL instructions as whole variable writes.
The instruction

   (+f0.0) SEL dst, src0, src1

will write either src0 or src1 to dst, depending on the predicate.
Unlike most predicated instructions, it always writes to dst.

fs_inst::is_partial_write() is supposed to return true if the whole
register is guaranteed to be written.  The !inst->predicated check makes
sense for most instructions, which might not write the whole register,
but SEL is a special case.

This caused live interval analysis to ignore the destination of
predicated SEL instructions when computing "def" information.

Requires the previous commit to avoid regressions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-12 13:12:59 -07:00
Kenneth Graunke
d21f542aa1 i965/fs: Explicitly disallow CSE on predicated instructions.
The existing inst->is_partial_write() already disallows predicated
instructions, so this has no functional change.  However, it's worth
doing explicitly since the CSE pass does not consider the flag register.
This means it could blindly factor out operations that use the same
sources, but which have different condition codes set.

This prevents a regression in the next commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-12 13:12:57 -07:00
Kenneth Graunke
53d8cff63b i965/fs: Log a performance warning if skipping 16-wide due to pulls.
Usually, the driver creates both 8-wide and 16-wide variants of every
fragment shader.  When 16-wide compilation fails, it logs a performance
warning explaining why only an 8-wide program exists.

However, when there are pull parameters, the driver won't even bother
trying the 16-wide compile (since it would fail).  In this case, it
failed to emit a performance warning, leaving no explanation for the
missing 16-wide program.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-12 13:12:47 -07:00
Chia-I Wu
a9b800aa81 ilo: initialize constant buffer SURFACE_STATE early
Fix ilo_gpe_init_view_surface_for_buffer to allow buffer to be NULL, and add
ilo_gpe_set_view_surface_bo to set it later.  This allows us to set up
SURFACE_STATE early for constant buffers backed by user buffers.
2013-08-12 11:49:51 +08:00
Chia-I Wu
b2f79a3823 ilo: 3DSTATE_INDEX_BUFFER may be wrongly skipped
In finalize_index_buffer(), when the current index buffer was destroyed due to
u_upload_data(), it may happen that the new index buffer is at the same
address as the old one.  Comparing the pointers to the two buffers could fail
to work, and 3DSTATE_INDEX_BUFFER would be incorrectly skipped.

Holding a reference to the current index buffer before calling u_upload_data()
should fix the problem.
2013-08-10 13:01:41 +08:00
Chris Forbes
637e6a0aa8 i965: add missing BRW_NEW_INTERPOLATION_MAP to state dump
Makes this flag appear in the output for INTEL_DEBUG=state

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-10 20:29:12 +12:00
Chris Forbes
e114b13dae i965: Add a new debug mode for the VUE map
INTEL_DEBUG=vue now emits a listing of each slot in the VUE map,
and the corresponding interpolation mode.

V2: Fix whitespace issues.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-10 20:28:45 +12:00
Ian Romanick
5894898148 glsl: Don't allow const on out or inout function parameters
Fixes piglit tests const-inout-parameter.frag and
const-out-parameter.frag.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-09 13:51:18 -07:00
Roland Scheidegger
894d4903e7 gallivm: set non-existing values really to zero in size queries for d3d10
My previous attempt at doing so double-failed miserably (minification of
zero still gives one, and even if it would not the value was never written
anyway).
While here also rename the confusingly named int_vec bld as we have int vecs
of different sizes, and rename need_nr_mips (as this also changes out-of-bounds
behavior) to is_sviewinfo too.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-09 20:49:19 +02:00
Roland Scheidegger
b0f74250e1 gallivm: use texture target from shader instead of static state for size query
d3d10 has no notion of distinct array resources neither at the resource nor
sampler view level. However, shader dcl of resources certainly has, and
d3d10 expects resinfo to return the values according to that - in particular
a resource might have been a 1d texture with some array layers, then the
sampler view might have only used 1 layer so it can be accessed both as 1d
or 1d array texture (I think - the former definitely works). resinfo of a
resource decleared as array needs to return number of array layers but
non-array resource needs to return 0 (and not 1). Hence fix this by passing
the target from the shader decl to emit_size_query and use that (in case of
OpenGL the target will come from the instruction itself).
Could probably do the same for actual sampling, though it may not matter there
(as the bogus components will essentially get clamped away), possibly could
wreak havoc though if it REALLY doesn't match (which is of course an error
but still).

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-09 20:49:18 +02:00
Roland Scheidegger
38ad404f76 gallivm: honor d3d10's wishes of out-of-bounds behavior for texture size query
Specifically, must return 0 for non-existent mip levels (and non-existent
textures which is an unsolved problem) for everything but total mip count.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-09 20:49:18 +02:00
Paul Berry
417dc8081b glsl: Enable ARB_fragment_coord_conventions functionality in GLSL 1.50.
GLSL 1.50 incorporates the functionality of the
ARB_fragment_coord_conventions extension, so we need to make this
functionality available even if the extension isn't enabled.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-09 10:35:06 -07:00
Paul Berry
13fedf2883 main: Fix deprecation of glLineWidth()
From section E.1 (Profiles and Deprecated Features of OpenGL 3.0)
of the OpenGL 3.0 spec:

    "LineWidth is not deprecated, but values greater than 1.0
    will generate an INVALID VALUE error"

From context it is clear that values greater than 1.0 should only
generate an INVALID VALUE error in a forward-compatible context.

The code was correctly quoting this spec text, but it was disallowing
all line widths in forward-compatible contexts, instead of just widths
greater than 1.0.

This patch introduces the correct check, so that setting a line width
of 1.0 or less is permitted.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-09 10:34:05 -07:00
Roland Scheidegger
836098f6b2 util: (trivial) fix asm input/output list for fxsave
Otherwise gcc might do very unsafe optimizations, spotted by Uros Bizjak.
Hopefully this time it's finally right?
2013-08-09 17:30:13 +02:00
Alex Deucher
c88783047e r600g: disable GPUVM by default
Cayman and trinity systems still seem to suffer from
stability problems with GPUVM.  This also fixes compute
on these asics.  It can still be enabled for testing
by setting env var RADEON_VA=true.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=65958

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: "9.2" <mesa-stable@lists.freedesktop.org>
CC: "9.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-09 10:51:25 -04:00
Zack Rusin
e8d8974f80 softpipe: fix the regressions
softpipe has a really weird handling of the draw attrs, lets
just not inject outputs in its data.
Trivial.
2013-08-08 20:54:50 -04:00
Zack Rusin
662a4d4a12 draw: rewrite primitive assembler
We can't be injecting the primitive id's in the pipeline because
by that time the primitives have already been decomposed. To
properly number the primitives we need to handle the adjacency
primitives by hand. This patch moves the prim id injection into
the original primitive assembler and completely removes the
useless pipeline stage.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-08 20:54:25 -04:00
Zack Rusin
1d425c4c6d draw: reset the vertex id when injecting new primitive id
Without reseting the vertex id, with primitives where the same
vertex is used with different primitives (e.g. tri/lines strips)
our vbuf module won't re-emit those vertices with the changed
primitive id. So lets reset the vertex id whenever injecting
new primitive id to make sure that the vertex data is correctly
emitted.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-08 20:54:03 -04:00
Zack Rusin
57cd326778 draw: cleanup the extra attribs
Before inserting new front face and prim id outputs cleanup
the old extra outputs, otherwise our cache will use previous
output slots which will break as soon as outputs of the current
shader don't match the last.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-08 20:53:40 -04:00
Dieter Nützel
8f40fa0e7f util: (trivial) fix more compile errors in u_cpu_detect (gcc/x86 this time).
Oops. Should fix https://bugs.freedesktop.org/show_bug.cgi?id=67921
2013-08-09 01:25:54 +02:00
Chad Versace
2c2e64edab egl: Do not export private symbols
libEGL was incorrectly exporting *all* symbols, public and private.
This patch adds -fvisibility=hidden to libEGL's linker flags to ensure
that only symbols annotated with __attribute__((visibility("default")))
get exported.

Sanity-checked with libEGL's builtin DRI2 driver and the i965 DRI driver
by running Piglit on X/EGL and by running weston-gears on Weston as an
X client.

Sanity-checked with libEGL's Gallium driver (which is not built-in) and
the swrast Gallium driver by running es2gears_x11.

Kristian reviewed the symbol diff in `nm libEGL.so`.

CC: "9.2" <mesa-stable@lists.freedesktop.org>
CC: Ian Romanick <idr@freedesktop.org>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-08 15:17:51 -07:00
Kenneth Graunke
fb3d62fe3d i965: Remember to call intel_prepare_render() before blitting.
Otherwise, blits to the window system buffer may cause crashes,
since dst_irb->mt may be NULL.

This code is lifted straight out of brw_blorp_framebuffer()'s
try_blorp_blit() helper.

Fixes crashes in Piglit's fbo-sys-blit on systems without BLORP.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65919
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-08 12:12:47 -07:00
Roland Scheidegger
43076a55c2 util: (trivial) fix compile error with MSVC on x86 2013-08-08 19:08:57 +02:00
Roland Scheidegger
6ce54a81b2 gallivm: honor d3d10 floating point rules for shadow comparisons
d3d10 specifies ordered comparisons for everything but not_equal which is
unordered (http://msdn.microsoft.com/en-us/library/windows/desktop/cc308050.aspx).
OpenGL probably doesn't care.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-08 18:55:58 +02:00
Roland Scheidegger
aa84f1ad55 softpipe: don't clamp reference value for shadow comparison for float formats
Clamping is only done for fixed-point formats as part of conversion to
texture format.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-08 18:55:57 +02:00
Roland Scheidegger
e1590b9690 gallivm: don't clamp reference value for shadow comparison for float formats
This is wrong both for OpenGL and d3d. (In fact clamping is a side effect
of converting to depth format, so this should really do quantization too
at least in d3d10 for the comparisons to be truly correct.)

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-08 18:55:57 +02:00
Roland Scheidegger
eac57bc223 gallivm: propagate scalar_lod to emit_size_query too
Clearly the returned values need to be per-element if the lod is per element.
Does not actually change behavior yet.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-08 18:55:57 +02:00
Roland Scheidegger
c8572a9457 gallium: clarify SVIEWINFO opcode
This opcode is quite problematic in tgsi, while it tries to mirror
d3d10 resinfo it can't really do what's stated there due to missing
the crazy return type modifiers. Hence specify this is ignored along
with the swizzle.
(Other options would be to have multiple opcodes or specify the ret
type modifier maybe in dst_reg as there's padding bits left there but
it is the only instruction allowing this.)

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-08 18:55:57 +02:00
Roland Scheidegger
ce0e66af0a gallivm: fix out-of-bounds behavior for fetch/ld
For d3d10 and ARB_robust_buffer_access_behavior, we are required to return
0 for out-of-bounds coordinates (for which we can just enable the code already
there was just disabled). Additionally, also need to return 0 for
out-of-bounds mip level and out-of-bounds layer. This changes the logic
so instead of clamping the level/layer, an out-of-bound mask is computed
instead in this case (actual clamping then can be omitted just like with
coordinates, since we set the fetch offset to zero if that happens anyway).

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-08 18:55:57 +02:00
Roland Scheidegger
883987503f util: try much harder to set DAZ flag
While so far this only causes some harmless test failures, there's lots more
cpus with DAZ. All 64bit capable ones can do it (particularly relevant for
AMD cpus as they supported sse3 very very late) but if really necessary we
can check support for that for real with some more magic.
(In fact just about ANY cpu with sse2 can support DAZ, I believe the only
exception are first gen P4 (Willamette) and from those only early steppings
which can't do it it's almost like intel forgot to add it... - a real pity
though docs say you can't just try to set it as they will throw a GPF.)
While this was meant to address https://bugs.freedesktop.org/show_bug.cgi?id=67672
it does not fix it. Most likely the tests need fixing as I don't think
there's any guarantee about denorm handling in the reference math library
functions if the flags aren't set to standard values. Nevertheless enabling
DAZ on all cpus which can do it should be the right thing to do.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-08 18:55:57 +02:00
Roland Scheidegger
e3b5e2db1b util: implement table-based + linear interpolation linear-to-srgb conversion
Should be much faster, seems to work in softpipe.
While here (also it's now disabled) fix up the pow factor - the former value
is what is in GL core it is however not actually accurate to fp32 standard
(as it is 1.0/2.4), and if someone would do all the accurate math there's no
reason to waste 8 mantissa bits or so...

v2: use real table generating function instead of just printing the values
(might take a bit longer as it does calculations on some 3+ million floats
but much more descriptive obviously).
Also fix up another inaccurate pow factor (this time in the python code) -
wondering where the couple one bit errors came from :-(.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-08 18:55:57 +02:00
Roland Scheidegger
2d9fea95e8 gallivm: fix comment wrt srgb accuracy.
I think it's actually not good enough now...
2013-08-08 18:55:57 +02:00
Chia-I Wu
f9a4288bd2 ilo: get rid of GPE tables completely
Move the estimate functions out of the tables and kill the tables.
2013-08-08 13:46:01 +08:00
Chia-I Wu
19204081ce ilo: clean up GPE header inclusions
This reduces the number of source files need to be recompiled when GPE
functions are changed other than regular clean ups.
2013-08-08 13:41:10 +08:00
Chia-I Wu
e292b9362a ilo: initialize alpha test state in ilo_gpe_init_dsa
This could speed up BLEND_STATE and COLOR_CALC_STATE emission a bit.
2013-08-08 13:30:34 +08:00
Chia-I Wu
02496cd2b6 ilo: fold gen6_translate_index_size into the caller
There is only one caller so fold it.
2013-08-08 13:10:36 +08:00
Chia-I Wu
1c19d0bb81 ilo: fold gen6_translate_depth_format into the caller
There is only one caller so fold it.
2013-08-08 13:02:17 +08:00
Courtney Goeltzenleuchter
c2c5366ff2 ilo: Call GPE emit functions directly.
Eliminate pipeline and GPE function vectors and have the pipeline functions
call the GPE emit functions directly.
2013-08-08 11:39:21 +08:00
Courtney Goeltzenleuchter
4bc9daf923 ilo: move emit functions so that they can be inlined. 2013-08-08 11:39:21 +08:00
Tom Stellard
d0c13fba17 r300g/compiler/tests: Pass the required LDFLAGS when building the test program
CC: "9.2 <mesa-stable@lists.freedesktop.org>"
2013-08-07 17:28:19 -07:00
Tom Stellard
d691ba4d94 r300g/compiler/tests: Fix segfault
CC: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-07 17:27:23 -07:00
Kristian Høgsberg
5575fdaccf gallium-egl: Commit the rest of the native_wayland_drm_bufmgr_helper v2 patch
I missed Anders v2 on the list which fixed non-wayland compilation:

http://lists.freedesktop.org/archives/mesa-dev/2013-July/042062.html

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2013-08-07 11:23:47 -07:00
Ander Conselvan de Oliveira
8d29b5271a egl: Update to Wayland 1.2 server API
Since Wayland 1.2, struct wl_buffer and a few functions are deprecated.

References to wl_buffer are replaced with wl_resource and some getter
functions and calls to deprecated functions are replaced with the proper
new API. The latter changes are related to resource versioning.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
2013-08-07 10:37:58 -07:00
Ander Conselvan de Oliveira
602351dd58 gallium-egl: Don't add a listener for wl_drm twice in wayland platform
A listener is added just after the interface is bound, in
registry_handle_global().

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
2013-08-07 10:37:58 -07:00
Ander Conselvan de Oliveira
331a8fa41d gallium-egl: Simplify native_wayland_drm_bufmgr_helper interface
The helper provides a series of functions to easy the implementation
of the WL_bind_wayland_display extension on different platforms. But
even with the helpers there was still a bit of duplicated code between
platforms, with the drm authentication being the only part that
differs.

This patch changes the bufmgr interface to provide a self contained
object with a create function that takes a drm authentication callback
as an argument. That way all the helper functions are made static and
the "_helper" suffix was removed from the sources file name.

This change also removes the mix of Wayland client and server code in
the wayland drm platform source file. All the uses of libwayland-server
are now contained in native_wayland_drm_bufmgr.c.

Changes to the drm platform are only compile tested.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
2013-08-07 10:37:58 -07:00
Chia-I Wu
79b868fea1 ilo: speed up 3DSTATE_VERTEX_BUFFERS emission a bit
Ignore vbuffer_mask which does not gain us anything.
2013-08-07 23:13:50 +08:00
Chia-I Wu
7ce3cbaacf ilo: skip state emission when reducing sampler count
When the number of sampler states bound is reduced, we are good to keep
referencing the old SAMPLER_STATE array and skip emitting a new one.
2013-08-07 23:13:44 +08:00
Chia-I Wu
2811dba1d0 ilo: simplify setting of shader samplers and views
Remove the special path that unbinds all samplers/views not in the range.
Just make another call to unbind them.
2013-08-07 18:10:32 +08:00
Chia-I Wu
186dab5b8f ilo: correctly check for stencil ref change
I intended to do a memcmp(), not a memcpy()...
2013-08-07 18:00:46 +08:00
Zack Rusin
12522041d6 draw: fix slot detection
Nowadays -1 for slots means that the semantic is not present, so
we need to store it in a signed variables, otherwise <0 comparisons
are pointless. Fixes
http://bugzilla.eng.vmware.com/show_bug.cgi?id=67811 (at least
with softpipe, edgeflags don't work wit llvmpipe)

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-06 20:23:57 -04:00
Laurent Carlier
2572e3b4a1 gallivm: Fix build - Remove TargetOptions.RealignStack for llvm>=3.4
Since llvm -3.4svn r187618, TargetOptions doesn't provide
RealignStack, so only enable it with llvm<3.4

This option must now be specified using function attributes, see LLVM
commit r187618

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-06 15:31:48 -07:00
Kenneth Graunke
0f7a15a247 i965: Add #defines for the MI_LOAD_REGISTER_MEM command.
This command reads a value from memory and writes it to a register (the
opposite of MI_STORE_REGISTER_MEM).  It's only available on Gen7+.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-06 14:41:37 -07:00
Kenneth Graunke
c047ad000b i965: Initialize the intel_context::bufmgr pointer earlier.
This prevents a crash in a future patch.

_mesa_initialize_context() creates a default transform feedback object
by calling the NewTransformFeedbackObject() driver hook.  Eventually,
we'll want to subclass that and allocate a buffer object.  This means
passing brw->bufmgr to drm_intel_alloc_bo(), and crashing if it isn't
initialized yet.

The buffer manager is actually already initialized; we just hadn't
copied the pointer from intel_screen to intel_context quite early
enough.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-06 14:41:37 -07:00
Kenneth Graunke
263ebe1a71 i965: Tidy preprocessor macros for SO_PRIM_STORAGE_NEEDED registers.
Gen7+ supports four transform feedback streams.  Using a function-like
macro makes it easy to access them by stream number or loop over them.
"GEN7_" prefixes are more common than "_IVB" suffixes, so use that.

Gen6 only supports a single stream, so the single #define should be
fine.  However, SO_NUM_PRIM_STORAGE_NEEDED was a poor name.  For one,
the word "NUM" doesn't appear in the actual name of the register.
It's also confusingly generic, as it doesn't exist on Gen7+.  Add a
"GEN6_" prefix for clarity.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-06 14:41:37 -07:00
Kenneth Graunke
8c27f13cd9 i965: Tidy preprocessor macros for SO_NUM_PRIMS_WRITTEN registers.
Gen7+ supports four transform feedback streams.  Using a function-like
macro makes it easy to access them by stream number or loop over them.
"GEN7_" prefixes are more common than "_IVB" suffixes, so we use that.

Gen6 only supports a single stream, so the single #define should be
fine.  However, SO_NUM_PRIMS_WRITTEN was confusingly generic, as it
doesn't exist on Gen7+.  Add a "GEN6_" prefix for clarity.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-06 14:41:37 -07:00
Christoph Bumiller
2daf974cfe nvc0: don't access array out of bounds on unexpected sample count 2013-08-06 22:29:33 +02:00
Emil Velikov
07c8f7a6f8 nv50: handle pure integer vertex attributes
And as a side effect fix a crash in the following piglit test:
general/attribs GL3

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "9.2 and 9.1" mesa-stable@lists.freedesktop.org
2013-08-06 22:25:26 +02:00
Samuel Pitoiset
31caddb8d9 nvc0: implement MP performance counters for nvc0:nvd9 2013-08-06 22:24:30 +02:00
Samuel Pitoiset
9dcd7888e6 nvc0: implement compute support for nvc0
Tested on nvc0, nvc1, nvcf and nvd9.
2013-08-06 22:22:49 +02:00
Samuel Pitoiset
981b589101 nvc0: add more MP counters for nve4 2013-08-06 22:22:34 +02:00
Ian Romanick
2f9fe2d80a mesa: Generate a renderbuffer wrapper even if the texture has no image
This prevents a segfault in check_begin_texture_render when an FBO is
rebound while in this state.  This fixes the piglit test
fbo-incomplete-invalid-texture.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
2013-08-06 12:18:50 -07:00
Ian Romanick
25281fef0f mesa: Validate the layer selection of an array texture too
Previously only the slice of a 3D texture was validated in the FBO
completeness check.  This fixes the failure in the 'invalid layer of an
array texture' subtest of piglit's fbo-incomplete test.

v2: 1D_ARRAY textures have Depth == 1.  Instead, compare against Height.

v3: Handle CUBE_MAP_ARRAY textures too.  Noticed by Marek.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
2013-08-06 12:18:46 -07:00
Ian Romanick
41485fea7c mesa: Don't call driver RenderTexture for invalid zoffset
This fixes the segfault in the 'invalid slice of 3D texture' and
'invalid layer of an array texture' subtests of piglit's fbo-incomplete
test.

The 'invalid layer of an array texture' subtest still fails.

v2: Fix off-by-one comparison error noticed by Chris Forbes.  Also,
1D_ARRAY textures have Depth == 1.  Instead, compare against Height.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
2013-08-06 12:18:42 -07:00
Ian Romanick
fb49713f8e mesa: Don't call driver RenderTexture for really broken textures
This fixes the segfault in the '0x0 texture' subtest of piglit's
fbo-incomplete test.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
2013-08-06 12:18:39 -07:00
Ian Romanick
0c3dbd689b mesa: Remove stray debug printfs in attachment completeness code
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
2013-08-06 12:18:29 -07:00
Ian Romanick
4a9522a5a0 mesa: Treat glBindFramebuffer and glBindFramebufferEXT more correctly
Allow user-generated names for glBindFramebufferEXT on desktop GL.
Disallow its use altogether for core profiles.

Names bound with glBindFramebuffer in desktop OpenGL are still
(incorrectly) shared across the share group instead of being
per-context.  This gets us a bit closer to being strictly conformant.

v2: Disallow glBindFramebufferEXT in 3.1 by not installing it in the
dispatch table.  Suggested by Jordan.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v1]
Cc: mesa-stable@lists.freedesktop.org
2013-08-06 10:46:05 -07:00
Ian Romanick
97965e87fc mesa: Treat glBindRenderbuffer and glBindRenderbufferEXT correctly
Allow user-generated names for glBindRenderbufferEXT on desktop GL.
Disallow its use altogether for core profiles.

v2: Disallow glBindRenderbufferEXT in 3.1 by not installing it in the
dispatch table.  Suggested by Jordan.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v1]
Cc: mesa-stable@lists.freedesktop.org
2013-08-06 10:46:05 -07:00
Michel Dänzer
46b6f79fea radeonsi: Number of SGPRs retrieved from LLVM already includes VCC
Fixes spurious 'Assertion `num_sgprs <= 104' failed.' with shaders using
all 104 SGPRs.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-06 12:50:01 +02:00
Kenneth Graunke
59f22148b3 i965: Don't allocate curbe buffers on Gen6+.
These are only used on Gen4-5.  Why waste the 8kB of space?

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-08-06 00:21:10 -07:00
Vinson Lee
b57c1e4b86 llvmpipe: Do not need to free anything if there is no geometry shader.
If gs is null, then freeing state->shader.tokens would result in a null
dereference.

Fixes "Dereference after null check" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-05 21:54:20 -07:00
Vinson Lee
60b567ee59 nvc0: Initialize ptr for unexpected sample_count on release builds.
Fixes "Uninitialized pointer read" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-05 21:53:39 -07:00
Vinson Lee
8e850f2feb draw: Change slot from unsigned to int.
unfilled_stage::face_slot is of type int.

Fixes "Unsigned compared against 0" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-05 17:40:19 -07:00
Vinson Lee
8294d969e1 postprocess: Check ppq is null before calling pp_free_bos.
pp_free_bos dereferences ppq without a null check.

Fixes "Dereference before null check" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-05 17:27:38 -07:00
Zack Rusin
a9cb914f49 draw: add back separate input assembler
the issue is that stream output is run before the pipeline, which
means that unless we decompose the primitives before the so
then things crash. we could convert the entire stream output
code into a pipeline stage but it will take a bit, so for now
fix the crashes by simply re-adding the old input assembler
which is run before the SO.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-08-03 02:57:40 -04:00
Zack Rusin
c9c211fae1 draw: implement proper primitive assembler as a pipeline stage
we used to have a face primitive assembler that we ran after if
the gs was missing but we had adjacency primitives in the pipeline,
lets convert it to a pipeline stage, which allows us to use it
to inject outputs (primitive id) into the vertices. it's also
a lot cleaner because the decomposition is already handled for us.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-03 00:38:58 -04:00
Zack Rusin
8a94d15fba draw: fix front face injection
Inject front face only if the fragment shader uses it and
propagate through all channels because otherwise we'll
need to figure out the exact swizzle that the fs expects and
it's just simpler to make sure all the components within
the front face register are correctly set.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-03 00:36:39 -04:00
Brian Paul
4c9f12d69c tgsi: remove unneeded File == TGSI_FILE_INPUT test
We're already in an "if (File == TGSI_FILE_INPUT)" block at that point.
2013-08-05 10:25:08 -06:00
Brian Paul
3e4b5c6c9c tgsi: clean up tgsi_scan_shader() function
Replace "fulldecl->Semantic.Name/Index" with semName/semIndex.
Simplify if/else logic for TGSI_FILE_OUTPUT code.
Remove old comment.
Fix indentation.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-05 10:11:33 -06:00
Zack Rusin
95829e2029 llvmpipe: fix frontface behavior again
Lets make sure the frontface is 1 for front and -1 for back.
Discussed with Roland and Jose.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-08-02 22:21:29 -04:00
Vinson Lee
0794f638ee r600g/sb: Dump correct value for CND.
Fixes "Copy-paste error" reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-08-04 13:49:17 -07:00
Jordan Justen
83486d3148 intel_fbo: remove unused intel_renderbuffer hiz functions
We are now using functions that operate on the renderbuffer
attachment to handle layered rendering.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-04 11:52:38 -07:00
Jordan Justen
7b36137642 i965 clear/draw: set renderbuffer attachment as needing depth resolve
Previously we would mark a renderbuffer as needing a depth resolve.
But, to support layered rendering, we need to look at the attachment
instead, since the attachment knows if layered rendering is being
used.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-04 11:52:38 -07:00
Jordan Justen
d44be9ed2f i965: add intel_renderbuffer_att_set_needs_depth_resolve
This function is needed to support layered rendering. With
layered rendering, the attachment stores the state of whether
layered rendering is being used.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-04 11:52:38 -07:00
Jordan Justen
814a040504 i965: add intel_miptree_set_all_slices_need_depth_resolve
This function marks all slices of a renderbuffer at a particular
level as needing a depth resolve.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-04 11:52:38 -07:00
Jordan Justen
b05b81743c i965 gen7: don't set FORCE_ZERO_RTAINDEX for layered rendering
When layered rendering is being used, we should not set
FORCE_ZERO_RTAINDEX in the clip state to allow render target
array values other than zero to be used.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04 11:52:38 -07:00
Jordan Justen
20799c11eb hsw hiz: Remove x/y offset restriction for hiz
This restriction was related to programming the offset fields
of the depth buffer packet. We are now setting these offsets
to 0 now, so this restriction should no longer be required.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04 11:52:37 -07:00
Jordan Justen
bf25ee2840 gen7 depth surface: program 3DSTATE_DEPTH_BUFFER to top of surface
Previously we would always find the 2D sub-surface of interest,
and then program the surface to this location. Now we always
program the 3DSTATE_DEPTH_BUFFER at the start of the surface.
To select the lod/slice, we utilize the lod & minimum array
element fields.

As part of this change, we must revert 1f112ccf:
Revert "i965/gen7: Align all depth miplevels to 8 in the X direction."

We also must disable brw_workaround_depthstencil_alignment for
gen >= 7. Now the hardware will handle alignment when rendering
to additional slices/LODs.

v2:
 * Merge with recent MOCS changes

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04 11:52:37 -07:00
Jordan Justen
f3c886be1f gen7 fbo: make unmatched depth/stencil configs return unsupported
For gen >= 7, we will use the lod/minimum-array-element fields to
support layered rendering. This means that we must restrict
the depth & stencil attachments to match in various more retrictive
ways. (Now the width, height, depth, LOD and layer must match)

The reason width, height, and depth must match is that the hardware
has a single set of width, height, and depth settings (in
3DSTATE_DEPTH_BUFFER) that affect both the depth and stencil buffers.
Since these controls determine the miptree layout, they need to be
set correctly in order for lod and minimum-array-element to work
properly.  So the only way rendering can work is if the width,
height, and depth match.

In the future, if this restriction proves to be a problem (say
because some crucial client application relies on rendering to
different levels/layers of stencil and depth buffers), then we can
always work around the restriction by copying depth and/or stencil
data to a temporary buffer prior to rendering (much in the same way
that brw_workaround_depthstencil_alignment() does today for
gen < 7), but hopefully that won't be necessary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04 11:52:37 -07:00
Jordan Justen
65290a20f9 hsw hiz: Add new size restrictions for miplevels > 0
When performing hiz ops, we must ensure that the region sizes
have an 8 aligned width and 4 aligned height. We can tweak the
size for blorp hiz operations at LOD 0, but for the others we
can't. Therefore, we disable hiz for these miplevels if they
don't meet the size alignment requirements.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04 11:52:37 -07:00
Jordan Justen
e3a49e1ad3 gen7 blorp depth: calculate base surface width/height
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04 11:52:37 -07:00
Jordan Justen
a23cfb8648 gen7 depth surface: calculate minimum array element being rendered
In layered rendering this will be 0. Otherwise it will be the
selected slice.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04 11:52:37 -07:00
Jordan Justen
08ef1dde1b gen7 depth surface: calculate LOD being rendered to
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04 11:52:37 -07:00
Jordan Justen
bc1acaa426 gen7 depth surface: calculate depth (array size) for depth surface
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch.

Note: Cube maps are treated as 2D arrays with 6 times as
many array elements as the cube map array would have.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04 11:52:37 -07:00
Jordan Justen
171e633294 gen7 depth surface: calculate more specific surface type
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch.

Note: Cube maps are treated as 2D arrays with 6 times as
many array elements as the cube map array would have.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04 11:52:37 -07:00
Jordan Justen
0e6be2e67b i965: init global state first in brw_workaround_depthstencil_alignment
In a future pass this will allow us to exit-early from this
routine to disable it for gen >= 7.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-04 11:52:37 -07:00
Ilia Mirkin
8edb79f1ef nv50: fix some h264 interlaced decoding on vp2
Some videos specify mb_adaptive_frame_field_flag instead of
field_pic_flag. This implies that the pic height needs to be halved, and
this field needs to be passed to the VP engine.

Cc: "9.2" mesa-stable@lists.freedesktop.org

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-03 12:52:04 +02:00
Zack Rusin
bff0d87668 llvmpipe: don't interpolate front face or prim id
The loop was iterating over all the fs inputs and setting them
to perspective interpolation, then after the loop we were
creating extra output slots with the correct interpolation. Instead
of injecting bogus extra outputs, just set the interpolation
on front face and prim id correctly when doing the initial scan
of fs inputs.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-02 20:12:53 -04:00
Zack Rusin
8e77e5e543 draw: make sure clipping works with injected outputs
clipping would drop the extra outputs because it always
used the number of standard vertex shader outputs, without
geometry shader or extra outputs. The commit makes sure
that clipping with geometry shaders which have more outputs
than the current vertex shader and with extra outputs correctly
propagates the entire vertex.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-02 20:11:18 -04:00
Zack Rusin
d6b3a193d4 draw: inject frontface info into wireframe outputs
Draw module can decompose primitives into wireframe models, which
is a fancy word for 'lines', unfortunately that decomposition means
that we weren't able to preserve the original front-face info which
could be derived from the original primitives (lines don't have a
'face'). To fix it allow draw module to inject a fake face semantic
into outputs from which the backends can figure out the original
frontfacing info of the primitives.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-02 20:11:18 -04:00
Zack Rusin
05487ef88d draw: stop crashing with extra shader outputs
Draw sometimes injects extra shader outputs (aa points, lines or
front face), unfortunately most of the pipeline and llvm code
didn't handle them at all. It only worked if number of inputs
happened to be bigger or equal to the number of shader outputs
plus the extra injected outputs. In particular when running
the pipeline which depends on the vertex_id in the vertex_header
things were completely broken. The patch adjust the code to
correctly use the total number of shader outputs (the standard
ones plus the injected ones) to make it all stop crashing and
work.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-08-02 20:11:18 -04:00
Zack Rusin
2e46a1dcb3 draw: use the vertex size
Instead of using the magical 4 use the above computed
vertex size. Doesn't change the behavior, just makes the code
a bit cleaner.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-02 20:11:18 -04:00
Zack Rusin
da1a74f673 draw/llvm: add some extra debugging output
when dumping shader outputs it's nice to have the integer
values of the outputs, in particular because some values
are integers.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-02 20:11:18 -04:00
Zack Rusin
36096af026 tgsi: detect prim id and front face usage in fs
Adding code to detect the usage of prim id and front face
semantics in fragment shaders.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-02 20:11:18 -04:00
Zack Rusin
2da1daaa4e tgsi: add ucmp to the list of opcodes
we forgot to add ucmp to the list of opcodes, so it was never
generated for ureg.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-02 19:08:39 -04:00
Zack Rusin
2d15f4746b llvmpipe: make the front-face behavior match the gallium spec
The spec says that front-face is true if the value is >0 and false
if it's <0. To make sure that we follow the spec, lets just
subtract 0.5 from our value (llvmpipe did 1 for frontface and 0
otherwise), which will get us a positive num for frontface and
negative for backface.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-02 15:50:16 -04:00
Matt Turner
4f83956347 Makefile.am: Remove api_exec_es* from EXTRA_FILES.
These files were removed in commits a0102154 and a8ab7e33.

Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-08-02 09:51:57 -07:00
Matt Turner
5854883312 mesa: Use MIN3 instead of two MIN2s. 2013-08-02 09:51:57 -07:00
Matt Turner
01bdad3173 mesa: Update comments to match newer specs.
Old GL 1.x specs used 'b' but newer specs use 'p'. The line immediately
above the second hunk also uses 'p'.
2013-08-02 09:51:57 -07:00
Kenneth Graunke
9375c16e72 i965: Initialize the maximum number of GS threads on Haswell.
We'll need proper values for max_gs_threads when we eventually support
geometry shaders.  Also, we initialize it for every other platform.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-02 08:24:23 -07:00
Kenneth Graunke
a1ddbd1d7c glsl: Disallow interpolation qualifiers on non-input/output variables.
Commit 2548092ad8 switched the sense of interpolation qualifier
checks in order to permit them on geometry shader in/out variables.

In doing so, it accidentally allowed interpolation qualifiers to be
applied to ordinary variables and function parameters.

Fixes a regression in Piglit's local-smooth-01.frag.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-02 08:24:23 -07:00
Kenneth Graunke
7d2423a09e glsl: Fix NULL pointer dereferences when linking fails.
Commit 7cfefe6965 introduced a check for whether linked->Type equals
GL_GEOMETRY_SHADER.  However, linked may be NULL due to an earlier error
condition.

Since the entire function after the error path is (or should be) guarded
by linked != NULL checks, we may as well just return early and remove
the checks.

Fixes crashes in 9 Piglit tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-02 08:24:23 -07:00
Andreas Boll
9d569fed8d docs: Document UVD (2.2 and 3.0) video decoding support in mesa 9.2
Cc: "9.2" mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-08-02 17:14:08 +02:00
Andreas Boll
ec4a6a94b1 docs: Document that i965 Gen6+ requires Kernel 3.6 or later
Cc: "9.2" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-02 17:13:40 +02:00
Timothy Arceri
37f9e0e84f docs: Update some out of date sourcetree information
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-08-02 16:22:03 +02:00
Christoph Bumiller
957a2014f9 r600g: honour semantic index in fragment color exports
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2013-08-02 13:32:49 +02:00
Andreas Boll
38903db439 docs: Add md5sums to 9.1.5 release notes 2013-08-02 09:58:34 +02:00
Andreas Boll
7eaaf62434 docs: Fix a typo in the 9.1.6 release notes 2013-08-02 09:47:43 +02:00
Topi Pohjolainen
f5947c2bc7 i965: enable image external sampling for imported dma-buffers
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-02 08:56:03 +03:00
Topi Pohjolainen
20de7f9f22 egl/dri2: support for creating images out of dma buffers
v2:
   - upon success close the given file descriptors

v3:
   - use specific entry for dma buffers instead of the basic for
     primes, and enable the extension based on the availability
     of the hook

v4 (Chad):
   - use ARRAY_SIZE
   - improve the comment about the number of file descriptors
   - in case of invalid format report EGL_BAD_ATTRIBUTE instead
     of EGL_BAD_MATCH
   - take into account specific error set by the driver.

v5:
   - fix error handling

v6 (Chad):
   - fix invalid plane count checking

v7 (Chad):
   - fix indentation and reset loop counter before checking
     for excess attributes

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-02 08:56:03 +03:00
Topi Pohjolainen
3a52cd351a intel: restrict dma-buf-import images to external sampling only
Memory originating outside mesa stack is meant to be for reading
only. In addition, the restrictions imposed by the image external
extension should apply. For example, users shouldn't be allowed
to generare mip-trees based on these images.

v2 (Chad): document using full extension names, fix the comment
           style itself and emit description of error

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-02 08:56:03 +03:00
Topi Pohjolainen
0de013b619 egl: definitions for EXT_image_dma_buf_import
As specified in:

http://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_import.txt

Checking for the valid fourcc values is left for drivers avoiding
dependency to drm header files here.

v2: enforce EGL_NO_CONTEXT

v3: declare the extension as EGL (not GLES)

v4: do not update eglext.h manually but rely on update from
    Khronos instead

v5: (Eric) report invalid context as EGL_BAD_PARAMETER instead of as
    EGL_BAD_CONTEXT

v6: (Chad) fix the checking for valid hints. Before all values were
    rejected.

v7: (Chad) comment style change from

    /**
     * Multi-
     * line

    into

    /* Multi-
     * line

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-02 08:56:03 +03:00
Topi Pohjolainen
674dedc87a dri: propagate extra dma_buf import attributes to the drivers
v2: do not break ABI, but instead introduce new entry point for
    dma buffers and bump up the dri-interface version to eight

v3 (Chad): allow the hook to specify an error originating from the
           driver. For now only unsupported format is considered.
           I thought about rejecting the hints also as they are
           addressing only YUV sampling which is not supported at
           the moment but then thought against it as the spec is
           not saying one way or the other.

v4 (Eric, Chad): restrict to rgb formatted only

v5: rebased on top of i915/i965 split

v6 (Chad): document using full extension name

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-02 08:56:03 +03:00
Topi Pohjolainen
ee844b6660 intel: set dri image dimensions even when creating out of primes
Otherwise 'intel_set_texture_image_region()' won't have enough
details to work with.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-02 08:56:03 +03:00
Topi Pohjolainen
904587ac3a intel: refactor planar format lookup
v2 (Eric): refactor both occurences, not just one

v3 (Chad): replace 0 by NULL

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-02 08:56:03 +03:00
Topi Pohjolainen
55162e2164 intel: do not create renderbuffers out of planar images
v2 (Chad): emit 'GL_INVALID_OPERATION' and description of error

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-02 08:56:03 +03:00
Topi Pohjolainen
e8568a0803 intel: allow packed prime buffers to be treated normally
v2:
   - fix earlier rebase error breaking bisect
     (loaderPriv -> loaderPrivate)

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-02 08:56:02 +03:00
Paul Berry
34c55b5925 main: Warn that geometry shader support is experimental.
Geometry shader support in the Mesa front end is still fairly
preliminary.  Many features are untested, and the following things are
known not to work:

- The gl_in interface block
- The gl_ClipDistance input
- Transform feedback of geometry shader outputs
- Constants that are new in GLSL 1.50 (e.g. gl_MaxGeometryInputComponents)

This isn't a problem, since no back-end drivers currently enable
geometry shaders.  However, to make sure no one gets the wrong
impression, emit a nasty warning to let the user know that geometry
shader support isn't complete.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-01 20:24:49 -07:00
Paul Berry
7cfefe6965 glsl: Implement rules for geometry shader input sizes.
Section 4.3.8.1 (Input Layout Qualifiers) of the GLSL 1.50 spec
contains some tricky rules for how the sizes of geometry shader input
arrays are related to the input layout specification.  In essence,
those rules boil down to the following:

- If an input array declaration does not specify a size, and it
  follows an input layout declaration, it is sized according to the
  input layout.

- If an input layout declaration follows an input array declaration
  that didn't specify a size, the input array declaration is given a
  size at the time the input layout declaration appears.

- All input layout declarations and input array sizes must ultimately
  match.  Inconsistencies are reported as soon as they are detected,
  at compile time if the inconsistency is within one compilation unit,
  otherwise at link time.

- At least one compilation unit must contain an input layout
  declaration.

(Note: the geom_array_resize_visitor class was contributed by Bryan
Cain <bryancain3@gmail.com>.)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:24:39 -07:00
Paul Berry
20ae8e0c91 glsl: Allow geometry shader input instance arrays to be unsized.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-01 20:24:32 -07:00
Paul Berry
c1f1d8522c glsl: Permit non-ubo input interface arrays to use non-const indexing.
From the GLSL ES 3.00 spec:

    "All indexes used to index a uniform block array must be constant
    integral expressions."

Similar text exists in GLSL specs since 1.50.

When we implemented this, the only type of interface block supported
by Mesa was uniform blocks, so we required all indexes used to index
any interface block to be constant integral expressions.

Now that we are adding interface block support for GLSL 1.50, we need
a more specific check.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:24:27 -07:00
Eric Anholt
6065a87bce glsl: Cross-validate GS layout qualifiers while intrastage linking.
This gets piglit's geometry-basic test running.

TODO: Still need to validate that the GS layout qualifiers don't get used
in places they shouldn't (like an interface block, or a particular shader
input or output)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:24:23 -07:00
Eric Anholt
010a6a8fd3 glsl: Export the compiler's GS layout qualifiers to the gl_shader.
Next step is to validate them at link time.

v2 (Paul Berry <stereotype441@gmail.com>): Don't attempt to export the
layout qualifiers in the event of a compile error, since some of them
are set up by ast_to_hir(), and ast_to_hir() isn't guaranteed to have
run in the event of a compile error.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v3 (Paul Berry <stereotype441@gmail.com>): Use PRIM_UNKNOWN to
represent "not set in this shader".

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-01 20:23:43 -07:00
Eric Anholt
624b7bac76 glsl: Parse the GLSL 1.50 GS layout qualifiers.
Limited semantic checking (compatibility between declarations, checking
that they're in the right shader target, etc.) is done.

v2: Remove stray debug printfs.

v3 (Paul Berry <stereotype441@gmail.com>): Process input layout
qualifiers at ast_to_hir time rather than at parse time, since certain
error conditions depend on the relative ordering between input layout
qualifiers, declarations, and calls to .length().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:23:33 -07:00
Eric Anholt
f2e14238a7 glsl: Make sure that we don't put too many bitfields in ast_type_qualifier.
We do some tests of qualifiers using a union containing an int and the
struct full of bitfields, so make sure the bitfields don't spill
outside the int.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:23:28 -07:00
Paul Berry
e62ca57199 main: Fix delete_shader_cb() for geometry shaders
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:23:25 -07:00
Fabian Bieler
bd85ba08bc glsl/linker: Fail to link geometry shader without vertex shader.
From section 2.15 (Geometry Shaders) the OpenGL 3.2 spec:

    A program object that includes a geometry shader must also include
    a vertex shader; otherwise a link error will occur.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:23:21 -07:00
Fabian Bieler
8cdbe8394e mesa: Validate the drawing primitive against the geometry shader input primitive type.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:23:19 -07:00
Fabian Bieler
39ca58192b mesa/shaderapi: Allow 0 GEOMETRY_VERTICES_OUT.
ARB_geometry_shader4 spec Errors:
"The error INVALID_VALUE is generated by ProgramParameteriARB if <pname>
is GEOMETRY_VERTICES_OUT_ARB and <value> is negative."

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:23:16 -07:00
Paul Berry
72219acf6b glsl: Properly pack GS output varyings
In geometry shaders, outputs are consumed at the time of a call to
EmitVertex() (as opposed to all other shader types, where outputs are
consumed when the shader exits).  Therefore, when packing geometry
shader output varyings using lower_packed_varyings, we need to do the
packing at the time of the EmitVertex() call.

This patch accomplishes that by adding a new visitor class,
lower_packed_varyings_gs_splicer, which is responsible for splicing
the varying packing code into place wherever EmitVertex() is found.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-01 20:23:12 -07:00
Paul Berry
f2ecc84826 glsl: Modify varying packing to use a temporary exec_list.
This patch modifies lower_packed_varyings to store the packing code it
generates in a temporary exec_list, and then splice that list into the
shader's main() function when it's done.  This paves the way for
supporting geometry shader outputs, where we'll have to splice a clone
of the packing code before every call to EmitVertex().

As a side benefit, varying packing code is now emitted in the same
order for inputs and outputs; this should make debug output a little
easier to read.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-01 20:23:08 -07:00
Paul Berry
3b0cf7027d glsl/linker: Properly pack GS input varyings.
Since geometry shader inputs are arrays (where the array index
indicates which vertex is being examined), varying packing needs to
treat them differently.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-01 20:22:59 -07:00
Paul Berry
40d469f9ac glsl/linker: Properly error check VS-GS linkage.
From section 4.3.4 (Inputs) of the GLSL 1.50 spec:

    Geometry shader input variables get the per-vertex values written
    out by vertex shader output variables of the same names. Since a
    geometry shader operates on a set of vertices, each input varying
    variable (or input block, see interface blocks below) needs to be
    declared as an array.

Therefore, the element type of each geometry shader input array should
match the type of the corresponding vertex shader output.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:22:55 -07:00
Paul Berry
05234e707b glsl: Require geometry shader inputs to be arrays.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:22:48 -07:00
Paul Berry
fc5fa56c86 mesa: Copy linked program data for GS.
The documentation for gl_shader_program.Geom and gl_geometry_program
says that the former is copied to the latter at link time, but this
wasn't happening.  This patch causes _mesa_ir_link_shader() to perform
the copy, and updates comment accordingly.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:22:07 -07:00
Paul Berry
13022c9c5f mesa: Refactor copying of linked program data.
This patch creates a single function to copy the the UsesClipDistance
flag from gl_shader_program.Vert to gl_vertex_program.  Previously
this logic was duplicated in the i965-specific function
brw_link_shader() and the core mesa function _mesa_ir_link_shader().

This logic will have to be expanded to support geometry shaders, and I
don't want to have to update it in two separate places.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:21:26 -07:00
Bryan Cain
2548092ad8 glsl: support compilation of geometry shaders
This commit adds all of the parsing and semantics for GLSL 150 style
geometry shaders.

v2 (Paul Berry <stereotype441@gmail.com>): Add a few missing calls to
get_pipeline_stage().  Fix some signed/unsigned comparison warnings.
Fix handling of NULL consumer in assign_varying_locations().

v3 (Bryan Cain <bryancain3@gmail.com>): fix indexing order of 2D
arrays.  Also, allow interpolation qualifiers in geometry shaders.

v4 (Paul Berry <stereotype441@gmail.com>): Eliminate
get_pipeline_stage()--it is no longer needed thanks to 030ca23 (mesa:
renumber shader indices according to their placement in pipeline).
Remove 2D stuff.  Move vertices_per_prim() to ir.h, so that it will be
accessible from outside the linker.  Remove
inject_num_vertices_visitor.  Rework for GLSL 1.50.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

v5 (Paul Berry <stereotype441@gmail.com>): Split out
do_set_program_inouts() argument refactoring to a separate patch.
Move geom_array_resizing_visitor to later in the series.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:20:45 -07:00
Paul Berry
844bd71736 glsl/linker: Make separate allocations to track vertex and fragment shaders.
There's no reason to be clever about this.  By making separate
allocations for vertex and fragment shaders, we'll allow geometry
shaders to be added without introducing any complication.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:20:41 -07:00
Bryan Cain
ff52377183 glsl: add builtins for geometry shaders.
v2 (Paul Berry <stereotype441@gmail.com>): Account for rework of
builtin_variables.cpp.  Use INTERP_QUALIFIER_FLAT for gl_PrimitiveID
so that it will obey provoking vertex conventions.  Convert to GLSL
1.50 style geometry shaders.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

v3 (Paul Berry <stereotype441@gmail.com>): Be less obscure about
setting interpolation field of gl_Primitive variables.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:20:36 -07:00
Bryan Cain
ae6eba3e32 glsl: add ir_emit_vertex and ir_end_primitive instruction types
These correspond to the EmitVertex and EndPrimitive functions in GLSL.

v2 (Paul Berry <stereotype441@gmail.com>): Add stub implementations of
new pure visitor functions to i965's vec4_visitor and fs_visitor
classes.

v3 (Paul Berry <stereotype441@gmail.com>): Rename classes to be more
consistent with the names used in the GL spec.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:20:16 -07:00
Bryan Cain
c6be77ee6f mesa: account for geometry shader texture fetches in update_texture_state
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:20:14 -07:00
Paul Berry
b272a01879 main: Allow for the possibility of GL 3.2 without ARB_geometry_shader4.
Previously, we assumed that the only way Mesa would expose geometry
shader support was via the ARB_geometry_shader4 extension.  But this
extension has some extra complications over GL 3.2 (interactions with
compatibility-only features, and link-time initialization of the
constant gl_VerticesIn).  So we want to allow for the possibility of
supporting GL 3.2 (with GLSL 1.50 style geometry shaders) even if
ctx->Extensions.ARB_geometry_shader4 is false.

This patch adds a new function, _mesa_has_geometry_shaders(), which
returns true if either ARB_geometry_shader4 is supported or the GL
version is at least 3.2 desktop.  Since compute_version() only enables
GL 3.2 functionality when GLSL 1.50 support is present, a sufficient
way for a back-end to advertise geometry shader support is to set
ctx->Const.GLSLVersion >= 150.

v2: Remove unnecessary ctx->Const.GeometryShaders150 constant.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:19:57 -07:00
Paul Berry
56dcc46f0e main: Fix geometry shader error messages (missing right paren)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:19:55 -07:00
Paul Berry
37270715ff glsl: Add EXT_texture_array support for geometry shaders.
We can't just use a ".glsl" file since the Lod variants are only
available in vertex and geometry shaders, while the bias variants are
only available in the fragment shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-01 20:19:51 -07:00
Paul Berry
6a2baf3a06 glsl/linker: Make update_array_sizes apply to just uniforms.
Commit 586b4b5 (glsl: Also update implicit sizes of varyings at link
time) extended update_array_sizes() to apply to both uniforms and
shader ins/outs.  However, doing creates problems for geometry
shaders, because update_array_sizes() assumes that variables with
matching names in different parts of the pipeline should have the same
sizes.  With the addition of geometry shaders, this is no longer true
(e.g. both vertex and geometry shaders have a gl_ClipDistance output
variable, but there's no reason these variables should have the same
sizes).

The original reason for commit 586b4b5 (avoid problems with
gl_TexCoord being 0 length) has since been addressed by commit 6f53921
(linker: Ensure that unsized arrays have a size after linking).  So go
ahead and switch update_array_sizes() back to only acting on uniforms.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:19:47 -07:00
Paul Berry
8fc41df549 glsl: Modify ir_set_program_inouts to handle geometry shaders.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-01 20:19:43 -07:00
Paul Berry
cea946e39d glsl: In ir_set_program_inouts, handle indexing outside array/matrix bounds.
According to GLSL, indexing into an array or matrix with an
out-of-range constant results in a compile error.  However, indexing
with an out-of-range value that isn't constant merely results in
undefined results.

Since optimization passes (e.g. loop unrolling) can convert
non-constant array indices into constant array indices, it's possible
that ir_set_program_inouts will encounter a constant array index that
is out of range; if this happens, just mark the whole array as used.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:19:39 -07:00
Paul Berry
1c789d8087 glsl: Fallback gracefully if ir_set_program_inouts sees unexpected indexing.
The code in ir_set_program_inouts that marks just a portion of a
variable as used (rather than the whole variable) only works on a few
kinds of indexing operations:

- Indexing into matrices
- Indexing into arrays of matrices, vectors, or scalars.

Fortunately these are the only kinds of indexing operations that we
expect to see; everything else is either handled by a
previously-executed lowering pass or prohibited by GLSL.

However, that could conceivably change in the future (the GLSL rules
might change, or we might modify the lowering passes).  To avoid
mysterious bugs in the future, let's have ir_set_program_inouts report
an assertion failure if it ever encounters an unexpected kind of
indexing operation (and in release builds, fall back to just marking
the whole variable as used).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:19:35 -07:00
Paul Berry
d5a333a06f glsl: Extract marking functions from ir_set_program_inouts.
This patch extracts the functions mark_whole_variable() and
try_mark_partial_variable() from the ir_set_program_inouts visitor
functions.  This will make the code easier to follow when we add
geometry shader support.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:19:31 -07:00
Paul Berry
0b0dc03a31 glsl: Use count_attribute_slots() in ir_set_program_inouts.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:19:26 -07:00
Paul Berry
7d95d2b4c9 glsl: Expand count_attribute_slots() to cover structs.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:19:22 -07:00
Paul Berry
0026ad4994 Move count_attribute_slots() out of the linker and into glsl_type.
Our previous justification for leaving this function out of glsl_type
was that it implemented counting rules that were specific to GLSL
1.50.  However, these counting rules also describe the number of
varying slots that Mesa will assign to a varying in the absence of
varying packing.  That's useful to be able to compute from outside of
the linker code (a future patch will use it from
ir_set_program_inouts.cpp).  So go ahead and move it to glsl_type.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:19:02 -07:00
Paul Berry
906eff09e3 glsl: Change do_set_program_inouts' is_fragment_shader arg to shader_type.
This will allow us to add geometry shader support without having to
add another boolean argument.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:18:42 -07:00
Roland Scheidegger
e7ed70a52e gallivm: obey clarified shift behavior
llvm shifts are undefined for shift counts exceeding (or matching) bit width,
so need to apply a mask for the tgsi shift instructions.

v2: only use mask for the tgsi shift instructions, not for the build shift
helpers. None of the internal callers need this behavior, and while llvm can
optimize away the masking for constants there are legitimate cases where it
might not be able to do so even if we know that shift count must be smaller
than type width (currently all such callers do not use the build shift
helpers).

Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-02 03:49:57 +02:00
Roland Scheidegger
7a72bef47e tgsi: obey clarified shift behavior
c shifts are undefined for shift counts exceeding (or matching) bit width,
so need to apply a mask (on x86 it actually would usually probably work as
shifts do masking on int domain shifts - unless some auto-vectorizer would
come along at last as simd domain does not mask the shift count).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-02 03:49:57 +02:00
Roland Scheidegger
606132b4de gallium: clarify shift behavior with shift count >= 32
Previously, nothing was said what happens with shift counts exceeding
bit width of the values to shift. In theory 3 behaviors are possible:
1) undefined (classic c definition)
2) just shift out all bits (so result is zero, or -1 potentially for ashr)
3) mask the shift count to bit width - 1
API's either require 3) or are ok with 1). In particular, GLSL (as well as a
couple uninteresting legacy GL extensions) is happy with undefined, whereas
both OpenCL and d3d10 require 3). Consequently, most hw also implements 3).
So, for simplicity we just specify that 3) is required rather than saying
undefined and then needing state trackers to work around it.
Also while here specify shift count as a vector, not scalar. As far as I
can tell this was a doc bug, neither state trackers nor drivers used scalar
shift count.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-02 03:49:57 +02:00
Carl Worth
7f2f63409a docs: Add md5sums to 9.1.6 release notes 2013-08-01 15:45:04 -07:00
Carl Worth
964b89e42a docs: Import 9.1.6 release notes, add news item. 2013-08-01 15:12:25 -07:00
Kenneth Graunke
fcb4ab6db1 i965: Delete the BATCH_LOCALS macro.
This hasn't done anything in a long time, and it's only used in a couple
places...which means we couldn't use it without doing a bunch of work
anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-01 10:38:20 -07:00
Corey Richardson
abdbd02e59 Correct clamping of TEXTURE_{MAX, BASE}_LEVEL
Previously, if TEXTURE_IMMUTABLE_FORMAT was TRUE, the levels were allowed to
be set like usual, but ARB_texture_storage states:

> if TEXTURE_IMMUTABLE_FORMAT is TRUE, then level_base is clamped to the range
> [0, <levels> - 1] and level_max is then clamped to the range [level_base,
> <levels> - 1], where <levels> is the parameter passed the call to
> TexStorage* for the texture object

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Corey Richardson <corey@octayn.net>
2013-08-01 10:23:39 -07:00
Corey Richardson
986ae4306c De-tab and align comments in gl_texture_object
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Corey Richardson <corey@octayn.net>
2013-08-01 10:23:39 -07:00
Chris Forbes
3eef7fec67 i965 Gen4/5: clip: Don't mangle flat varyings
This patch ensures that integers will pass through unscathed.  Doing
(useless) computations on them is risky, especially when their bit
patterns correspond to values like inf or nan.

[V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com>
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:59:03 +12:00
Chris Forbes
3f6fb5e1dd i965 Gen4/5: clip: Add support for noperspective varyings
Adds support for interpolating noperspective varyings linearly in screen
space when clipping.

Based on Olivier Galibert's patch from last year:
http://lists.freedesktop.org/archives/mesa-dev/2012-July/024341.html

At this point all -fixed and -vertex interpolation tests work.

V5: Add brw_clip_compile.has_noperspective_shading rather than another
key flag.
V6: Real bools.

[V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com>
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:58:59 +12:00
Chris Forbes
f0feb32eaf i965 Gen4/5: clip: correctly handle flat varyings
Previously we only gave special treatment to the builtin color varyings.
This patch adds support for arbitrary flat-shaded varyings, which is
required for GLSL 1.30.

Based on Olivier Galibert's patch from last year:
http://lists.freedesktop.org/archives/mesa-dev/2012-July/024340.html

V5: Move key.do_flat_shading to brw_clip_compile.has_flat_shading
V6: Real bools.

[V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com>
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:58:56 +12:00
Chris Forbes
21922cb70d i965 Gen4/5: Generalize SF interpolation setup for GLSL1.3
Previously the SF only handled the builtin color varying specially.
This patch generalizes that support to cover user-defined varyings,
driven by the interpolation mode array set up alongside the VUE map.

Based on the following patches from Olivier Galibert:
- http://lists.freedesktop.org/archives/mesa-dev/2012-July/024335.html
- http://lists.freedesktop.org/archives/mesa-dev/2012-July/024339.html

With this patch, all the GLSL 1.3 interpolation tests that do not clip
(spec/glsl-1.30/execution/interpolation/*-none.shader_test) pass.

V5: Move key.do_flat_shading to brw_sf_compile.has_flat_shading; drop
vestigial hunks.
V6: Real bools.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:58:52 +12:00
Chris Forbes
3b5fe704e1 i965: Add helper functions for interpolation map
V6: real bools

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:58:49 +12:00
Chris Forbes
9f51499d28 i965 Gen4/5: Introduce 'interpolation map' alongside the VUE map
The interpolation map (in brw->interpolation_mode) is a new auxiliary
structure alongside the post-GS VUE map, which describes the
interpolation modes for each VUE slot, for use by the clip and SF
stages.

This patch introduces a new state atom to compute the interpolation map,
and adjusts the program keys for the clip and SF stages, but it is not
actually used yet.

[V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com>

V3: Updated for vue_map changes, intel -> brw merge, etc. (Chris Forbes)
V4: Compute interpolation map as a new state atom rather than tacking it
on the front of the clip setup
V5: Rework commit message, make interpolation_mode_map a struct.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-01 20:58:19 +12:00
Carl Worth
c6f3036179 get-pick-list: Allow for non-whitespace between "CC:" and "mesa-stable"
We recently proposed a new syntax for stable-patch nominations such as:

	CC: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>

and this has already appeared in the wild.

So we extend the regular expression to pick this up as well.
2013-07-31 15:49:48 -07:00
Samuel Pitoiset
ef6d5ee9f3 nvc0: properly align NVE4_COMPUTE_MP_TEMP_SIZE
MP_TEMP_SIZE must be aligned to 0x8000, while TEMP_SIZE on NVE4_3D
must be aligned to 0x20000, so perform both alignments to be sure
we allocate enough space (actually the bo will most likely use 128
KiB pages and not aligning to that would be a waste anyway).

Cc: "9.2" mesa-stable@lists.freedesktop.org
2013-07-31 21:40:38 +02:00
Laurent Carlier
5ffa28df4e mesa/program: remove useless YYID
This fixes the build with Bison 3.0. Also works with Bison 2.7.1.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-31 11:57:32 -07:00
Kenneth Graunke
6d2a9220b8 mesa/program: Switch from the deprecated YYLEX_PARAM to %lex-param.
YYLEX_PARAM is no longer supported as of Bison 3.0.  Instead, the Bison
developers recommend using %lex-param.

%lex-param takes a type and variable name, similar to %parse-param,
so you can't pass an arbitrary expression like state->scanner.  But Flex
insists on passing the actual scanner object, not an arbitrary object
like state.

To solve this, the parser defines a wrapper lex() function which accepts
"state," and calls Flex's lex() function with state->scanner.

Fixes the build with Bison 3.0.  Also works with Bison 2.7.1.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67354
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Cc: "9.2" mesa-stable@lists.freedesktop.org
2013-07-31 11:52:13 -07:00
Kenneth Graunke
de917b4c4c mesa/program: Change the program parser's namespace.
Bison 3.0 removes the YYLEX_PARAM macro.  In preparation for handling
this using %lex-param, the parser needs a wrapper function for the
actual Flex lex() function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67354
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Cc: "9.2" mesa-stable@lists.freedesktop.org
2013-07-31 11:52:06 -07:00
Kenneth Graunke
f043381334 glsl: Switch from the deprecated YYLEX_PARAM to %lex-param.
YYLEX_PARAM is no longer supported as of Bison 3.0.  Instead, the Bison
developers recommend using %lex-param.

%lex-param takes a type and variable name, similar to %parse-param,
so you can't pass an arbitrary expression like state->scanner.  But Flex
insists on passing the actual scanner object, not an arbitrary object
like state.

To solve this, the parser defines a wrapper lex() function which accepts
"state," and calls Flex's lex() function with state->scanner.

Fixes the build with Bison 3.0.  Also works with Bison 2.7.1.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67354
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Cc: "9.2" mesa-stable@lists.freedesktop.org
2013-07-31 11:51:57 -07:00
Kenneth Graunke
eb7c8c7fb6 glsl: Change the lexer's namespace.
Bison 3.0 removes the YYLEX_PARAM macro.  In preparation for handling
this using %lex-param, the parser needs a wrapper function for the
actual Flex lex() function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67354
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Cc: "9.2" mesa-stable@lists.freedesktop.org
2013-07-31 11:49:30 -07:00
Eric Anholt
eed0a80137 egl: Restore "bogus" DRI2 invalidate event code.
I had removed it in commit 1e7776ca2b
because it was obviously wrong -- why do we care whether the server is a
version that emits events, if we're not watching for the server's events,
anyway?  And why would you only invalidate on a server that emits
invalidate events, when the comment said to emit invalidates if the server
*doesn't*?  Only, I missed that we otherwise don't flag that our buffers
might have changed at swap time at all, so the driver was only checking
for new buffers when triggered by the Viewport hack.  Of course you don't
expect Viewport to be called after a swap.

So, this is effectively a revert of the previous commit, except that I
dropped the check for only emitting invalidates on a new server -- we
*always* need to invalidate if we're doing a SwapBuffers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63435
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 and 9.2" <mesa-stable@lists.freedesktop.org>
2013-07-31 10:43:35 -07:00
Roland Scheidegger
b1ed7202df gallivm: use nearest rounding for float->unorm24 conversion
Previously we were using truncation, which gives the correct result
only for numbers in [0.5-1.0] range (because there's no mantissa bits
to do any rounding there).
This is frequently hit (and probably only used there) when converting
fragment depth to depth format (d24s8 etc.) or otherwise dealing with
depth format.

v2: as spotted by Jose, get rid of extra type (src_type is already unsigned).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-31 17:09:02 +02:00
Mikko Juola
8624a514c2 mesa: fix multisampling proxy textures not being queryable
The code that checks if some texture target is valid for
glGetTexLevelParameter*() was not programmed to check for multisampling
proxy textures.  This made it impossible(?) to use the proxy textures
for their intended purpose as glGetTexLevelParameter*() would just fail
on you.

Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: mesa-stable@lists.freedesktop.org
2013-07-31 07:27:01 -06:00
Mikko Juola
e404105e7d mesa: fix proxy textures becoming immutable and unusable
glTexStorage*() functions make textures immutable.  This carries on to
proxy textures.  Error checking in texture storage functions prevents
proxy textures from working after first time because internally, they
became immutable.

This commit makes the error checking ignore the immutability flag when
working with proxy textures.

Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: mesa-stable@lists.freedesktop.org
2013-07-31 07:26:55 -06:00
Mikko Juola
3f3f66fd94 mesa: fix proxy textures not working with default texture binding
When working with the glTexStorage*() functions, the error checking
checks that a non-default (i.e., non-zero) texture is currently bound.
However, this check made glTexStorage*() functions fail with proxy
textures when the default texture is bound. Proxy textures do not care
about the current texture bindings so for them this check should not
be done.

Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: mesa-stable@lists.freedesktop.org
2013-07-31 07:26:50 -06:00
Mikko Juola
de7e3741eb mesa: fix number of mipmaps calculation for proxy textures
The function _mesa_get_tex_max_num_levels() is supposed to calculate
the number of mipmap levels but it was not written to handle proxy
textures, at best returning a maximum of 1 mipmap level. Because of
this, at least glTexStorage*() calls would incorrectly fail when used
with proxy textures with more than one mipmap level.

Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: mesa-stable@lists.freedesktop.org
2013-07-31 07:26:43 -06:00
Brian Paul
e5f32a0b3a mesa: improve free() cleanup in generate_mipmap_compressed()
Free all our temporary buffers in one place at the end of the
function.  Fixes memory leak detected by Coverity.

Note: This is a candidate for the 9.x branches
Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-07-31 06:53:48 -06:00
Brian Paul
fdbd6a5033 gallium/util: reformat, comment util_get_offset()
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-31 06:53:48 -06:00
Brian Paul
30f1770cb1 gallium/util: comments, var renaming in u_inlines.h
The variable 'usage' was being used for two different things.
Sometimes for PIPE_USAGE_x and other times for PIPE_TRANSFER_x.
This renames usage to access when we're talking about PIPE_TRANSFER_x
flags.  Plus, add a bunch of comments to remind us what's going on.

Also, use unsigned for PIPE_TRANSFER_x bitmask to be consistent with
other places.  And add a missing const qualifier.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-31 06:53:48 -06:00
Brian Paul
365f38f3df softpipe: use new softpipe_resource_data() accessor
We should probably be using map()/unmap() when accessing resource
data, but this is a little better.

v2: assert that the resource is not a display target, per Jose.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-07-31 06:53:48 -06:00
Brian Paul
99c42d11a2 softpipe: don't ignore pipe_constant_buffer::buffer_offset
This was never a problem since the Mesa state tracker always gives
us a user-space constant buffer with buffer_offset=0.  But if another
state tracker ever gave us a "HW" constant buffer with non-zero
buffer_offset we'd mis-render.

Also, use the correct buffer size.  And move an assertion to the
top of the function.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-07-31 06:53:48 -06:00
Brian Paul
089ef37eab gallium/docs: clarify definition of PIPE_CAP_USER_CONSTANT_BUFFERS, etc
The cap means _can_ accept user-space constant buffers; it doesn't
mean _only_ accepts user-space constant buffers.

v2: also update the PIPE_CAP_USER_VERTEX_BUFFERS and
PIPE_CAP_USER_INDEX_BUFFERS descriptions as well.  Per Jose.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-07-31 06:53:48 -06:00
Chris Forbes
cace82b0cd i965/vs: Put lod parameter in the correct place for Gen4
This was never visible before due to the bogus sampler state pointer.
Fixes remaining vertex texturing breakage on Gen4.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2013-07-31 21:33:18 +12:00
Chris Forbes
97676032c2 i965/vs: set up sampler state pointer for Gen4/5.
Fixes broken filter and lod selection for vertex texturing.
(txs/txf only worked properly because they ignore the sampler state
completely)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2013-07-31 21:33:18 +12:00
Marek Olšák
7568a89500 st/dri: add a new driconf option disable_shader_bit_encoding for Unigine
Now Unigine Heaven 3.0 finally works with r600g.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-30 23:31:30 +02:00
Marek Olšák
369c829152 st/mesa: fix opcode translation for ARB_shader_bit_encoding functions
We treat the opcodes as MOVs, but we should at least change the type
of the expression, which later affects which TGSI opcode is chosen.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-30 23:31:30 +02:00
Marek Olšák
0f6a7cb00c mesa,glsl,st/dri: add a new driconf option force_glsl_version for Unigine
See documentation in mtypes.h.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-30 23:31:28 +02:00
Marek Olšák
ab78939344 mesa: add MESA_GLSL debug flag to dump shaders on compile error
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-30 23:31:26 +02:00
Marek Olšák
7f2f804c75 driconf: enable app-specific workarounds for all drivers
They were only enabled for i965.

Note that drirc must be installed in /etc.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-30 23:31:24 +02:00
Marek Olšák
bc4f0b6bac st/dri: remove driOptionCache from dri_context in favor of dri_screen
There is no reason to have this duplicated.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-30 23:31:24 +02:00
Marek Olšák
dda936e057 st/dri: move enabling postprocessing to dri_screen
The driconf options are global.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-30 23:31:24 +02:00
Marek Olšák
772070527f st/dri: remove more unused driconf options
vblank_mode is read by dri_util.c and falls under the "dri2" driver name,
which is not connected to the actual Mesa/Gallium driver in any way.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-30 23:31:24 +02:00
Marek Olšák
83dbe61ea4 st/dri: implement the driconf option force_s3tc_enable properly
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-30 23:31:24 +02:00
Marek Olšák
f27f3a4b15 driconf: remove the unused option allow_large_textures
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-30 23:31:23 +02:00
Marek Olšák
2acc27cc6d st/dri: support the driconf option disable_blend_func_extended
This is needed for Unigine.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-30 23:31:23 +02:00
Marek Olšák
71e0b5d688 st/osmesa: initialize disable_glsl_line_continuations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-30 23:31:22 +02:00
Marek Olšák
4c89ec1f69 gallium/postprocessing: convert blits to pipe->blit
PP saves current states to cso_context and then util_blit_pixels does
the same. cso_context doesn't like that and the original state is not
correctly restored.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-30 23:31:22 +02:00
Marek Olšák
c84e8d039e gallium/postprocessing: fix shader parsing
tokens was converted to a pointer, which made the Elements macro return 1.

Broken by e87fc11cac.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-30 23:31:22 +02:00
Marek Olšák
c40f8d087a docs/GL3: clarify core vs compatibility extension support
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-30 23:31:21 +02:00
Marek Olšák
7db83d8d4b mesa: default texture buffer format should be R8 in the core profile
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

v2: Since we don't expose the extension in the compatibility profile,
    the "if (API == CORE) .. else .." statement is removed.
2013-07-30 22:36:21 +02:00
Marek Olšák
a6b1a7c0d2 mesa: default DEPTH_TEXTURE_MODE should be RED in the core profile
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-30 22:36:21 +02:00
Marek Olšák
63569dbeb0 st/mesa: expose EXT_framebuffer_multisample_blit_scaled if MSAA is supported
Surprisingly all drivers supporting MSAA can already do this (r300g and r600g
for sure) and I think Christoph wanted to have this feature for his Nouveau
drivers anyway.
2013-07-30 22:36:21 +02:00
Marek Olšák
1302c66896 st/mesa: fix sRGB renderbuffers without EXT_framebuffer_sRGB support
https://bugs.freedesktop.org/show_bug.cgi?id=59322

Cc: mesa-stable@lists.freedesktop.org
2013-07-30 22:36:20 +02:00
Marek Olšák
4dfe1a0df5 Revert "r300g: Give CLIP_DISABLE another try"
This reverts commit e866bd1ade.

https://bugs.freedesktop.org/show_bug.cgi?id=57875

Cc: mesa-stable@lists.freedesktop.org
2013-07-30 22:36:20 +02:00
Carl Worth
122d8d2f5a get-pick-list.sh: Include commits mentionining "CC: mesa-stable..." in pick list
We recently adopted a new convention that patches can be nominated for the
stable branch by including a line in the commit message as follows:

	CC: mesa-stable@lists.freedesktop.org

This is a convenient syntax as "git send-email" will notice this line and
automatically copy the resulting patch email to the mesa-stable mailing list.

Here we extend the regular expression in the get-pick-list.sh script to also
notice this pattern, (as well as the traditional "NOTE: This patch is a
candidate..." form.
2013-07-30 12:36:37 -07:00
Paul Berry
1299694ed5 glsl: Remove redundant writes to prog->LinkStatus
The linker_error() function sets prog->LinkStatus to false.  There's
no reason for the caller of linker_error() to also do so.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-30 10:10:27 -07:00
Paul Berry
5fe6b90c87 glsl: Improve error message for interstage interface block mismatch.
We're now emitting this error from a point where we have easy access
to the name of the block that failed to match, so go ahead and include
that in the error message, as we do for intrastage interface block
mismatches.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-30 10:10:27 -07:00
Paul Berry
b95d237fe6 glsl: Use a consistent technique for tracking link success/failure.
This patch changes link_shaders() so that it sets prog->LinkStatus to
true when it starts, and then relies on linker_error() to set it to
false if a link failure occurs.

Previously, link_shaders() would set prog->LinkStatus to true halfway
through its execution; as a result, linker functions that executed
during the first half of link_shaders() would have to do their own
success/failure tracking; if they didn't, then calling linker_error()
would add an error message to the log, but not cause the link to fail.
Since it wasn't always obvious from looking at a linker function
whether it was called before or after link_shaders() set
prog->LinkStatus to true, this carried a high risk of bugs.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-30 10:10:26 -07:00
Paul Berry
659ec1c958 glsl: Add error message for intrastage interface block mismatch.
Previously we failed to link (which is correct), but we did not output
an error message, which could have been confusing for users.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-30 10:10:26 -07:00
Paul Berry
4682b9b7bf glsl: Remove bogus check on return value of link_uniform_blocks().
A comment in link_intrastage_shaders(), and an if-test that followed
it, seemed to indicate that link_uniform_blocks() would return a
negative value in the event of an error.  But this is not the
case--all error checking has already been performed by
validate_intrastage_interface_blocks(), and link_uniform_blocks() can
only return unsigned values.

So get rid of the if-test and change the return type of
link_intrastage_shaders() to clarify that it can only return unsigned
values.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-30 10:10:25 -07:00
Jonathan Charest
4f8048bb5a r600g/compute: Added missing address space checking of kernel parameters
To have non-static buffers in local memory, it is necessary to pass them
as arguments to the kernel.

For r600, the correct lds size must be set to the SQ_LDS_ALLOC register.
The correct size is the clover size plus the size reported by the
compiler.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-30 07:09:16 -07:00
Jonathan Charest
d9576598c7 clover: Added missing address space checking of kernel parameters v2
Here is an updated patch with no line wrapping and respecting 80-column limit (for my changes).

v2: Tom Stellard
  - Create global arguments for constant buffers so we don't break
    r600g.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-30 07:09:15 -07:00
Kenneth Graunke
07cdf426c1 mesa: Remove broken assertion about enabled texture targets.
For GLSL programs, enabledTargets can have more than one bit set.  For
example, a shader that uses sampler2D and samplerCube uniforms will have
both TEXTURE_2D_BIT and TEXTURE_CUBE_BIT set.

The code that sets _ReallyEnabled already handles this, selecting the
"highest priority" texture target.  We should simply use that.

Fixes new Piglit test incomplete-textures-of-multiple-types.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62698
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-29 22:35:37 -07:00
Emil Velikov
488b3ed6f4 build: unify mesa version by using a VERSION file
Rather than having to keep track of all the build systems and their respecitve
definition of the mesa version, use a single top file VERSION. Every build
system is responsible for reading/parsing the file and using it

v2:
* remove useless bulletpoint from the documentation, suggested by Matt
* "Androing is Linux. Use '/' in stead of '\'", spotted by Chad V
* use cleaner code to get the version in scons, suggested by Chad V

v3:
* ensure leading and trailing whitespace characters are stripped while parsing
* android: handle GNU shell commands approapriately

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-07-29 13:39:29 -07:00
Kenneth Graunke
efb566dff2 i965: Don't create a swrast context on ES2+.
We already skip this for API_OPENGL_CORE; ES2+ is very similar.
The primary user of the swrast context is GL_SELECT and GL_FEEDBACK,
which have never existed in ES.

This saves approximately 18MB of memory in GLBenchmark 2.7 Egypt (ES2).
No regressions in es3conform on Ivybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2013-07-29 13:26:27 -07:00
Kenneth Graunke
6aba035f6b glsl: Remove shader stage checking for extension handling.
Certain extensions only add functionality to particular shader stages.
(For example, ARB_draw_instanced only adds variables to the vertex
shader stage.)

Previously, we only allowed such extensions to be enabled in the shader
stages where they're useful.  However, I've never found any text which
mandates that behavior; in my opinion, you should be able to turn on
extensions in any shader stage, even if they have no effect.

Fixes Piglit tests glslparsertest/glsl2/draw_buffers-05.vert and
ARB_draw_instanced/preprocessor/feature-macro-enabled.frag.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29185
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-29 10:51:25 -07:00
Matt Turner
0ed02d435e mesa: Expose OES_surfaceless_context.
EGL_KHR_surfaceless_context extension allows contexts to be made current
without a default winsys fbo. This extension specifies what ES 1.1 and
2.0 should do (the ES 3.0 spec already does).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-29 10:35:16 -07:00
Matt Turner
8dd15e6021 mesa: Return GL_FRAMEBUFFER_UNDEFINED if the winsys fbo is incomplete.
Specified by ARB_framebuffer_object, GL 3.0, and ES 3.0.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-29 10:35:01 -07:00
Matt Turner
b2d3f25aa2 gles3: Update gl3.h to 2013-02-12.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-07-29 10:35:00 -07:00
Matt Turner
00a945f61e gles2: Update gl2ext.h to revision 22161.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-07-29 10:34:58 -07:00
Matt Turner
efa8a6e72f gles2: Update gl2.h to revision 20555.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-07-29 10:34:47 -07:00
Matt Turner
32a2ab47fe gles: Update glext.h to revision 20798.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-07-29 10:34:42 -07:00
Roland Scheidegger
e08114fed7 gallivm: (trivial) git rid of assertion in float->uint conversion code
Commit 8c3d3622d9 introduced a new assertion,
but since it causes lp_test_conv failures remove it again and let's hope
we don't really hit bugs caused by the potentially bogus code (it is possible
the assert() caught some cases which work correctly too).
2013-07-29 13:23:56 +02:00
Maarten Lankhorst
e847b5ae06 nvc0: force use of correct firmware file
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-07-28 12:06:57 +02:00
Ian Romanick
803f755ede glsl: Less const for glsl_type convenience accessors
The second 'const' says that the pointer itself is constant.  This in
unenforcible in C++, so GCC emits a warning (see) below for each of
these functions in every file that includes glsl_types.h.  It's a lot of
warning spam.

../../../src/glsl/glsl_types.h:176:58: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2013-07-27 12:13:03 -07:00
Kenneth Graunke
17856726c9 glsl: Disallow auxiliary storage qualifiers on FS outputs.
This has always been an error; we just forgot to check for it.

Fixes Piglit's no-aux-qual-on-fs-output.frag.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67333
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2013-07-27 10:31:40 -07:00
Kenneth Graunke
c178ec0d7e glsl: Classify "layout" like other identifiers.
When "layout" isn't being lexed as LAYOUT_TOK, we should treat it like
an ordinary identifier.  This means we need to classify it to determine
whether we should return IDENTIFIER, TYPE_IDENTIFIER, or NEW_IDENTIFIER.

Fixes the WebGL conformance test "shader-with-non-reserved-words."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64087
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2013-07-27 10:31:38 -07:00
Paul Berry
4d7899fe81 glsl: Be consistent about '\n', '.', and capitalization in errors/warnings.
The majority of calls to _mesa_glsl_error(), _mesa_glsl_warning(), and
_mesa_glsl_parse_state::check_version() use a message that begins with
a lower case letter and ends without a period.  This patch makes all
messages follow that convention.

Also, error/warning messages shouldn't end in '\n', since
_mesa_glsl_msg() automatically adds '\n' at the end of the message.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-27 09:41:30 -07:00
Roland Scheidegger
8c3d3622d9 gallivm: fix float->SNORM conversion
Just like the UNORM case we need to use round to nearest, not trunc.
(There's also another problem, we're using the formula for SNORM->float
which will produce a value below -1.0 for the most negative value which
according to both OpenGL and d3d10 would need clamping. However, no actual
failures have been observed due to that hence keep cheating on that.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-27 16:41:29 +02:00
Roland Scheidegger
d86fddc876 util: don't flush overflowing values to infinity in half-float conversion
I am not able to find _any_ rounding behavior specified for OpenGL for
float to half-float conversions. However, it is specified for fp11/fp10
which suggests round to next finite value but round-to-zero would also
be allowed, but finite values must not be flushed to infinity in either
case.
Hence I believe it makes sense to do the same for half-floats too.
We could probably also use round-to-zero consistently, which is in fact
required by d3d10 (but it doesn't seem to matter much).
Does not match the mesa core function doing the same though (which is
saying it was built to match intel gpus which I don't believe for a
second as it would cause failures in d3d10, moreover the PRM (for
ivy bridge, not listed in older manuals) while not specifying rounding
behavior clearly states finite numbers are never flushed to infinity).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-27 16:41:29 +02:00
Roland Scheidegger
47e528b740 tgsi: handle texel swizzles correctly for d3d10-style sample opcodes
Same as for gallivm (though these don't quite work correctly in softpipe,
so untested).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-27 16:41:29 +02:00
Roland Scheidegger
abcc40e7f0 gallivm: handle texel swizzles correctly for d3d10-style sample opcodes
unlike OpenGL, the texel swizzle is embedded in the instruction, so honor
that.
(Technically we now execute both the sampler_view swizzle and the
per-instruction swizzle but this should be quite ok.)

v2: add documentation note as it's not obvious.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-27 16:41:29 +02:00
Kenneth Graunke
f2be639972 docs: Mark ARB_vertex_attrib_binding as started.
Fredrik Höglund has a partial implementation in his git tree.
2013-07-26 23:47:27 -07:00
Ian Romanick
b55c1638ad mesa: Disable GL_EXT_framebuffer_object in core profiles and OpenGL 3.1
GL_EXT_framebuffer_object differs from GL_ARB_framebuffer_object in ways
that we can't and don't implement in core profiles.  Exposing it is a
lie, so we shouldn't do that.

It's possible the some other GL_EXT_framebuffer_* extensions should be
disabled, but it's not quite so clear cut.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-26 22:56:26 -07:00
Matt Turner
86ae3027a1 docs: Mark GL_ARB_shading_language_420pack as done. 2013-07-26 22:33:39 -07:00
Chris Forbes
6c0dad6128 docs: Mark off 420pack
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-27 21:29:01 +12:00
Tapani Pälli
8c211dd742 glsl: disable ARB_texture_cube_map_array_enable keywords for glsl es
Patch fixes a crash with Webgl 'shader-with-non-reserved-words'
conformance test by ignoring desktop extension keywords on GLSL ES.

v2: fix reserved and allowed desktop glsl versions (Chris)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64087
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-26 10:05:20 -07:00
Chris Forbes
124f567f1d i965/vs: Fix flaky texture swizzling
If any component used the ZERO or ONE swizzle, its corresponding member
in the `swizzle` array would never be initialized. We *mostly* got away
with this, except when that memory happened to contain a value that
clobbered another channel when combined using BRW_SWIZZLE4().

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-27 06:34:29 +12:00
Niels Ole Salscheider
81a156d099 st/clover: Allow double precision operations
Pass "cl_khr_fp64" preprocessor definition to clang

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-25 18:55:56 -07:00
Dave Airlie
19338157c9 gallium/vl: add prime support
This fixes the dri2 opening to check if DRI_PRIME is set,
and picks the correct drm device path to open, this along
with a change to libvdpau allows vdpauinfo to work at least,

Martin Peres tested with nouveau, and there seems to be a
further issue with final displaying, it only works sometimes,
but this patch is at least necessary to help debug further.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67283
Tested-by: Armin K. <krejzi@email.com>
2013-07-26 08:42:00 +10:00
Kenneth Graunke
0e9549e2bd Revert "i965: Delete pre-DRI2.3 viewport hacks."
This reverts commit c9db037dc9.

Eric believes that the viewport hacks are still necessary for EGL;
invalidate events aren't hooked up properly.

This commit caused a regression where EFL applications wouldn't show
anything other than window decorations; GLBenchmark also showed issues.

The revert had conflicts due to the intel_context/brw_context merge.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66606
Cc: mesa-stable@lists.freedesktop.org
2013-07-25 15:25:43 -07:00
Kenneth Graunke
a8c8c5f8d2 mesa: Bump version to 9.3.0-devel.
This should have been done when making the 9.2 branch, but was missed.
2013-07-25 13:34:53 -07:00
Kenneth Graunke
7d24d1b873 docs: Remove <em> obfuscation on public mailing list addresses.
Wrapping every character of an email address in <em> looks bizarre, and
makes it impossible to read the text.  Apparently Brian did this in 2003
to try and obfuscate email addresses and avoid spam.

Of course, mesa-*@lists.freedesktop.org are public mailing lists and
trivial to find on the internet.  So obfuscation buys us nothing
(assuming the <em> technique even works at all, which I doubt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
LOLed-at-by: Matt Turner :)
2013-07-25 13:34:43 -07:00
Rob Clark
890e27ef25 xa: bump major version
Bump major version, as the change to require explicit
xa_context_flush(), the addition of the handle-type parameter to
xa_surface_handle(), and change of surface to ref/unref will require a
minor change in DDX.
2013-07-25 13:59:55 -04:00
Jerome Glisse
8b21a3825b xa: move surface to ref/unref api
This make ddx life easier.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-07-25 13:59:55 -04:00
Jerome Glisse
d156c032c9 xa: let ddx handle flush
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-07-25 13:59:55 -04:00
Jerome Glisse
6e8c9589db xa: export a common context flush function
First step before moving flushing inside the ddx.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-07-25 13:59:55 -04:00
Jerome Glisse
d1444225d3 xa: add handle type parameter to get handle
Allow to retrieve non shared handle.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-07-25 13:59:55 -04:00
Rob Clark
984da46219 xa: add xa_surface_from_handle()
For freedreno DDX, we have to create the scanout GEM bo in a special way
(until we have our own KMS/DRM kernel driver.. and even then for
phones/tablets you probably need to use the android drivers if you don't
want to port the lcd panel driver support).  The easiest way to handle
this is let the DDX create the scanout bo, and then create the xa
surface from that.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-07-25 13:59:54 -04:00
Vinson Lee
60c248c3af gallivm: Remove NoFramePointerElimNonLeaf for LLVM >= 3.4.
TargetOptions::NoFramePointerElimNonLeaf was removed in LLVM 3.4
r187093.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-07-25 09:50:07 -07:00
Paul Berry
a5eecb246d glsl: Handle empty if statement encountered during loop analysis.
The is_loop_terminator() function was asserting that the following
kind of if statement could never occur:

    if (...) { } else { }

(presumably based on the assumption that such an if statement would be
eliminated by previous optimization stages).  But that isn't the
case--it's possible that previous optimization stages might simplify
more complex code down to this empty if statement, in which case it
won't be eliminated until the next time through the optimization loop.

So is_loop_terminator() needs to handle it.  Fortunately it's easy to
handle--it's not a loop terminator because it does nothing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64330
CC: mesa-stable@lists.freedesktop.org

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-25 09:37:02 -07:00
Paul Berry
b8f13fbb85 i965: Initialize inout_offset parameter to brw_search_cache().
Two callers of brw_search_cache() weren't initializing that function's
inout_offset parameter: brw_blorp_const_color_params::get_wm_prog()
and brw_blorp_const_color_params::get_wm_prog().

That's a benign problem, since the only effect of not initializing
inout_offset prior to calling brw_search_cache() is that the bit
corresponding to cache_id in brw->state.dirty.cache may not be set
reliably.  This is ok, since the cache_id's used by
brw_blorp_const_color_params::get_wm_prog() and
brw_blorp_blit_params::get_wm_prog() (BRW_BLORP_CONST_COLOR_PROG and
BRW_BLORP_BLIT_PROG, respectively) correspond to dirty bits that are
not used.

However, failing to initialize this parameter causes valgrind to
complain.  So let's go ahead and fix it to reduce valgrind noise.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66779

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-25 09:36:15 -07:00
Paul Berry
42a921fa92 glsl: don't rename variables in interface block arrays.
The linker matches up variables in interface blocks according to their
block name and variable name.  When support for interface block arrays
was added in commit d6863acb, we renamed variables appearing in
interface blocks so that their name included the array size.  For
example, in a block like this:

out foo {
   float bar
} baz[3];

The variable "bar" would get renamed to "bar[3]".

This is unnecessary, and leads to problems in supporting geometry
shaders, since geometry shaders require vertex shader outputs which
are non-arrays to be linked up to geometry shader inputs which are
arrays.

This patch makes the behaviour of interface block arrays the same as
simple non-array interface blocks; in both cases, the variables
contained within them are not renamed.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-25 09:34:24 -07:00
Zack Rusin
f19cb0e5f3 draw: fix vertex id computation
vertex id has to be unaffected by the start index (i.e. when calling
draw arrays with start_index = 5, the first vertex_id has to still
be 0, not 5) and it has to be equal to the index when performing
indexed rendering (in which case it has to be unaffected by the
index bias). This fixes our behavior.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-25 02:02:59 -04:00
Zack Rusin
0e9ec86973 draw: cleanup and fix instance id computation
The instance id system value always starts at 0, even if the
specified start instance is larger than 0. Instead of implicitly
setting instance id to instance id plus start instance and then
having to subtract instance id when computing the buffer offsets
lets just set instance id to the proper instance id. This fixes
instance id computation and cleansup buffer offset computation.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-25 02:02:36 -04:00
Vinson Lee
0ac3164708 gallivm: Remove dead code in lp_build_compare_ext.
There are earlier returns for PIPE_FUNC_NEVER and PIPE_FUNC_ALWAYS. The
switch value of 'func' cannot be either of those values.

Fixes "Logically dead code" defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-07-24 23:47:34 -07:00
Brian Paul
8a9df7a370 mesa: implement mipmap generation for compressed 2D array textures
We weren't looping over all the slices in the array.  The updated
code should also correctly handle 3D compressed textures too, whenever
we have that feature.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66850

NOTE: This is a candidate for the 9.x branches
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-07-24 15:29:30 -06:00
Brian Paul
484fa87984 meta: handle 2D texture arrays in decompress_texture_image()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66850

NOTE: This is a candidate for the 9.x branches.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-07-24 15:29:30 -06:00
Brian Paul
2931bcb0d2 mesa: handle 2D texture arrays in get_tex_rgba_compressed()
If we call glGetTexImage() for a compressed 2D texture array we need
to loop over all the slices.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66850

NOTE: This is a candidate for the 9.x branches.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-07-24 15:29:29 -06:00
Christoph Bumiller
5c37039797 nv50,nvc0: s/uint16/uint32 for constant buffer offset
Looks like a thinko, "Hey, constant buffers can be at most 64 KiB
in size, offset can't be larger." But it can, of course.

I think piglit lacks a test for UBO and BindBufferRange that
tests if it actually works.
2013-07-24 20:46:38 +02:00
Roland Scheidegger
1e003b44e8 draw: always call util_cpu_detect() in draw context creation.
Since disabling denorms in draw_vbo() we require the util_cpu_caps to be
initialized there. Hence add another util_cpu_detect() call in
draw_create_context() which should ensure this.
(There is another call in draw_get_option_use_llvm() which only gets called
with x86 (not x86_64) but calling it always there wouldn't help since it most
likely wouldn't get called when compiling without llvm, so leave it alone
there.)
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=66806.
(Because util_cpu_caps wasn't initialized when first calling util_fpstate_get()
hence it returning zero, but it would later get initialized by rtasm translate
code hence when draw call returned it unmasked all exceptions by calling
util_fpstate_set(). This was happening only with DRAW_USE_LLVM=0 or not
compiling with llvm, otherwise the llvm init code was calling it on time too.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2013-07-24 15:58:07 +02:00
Roland Scheidegger
bceb5f36ec mesa: fix rgtc snorm decoding
The codeword must be unsigned (otherwise will shift in 1's from above when
merging low/high parts so some texels decode wrong).
This also affects gallium's util/u_format_rgtc.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-07-24 15:58:00 +02:00
Andre Heider
0acf3a8407 gallium/util: Fix detection of AVX cpu caps
For AVX it's not sufficient to only rely on the cpuid flags. If the CPU
supports these extensions, but the OS doesn't, issuing these insns will
trigger an undefined opcode exception.

In addition to the AVX cpuid bit we also need to:
* test cpuid for OSXSAVE support
* XGETBV to check if the OS saves/restores AVX regs on context switches

See "Detecting Availability and Support" at
http://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions

Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-07-23 23:12:58 +01:00
Chris Forbes
5a7bdd4b41 docs: Add items for GL4.4
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-23 19:04:43 +12:00
Francisco Jerez
df530829f7 clover: Respect kernel argument alignment restrictions.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-22 23:09:34 +02:00
Francisco Jerez
f64c0ca692 clover: Extend kernel arguments for differing host and device data types.
Loosely based on a similar patch by Tom Stellard.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-22 23:09:34 +02:00
Francisco Jerez
829caf410e clover: Byte-swap kernel arguments when host and device endianness differ.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-22 23:09:22 +02:00
Francisco Jerez
2265b40e37 clover: Add kernel argument fields to allow differing host/target data types.
Loosely based on a similar patch by Tom Stellard.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-22 22:47:27 +02:00
Francisco Jerez
a3dcab43c6 clover: Pass corresponding module::argument to kernel::argument::bind().
And remove size information from most kernel::argument derived
classes, it's no longer going to be necessary.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-22 22:45:41 +02:00
Tom Stellard
8c9d3c62f6 clover: Return correct value for CL_DEVICE_ENDIAN_LITTLE
Query the driver using PIPE_CAP_ENDIANNESS rather than always returning
true.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-07-22 22:45:20 +02:00
Tom Stellard
4e90bc9a12 gallium: Add PIPE_CAP_ENDIANNESS
Cc: mesa-stable@lists.freedesktop.org
[ Francisco Jerez: Fix "PIPE_ENDIAN_SMALL" in the documentation,
  define PIPE_ENDIAN_NATIVE. ]
2013-07-22 22:43:17 +02:00
Matt Turner
c09a4cbbaf configure.ac: Use correct options names in AC_ARG_ENABLE. 2013-07-22 10:48:45 -07:00
Matt Turner
242a59d535 egl/build: Remove unused GLAPI_LIB. 2013-07-22 10:48:45 -07:00
Matt Turner
3647efa5c1 build: Remove unused EGL_PLATFORMS. 2013-07-22 10:48:45 -07:00
Matt Turner
5e4e145025 build: Add tests directories to SUBDIRS
Fixes a problem with distcheck.
2013-07-22 10:48:45 -07:00
Zack Rusin
7bae56c5c2 llvmpipe: Ensure FTZ/DAZ flags are set on deferred draw flushes.
Tested-by: José Fonseca <jfonseca@vmware.com>
2013-07-22 18:11:39 +01:00
José Fonseca
2a650611be llvmpipe: Remove lp_rast_get_num_threads().
Never called.

Trivial.
2013-07-22 18:08:39 +01:00
José Fonseca
190312949e scons: Don't use -z defs ld option on Mac.
Should fix fdo bug 67098.
2013-07-21 09:55:04 +01:00
Vinson Lee
cd90ebefd4 glsl: Initialize ast_function member variables.
Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-21 00:23:17 -07:00
Jeremy Huddleston Sequoia
fa5ed99d8e Apple: glFlush() is not needed with CGLFlushDrawable()
<rdar://problem/14496373>

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2013-07-20 10:25:28 -07:00
José Fonseca
b844c8e039 util/u_math: Define NAN/INFINITY macros for MSVC.
Untested. But should hopefully fix the build.
2013-07-20 00:31:18 +01:00
Zack Rusin
f59cb67376 llvmpipe/tests: update arith test to check for edge cases
Test infs, zeros and nans with our arith functions to assure
correct/defined behavior with those values.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-19 16:29:18 -04:00
Zack Rusin
f7c06785d0 gallivm: add a log function that handles edge cases
Same as log2_safe, which means that it can handle infs, 0s and
nans.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-19 16:29:18 -04:00
Zack Rusin
018c69ac56 gallivm: export unordered/ordered cmp to a common function
Only the floating point operarators change everything else
is the same so it makes sense to share the code.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-19 16:29:18 -04:00
Zack Rusin
192c68b85a gallivm: handle -inf, inf and nan's in sin/cos instructions
sin/cos for anything not finite is nan and everything else has
to be between [-1, 1].

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-19 16:29:17 -04:00
Zack Rusin
13e2cd2f2c gallivm: add a version of log2 which handles edge cases
That means that if input is:
 * - less than zero (to and including -inf) then NaN will be returned
 * - equal to zero (-denorm, -0, +0 or +denorm), then -inf will be returned
 * - +infinity, then +infinity will be returned
 * - NaN, then NaN will be returned
It's a separate function because the checks are a little bit costly
and in most cases are likely unnecessary.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-19 16:29:17 -04:00
Zack Rusin
7b672c1503 gallivm: fix edge cases in exp2
exp(0) has to be exactly 1, exp(-inf) has to be 0, exp(inf) has
to be inf and exp(nan) has to be nan, this fixes all of those
cases.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-19 16:29:17 -04:00
Zack Rusin
ab47bbecd6 gallivm: handle nan's in min/max
Both D3D10 and OpenCL say that if one the inputs is nan then
the other should be returned. To preserve that behavior
the patch fixes both the sse and the non-sse paths in both
functions and adds helper code for handling nans.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-19 16:29:17 -04:00
José Fonseca
719000bd7d scons: Disallow undefined symbols in Xlib libGL.so.
It's not the first time that, due to missing build dependencies or
incomplete commits, we end up with a broken libGL.so that's missing
symbols, causing all tests to fail catastrophically.

Instead try to catch this sort of issues earlier.
2013-07-19 13:08:07 +01:00
Tomasz Lis
9f07ca11c1 mesa: Dispatch ARB_framebuffer_object and EXT_framebuffer_object differently
Almost all of the functions between the ARB and the EXT share the same
GLX protocol because the functionality is, essentially, identical.
However, there are some differences between the extensions:

- In the ARB extension, names must come from glGenBuffers.

- In the ARB extension, framebuffer objects are not shared (but they are
  in the EXT).

For these reasons, glBindFramebuffer and glBindRenderbuffer have
different GLX protocol opcodes than their EXT counterparts.  Currently
these functions alias each other in the dispatch table.  This makes it
impossible to be truly spec conformant.

This patch enables fixing the conformance issue by splitting
glBindFramebuffer / glBindFramebufferEXT and glBindRenderbuffer /
glBindRenderbufferEXT into separate dispatch table entries.

Patches will be available shortly to:

- Fix the conformance issue.

- Stop advertising the EXT in OpenGL 3.1 (or core profiles).

HOWEVER, this does represent a compatibility break between the loader
(libGL or the Xserver GLX module) and the driver.  Mesa drivers compiled
without this change will request a single dispatch table entry for
glBindFramebuffer and glBindFramebufferEXT.  Since the updated loader
has different entries for each, the request will fail, and the driver
will die in a fire.

Drivers built with the change should continue to load fine on loaders
without the change.  In this case, the driver will separately ask for
entries for glBindFramebuffer and glBindFramebufferEXT, and the loader
will tell it the same location.  Since the loader in the server's GLX
module is not (yet) updated, this should not be a problem.  We also do
not advertise the ARB extension from the server, so, again, this should
not be a problem for the server.

HOWEVER, this means that DRI1 drivers (remember mga_dri.so?) will no
longer load with libGL build hereafter.  That means this patch will need
to be back ported to the 8.0 branch.

v2 (idr): Added missing GLX protocol opcodes for the EXT functions and
corrected the opcodes for the ARB functions.  Updated GLX indirect_api
unit test and dispatch sanity unit test.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Bartosz Zawistowski <bartosz.l.zawistowski@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
2013-07-18 17:42:46 -07:00
Kenneth Graunke
adfd0123c8 st/mesa: Enable the ARB_shading_language_420pack extension for 1.30+.
Any driver that supports GLSL 1.30 should be able to handle this
extension, as it's entirely implemented in the GLSL compiler.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
46d9baf3e3 i965: Enable the GL_ARB_shading_language_420pack extension on Gen6+.
While all the work is in the shared GLSL compiler, this extension
requires GLSL 1.30, which is currently only supported on Gen6+.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
bfcec4618a glsl: Handle the binding qualifier for UBO variables.
layout(binding = N) is equivalent to calling glUniformBlockBinding(_,N).

This currently only handles the GLSL 1.40 case - no interface names, no
arrays of uniform blocks.  This is okay since we don't yet support GLSL
1.50, and don't expose ARB_shading_language_420pack in ES 3.0.

v2: Move into the other function; use binding, not constant_value.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
f25d94084c glsl: Propagate UBO binding qualifier into UBO member variables.
Without an instance name, there is no ir_variable representing the
actual uniform block declaration.  When the linker goes to set uniform
initializers, it only sees the members as ir_variables; never the block.

So, unfortunately, the members need to know about the binding.

There has to be a better way to do this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
34e2ccc9f0 glsl: Handle the binding qualifier for arrays of samplers.
Normally, uniform array variables are initialized by array literals.
That is, val->type->array_elements >= storage->array_elements.

However, samplers are different.  Consider a declaration such as:

   layout(binding = 5) uniform sampler2D[3];

The initializer value is a single integer (5), while the storage has 3
array elements.  The proper behavior here is to increment one for each
element; they should be initialized to 5, 6, and 7.

This patch introduces new code for sampler types which handles both
arrays of samplers and single samplers correctly.

v2: Move into the other function; use binding, not constant_value.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
67038c6ba2 glsl: Add plumbing for handling uniform binding qualifiers.
Sampler uniforms and uniform blocks do not have a var->constant_value.
Instead, they have an integer var->binding value.

This makes extending set_uniform_initializer() somewhat problematic: it
assumes that there is an ir_constant * which represents the initializer,
and that it's safe to dereference that without any NULL checks.

Instead, this patch creates an analogous function for binding
qualifiers, and calls one or the other as appropriate.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
0a23ec2b6e glsl: Delete unused code for handling samplers in array-initializers.
There is existing code to handle sampler uniform initializers.  Prior to
GLSL 4.20's "binding" keyword, sampler uniforms don't have initializers
at all, so this is somewhat surprising.

The existing code is broken into two cases: one where both the variable and
initializer are arrays, and a second where the variable and initializer are
scalars.

The first case should never occur, since array-typed initializers do not
exist for sampler uniforms.  Even with the binding keyword, the
initializer is a single integer which represents the texture unit to use
for the first array element.

The second is apparently used for some fixed-function code.

v2: Rewrite the commit message - suggested by Paul.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
9a9a830b44 glsl: Cross-validate explicit binding points.
All compilation units need to agree on the binding point, if they
specify one at all.

v2: Use binding, not constant_value.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
d4375fc016 glsl: Propagate explicit binding information from AST to IR.
Rather than creating a new "binding" field in ir_variable, we reuse
constant_value since the linker code for handling uniform initializers
uses that.

Since UBOs and samplers can't otherwise have initializers/constant
values, there shouldn't be a conflict.

v2: Propagate the new binding variable around too.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
4da1504c0f glsl: Add ir_variable fields for explicit bindings.
These are not used yet, but they exist and are copied appropriately.

v2: Add an explicit "int binding" variable rather than reusing
    constant_value, as suggested by Paul Berry.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
5e5e12040b glsl: Add validation for the "binding" qualifier.
The "binding" qualifier only applies to UBO blocks and samplers, along
with arrays of those types.  (It would also apply to images and atomic
counters, but we don't support those yet.)

This also validates sampler bindings against the maximum number of
texture units, and UBO bindings against the number of uniform buffer
binding points.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
0418846a07 glsl: Parse the "binding" keyword and store it in ast_type_qualifier.
Nothing actually uses this yet.

v2: Remove >= 0 checks.  They'll be handled in later validation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
7f6a2d6937 glsl: Have the lexer return LAYOUT_TOK if 420pack is enabled.
GL_ARB_shading_language_420pack also provides layout qualifiers.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
56bcde34b2 glsl: Use has_layout() rather than a partial open coded version.
The idea of this code is to disallow layout(...) sections with the
deprecated "varying" or "attribute" keywords, unless a few select
extensions are enabled which allow a more relaxed check.

In order to detect a layout(...) section, the code checks for a number
of layout qualifiers.  However, it failed to check for all of them,
which could lead to layout(...) not being detected when it should.

By replacing this with has_layout(), we properly check for all layout
qualifiers, and also guarantees that new qualifiers added in the future
will not be forgotten.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
c397ec94e9 glsl: Relax auxiliary storage ordering requirements with 420pack.
These were already semi-relaxed, since the storage qualifier rule
already skipped when 420pack was enabled.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
b5d6c51e2b glsl: Handle centroid qualifier ordering in C code, not the parser.
The GL_ARB_shading_language_420pack extension/GLSL 4.20 split centroid
off into a new category, "auxiliary storage qualifiers," and allow these
to be placed anywhere in the series.  So we have to stop recognizing
"centroid in"/"centroid out"/"centroid varying" in the grammar and get
more creative.

The same approach used before works here, too.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
844307a584 glsl: Allow precision qualifiers to be flexibly ordered with 420pack.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
6eec502e84 glsl: Move precision handling to be part of qualifier handling.
This is necessary for the parser to be able to accept precision
qualifiers not immediately adjacent to the type, such as "const highp
inout float foo".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
308d4c7146 glsl: Change is_precision_statement to default_precision != none.
Currently, we store precision in ast_type_specifier, rather than
ast_type_qualifier.  This works because precision is the last qualifier,
and immediately adjacent to the type.

Default precision statements (such as "precision highp float") are
represented as ast_type_specifier objects, with a boolean to indicate
that it's a default precision statement rather than an ordinary type.

ast_type_specifier::precision will be moving to ast_type_qualifier soon,
in order to support arbitrary qualifier ordering.  However, we still
need to store a "this is a precision statement" flag /and/ the default
precision in ast_type_specifier.

This patch changes the boolean into a new field, default_precision.
If default_precision != ast_precision_none, it's a precision statement
with the specified precision.  Otherwise, it's an ordinary type.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
7855482138 glsl: Disable ordering checks for const parameters with 420pack.
This makes the complier accept both "const in" and "in const".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
293dfe5738 glsl: Handle "const" as a parameter qualifier.
This will make it easy to support both "const in" and "in const", as
required by GLSL 4.20/ARB_shading_language_420pack.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
a4d15a3cd9 glsl: Refactor parameter qualifier handling.
"Parameter direction qualifier" is a new term I invented just now; it's
not part of any GLSL specification.

This paves the way handling multiple parameter qualifiers, in any order,
as required by GLSL 4.20/ARB_shading_language_420pack.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
83fe4f7019 glsl: Use merge_qualifier() when processing qualifier lists.
Most of ast_type_qualifier is simply a bitfield (represented as a
structure of unsigned:1 bits in a union with an unsigned).  However, it
also contains ARB_explicit_attrib_location's location/index fields.

In the past, this has worked by simply returning the layout qualifier's
ast_type_qualifier and merging the other bits into it.  However, that's
not obvious until you break it by switching $1 and $2.

Using merge_qualifier() copies them appropriately, and also properly
overrides layout qualifiers.  It also checks for duplicate qualifiers,
which renders some of the checks in the previous patch unnecessary.
However, those checks provide better error messages, such as "Duplicate
interpolation qualifier", rather than just "duplicate qualifier".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
0cb90fcfbd glsl: Allow duplicate layout qualifiers with 420pack.
The new 4.20 rules explicitly allow multiple layout(...) sections.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
89f75e7e7b glsl: Disable ordering checks on most qualifiers for 420pack.
This makes the compiler accept invariant, storage, layout, and
interpolation qualifiers in any order when ARB_shading_language_420pack
is enabled.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
48e3bd33dc glsl: Handle most qualifier ordering in C code rather than the grammar.
The GL_ARB_shading_language_420pack extension/GLSL 4.20 allow qualifiers
to be specified in (basically) any order.  In order to support this, we
can't hardcode the ordering restrictions in the grammar.

This patch alters the grammar to accept invariant, storage, layout, and
interpolation qualifiers in any order, but adds C code to enforce the
ordering requirements.  In the 420pack case, we should be able to simply
skip the error checks.

As a bonus, this also lets us generate decent error messages, rather
than Bison's awful "unexpected TOKEN" errors.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
1b719df14d glsl: Add a new ast_type_qualifier::has_auxiliary_storage() method.
"Auxiliary storage qualifiers" is the new term given to "centroid",
"patch", and "sample" by GLSL 4.20/GL_ARB_shading_language_420pack.

Even though we only support "centroid", it's useful to add this now
so that all auxiliary storage qualifiers get handled in the right places
once they're eventually supported.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
eb30af51d6 glsl: Add a new ast_type_qualifier::has_storage() method.
This makes it easy to check if any storage qualifiers are set.

"centroid" is not considered a storage qualifier.  In the old language
rules, you can't specify "centroid" by itself; it's always "centroid
in", "centroid out", or "centroid varying."  So one of the other storage
qualifiers will always be set; there's no need to specifically check for
centroid.

In the new 4.20 rules, centroid is an auxiliary storage qualifier, not a
storage qualifier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
7cef2b22b8 glsl: Add a new ast_type_qualifier::has_layout() method.
This makes it easy to check if any layout qualifiers are set.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:21 -07:00
Kenneth Graunke
7ce5c6b214 i965: Combine URB code emission into a single group.
All four URB packets need to be programmed together in order for the GPU
state to be valid.  Putting them in separate BEGIN..ADVANCE blocks is
risky: if we're nearing the end of a batch, the batch could be flushed
inbetween two of the commands, causing the URB programming to be split
into two batchbuffers.

This -might- be okay with hardware contexts, but it offers no advantages
over keeping them together, and has a potential for hangs.

Putting them into a single BEGIN..ADVANCE block ensures they'll be kept
in the same batch, which seems wise.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-18 16:57:21 -07:00
Chad Versace
30f33deccb i965/hsw: Change L3 MOCS for depth, hiz, and stencil
Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:18:22 -07:00
Chad Versace
2273b652bb i965/hsw: Change L3 MOCS of 3DSTATE_CONSTANT_VS/PS
Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

In blorp, change only the PS packet, because the VS packet is disabled.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:18:22 -07:00
Chad Versace
2f346395f5 i965/hsw: Change L3 MOCS of SURFACE_STAT
Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:18:21 -07:00
Chad Versace
a16d47465e i965/hsw: Change L3 MOCS of 3DSTATE_VERTEX_BUFFERS
Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:18:21 -07:00
Tomasz Lis
eb83079b35 glx: Enable floating-point fbconfig extensions
Signed-off-by: Tomasz Lis <listom@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Ian Romanick
74cbe6e497 egl: Drop configs with unknown or invalide __DRI_ATTRIB_RENDER_TYPE
Some render types, such as floating-point, aren't valid with EGL.
Return NULL in those cases to drop them.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
c37c367d38 dri: Introduce new flags in __DRI_ATTRIB_RENDER_TYPE
Mark __DRI_ATTRIB_FLOAT_MODE as deprecated, and introduce new flags to
__DRI_ATTRIB_RENDER_TYPE for float modes.  Both signed float
(fbconfig_float) and unsigned (packed_float) are introduced. The old
attribute should be set for both float modes.

v2 (idr): Require that the render mode from the DRI attributes matches the
render mode of the config exactly.  This is the behavior of the old code.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
4473af7aca glx: Require proper drawableType in init_fbconfig_for_chooser
Make sure that init_fbconfig_for_chooser sets correct value of
drawableType for visual configs and fbconfigs.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
2eed9ff2fb glx: Validate the GLX_RENDER_TYPE value
Correctly handle the value of renderType in GLX context.  In case of the
value being incorrect, context creation fails.

v2 (idr): indirect_create_context is just a memory allocator, so don't
validate the GLX_RENDER_TYPE there.  Fixes regressions in several
GLX_ARB_create_context piglit tests.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
27c8aa5cfb glx: Store the RENDER_TYPE in indirect rendering
v2 (idr): Open-code the check for GLX_RENDER_TYPE.
dri2_convert_glx_attribs can't be called from here because that function
only exists in direct-rendering builds.  Also add a stub version of
indirect_create_context_attribs to tests/fake_glx_screen.cpp to prevent
'make check' regressions.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
1c748dff6b glx: Handling RENDER_TYPE in glXCreateContext and init_fbconfig_for_chooser
Set the correct values of renderType in glXCreateContext and
init_fbconfig_for_chooser.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
b8126c7c8a glx: Changes to visual configs initialization.
Correctly handle the value of renderType and drawableType in
fbconfig. Modify glXInitializeVisualConfigFromTags to read the parameter
value, or detect it if it's not there.

v2 (idr): If there was no GLX_RENDER_TYPE property, set the type based
purely on the rgbMode as the previous code did.  It is impossible for
floatMode to be set at this point, so we can't have a float config.  The
previous code regressed a large number of piglit GLX tests because those
tests don't set GLX_RENDER_TYPE in the glXChooseConfig call.  Restoring
the old behavior for that case fixes those regressions.

Also fix handling of GLX_DONT_CARE for GLX_RENDER_TYPE.  Fixes a
regression in glx-dont-care-mask.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
a92cd5b245 glx: Retrieve the value of RENDER_TYPE from GLX attribs array
Make sure that context creation routines are provided with the value of
RENDER_TYPE retrieved from GLX attribs.

v2 (idr): Minor formatting changes.  Change type of
dri2_convert_glx_attribs render_type parameter to uint32_t to silence
some GCC warnings.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
36259a16fe glx: Store the value of renderType while creating context
Make sure that renderType property value is stored in GLX context while
it's being created.  Further patches will be provided to make the value
correspond to fbconfig's renderType.

v2 (idr): Move a hunk from the next patch to this patch to prevent a
build break.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Kenneth Graunke
7791c9869b i965: Add #defines for Memory Object Control State fields on Gen7-7.5.
The L3 controls are identical on all platforms, but LLC differs:
- Ivybridge has a "cache in LLC" flag
- Baytrail has no LLC, but instead has a snoop bit:
  "data accesses in this page must be snooped in the CPU caches."
- Haswell has writeback/uncached flags for LLC and eLLC (eDRAM).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:03:19 -07:00
Fabian Bieler
6368478712 glsl/linker: Use correct array length when linking inter-stage uniforms and varyings.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Fabian Bieler <fabianbieler@fastmail.fm>
2013-07-18 14:12:44 -07:00
Mike Frysinger
73c9b4b0e0 gen_matypes: fix cross-compiling with gcc
The current gen_matypes logic assumes that the host compiler will produce
information that is useful for the target compiler.  Unfortunately, this
is not the case whenever cross-compiling.

When we detect that we're cross-compiling and using GCC, use the target
compiler to produce assembly from the gen_matypes.c source, then process
it with a shell script to create a usable header.  This is similar to how
the linux kernel creates its asm-offsets.c file.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2013-07-18 13:55:48 -07:00
Andreas Oberritter
a48be954ce ax_prog_flex.m4: change grep syntax to accept e.g. flex.real
This is required in case a wrapper or symlink is used. This patch
has also been sent upstream, awaiting moderation.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andreas Oberritter <obi@saftware.de>
2013-07-18 13:54:59 -07:00
Jonathan Liu
2da0bd0526 builtin_compiler/build: Avoid using libtool if cross compiling
Adds the dependencies of builtin_compiler as sources when cross
compiling instead of using libtool to share compilation with src/glsl.
The builtin_compiler executable is built for the host when cross
compiling so it doesn't make sense to share compilation with src/glsl
built for the target in this case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44618
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jonathan Liu <net147@gmail.com>
2013-07-18 13:54:20 -07:00
Kenneth Graunke
2b5b436615 i965: Add MOCS shift and mask for SURFACE_STATE entries.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 10:45:49 -07:00
Roland Scheidegger
4ef19f7fec llvmpipe: clamp inputs for srgb render buffers
Usually with fixed point renderbuffers clamping is done as part of conversion.
However, since we blend in float format, we essentially skip all conversion
steps pre-blend but since this is still a fixed point renderbuffer we must
still clamp the inputs in this case. Makes no difference for piglit though.
Obviously we could skip this if fragment color clamping is enabled, but a)
this is deprecated in OpenGL (d3d never had it) and b) we don't support it
natively so it gets baked into the shader.
Also add some comment about logic ops being broken for srgb, luckily no test
tries to do that as there's no easy fix...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-07-18 19:04:20 +02:00
Roland Scheidegger
e57b98bad3 llvmpipe: fix blending with SRC_ALPHA_SATURATE with some formats without alpha
We were fixing up the blend factor to ZERO, however this only works correctly
with fixed point render buffers where the input values are clamped to 0/1
(because src_alpha_saturate is min(As, 1-Ad) so can be negative with unclamped
inputs). Haven't seen any failure anywhere due to that with fixed point SNORM
buffers (which clamp inputs to -1/1) but it should apply there as well (snorm
blending is rare, even opengl 4.3 doesn't require snorm rendertargets at all,
d3d10 requires them but they are not blendable).
Doesn't look like piglit hits this though (some internal testing hits the
float case at least). (With legacy OpenGL we could theoretically still use the
fixup to zero if the fragment color clamp is enabled, but we can't detect that
easily since we don't support native clamping hence it gets baked into the
shader.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-07-18 19:03:35 +02:00
Marek Olšák
0d7f087483 r600g: use WAIT_3D_IDLE before using CP DMA
I broke this with 7948ed1250 for r700 at least.
2013-07-18 14:27:34 +02:00
Jonathan Gray
0b405f364f r300g: make use of gallium's os_get_process_name()
Lets the code compile on non Linux systems.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-07-18 14:04:48 +02:00
Jean-Sébastien Pédron
148f0deb06 configure.ac: On some systems, "x86-64" is called "amd64"
For instance, this is the case on FreeBSD.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 23:10:23 -07:00
Ilia Mirkin
fbdae1ca41 nv50: H.264/MPEG2 decoding support via VP2, available on NV84-NV96, NVA0
Adds H.264 and MPEG2 codec support via VP2, using firmware from the
blob. Acceleration is supported at the bitstream level for H.264 and
IDCT level for MPEG2.

Known issues:
 - H.264 interlaced doesn't render properly
 - H.264 shows very occasional artifacts on a small fraction of videos
 - MPEG2 + VDPAU shows frequent but small artifacts, which aren't there
   when using XvMC on the same videos

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-07-18 07:52:32 +02:00
Jonathan Gray
f96c07abf6 configure.ac: make grep tests more portable
Use grep -w instead of the empty string escape sequences
which are less portable.  Makes the grep tests
function as intended on OpenBSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 22:50:19 -07:00
Jonathan Gray
78fbb41fe3 configure.ac: add OpenBSD
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 21:06:46 -07:00
Vinson Lee
21f97446f4 glsl: Remove comma at end of enumerator list.
Fixes this build error on OpenBSD 5.3.

In file included from ../../src/mesa/main/ff_fragment_shader.cpp:53:
./../glsl/ir_optimization.h:64: error: comma at end of enumerator list

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 20:57:54 -07:00
Vinson Lee
77311dab3a mesa: Remove commas at end of enumerator lists.
Fixes these build errors on OpenBSD 5.3.

In file included from ../../src/mesa/main/errors.h:47,
                 from ../../src/mesa/main/imports.h:41,
                 from ../../src/mesa/main/ff_fragment_shader.cpp:32:
../../src/mesa/main/mtypes.h:3286: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3296: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3303: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3356: error: comma at end of enumerator list

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 20:57:53 -07:00
Carl Worth
ceaf1a74cb docs: Import 9.1.5 release notes
And add news item for the release.
2013-07-17 20:11:02 -07:00
Roland Scheidegger
7fd30a8621 gallivm: (trivial) simplify lp_build_cos/lp_build_sin a tiny bit
Use "or" instead of "add" (this is a classic select sequence, which at
least newer llvm versions can actually recognize (3.2+?), and the "add"
might prevent that - and we really don't want an add instead of an or with
avx if it isn't recognized (even without avx logic ops might be cheaper)).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-17 18:16:34 +02:00
Roland Scheidegger
f0f9fb59c3 util/u_format_s3tc: handle srgb formats correctly.
Instead of just ignoring the srgb/linear conversions, simply call the
corresponding conversion functions, for all of pack/unpack/fetch,
both for float and unorm8 versions (though some don't make a whole
lot of sense, i.e. unorm8/unorm8 srgb/linear combinations).
Refactored some functions a bit so don't have to duplicate all the code
(there's a slight change for packing dxt1_rgb, as there will now be
always 4 components initialized and sent to the external compression
function so the same code can be used for all, the quite horrid and
ad-hoc interface (by now) should always have worked with that).

Fixes llvmpipe/softpipe piglit texwrap GL_EXT_texture_sRGB-s3tc.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-17 18:16:27 +02:00
Vadim Girlin
07baf9cfd1 r600g/sb: improve alu packing on cayman
Scheduler/register allocator in r600-sb was developed and optimized
on evergreen (VLIW-5) hardware, so currently it's not optimal for
VLIW-4 chips.
This patch should improve performance on cayman gpus due to better alu
packing, but also it tends to increase register usage, so overall positive
effect on performance has to be proven by real benchmarks yet.

Some results with bfgminer kernel on cayman:
source bytecode:       60 gprs, 3905 alu groups,
sbcl before the patch: 45 gprs, 4088 alu groups,
sbcl with this patch:  55 gprs, 3474 alu groups.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:29:56 +04:00
Vadim Girlin
ba7fa4c4c9 r600g/sb: fix handling of new multislot instructions on cayman
Ex-scalar instructions that became multislot on cayman do replicate result
to all channels - handle them similar to DOT4.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:31 +04:00
Vadim Girlin
033eec4145 r600g/sb: fix debug dump code in scheduler
Update the stale debug code for other changes related to debug output.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:31 +04:00
Vadim Girlin
44ebe7291c r600g/sb: fix initial register allocation
Mark values that are members of the 'same register' constraint as
preallocated in ra_init pass, this will prevent incorrect
reallocation in scheduler in some cases.

Should fix https://bugs.freedesktop.org/show_bug.cgi?id=66713

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:30 +04:00
Vadim Girlin
f0d881106a r600g/sb: move chip & class name functions to sb_context
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:30 +04:00
Vadim Girlin
96efa4cdf4 r600g/sb: fix handling of PS in source bytecode on cayman
Actually PS doesn't make sense for cayman and isn't even mentioned in
cayman docs, but llvm backend currently uses it in bytecode and, assuming
that hw seems to be mostly ok with it, this will allow sb to parse such
source bytecode correctly.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:30 +04:00
Vinson Lee
81d3881367 r600g/sb: Initialize ra_checker member variables.
Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 18:27:30 +04:00
Emil Velikov
b20e0fb520 gallium/util: use explicily sized types for {un, }pack_rgba_{s, u}int
Every function but the above four uses explicitly sized types for their
src and dst arguments. Even fetch_rgba_{s,u}int follows the convention.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-07-17 13:01:46 +02:00
Kyle McMartin
87c3440567 llvmpipe: use MCJIT on ARM and AArch64
MCJIT is the only supported LLVM JIT on AArch64 and ARM (the regular
JIT has bit-rotted badly on ARM and doesn't exist on AArch64.)

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
2013-07-17 17:29:01 +10:00
Kenneth Graunke
00d32cd5b4 glsl: Fix absurd whitespace conventions in the parser.
Historically, we indented grammar production rules with a single 8-space
tab, but code inside of blocks used Mesa's 3-space indents.

This meant when editing code, you had to use an 8-space tab for the
first level of indentation, and 3-spaces after that.  Unless you
specifically configure your editor to understand this, it will get the
indentation wrong on every single line you touch, which quickly devolves
into a colossal waste of time.

It's also inconsistent with every other file in the entire project.

This patch removes all tabs and moves to a consistent 3-space indent.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-16 11:31:58 -07:00
Kenneth Graunke
4ab7fc9ec3 glsl: Fail the build if the grammar contains shift/reduce errors.
When working on a parser, it's very easy to accidentally introduce
new shift/reduce conflicts.  Failing the build guarantees they'll
be noticed and fixed.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-16 11:31:58 -07:00
Kenneth Graunke
73620709c9 glsl: Silence the last shift/reduce conflict warning in the grammar.
The single remaining shift/reduce conflict was the classic ELSE problem:

  292 selection_rest_statement: statement . ELSE statement
  293                         | statement .

    ELSE  shift, and go to state 479

    ELSE      [reduce using rule 293 (selection_rest_statement)]
    $default  reduce using rule 293 (selection_rest_statement)

The correct behavior here is to shift, which is what happens by default.
However, resolving it explicitly will make it possible to fail the build
on new errors, making them much easier to detect.

The classic way to solve this is to use right associativity:
http://www.gnu.org/software/bison/manual/html_node/Non-Operators.html

Since there is no THEN token in GLSL, we need to fake one.  %right THEN
creates a new terminal symbol; the %prec directive says to use the
precedence of that terminal.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-16 11:31:58 -07:00
Vinson Lee
fa7829c36b glsl: Initialize ast_jump_statement::opt_return_value.
opt_return_value was not initialized if mode != ast_return.

Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-16 09:03:02 -07:00
Vinson Lee
f74acb9835 glapi: Do not use backtrace on OpenBSD.
execinfo.h is not available on OpenBSD.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-16 09:00:38 -07:00
Maarten Lankhorst
b20b2b6dc8 osmesa: link against static libglapi library too to get the gl exports
This should fix missing symbols in a osmesa built against shared glapi
osmesa build. All opengl exports were missing that are defined in the
static glapi, so link against both to fix this.

This is a candidate for the stable series.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47824
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-07-16 10:18:40 +02:00
Chris Forbes
121ea0b38b i965/Gen4: Zero extra coordinates for ir_tex
We always emit U,V,R coordinates for this message, but the sampler gets
very angry if we pass garbage in the R coordinate for at least some
texture formats.

Fill the remaining coordinates with zero instead.

Fixes broken rendering on GM45 in Source games, and in VDrift.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65236

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-16 19:08:41 +12:00
Kenneth Graunke
e4fdf1b008 i965: Cite the Ivybridge PRM for 3DSTATE_CLEAR_PARAMS notes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:53 -07:00
Kenneth Graunke
b72a298751 i965: Refer people to brw_tex_layout.c rather than the BSpec.
brw_tex_layout.c sets up the align_w/h fields, and has all the
appropriate spec references already.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:53 -07:00
Kenneth Graunke
4b704424e0 i965: Remove old BSpec reference from BLORP's 3DSTATE_WM/PS packets.
The Sandybridge code had a citation for the range of the "Maximum Number
of Threads" field, and the Ivybridge code just mentioned the "BSpec" in
general.  That's documented in the obvious place, so people can find it
without a spec reference.

The real value of the comment is to say "we tried zero, and it exploded,
so program it to a valid number even if pixel shading is off."

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
ada110716a i965: Cite the Ivybridge PRM for 3DSTATE_URB_* programming.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
90b5a03581 i965: Update workaround flush comments for Gen6 3DSTATE_VS.
Unfortunately, the workaround text never made it into the Sandybridge
PRM, so we still have to refer to the BSpec.

It also wasn't obvious why we needed this workaround at all, since we
don't currently do VS passthrough - but BLORP can turn off the VS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
3b3a440d2b i965: Cite the Ivybridge PRM for VS PIPE_CONTROL workarounds.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
9a86875c6b i965: Cite the Sandybridge PRM for Gen7 stencil pitch requirements.
Sadly, the Ivybridge PRM can't be cited, as it is missing the relevant
text for some reason.  However, the Sandybridge PRM has the text Chad
originally quoted, and the modern BSpec has the same text.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
2e928e2a3f i965: Cite the Ivybridge PRM for multisample surface format notes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
43ea434225 i965: Delete "the data cache is the sampler cache" comments on Gen7+.
I cut and pasted these comments from the Gen4 code during Ivybridge
enabling, and didn't understand what they meant at the time.

The data cache is NOT the same as the sampler cache on Ivybridge.
The sampler cache has L1 and L2 caches in addition to the L3 cache,
while data port messages to the "data cache" hit L3 directly.

This means that the sampler domain is technically wrong, but we stopped
caring about read/write domains quite a while ago.  The kernel just
flushes all the caches at the end of each batchbuffer, and our render to
texture code flushes the sampler caches when necessary.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
3f64cfabfc i965: Cite the 965 PRM for "the data cache is the sampler cache".
Presumably, this comment exists to justify the usage of
I915_GEM_DOMAIN_SAMPLER for this relocation.  At one point, this was
necessary to ensure that the right flushing was done to keep caches
coherent.  These days, the kernel just flushes everything, so I don't
think it matters.

Still, the comment is interesting, so leave it in place.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:51 -07:00
Kenneth Graunke
f254c94204 i965: Cite the Ivybridge PRM for DP message descriptor fields.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:51 -07:00
Kenneth Graunke
a0c8e76202 i965: Cite the Ivybridge PRM for why the fake MRF range is what it is.
The exact text is in the public docs, so we should cite those.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:51 -07:00
Kenneth Graunke
3090d39dde i965: Cite the Ivybridge PRM for SFID enum values.
The Ivybridge PRM adds new SFIDs and lists them in a different volume
than Sandybridge, so it's worth adding a reference.

I also removed the BSpec reference, as the section it referred to
was moved somewhere, and I couldn't find it.  This leaves one Haswell
SFID without a citation, but we can add one once the PRMs are out.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:51 -07:00
Roland Scheidegger
dc1cc928ed llvmpipe: support sRGB framebuffers
Just use the new conversion functions to do the work. The way it's plugged
in into the blend code is quite hacktastic but follows all the same hacks
as used by packed float format already.
Only support 4x8bit srgb formats (rgba/rgbx plus swizzle), 24bit formats never
worked anyway in the blend code and are thus disabled, and I don't think anyone
is interested in L8/L8A8. Would need even more hacks otherwise.
Unless I'm missing something, this is the last feature except MSAA needed for
OpenGL 3.0, and for OpenGL 3.1 as well I believe.

v2: prettify a bit, use separate function for packing.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-16 01:54:51 +02:00
Marek Olšák
a882067d74 Revert "r300g: allow HiZ with a 16-bit zbuffer"
This reverts commit 631c631cbf.

https://bugs.freedesktop.org/show_bug.cgi?id=66921

Cc: mesa-stable@lists.freedesktop.org
2013-07-15 23:46:01 +02:00
Marek Olšák
7969b567bd r300g/swtcl: fix a lockup in MSAA resolve
Cc: mesa-stable@lists.freedesktop.org
2013-07-15 23:45:22 +02:00
Marek Olšák
22427640b2 r300g/swtcl: fix geometry corruption by uploading indices to a buffer
The splitting of a draw call into several draw commands was broken, because
the split sometimes took place in the middle of a primitive. The splitting
was supposed to be dealing with the case when there are more indices than
the maximum size of a CS.

This commit throws that code away and uses a real index buffer instead.

https://bugs.freedesktop.org/show_bug.cgi?id=66558

Cc: mesa-stable@lists.freedesktop.org
2013-07-15 23:45:16 +02:00
Matt Turner
c889df3fbe glsl: Reject C-style initializers with unknown types.
_mesa_ast_set_aggregate_type walks through declarations initialized with
C-style aggregate initializers and stops when it runs out of LHS
declarations or RHS expressions.

In the example

   vec4 v = {{{1, 2, 3, 4}}};

_mesa_ast_set_aggregate_type would not recurse into the subexpressions
(since vec4s do not contain types that can be initialized with an
aggregate initializer) to set their <constructor_type>s. Later in ::hir
we would dereference the NULL pointer and segfault.

If <constructor_type> is NULL in ::hir we know that the LHS and RHS
were unbalanced and the code is illegal.

Arrays, structs, and matrices were unaffected.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-15 13:02:36 -07:00
Paul Berry
7706e52b25 glsl: Rework builtin_variables.cpp to reduce code duplication.
Previously, we had a separate function for setting up the built-in
variables for each combination of shader stage and GLSL version
(e.g. generate_110_vs_variables to generate the built-in variables for
GLSL 1.10 vertex shaders).  The functions called each other in ad-hoc
ways, leading to unexpected inconsistencies (for example,
generate_120_fs_variables was called for GLSL versions 1.20 and above,
but generate_130_fs_variables was called only for GLSL version 1.30).
In addition, it led to a lot of code duplication, since many varyings
had to be duplicated in both the FS and VS code paths.  With the
advent of geometry shaders (and later, tessellation control and
tessellation evaluation shaders), this code duplication was going to
get a lot worse.

So this patch reworks things so that instead of having a separate
function for each shader type and GLSL version, we have a function for
constants, one for uniforms, one for varyings, and one for the special
variables that are specific to each shader type.

In addition, we use a class, builtin_variable_generator, to keep track
of the instruction exec_list, the GLSL parse state, commonly-used
types, and a few other variables, so that we don't have to pass them
around as function arguments.  This makes the code a lot more compact.

Where it was feasible to do so without introducing compilation errors,
I've also gone ahead and introduced the variables needed for
{ARB,EXT}_geometry_shader4 style geometry shaders.  This patch takes
care of everything except the GS variable gl_VerticesIn, the FS
variable gl_PrimitiveID, and GLSL 1.50 style geometry shader inputs
(using the gl_in interface block).  Those remaining features will be
added later.

I've also made a slight nomenclature change: previously we used the
word "deprecated" to refer to variables which are marked in GLSL 1.40
as requiring the ARB_compatibility extension, and are marked in GLSL
1.50 onward as requiring the compatibilty profile.  This was
misleading, since not all deprecated variables require the
compatibility profile (for example gl_FragData and gl_FragColor, which
have been deprecated since GLSL 1.30, but do not require the
compatibility profile until GLSL 4.20).  We now consistently use the
word "compatibility" to refer to these variables.

This patch doesn't introduce any functional changes (since geometry
shaders haven't been enabled yet).

Reviewed-by: Matt Turner <mattst88@gmail.com>

v2: Rename "typ" -> "type".  Add blank line between inline functions
and declarations in builtin_variable_generator class.  Use the
standard comment "/* FALLTHROUGH */" for compatibility with static
code analysis tools.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 09:35:28 -07:00
Paul Berry
428e030210 glsl: Fix lower_named_interface_blocks to account for dereferences of consts.
In certain rare cases (such as those involving dereference of a
literal constant array of structs),
flatten_named_interface_blocks_declarations's rvalue visitor may be
invoked on an ir_dereference_record whose variable_referenced() method
returns NULL.

Check for this case to avoid a segfault.

Prevents crashes in piglit tests
{vs,fs}-deref-literal-array-of-structs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-07-15 07:59:52 -07:00
Paul Berry
b2265db8e7 glsl: Don't allow vertex shader input arrays until GLSL 1.50.
Vertex shader inputs are not allowed to be arrays until GLSL 1.50.  We
were accidentally enabling them for GLSL 1.40 (although we haven't
written any tests for them, so it's not clear whether they actually
work).

NOTE: although this is a simple bug fix, it probably isn't sensible to
cherry-pick it to stable release branches, since its only effect is to
cause incorrectly-written shaders to fail to compile.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 07:50:47 -07:00
Chris Forbes
b616d01661 i965: Gen4/5: use IEEE floating point mode for GLSL shaders.
Fixes isinf(), isnan() from GLSL 1.30

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-14 19:58:25 +12:00
Chris Forbes
1ec66f2fb2 i965/vs: Gen4/5: enable front colors if back colors are written
Fixes undefined results if a back color is written, but the
corresponding front color is not, and only backfacing primitives are
drawn. Results are still undefined if a frontfacing primitive is drawn,
but that's OK.

The other reasonable way to fix this would have been to just pick
the one color slot that was populated, but that dilutes the value of
the tests.

On Gen6+, the fixed function clipper and triangle setup already take
care of this.

Fixes 11 piglits:
spec/glsl-1.10/execution/interpolation/interpolation-none-gl_Back*Color-*

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-14 19:58:11 +12:00
Roland Scheidegger
796b73d1fe gallivm: (trivial) use constant instead of exp2f() function
Some lame compilers can't do exp2f() and as far as I can tell they can't do
exp2() (with doubles) neither so instead of providing some workaround for
that (wouldn't actually be too bad just replace with pow) and since it is
used with a constant only just use the precalculated constant.
2013-07-14 02:39:33 +02:00
Chia-I Wu
62c546bbf8 ilo: skip 3DSTATE_INDEX_BUFFER when possible
When only the offset to the index buffer is changed, we can skip the
3DSTATE_INDEX_BUFFER if we always use 0 for the offset, and add
(offset / index_size) to Start Vertex Location in 3DPRIMITIVE.
2013-07-14 05:59:52 +08:00
Roland Scheidegger
6bcbb0dc82 gallivm: handle srgb-to-linear and linear-to-srgb conversions
srgb-to-linear is using 3rd degree polynomial for now which should be _just_
good enough. Reverse is using some rational polynomials and is quite accurate,
though not hooked into llvmpipe's blend code yet and hence unused (untested).
Using a table might also be an option (for srgb-to-linear especially).
This does not enable any new features yet because EXT_texture_srgb was already
supported via util_format fallbacks, but performance was lacking probably due
to the external function call (the table used by the util_format_srgb code may
not be all that much slower on its own).
Some performance figures (taken from modified gloss, replaced both base and
sphere texture to use GL_SRGB instead of GL_RGB, measured on 1Ghz Sandy Bridge,
the numbers aren't terribly accurate):

normal gloss, aos, 8-wide: 47 fps
normal gloss, aos, 4-wide: 48 fps

normal gloss, forced to soa, 8-wide: 48 fps
normal gloss, forced to soa, 4-wide: 47 fps

patched gloss, old code, soa, 8-wide: 21 fps
patched gloss, old code, soa, 4-wide: 24 fps

patched gloss, new code, soa, 8-wide: 41 fps
patched gloss, new code, soa, 4-wide: 38 fps

So there's a performance hit but it seems acceptable, certainly better
than using the fallback.
Note the new code only works for 4x8bit srgb formats, others (L8/L8A8) will
continue to use the old util_format fallback, because I can't be bothered
to write code for formats noone uses anyway (as decoding is done as part of
lp_build_unpack_rgba_soa which can only handle block type width of 32).
Compressed srgb formats should get their own path though eventually (it is
going to be expensive in any case, first decompress, then convert).
No piglit regressions.

v2: use lp_build_polynomial instead of ad-hoc polynomial construction, also
since keeping both linear to srgb functions for now make sure both are
compiled (since they share quite some code just integrate into the same
function).

v3: formatting fixes and bugfix in the complicated (disabled) linear-to-srgb
path.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-13 18:42:17 +02:00
Roland Scheidegger
9b8d97e5bf gallivm: better support for fast rsqrt
We had to disable fast rsqrt before because it wasn't precise enough etc.
However in situations when we know we're not going to need more precision
we can still use a fast rsqrt (which can be several times faster than
the quite expensive sqrt). Hence introduce a new helper which does exactly
that - it is probably not useful calling it in some situations if there's
no fast rsqrt available so make it queryable if it's available too.

v2: use fast_rsqrt consistently instead of rsqrt_fast, fix indentation,
let rsqrt use fast_rsqrt.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-13 18:42:17 +02:00
Klemens Baum
45574ab2e9 configure.ac: better detection of LLVM version
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-12 21:20:59 -07:00
Vinson Lee
b0c3c955ae r600g/sb: Initialize ra_constraint::cost.
Fixes "Uninitialized scalar field" reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-13 06:57:26 +04:00
Vinson Lee
be8d787873 glsl: Initialize ast_aggregate_initializer::constructor_type.
Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12 18:42:46 -07:00
Paul Berry
c6bfe62e21 glsl: Make gl_TexCoord compatibility-only
gl_TexCoord was deprecated in GLSL 1.30.  In GLSL 1.40 it was marked
as ARB_compatibility-only, and in GLSL 1.50 and above it was marked as
only appearing in the compatibility profile.  It has never appeared in
GLSL ES.

However, Mesa erroneously included it in all desktop versions of GLSL,
even versions 1.40 and 1.50 (which do not currently support the
compatibility profile).  This patch makes gl_TexCoord available in the
compatibility profile (and GLSL versions 1.30 and prior) only.

NOTE: although this is a simple bug fix, it probably isn't sensible to
cherry-pick it to stable release branches, since its only effect is to
cause incorrectly-written shaders to fail to compile.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12 18:18:49 -07:00
Paul Berry
8f51d68f8c glsl ES: Fix magnitude of gl_MaxVertexUniformVectors.
Previously, we set it equal to MaxVertexUniformComponents.  It should
be MaxVertexUniformComponents / 4.

NOTE: This is a candidate for the stable branches.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12 18:18:48 -07:00
Marek Olšák
06b38dbab2 winsys/radeon: allow a NULL cs pointer in radeon_bo_map to fix a segfault
The original idea was that cs=NULL should be allowed here, but we never used
NULL until 862f69fbe1. This fixes a segfault in CoreBreach.
2013-07-13 02:38:23 +02:00
Chia-I Wu
8d4ac98549 ilo: move a santiy check into its assert()
The compiler does not know that ilo_3d_pipeline_estimate_size() is pure and
can be eliminated in a release build in gen6_pipeline_end().  Move the call
into the assert().
2013-07-13 07:27:28 +08:00
Chia-I Wu
bf9670270f ilo: mark some states dirty when they are really changed
The checks may seem redundant because cso_context handles them, but
util_blitter does not have access to cso_context.
2013-07-13 06:43:53 +08:00
Chia-I Wu
9047598a8d ilo: clean up ilo_blitter_pipe_begin()
Document why certain states need to be saved, and fix a bug when blitting with
scissor enabled.
2013-07-13 06:43:53 +08:00
Alex Deucher
e0a7565832 r600g: don't use the CB/DB CP COHER logic on r6xx
There are hw bugs.  Flush and inv event is sufficient.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=66837

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-12 18:07:56 -04:00
Jonathan Liu
af16f73051 configure: Avoid use of AC_CHECK_FILE for cross compiling
The AC_CHECK_FILE macro can't be used for cross compiling as it will
result in "error: cannot check for file existence when cross compiling".
Replace it with the AS_IF macro.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jonathan Liu <net147@gmail.com>
2013-07-12 13:21:28 -07:00
Brian Paul
bf86e0e050 nv30: fix KILL_IF breakage
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66858
2013-07-12 10:00:18 -06:00
Zack Rusin
00cd455bd5 gallium: fixup definitions of the rsq and sqrt
GLSL spec says that rsq is undefined for src<=0, but the D3D10
spec says it needs to be a NaN, so lets stop taking an absolute
value of the source which completely breaks that behavior. For
the gl program we can simply insert an extra abs instrunction
which produces the desired behavior there.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-11 20:19:04 -04:00
José Fonseca
a171812d27 util/u_format: Comment out half float denormal test case.
So that lp_test_format doesn't fail until we decide what should be done.
2013-07-12 15:48:38 +01:00
José Fonseca
1b0d29b5da gallivm: Eliminate redundant lp_build_select calls.
lp_build_cmp already returns 0 / ~0, so the lp_build_select call is
unnecessary.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-12 15:40:16 +01:00
Brian Paul
46205ab8cc tgsi: rename the TGSI fragment kill opcodes
TGSI_OPCODE_KIL and KILP had confusing names.  The former was conditional
kill (if any src component < 0).  The later was unconditional kill.
At one time KILP was supposed to work with NV-style condition
codes/predicates but we never had that in TGSI.

This patch renames both opcodes:
  TGSI_OPCODE_KIL -> KILL_IF   (kill if src.xyzw < 0)
  TGSI_OPCODE_KILP -> KILL     (unconditional kill)

Note: I didn't just transpose the opcode names to help ensure that I
didn't miss updating any code anywhere.

I believe I've updated all the relevant code and comments but I'm
not 100% sure that some drivers had this right in the first place.
For example, the radeon driver might have llvm.AMDGPU.kill and
llvm.AMDGPU.kilp mixed up.  Driver authors should review their code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-12 08:32:51 -06:00
Brian Paul
f501baabdb tgsi: fix-up KILP comments
KILP is really unconditional fragment kill.

We've had KIL and KILP transposed forever.  I'll fix that next.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-12 08:32:51 -06:00
Brian Paul
e7c3898725 tgsi: exec TGSI_OPCODE_SQRT as a scalar instruction, not vector
To align with the docs and the state tracker.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-12 08:32:51 -06:00
Brian Paul
f3fad24b62 tgsi: use X component of the second operand in exec_scalar_binary()
The code happened to work in the past since the (scalar) src args
effectively always have a swizzle of .xxxx, .yyyy, .zzzz, or .wwww so
whether you grab the X or Y component doesn't really matter.  Just
fixing the code to make it look right.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-12 08:32:51 -06:00
Brian Paul
cb2de08f27 mesa: update glext.h to version 20130708
This update fixes the problem with duplicated typedefs for
GLclampf and GLclampd in the previous version.

It also changes some parameter types for glDebugMessageCallbackARB()
and glTransformFeedbackVaryingsEXT().

Note we should someday update the glapi-gen code so that it
understands void pointer parameters.  Currently, the Python code
only understands "GLvoid *" but not "void *".  Luckily, the
compilers don't seem to complain about mixing GLvoid and void.
2013-07-12 08:32:51 -06:00
Brian Paul
5749aea255 mesa: fix Address Sanitizer (ASan) issue in _mesa_add_parameter()
If the size argument isn't a multiple of four, we would have read/
copied uninitialized memory.

Fixes an issue reported by Myles C. Maxfield <myles.maxfield@gmail.com>
2013-07-12 08:32:51 -06:00
Brian Paul
9ca026e220 mesa: simplify some _mesa_IsEnabled() queries
No need to test array->Enabled != 0 since the Enabled field can
only be 0 or 1.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-12 08:32:50 -06:00
Brian Paul
9fc532a263 os: add os_get_process_name() function
v2: explicitly test for BSD/APPLE, #warning for unexpected
environments.
2013-07-12 08:32:50 -06:00
Brian Paul
3fb3e1e38c mesa: whitespace, formatting, 80-column wrapping 2013-07-12 08:32:22 -06:00
Brian Paul
919236f3a2 softpipe: silence some MSVC warnings 2013-07-12 08:19:52 -06:00
Brian Paul
76666b9394 hud: silence some MSVC warnings 2013-07-12 08:19:52 -06:00
Brian Paul
d7a852b3a1 util: add casts to silence MSVC warnings in u_blit.c 2013-07-12 08:19:51 -06:00
Brian Paul
c45d8f2e98 tgsi: s/unsigned/int/ to silence MSVC warning 2013-07-12 08:19:50 -06:00
Brian Paul
2cfd768473 mesa: s/unsigned/int/ to fix MSVC warning in uniforms.c 2013-07-12 08:19:50 -06:00
Brian Paul
5b0fbf1b0b mesa: s/GLuint/GLint/ to silence MSVC warning in textore.c 2013-07-12 08:19:50 -06:00
Brian Paul
721f47227e mesa: add casts to fix MSVC warnings in multisample.c 2013-07-12 08:19:49 -06:00
Brian Paul
528e5b9476 mesa: s/GLint/GLuint/ to fix MSVC warnings in mipmap.c 2013-07-12 08:19:49 -06:00
Brian Paul
738337356b mesa: fix inconsistent function declaration, definitions
To silence MSVC warnings that the declaration and definitions
were different.
2013-07-12 08:19:49 -06:00
Brian Paul
8ba5c79d2c mesa: add cast to silence MSVC warning 2013-07-12 08:19:49 -06:00
Christian König
1681bd7f2b radeon/uvd: fall back to shader based decoding for MPEG2 on UVD 2.x v2
UVD 2.x doesn't support hardware decoding of MPEG2, just use shader
based decoding for those chipsets.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=66450

v2: fix interlacing as well

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-07-12 10:52:27 +02:00
José Fonseca
649ef4da30 glsl: Avoid variable length arrays.
They are a non-standard GCC extension that's not widely supported by
other C/C++ compilers.

Use a dynamic array instead.

Trivial. Should fix the MSVC build.
2013-07-12 09:28:22 +01:00
Matt Turner
1b0d6aef03 glsl: Add support for C-style initializers.
Required by GL_ARB_shading_language_420pack.

Parts based on work done by Todd Previte and Ken Graunke, implementing
basic support for C-style initializers of arrays.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
ae79e86d4c glsl: Add infrastructure for aggregate initializers.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
8d45caaeba glsl: Add an is_declaration field to ast_struct_specifier.
Will be used in a later commit to differentiate between a structure type
declaration and a variable declaration of a struct type. I.e., the
difference between

   struct S { float x; }; (is_declaration = true)

and

   S s;                   (is_declaration = false)

Also note that is_declaration = true for

   struct S { float x; } s;

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
5df807b06f glsl: Track structs' ast_type_specifiers in symbol table.
Will be used in a future commit. An ast_type_specifier is stored (rather
than an ast_struct_specifier) with the idea that we may have more
general uses for this in the future. struct names are prefixed with
'#ast.' to avoid collisions with the glsl_types in the symbol table.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
e641b5fbee glsl: Add process_vec_mat_constructor() function.
Based largely on process_array_constructor().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
af2987d5b6 glsl: Separate code into process_record_constructor().
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
a760c73853 glsl: Add copy-constructor for ast_struct_specifier.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
43757135b2 glsl: Add a constructor for ast_type_specifier.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
b85f0c5121 glsl: Clean up and clarify comment explaining initializer rules.
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
ce2464a8a7 glsl: Change type of is_array to bool.
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
361206771c glsl: Add a comment to note what an exec_list is a list of.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
46b74ca7bc glsl: Fix inverted conditional in error message.
The code float a[2] = float[2]( 3.4, 4.2, 5.0 ); previously generated
this:

   error: array constructor must have at least 2 parameters

when in fact it requires exactly two.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
9749d96817 glsl: Add missing return error_value(ctx) in error path.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
e117eda251 glsl: Remove unnecessary #include from ast_type.cpp.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Chia-I Wu
93742d9757 glsl/build: build builtin_compiler with VISIBILITY_CFLAGS
libglslcore.la and libglcpp.la that are built with builtin_compiler are also
linked to by drivers not using libdricore.  Since there is no public symbol in
them, it is better to mark all symbols hidden.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12 09:42:25 +08:00
Matt Turner
08c90f651b glsl: Add comment explaining "row_major" parsing.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-11 16:22:07 -07:00
Matt Turner
14ed9018de glsl: Mark "row_major" as not a reserved word in GLSL ES 3.0.
We mark ARB_uniform_buffer_object as enabled under ES 3 since it
contains that functionality, which tricked the compiler into tokenizing
"row_major".

Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 16:22:07 -07:00
Matt Turner
c30948517e glsl: Remove outdated FINISHME comment.
Explicit index support was added by commit 1256a5dc.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 16:22:07 -07:00
Alex Deucher
77300bacaf radeon: bump libdrm_radeon requirement for CIK support
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-11 19:11:44 -04:00
Christoph Bumiller
9974593dfb r600g: x/y coordinates must be divided by block dim in dma blit
Note: this is a candidate for the 9.1 branch.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-11 19:11:44 -04:00
Chih-Wei Huang
1d9271a95c r600g/sb: Fix Android build v2
Add the sb CXX files to the Android Makefile and also stop using some
c++11 features.

v2 (Vadim Girlin): use &bc[0] instead of bc.begin()
2013-07-12 01:11:04 +04:00
Vadim Girlin
758ac6f918 r600g/sb: improve math optimizations v2
This patch adds support for some math optimizations that are generally
considered unsafe, that's why they are currently disabled for compute
shaders.

GL requirements are less strict, so they are enabled for
for GL shaders by default. In case of any issues with
applications that rely on higher precision than guaranteed by GL,
'sbsafemath' option in R600_DEBUG allows to disable them.

v2 - always set proper src vector size for transformed instructions
   - check for clamp modifier in the expr_handler::fold_assoc

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-11 23:01:01 +04:00
Jonathan Gray
c451619dde st/xvmc/tests: avoid non portable error.h functions
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-07-11 09:52:00 +02:00
Anuj Phogat
9a1a67b081 i965/blorp: Fix clear rectangle alignment in fast color clear
From BSpec: 3D-Media-GPGPU Engine > 3D Pipeline > Pixel >
Pixel Backend > MCS Buffer for Render Target(s) [DevIVB+]:
[DevHSW:GT3]: Clear rectangle must be aligned to two times
the number of pixels in the table shown below...
Observed no piglit, gles3conform regressions with this patch.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65744
2013-07-10 18:41:16 -07:00
Chia-I Wu
ad244884fc winsys/intel: build with VISIBILITY_CFLAGS
There is no public symbol in this winsys.
2013-07-11 09:03:59 +08:00
Chia-I Wu
79bc245c01 ilo: reduce PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS to 12
So that there are at most (2^22 * 6) texels, lower than the 2^26 limit.
2013-07-11 08:03:27 +08:00
Chia-I Wu
29af29b8dc ilo: correctly initialize undefined registers in fs
Initialize all 4 channels of undefined registers (that is, TEMPs that are used
before being assigned) in FS.
2013-07-11 07:01:51 +08:00
Michel Dänzer
a06ee5a09e radeonsi: Handle TGSI_OPCODE_DDX/Y using local memory
16 more little piglits.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-10 18:40:32 +02:00
Michel Dänzer
a6b83c0f23 radeonsi: Handle TGSI_OPCODE_TXD
One more little piglit.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-10 12:16:38 +02:00
José Fonseca
b042aae70d util/u_math: Use xmmintrin.h whenever possible.
It seems  __builtin_ia32_ldmxcsr is only available on gcc and only when
-msse is used. xmmintrin.h/pmmintrin.h provide portable intrinsics, but
these too are only available with gcc when -msse/-msse3 are set.

scons build always sets -msse on x86 builds, but autotools doesn't seem
to.

We could try to get this working on gcc x86 without -msse by emitting
assembly, but I believe that in this day and age we really should be
building Mesa with -msse and -msse2.
2013-07-10 07:56:17 +01:00
Chia-I Wu
045bf0db52 ilo: honor surface padding requirements
The PRM specifies several padding requirements that we failed to honor.
2013-07-10 12:40:22 +08:00
Zack Rusin
63386b2f66 util: treat denorm'ed floats like zero
The D3D10 spec is very explicit about treatment of denorm floats and
the behavior is exactly the same for them as it would be for -0 or
+0. This makes our shading code match that behavior, since OpenGL
doesn't care and on a few cpu's it's faster (worst case the same).
Float16 conversions will likely break but we'll fix them in a follow
up commit.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-09 23:30:55 -04:00
Matt Turner
80bc14370a mesa: Set ProfileMask properly for core profile.
Fixes MESA_GL_VERSION_OVERRIDE=3.2 egl-create-context-verify-gl-flavor.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-09 14:19:22 -07:00
Kenneth Graunke
8c9a54e7bc i965: Delete intel_context entirely.
This makes brw_context inherit directly from gl_context; that was the
only thing left in intel_context.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:35 -07:00
Kenneth Graunke
53631be4eb i965: Move intel_context::gen and gt fields to brw_context.
Most functions no longer use intel_context, so this patch additionally
removes the local "intel" variables to avoid compiler warnings.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:34 -07:00
Kenneth Graunke
2e26afb37b i965: Move intel_context::has_llc to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:33 -07:00
Kenneth Graunke
794de2f387 i965: Move intel_context::is_<platform> flags to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:31 -07:00
Kenneth Graunke
44fd490067 i965: Move must_use/has_separate_stencil fields to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:30 -07:00
Kenneth Graunke
3b80b147f6 i965: Move intel_context::has_hiz to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:29 -07:00
Kenneth Graunke
351d2add62 i965: Free brw, not intel.
Things worked out in the past because both brw and intel share the same
memory address (by virtue of intel being the first member of brw).

However, brw is what actually gets rzalloc'd (brw_context.c:285), so
freeing that seems safer and more obvious.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:28 -07:00
Kenneth Graunke
e3c2bb1eb4 i965: Shorten context base class dereference chains.
ctx->DrawBuffer is much more sensible than brw->intel.ctx.DrawBuffer.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:26 -07:00
Kenneth Graunke
d5b4a3f5a3 i965: Move intel_context::has_swizzling to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:25 -07:00
Kenneth Graunke
02128c448d i965: Move intel_context::intelScreen to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:24 -07:00
Kenneth Graunke
44a11eab9c i965: Delete unused intel_context::driFd field.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:23 -07:00
Kenneth Graunke
e0858763bc i965: Store brw_context as the DRI driver private, not intel_context.
Right now, they're interchangeable.  In the future, intel_context will
either go away or change purpose.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:21 -07:00
Kenneth Graunke
a1d94cdb00 i965: Move intel_context::driContext to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:20 -07:00
Kenneth Graunke
a9d33dbbdd i965: Move intel_context::NewGLState to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:19 -07:00
Kenneth Graunke
dd54558d31 i965: Move intel_context::upload to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:17 -07:00
Kenneth Graunke
0273e6e23e i965: Move intel_context::max_gtt_map_object_size to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:16 -07:00
Kenneth Graunke
b15f1fc3c6 i965: Move intel_context::perf_debug to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:14 -07:00
Kenneth Graunke
7c3180a4ad i965: Move intel_context::no_batch_wrap to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:13 -07:00
Kenneth Graunke
5314afa27a i965: Move intel_context's framerate throttling fields to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:12 -07:00
Kenneth Graunke
ec995de6fb i965: Move intel_context::stats_wm to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:10 -07:00
Kenneth Graunke
329779a0b4 i965: Move intel_context::batch to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:08 -07:00
Kenneth Graunke
5d8186ac1a i965: Move intel_context::hw_ctx to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:07 -07:00
Kenneth Graunke
eeb75b41f1 i965: Move intel_context::bufmgr to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:05 -07:00
Kenneth Graunke
e33439045d i965: Move intel_context's driconf flags to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:04 -07:00
Kenneth Graunke
fe0a8cb30d i965: Move intel_context::reduced_primitive to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:03 -07:00
Kenneth Graunke
9147b40496 i965: Move front buffer rendering fields from intel_context to brw.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:01 -07:00
Kenneth Graunke
e43043c316 i965: Move intel_context::vtbl to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:58 -07:00
Kenneth Graunke
fbdd3891e1 i965: Move intel_context::optionCache to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:55 -07:00
Kenneth Graunke
ca437579b3 i965: Pass brw_context to functions rather than intel_context.
This makes brw_context available in every function that used
intel_context.  This makes it possible to start migrating fields from
intel_context to brw_context.

Surprisingly, this actually removes some code, as functions that use
OUT_BATCH don't need to declare "intel"; they just use "brw."

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:53 -07:00
Kenneth Graunke
86f2711722 i965: Remove pointless intel_context parameter from try_copy_propagate.
It's already part of the visitor class.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:51 -07:00
Kenneth Graunke
18a223d323 i965: Add forward declarations of brw_context to a few places.
These files have forward declarations for intel_context.  This makes
brw_context available in the same places without further #include
monkeying.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:50 -07:00
Kenneth Graunke
a69274454b i965: Replace #include "intel_context.h" with brw_context.h.
brw_context.h includes intel_context.h, but additionally makes the
brw_context structure available.  Switching this allows us to start
using brw_context in more places.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:48 -07:00
Kenneth Graunke
99ebf9d07a i965: Move ctx->Const setup from intelInitContext to the new helper.
This also requires moving _mesa_init_point() to after the ctx->Const
initialization.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:47 -07:00
Kenneth Graunke
963d9f78a4 i965: Split code to set ctx->Const values into a helper function.
brwCreateContext() has a lot of random things to do.  Factoring out the
part that initializes ctx->Const values and shader compiler options
makes the main function a bit easier to read.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:45 -07:00
Kenneth Graunke
d13c120573 i915: Remove i965+ chip names.
i965+ chipsets shouldn't ever hit this driver.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:44 -07:00
Kenneth Graunke
e4f3d5cdcf i965: Remove i915 chip names.
i915 chipsets shouldn't ever hit this driver.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:42 -07:00
Kenneth Graunke
2921390666 i965: Replace intel_context:needs_ff_sync with intel->gen == 5.
Technically, needs_ff_sync was set on Gen5+, but it was only consulted
in the clipper threads and quad/lineloop decomposition code, which are
both Gen4-5 only.  So in reality it only identified Ironlake.

The named flag doesn't really clarify things, and seems like overkill.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:07:13 -07:00
Kenneth Graunke
968c57782d i965: Add missing newline to blorp color clear perf_debug message.
perf_debug() doesn't add a newline for you; without this, all the
INTEL_DEBUG=perf output was jumbled together.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-09 10:10:46 -07:00
Emil Velikov
f0260f4e3d glsl: Silence unused variable warning in the release build
Resolves the following gcc warning

 opt_flip_matrices.cpp:84:32: warning: unused variable 'deref'

v2: keep the variable, but wrap it in a ifndef NDEBUG block
    (suggested by Ian)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-08 19:08:42 -07:00
Emil Velikov
4df6823f21 glsl/ast: Silence uninitialized variable warnings in the release build
Resolves the following gcc warnings

 warning: 'iface_type_name' may be used uninitialized in this function
 warning: 'var_mode' may be used uninitialized in this function

Note: The variables are initialised to UNKNOWN and ir_var_auto

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-08 19:08:30 -07:00
Paul Berry
292368570a i965: Add an assertion to brwProgramStringNotify.
driver->ProgramStringNotify is only called for ARB programs, fixed
function vertex programs, and ir_to_mesa (which isn't used by the i965
back-end).  Therefore, even after geometry shaders are added,
brwProgramStringNotify should only ever be called with a target of
GL_VERTEX_PROGRAM_ARB or GL_FRAGMENT_PROGRAM_ARB.

This patch adds an assertion to clarify that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-08 14:18:02 -07:00
Matt Turner
ba7b60d3e4 glsl: Allow non-constant expression initializers of const-qualified vars.
Required by ARB_shading_language_420pack.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-08 12:46:56 -07:00
Marek Olšák
1faa375573 r600g: improve the mechanism for recognizing an empty CS
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
287b2fa115 r600g: explicitly flush caches for streamout-based buffer copying & clearing
It's done automatically for vertex buffers, but not for constant buffers,
textures, and colorbuffers.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
7948ed1250 r600g: only flush the caches that need to be flushed during CP DMA operations
This should increase performance if constant uploads are done with the CP DMA,
because only the cache that needs to be flushed is flushed.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
1b40398d02 r600g: split INVAL_READ_CACHES into vertex, tex, and const cache flags
also flushing any cache in evergreen_emit_cs_shader seems to be superfluous
(we don't flush caches when changing the other shaders either)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Alex Deucher
098316211c r600g: adjust flush flags (v3)
1. flush SH with read caches
2. add flag for DB flushes
3. add flag for CB flushes

v2: flush all CBs, remove redundant emit_state variable.
v3: Marek: also set the new flags in r600_context_flush, the CP dma functions,
    and texture_barrier, and rename them

Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
862f69fbe1 r600g: don't call buffer_wait in buffer_mmap_sync_with_rings
The winsys should do this, because it measures how much time we spend
in buffer_map doing synchronization, which can be viewed with the gallium
HUD.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
94d294137e r600g: don't read back the MSAA depth buffer if the read flag is not set
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
141b892620 r600g: don't flush the context in texture_transfer_map
the winsys does this automatically

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
ae87aae0c4 r600g: fix texture offset computation for mapped MSAA depth buffers
It was wrong, because the offset shouldn't be applied to MSAA depth buffers.
This small cleanup should prevent such issues in the future.

This fixes a lockup in "piglit/fbo-depthstencil default_fb -samples=n".

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
a3263cca59 r600g: fix color resolve for RGBX8 and RGBX16 integer formats
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
b1a061b81e r600g: enable fast MSAA color clear for array/3D/cube textures
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
87669c3654 r600g: implement fast MSAA color clear for integer textures
this also fixes the fast clear with multiple colorbuffers and each having
a different format

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Christian König
085c695488 r600/uvd: fix check for UVD 2.x
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-07-08 19:51:20 +02:00
Chris Forbes
1415a1884c i965: fix alpha test for MRT
Include src0 alpha in the RT write message when using MRT, so it is used
for the alpha test instead of the normal per-RT alpha value.

Fixes broken rendering in Dota2 under Wine [FDO #62647].

No Piglit regressions on Ivybridge.

V2: reuse (and simplify) existing sample_alpha_to_coverage flag in
the FS key, rather than adding another redundant one.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewd-by: Paul Berry <stereotype441@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62647
NOTE: This is a candidate for the stable branches.
2013-07-06 12:41:54 +12:00
Roland Scheidegger
9ef49cfd84 gallivm: (trivial) fix using one lod instead of per-quad lod for texel fetch
The logic for choosing number of lods was bogus.
(The code should ultimately handle the case of only one lod even with multiple
quads but currently can't.)
2013-07-05 18:07:51 +02:00
José Fonseca
45f174ce40 gallivm: Remove bogus assert.
It is perfectly valid for the swizzle to be bigger than 2. For example the
texel offsets could be

  SAMPLE ..., IMM[0].zzz

What is not correct is for chan_index to be bigger than 2.

Trivial.
2013-07-05 14:35:54 +01:00
Ben Skeggs
c29c6b2b2e nvc0: enable very initial support for nvf0 (GK110)
Shaders need a lot of work still.  Basic stuff generally works, so this
is basically just fine for gnome-shell, OA etc at this point.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2013-07-05 14:15:04 +10:00
Roland Scheidegger
4dbca8672b gallivm: (trivial) fix bogus assertion for per-element lod with 1d resources
The assertion was always broken but the code unused until enabling the
per-element lod code. Fixes piglit texelFetch vs isampler1D and similar
tests (only run with GL 3.0 version override).
2013-07-05 01:19:23 +02:00
Roland Scheidegger
f3bbf65929 gallivm: do per-pixel lod calculations for explicit lod
d3d10 requires per-pixel lod calculations for explicit lod, lod bias and
explicit derivatives, and we should probably do it for OpenGL too - at least
if they are used from vertex or geometry shaders (so doesn't apply to lod
bias) this doesn't just affect neighboring pixels.
Some code was already there to handle this so fix it up and enable it.
There will no doubt be a performance hit unfortunately, we could do better
if we'd knew we had a real vector shift instruction (with variable shift
count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu).
Don't do anything for lod bias and explicit derivatives yet, though
no special magic should be needed for them neither.
Likewise, the size query is still broken just the same.

v2: Use information if lod is a (broadcast) scalar or not. The idea would be
to base this on the actual value, for now just pretend it's a scalar in fs
and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same
code is generated for fs as before).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-04 19:42:04 +02:00
Zack Rusin
bbd1e60198 draw: fix overflows in the indexed rendering paths
The semantics for overflow detection are a bit tricky with
indexed rendering. If the base index in the elements array
overflows, then the index of the first element should be used,
if the index with bias overflows then it should be treated
like a normal overflow. Also overflows need to be checked for
in all paths that either the bias, or the starting index location.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-03 09:06:30 -04:00
Zack Rusin
09820902d7 draw/llvm: index overflows if it's greater than elt max
The comparison, incorrectly, was greater-than-or-equal to
elt max.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-03 09:06:24 -04:00
Kenneth Graunke
764afc48cf i965: Move the rest of intel_tex_layout.c into brw_tex_layout.c.
The texture alignment unit functions are called from brw_tex_layout.c,
so it makes sense to put them there.  Since the only caller of
intel_get_texture_alignment_unit() is in brw_tex_layout.c, it could be
made into a static function.  However, this patch instead simply folds
it into the caller, as it's only two lines anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
466aa712b6 i965: Push intel_get_texture_alignment_unit call into brw_miptree_layout
intel_miptree_create_layout() calls intel_get_texture_alignment_unit()
and then immediately calls brw_miptree_layout().  There are no other
callers.

intel_get_texture_alignment_unit() populates the miptree's alignment
unit fields, which are used by brw_miptree_layout() to determine where
to place each miplevel.  Since brw_miptree_layout() needs those to be
present, it makes sense to have it initialize them as the first step.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
c4c3c0dc94 i965: Declare for-loop counters in the loop in brw_tex_layout.c.
The driver is compiled in C99 mode, so this is not a problem.  It's
slighlty tidier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
ccf312fd12 i965: Remove use of GLuint/GLint in brw_tex_layout.c.
Using GL types is silly; this isn't even remotely API-facing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
ed95e396f3 i965: Tidy the brw_tex_layout.c copyright and file header comments.
This uses Doxygen style for the file comments, and generally makes it
more consistent with the rest of the driver.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
2ea87fde31 i965: Move i945_texture_layout_2d to brw_tex_layout.c
This consolidates the miptree layout logic in a single file.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
1920209970 i965: Remove fallthrough for Gen4 cube map layout.
Now that both 2DArray and Cube layouts are taken care of by helper
functions, it's easy to just call the right function for each
generation.  This is a little cleaner than falling through.

This also reworks the comments.  Referencing "Volume 1" of the BSpec
isn't very helpful, since that's only available inside Intel, and it
doesn't even use volume numbers.  Also, "Ironlake...finally" sounds a
bit strange considering that almost all hardware uses the 2D array
approach.  At this point, Gen4 is the only special case.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
7e4007a1b3 i965: Combine GL_TEXTURE_CUBE_MAP_ARRAY case with the other array cases.
These do the exact same thing; combining them is tidier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
bc51f15b32 i965: Pull 3D texture layout code out into a helper function.
A bit cleaner than having it in one giant function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
abc2bdffd6 i965: Replace maxBatchSize variable with BATCH_SZ define.
maxBatchSize was only ever initialized to BATCH_SZ, and a few places
used BATCH_SZ directly anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
2c602d2adf i965: Move annotate_aub out of the vtable.
brw_annotate_aub() is the only implementation of this function, so it
makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
f05f8793c8 i965: Move debug_batch hook out of the vtable.
brw_debug_batch() is the only implementation of this function, so it
makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
749160aab3 i965: Remove render_target_supported from the vtable.
brw_render_target_supported() is the only implementation of this
function, so it makes sense to just call it directly.

Rather than adding an #include of brw_wm.h, this patch moves the
prototype to brw_context.h.  Prototypes seem to be in rather arbitrary
places at the moment, and either place seems as good as the other.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
7c5279e554 i965: Move is_hiz_depth_format out of the vtable.
brw_is_hiz_depth_format() is the only implementation of this function,
so it makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
607338f1cb i965: Remove the invalidate_state() vtable hook.
The hook was a noop.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
251cdcf059 i965: Replace fprintfs with assertions in GLenum comparison translators.
These functions translate GLenum comparison operations into the hardware
enumerations.  They should never be passed something other than a GL
comparison operator, or something is very broken.

Assertions seem more appropriate than fprintf.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
7ee616f1bf i965: Replace intel_state.c enums with those from brw_defines.h.
Both intel_context.h and brw_defines.h have #defines for comparison
functions, stencil ops, blending logic ops, and blending factors.
They're exactly the same values, so it makes sense to pick one.

brw_defines.h is the logical place for this kind of stuff, so this patch
converts intel_state.c to use the set defined there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
c9db037dc9 i965: Delete pre-DRI2.3 viewport hacks.
The __DRI_USE_INVALIDATE extension was added in May 11th, 2010 by commit
4258e3a2e1.  At this point, it's unlikely that anyone's using the
right mix of new and old components to hit this path.  Deleting it
removes an untested code path and cleans up the driver a bit.

Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Keith Packard <keithp@keithp.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
cbb37b7586 i965: Remove "There are probably better ways" comment.
There are always better ways to do things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
7115bee993 i965: Delete brw_print_reg() function.
This wasn't called from anywhere; presumably it was used to examine
brw_regs when debugging shader assembly.  However, it prints registers
in a different notation than brw_disasm.c which everyone is used
to...which means I doubt anyone will want to use it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
bc8b62e3a0 i965: Move contents of intel_clear.h to intel_context.h.
Having a header file for a single prototype seems rather excessive.
Plus, the actual function is in brw_clear.c, not intel_clear.c, so
there isn't even the .c/.h filename symmetry one might expect.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
7d8e70f301 i965: Move contents of intel_extensions.h to intel_context.h.
Having an entire header file for a single prototype seems a bit
excessive.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
7d119880e8 i965: Remove some dead code.
A random smattering of things that just aren't used anymore.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
d245e795cf i965: Delete dead intel_buffer_object::range_map_size field.
Nothing uses this, apparently.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
1f6ebdd43f i965: Remove intel_buffer_object::source.
This was only used for BOs backed by system memory on i915.  With that
gone, there's nothing that even sets source to non-zero, so this is
purely dead code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
6e5b80ee5a i965: Fix buffer object segfault since removal of system memory BOs.
Commit cf31a19300 removed support for BOs
backed by system memory, as it was only useful for i915.  However, it
removed a little too much code: intel_bufferobj_buffer() used to call
intel_bufferobj_alloc_buffer(), and after that commit, it didn't.

This led to NULL pointer dereferences in several test cases, such as
es3conform's transform_feedback_state_variables test.

This commit restores the allocation, preserving the original behavior.
It may not be the cleanest approach, but tidying should come later.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66432
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:12 -07:00
Matthew McClure
012ba47076 postprocess: move second temporary assertion into isolated configuration
With this patch we will only assert that the second temporary is allocated,
when there are more than two active filters.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66423

Signed-off-by: Brian Paul <brianp@vmware.com>
2013-07-03 09:19:04 -06:00
José Fonseca
9b6788eb15 glsl: Ensure snprintf is defined on MSVC builds.
Should fix:

  src\glsl\opt_dead_builtin_varyings.cpp(244) : error C3861: 'snprintf': identifier not found
  ...
2013-07-03 08:26:08 +01:00
Ilia Mirkin
4bc8e3c3e4 targets/xvmc-nouveau: add in missing nv30 lib
Currently libXvMCnouveau.so is missing nv30_screen_create. Add it in so
that it may be dlopen'd.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-07-03 09:02:40 +02:00
Marek Olšák
30c3e8718d mesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependencies
Not needed with do_dead_builtin_varyings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
74edd56927 st/mesa: disable EXT_separate_shader_objects
The extension disallows elimination of set-but-unused varyings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
b3d8b4c0b4 glsl/linker: eliminate unused and set-but-unused built-in varyings
This eliminates built-in varyings such as gl_Color, gl_SecondaryColor,
gl_TexCoord, and gl_FogFragCoord if they are unused by the next stage or
not written at all (e.g. gl_TexCoord elements). The gl_TexCoord array is
broken down into separate vec4s if needed.

v2: - use a switch statement in varying_info_visitor::visit(ir_variable*)
    - use snprintf
    - disable the optimization for GLES2

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
3c555827c3 glsl/linker: check against varying limit after unused varyings are eliminated
We counted even the varyings which were later eliminated, which was
suboptimal.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
284d954912 glsl/linker: link shaders in the opposite order (from fragment to vertex)
This ensures that inter-shader outputs and inputs are properly eliminated
across 3 or more shader stages. The behavior is unchanged with 2 or less
shader stages.

For example, elimination of unused FS inputs causes elimination of matching
GS outputs, which causes elimination of the GS inputs that were needed for
evaluation of the eliminated GS outputs, which causes elimination of
matching VS outputs. An unused FS input is all that's needed to trigger
this chain reaction.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
030ca230e2 mesa: renumber shader indices according to their placement in pipeline
See my explanation in mtypes.h.

v2: don't do this in gallium
v3: also updated the comment at the gl_shader_type definition

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
José Fonseca
84f367e69a gallivm: Simplify intrinsic name construction.
Just noticed this could be slightly shortened when fixing MSVC build.

Trivial.
2013-07-02 13:12:31 +01:00
Kenneth Graunke
15ca0ca1b6 glsl/builtins: Fix ARB_texture_cube_map_array built-in availability.
This patch adds texture() for isamplerCubeArray and usamplerCubeArray,
which were entirely missing.

It also makes texture() with a LOD bias fragment shader specific.  The
main GLSL specification explicitly says that texturing with LOD bias
should not be allowed for vertex shaders.

Affects Piglit's ARB_texture_cube_map_array/compiler/tex_bias-01.vert.
which tries to use bias in a vertex shader.  Currently, it expects this
to pass (so this patch regresses the test), but I've sent a patch to
reverse the expected behavior (so this patch would fix the updated test):
http://lists.freedesktop.org/archives/piglit/2013-June/006123.html

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2013-07-02 01:01:30 -07:00
José Fonseca
4c859901ce gallivm: Fix MSVC build. 2013-07-02 06:41:32 +01:00
José Fonseca
e621ec816d gallivm: Fix indirect immediate registers.
If reg->Register.Indirect is true then the immediate is not truly a
constant LLVM expression.

There is no performance regression in using LLVMBuildBitCast, as it will
fallback to LLVMConstBitCast internally when the argument is a constant.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-07-02 06:30:06 +01:00
Zack Rusin
70bc43acdb gallium/tests: fix the translate test 2013-06-28 09:43:17 -04:00
Anuj Phogat
722721d718 i965: Enable ext_framebuffer_multisample_blit_scaled on intel h/w
This patch enables ext_framebuffer_multisample_blit_scaled extension
on intel h/w >= gen6.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-01 15:21:25 -07:00
Anuj Phogat
6fc3da2da0 i965/blorp: Add bilinear filtering of samples for multisample scaled blits
Current implementation of ext_framebuffer_multisample_blit_scaled in
i965/blorp uses nearest filtering for multisample scaled blits. Using
nearest filtering produces blocky artifacts and negates the benefits
of MSAA. That is the reason why extension was not enabled on i965.

This patch implements the bilinear filtering of samples in blorp engine.
Images generated with this patch are free from blocky artifacts and show
big improvement in visual quality.

Observed no piglit and gles3 regressions.

V3:
- Algorithm used for filtering assumes a rectangular grid of samples
  roughly corresponding to sample locations.
- Test the boundary conditions on the edges of texture.

V4:
- Clip texcoords and use conditional MOVs.
- Send texture dimensions as push constants.
- Remove the optimization in case of scaled multisample blits.

V5:
- Move mcs_fetch() inside the 'for' loop after computing pixel coordinates.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-01 15:21:25 -07:00
Ian Romanick
27f2df2507 docs: Import 9.1.4 release notes, add news item.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-01 14:48:58 -07:00
Zack Rusin
1c2e5c223d draw/translate: fix instancing
We were incorrectly computing the buffer offset when using the
instances. The buffer offset is always equal to:
start_instance * stride + (instance_num / instance_divisor) *
stride
We were completely ignoring the start instance quite
often producing instances that completely wrong, e.g. if
start instance = 5, instance divisor = 2, then on the first
iteration it should be:
5 * stride, not (5/2) * stride as we'd have currently, and if
start instance = 1, instance divisor = 3, then on the first
iteration it should be:
1 * stride, not 0 as we'd have.
This fixes it and adjusts all the code to the changes.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 05:21:20 -04:00
Zack Rusin
df4ab7974a draw: fix incorrect clipper invocation statistics
clipper invocations are computed earlier (of course
before the emittion) so this code was adding bogus
numbers to already computed clipper invocations.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:29 -04:00
Zack Rusin
34546d61c1 draw/gallivm: export overflow arithmetic to its own file
We'll be reusing this code so lets put it in a common file
and use it in the draw module.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:24 -04:00
Zack Rusin
88de009cc1 draw: check for integer overflows in instance computation
Integers could easily overflow is the starting instance
was large enough. Instead of letting bogus counts through
set the instance to max if it overflown and let our
regular buffer overflow computation handle it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:20 -04:00
Zack Rusin
2f13f28120 draw: check for an integer overflow when computing stride
Our buffer overflow arithmetic was susceptible to integer
overflows which was the buffer overflow logic to break.
Lets use the llvm overflow intrinsics to check for integer
overflows while computing the stride/needed buffer size.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:16 -04:00
Zack Rusin
e742f7788e draw: account for elem size when computing overflow
We weren't taking into account the size of element
that is to be fetched, which meant that it was possible
to overflow the buffer reads if the stride was very
close to the end of the buffer, e.g. stride = 3, buffer
size = 4, and the element to be read = 4. This should
be properly detected as an overflow.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:12 -04:00
Vinson Lee
7214fe3cc4 i965: Initialize brw_blorp_const_color_program member variables.
Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-01 10:16:16 -07:00
Ross Burton
2c6186390c eglplatform: use unsigned long instead of 32-bit ints in generic platform
In the generic Unix case use the "unsigned long" type instead of 32-bit
integers so that the type sizes are consistant on 64-bit machines between X11
and not-X11.

Signed-off-by: Ross Burton <ross.burton@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-01 10:06:24 -07:00
Ross Burton
1a7275de9a build: fix EGL build when no X11 headers are present
eglplatform.h defaults to X11 on Unix unless told otherwise, so if we're doing a
build without any X11 support tell it so that we don't try including headers
that don't exist.

Also set GL_PC_FLAGS so that the definition is in egl.pc, so that applications
using EGL don't try to pull in X11 headers on systems where EGL was configured
without X11 support.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64959
Signed-off-by: Ross Burton <ross.burton@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-01 10:06:11 -07:00
José Fonseca
acc6a141b8 tools/trace: Return dummy fence object to silence warnings. 2013-07-01 12:06:58 +01:00
José Fonseca
0fd71ac9eb tools/trace: Don't crash if a trace has no timing information. 2013-07-01 12:05:57 +01:00
José Fonseca
fa3040c117 scons: Fix dependencies of enums.c and api_exec.c. 2013-07-01 12:04:59 +01:00
Maarten Lankhorst
bf95ca7de0 nvc0: allow frame dropping in h264
The only reason the checks existed were paranoia, when I first
wrote the code I wasn't sure it was correct. Now that I am,
the asserts triggered when XBMC was dropping frames, so remove it.

NOTE: This is a candidate for the 9.1 branch.
2013-07-01 08:47:49 +02:00
Tom Stellard
24fa43675f r300g/compiler: Prevent regalloc from swizzling texture operands v2
https://bugs.freedesktop.org/show_bug.cgi?id=63520

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-30 21:38:57 -07:00
Tom Stellard
e2c3640540 r300g/compiler/tests: Add an assembly parser
The assembly parser can be used to load r300 assembly dumps
and run them through any of the r300 compiler passes.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-30 21:38:57 -07:00
Tom Stellard
ab40d8d56f r300g: Fix make check
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-30 21:24:55 -07:00
Grigori Goronzy
30004b20c2 r600g: implement fast color clears for MSAA on evergreen+
Allows MSAA colorbuffers, which have a CMASK automatically and don't
need any further special handling, to be fast cleared. Instead
of clearing the buffer, set the clear color and the CMASK to the
cleared state.

Fast clear is used only when all bound colorbuffers fulfill certain
conditions: a CMASK is required, we have to be able to create a clear
color value for the format and the texture mustn't contain multiple
images. Technically, it should be possible to support array textures
and cubemaps if all images are attached to the framebuffer,
but this does not appear to be common.

v2: fix fast clear check
v3: Marek: - disable fast clear with 128-bit formats, which are unsupported
           - set tex->dirty_level_mask in r600_clear, so that the driver knows
             the resource must be decompressed/expanded
           - return early from r600_clear if there's nothing else to do

Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-07-01 03:02:43 +02:00
Marek Olšák
b1693194ee r600g/compute: disable unused colorbuffer slots
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-01 03:02:43 +02:00
Marek Olšák
f83e220d36 st/mesa: handle SNORM formats in generic CopyPixels path
v2: check desc->is_mixed in util_format_is_snorm
2013-06-30 22:14:37 +02:00
Matt Turner
adf8afa168 i965: NULL check depth_mt to quiet static analysis.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-29 15:19:08 -07:00
Roland Scheidegger
7d430bfab9 llvmpipe: fix timer query if there's no bins
b04a295a4a removed seemingly unnecessary
code in get_query. Turns out this code could in fact be reached - while
timestamps are always binned, if there are no bins (which happens if fb
size is 0) then the rasterization query code filling this in is still
never executed.
So fix this up by filling in some timestamp, but do it at EndQuery time
not GetQuery time which should be more appropriate.
Makes piglit arb_timer_query-timestamp-get happy again.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-29 16:58:02 +02:00
Tom Stellard
5a925cc550 clover: Don't segfault when compiling a program with no kernel 2013-06-28 15:19:06 -07:00
Eric Anholt
d7361f2943 mesa: Remove unused allow_large_textures driconf from classic drivers.
This option hasn't been used since the introduction of DRI2.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:27 -07:00
Kenneth Graunke
03600660a1 i915: Remove GLES 3.0 sRGB workaround.
Gen3 doesn't support GLES 3.0, so there's no need for it.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
dc8796506e i965: Remove is_945.
Only relevant on Gen3.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
a4e31956ac i965: Delete hw_stencil flag.
This was only used by i915.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
4299e35888 i965: Remove hw_stipple flag.
This was only used by i915.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
1a5dca38e9 i965: Remove use_early_z option.
This was only used by i965+.

v2: Also remove the option from the driconf list. (change by anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
2cc5724db2 i965: Remove unused SUBPIXEL_* macros.
Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
2e9fe0ca12 i965: Remove redundant Gen3 PCI IDs.
Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
1811f5c43d intel: Remove unused INTEL_MAX_FIXUP macro.
v2: Remove it from i915, too (change by anholt)

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Eric Anholt
0ac0a1b02e i965: Drop i915 register/instruction definitions.
v2: Remove unused DV_PF_* macros, too. (change by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:26 -07:00
Eric Anholt
1b67cd29a1 i965: Drop code for calling the empty brw_update_draw_buffers() hook.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
7c232189c5 i965: Drop dead i915 blend state code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
d58d0a3754 i965: Drop i915-specific blit clear code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
cf31a19300 i965: Drop the system-memory VBO support for i915.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
814440aadd i965: Drop i915 swtnl code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
bb2e312d4d i965: Drop i915-specific vtbl entries.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
a61d8f6110 i965: Drop swtnl fallback code for i915.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
28e80d7136 i965: Drop i915 code from intel_screen.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
4a08a86f22 i965: Drop #ifdef I915 code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
6fddd375d7 i965: Drop code checking for gen <= 3.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
3c231b8631 i915: Remove a duplicated set of PCI IDs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
8ac1ed92aa i915: Remove various remaining dead code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
934974fba6 i915: Remove dead debug flags.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
39c5fd7f13 i915: Remove state batch emit support.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
a40f9871a0 i915: Drop unused register #defines from the shared reg file.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
173666e2ed i915: Drop 965+ GL version setup.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
f6426509dc i915: Remove gen6+ batchbuffer support.
While i915 does have hardware contexts in hardware, we don't expect there
to ever be SW support for it (given that support hasn't even made it back
to gen5 or gen4).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
c25e3c34d6 i915: Drop chipset detection code for 965+ chipsets.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
014251ef42 i915: Drop context fields specific to 965+ chipsets.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
d71b7301ec i915: Drop all has_llc code.
i915 never has llc.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
be63c1c993 i915: Remove the remainder of the batchbuffer caching.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
7f210bf535 i915: Remove miscellanous uncalled gen4 code from formerly shared files.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
6bdc5ecbba i915: Remove most of the code under gen >= 4 checks.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
18100d415e i915: Remove fake ETC support that only existed on gen4+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
27eedca3e0 i915: Remove separate stencil code.
This was formerly-shared code for supporting gen5+.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
279f0bce47 i915: Remove the I915 macro from the formerly shared code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
f26104eb5b i915: Remove all the MSAA support code.
This hardware doesn't have MSAA support, so this code is all a waste for it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
0f31e06a2e i915: Remove all the HiZ code from i915.
v2: Remove extra struct forward declaration (change by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Ian Romanick
927f572c27 mesa: GL_EXT_shadow_funcs is not optional with GL_ARB_shadow
Every driver left in Mesa that enables one also enables the other.
There's no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
41853b598c mesa: GL_ARB_texture_storage_multisample is not optional with GL_ARB_texture_multisample
In Mesa, this extension is implemented purely in software.  Drivers may
*optionally* provide optimized paths.  If a driver enables,
GL_ARB_texture_multisample, it gets GL_ARB_texture_storage_multisample
for free.

NOTE: This has the side effect of enabling the extension in Gallium
drivers that enable GL_ARB_texture_multisample.

v2 (Ken): Still prevent multisample texture targets in TexParameter for
implementations that don't support multisampling.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
d5b6b7a39b mesa: GL_ARB_texture_storage is not optional
In Mesa, this extension is implemented purely in software.  Drivers may
*optionally* provide optimized paths.

NOTE: This has the side effect of enabling the extension in the radeon,
r200, and nouveau drivers.

v2: Minor whitespace tidying (suggested by Brian).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
70966570f3 mesa: GL_ARB_shading_language_100 is not optional
This extension just provides some of the most basic software framework
for GLSL.  Without GL_ARB_vertex_shader or GL_ARB_fragment_shader,
applications still cannot use GLSL.  There's no value in
conditionalizing support for this extension.

NOTE: This has the side effect of enabling the extension in the radeon,
r200, and nouveau drivers.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
e6ec425d6e mesa: GL_ARB_shader_objects is not optional
This extension just provides some of the most basic software framework
for GLSL.  Without GL_ARB_vertex_shader or GL_ARB_fragment_shader,
applications still cannot use GLSL.  There's no value in
conditionalizing support for this extension.

NOTE: This has the side effect of enabling the extension in the radeon,
r200, and nouveau drivers.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
9bc24b4fc4 mesa: GL_NV_blend_square is not optional
Every driver left in Mesa enables this extension all the time.  There's
no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
338ea2e4d1 mesa: GL_EXT_fog_coord is not optional
Every driver left in Mesa enables this extension all the time.  There's
no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
c139708087 mesa: GL_EXT_secondary_color is not optional
Every driver left in Mesa enables this extension all the time.  There's
no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
b5305a303b mesa: GL_EXT_framebuffer_object is not optional
Every driver left in Mesa enables this extension all the time.  There's
no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
f4571640b8 mesa: Remove GL_MESA_resize_buffers
Commit bab755a made the implementation a no-op, and it was only ever
enabled by software rasterizers.

v2: Move the spec into docs/specs/OLD since it's now obsolete
    (squashed patch from Andreas Boll)

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
34e8905077 mesa: Remove _mesa_{enable, disable}_extension and _mesa_extension_is_enabled
They're not used anywhere.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
e14b486113 mesa: Just set extension flags instead of calling _mesa_enable_extension
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
b0d755f00b mesa: Remove _mesa_enable_._._extensions functions
After the preceeding commits, they are not used.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
45099ec175 swrast: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions
_mesa_enable_sw_extensions enables all the extensions (and more) that
the others enable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
a964397fd9 osmesa: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions
_mesa_enable_sw_extensions enables all the extensions (and more) that
the others enable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
c9edd661c4 wmesa: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions
_mesa_enable_sw_extensions enables all the extensions (and more) that
the others enable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
89cf6e6273 x11: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions
_mesa_enable_sw_extensions enables all the extensions (and more) that
the others enable.  Also, don't duplicate the DXTn checks.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
0b9398c74f i965: Merge the two GEN >= 6 extension enable blocks
There's no reason for these blocks to be separate.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:21 -07:00
Ian Romanick
ae66a656fd i965: Move GEN >= 4 extensions into the "always on" list
This copy of the source file is only used for GEN >= 4, so extensions
that are enabled for GEN >= 4 are always enabled.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:21 -07:00
Ian Romanick
4ed976f6b5 i965: Move GEN >= 3 extensions into the "always on" list
This copy of the source file is only used for GEN >= 4, so extensions
that are enabled for GEN >= 3 are always enabled.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:20 -07:00
Ian Romanick
e621208e29 i915: Remove GEN >= 4 extension support
This copy of the source file is only used for GEN <= 3, so remove the
dead code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:20 -07:00
Kenneth Graunke
745f6c692c i965: Split surface format code into a new file (brw_surface_formats.c).
brw_wm_surface_state.c has gotten rather large and unwieldy.  At this
point, it consists of two separate portions:

1. Surface format code

   This includes the giant table of surface formats and what features
   they support on each generation, as well as the code to translate
   between Mesa formats and hardware formats.

   This is used across all generations.

2. Binding table (SURFACE_STATE) related code.

   This is the code to generate SURFACE_STATE entries for renderbuffers,
   textures, transform feedback buffers, constant buffers, and so on, as
   well as the code to assemble them into binding tables.

   This is only used on Gen4-6; gen7_surface_state.c has Gen7+ code.

Since the two are logically separate, and one is reused on every
generation while the other is not, it makes a lot of sense to split
them out.  It should also make finding code easier.

No code is changed by this patch.  I simply copied the file then deleted
portions of both.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:11 -07:00
Alex Deucher
c309e64db8 radeonsi: add kabini pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:17:27 -04:00
Alex Deucher
b6b1346691 radeonsi: add bonaire pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:17:18 -04:00
Alex Deucher
d669992e35 radeonsi: disable 2D tiling on CIK for now
Causes GPU hangs.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:17:10 -04:00
Alex Deucher
1357624abc radeonsi: add llvm processor names for CIK
Requires updated llvm.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:17:00 -04:00
Alex Deucher
234d81e6b2 radeonsi: emit PA_SC_RASTER_CONFIG[_1] on cik
Use the golden values for each asic.

Todo: update Kabini and Kaveri.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:53 -04:00
Alex Deucher
9d8ad222c6 radeonsi: PA_CL_ENHANCE is privileged on CIK
Needs to be and is set by the kernel.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:46 -04:00
Alex Deucher
72c10be3a7 radeonsi: update surface sync packet emit for CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:35 -04:00
Alex Deucher
f2a9bd8084 radeonsi: store chip class in the pm4 struct
Will be used for asic specific pm4 behavior.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:27 -04:00
Alex Deucher
3a47f1945f radeonsi: properly handle DB tiling setup on CIK
On CIK, DB switches back to using per-surface tiling
parameters rather than the tile index used on SI.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:17 -04:00
Alex Deucher
8c903f5df9 radeonsi: emit additional shader pgm rsrc registers for CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:10 -04:00
Alex Deucher
59e4fe0b75 radeonsi: emit TA_BC_BASE_ADDR_HI for border color on CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:03 -04:00
Alex Deucher
b363a45c54 radeonsi: fix VGT_PRIMITIVE_TYPE emit for CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:54 -04:00
Alex Deucher
ecb679a8d3 radeonsi: register updates for CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:46 -04:00
Alex Deucher
deb2358243 radeonsi: initial PM4 changes for CIK
note which packets are removed and add new ones.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:36 -04:00
Alex Deucher
f29f206c93 radeonsi: initial support for CIK chips
Add the infrastructure to differentiate them.
Just treat them like SI for now.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:28 -04:00
Alex Deucher
5b3f1ea933 radeonsi: rename SI chip class from TAHITI to SI
Covers the entire family.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:20 -04:00
Tom Stellard
47e35eff9d r600g: Fix build
Broken since 2840bec56f when opencl is
disabled.
2013-06-28 11:11:43 -07:00
Anuj Phogat
ee723ffabb mesa: Return ZeroVec/dummyReg instead of NULL pointer
Assertions are not sufficient to check for null pointers as they don't
show up in release builds. So, return ZeroVec/dummyReg instead of NULL
pointer in get_{src,dst}_register_pointer(). This should calm down the
warnings from static analysis tool.

Note: This is a candidate for the 9.1 branch.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 10:53:43 -07:00
Tom Stellard
bee49cb0ec mesa: Fix build with older gcc since update of glext.h
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 08:49:06 -07:00
Tom Stellard
2840bec56f r600g/compute: Accept LDS size from the LLVM backend
And allocate the correct amount before dispatching the kernel.

Tested-by: Aaron Watry <awatry@gmail.com>
2013-06-28 08:33:11 -07:00
Tom Stellard
2639fca1f0 r600g/compute: Move compute_shader_create() function into evergreen_compute.c
Tested-by: Aaron Watry <awatry@gmail.com>
2013-06-28 08:33:11 -07:00
Brian Paul
ba4979810f svga: pass svga_compile_key by reference instead of value
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-28 08:38:00 -06:00
Brian Paul
74e8a7d1dd svga: use switch statement in svga_shader_type()
Safer in case the PIPE_SHADER_x tokens get renumbered (as Marek
wanted to do).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-28 08:37:59 -06:00
Chia-I Wu
24b05ff158 ilo: clean up states that use ilo_view_surface
Use variables that are easier to remember what they are.
2013-06-28 15:01:00 +08:00
Chia-I Wu
2c9b6a2164 ilo: remove ilo_cbuf_state::count
We can derive it from enabled_mask.
2013-06-28 15:01:00 +08:00
Chia-I Wu
7ea3ed81c8 ilo: clean up ilo_set_constant_buffer()
Add loops that will be optimized away.
2013-06-28 15:01:00 +08:00
Chia-I Wu
11d283cde9 ilo: clean up states that take a start_slot
They are similar, so clean them up to make them look similar.
2013-06-28 15:00:42 +08:00
Vinson Lee
def634979d glsl: Initialize member variable is_ubo_var in constructor.
Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-06-27 21:51:32 -07:00
Chia-I Wu
20c691b936 ilo: use shorter names for dirty flags
The new names match those of ilo_context's members respectively, and are
shorter.
2013-06-28 10:44:51 +08:00
Chia-I Wu
cabc7b44c0 ilo: track if primitive restart has changed
Re-emit 3DSTATE_INDEX_BUFFER to enable/disable primitive restart.
2013-06-28 10:44:38 +08:00
Chia-I Wu
e071812e46 ilo: avoid potential dangling pointer dereference
Set pipe_draw_info to NULL after draw_vbo().
2013-06-28 10:11:49 +08:00
Ian Romanick
c74a7eb9c5 mesa: Remove GL_EXT_clip_volume_hint
As far as I can tell, no driver has enabled this extension since c6499a7
back in 2007.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-27 18:14:33 -07:00
Chad Versace
6b676e6634 i965,i915: Return early if miptree allocation fails
If allocation fails in intel_miptree_create_layout(), don't proceed to
dereference the miptree. Return an early NULL.

Fixes static analysis error reported by Klocwork.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-27 13:16:47 -07:00
Roland Scheidegger
670f829102 llvmpipe: handle offset_clamp
This was just ignored (unless for some reason like unfilled polys draw was
handling this).
I'm not convinced of that code, putting the float for the clamp in the key
isn't really a good idea. Then again the other floats for depth bias are
already in there too anyway (should probably have a jit_context for the
setup function), so this is just a quick fix.
Also, the "minimum resolvable depth difference" used isn't really right as it
should be calculated according to the z values of the current primitive
and not be a constant (of course, this only makes a difference for float
depth buffers), at least for d3d10, so depth biasing is still not quite right.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 19:06:40 +02:00
Roland Scheidegger
b04a295a4a llvmpipe: remove never reached code for timestamp queries.
timestamp queries are always binned in an active scene, therefore
always have a result.
2013-06-27 19:06:40 +02:00
Roland Scheidegger
59b8689d37 llvmpipe: fix a bug in opaque optimization
If there are queries active the opaque optimization reseting the bin needs to
be disabled.
(Not really tested since the bug was discovered by code inspection not
an actual test failure.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 19:06:40 +02:00
Vinson Lee
f12e551810 radeonsi/compute: Fix memory leak in radeonsi_launch_grid.
Fixes "Resource leak" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-06-27 10:03:33 -07:00
Tom Stellard
0e990736f3 clover: Fix build with LLVM 3.4
Reported on IRC by lordheavy
2013-06-27 10:03:33 -07:00
Bill York
191795eaf1 docs: updated instructions for Mesa on Windows
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-06-27 09:49:41 -06:00
Matthew McClure
e87fc11cac postprocess: handle partial intialization failures.
This patch fixes segfaults observed when enabling the post processing
features. When the format is not supported, or a texture cannot be
created, the code must gracefully handle failure and report the error to
the calling code for proper failure handling.

To accomplish this the following changes were made to the filters.h
prototypes:

- bool return for pp_init_func
- Added pp_free_func for filter specific resource destruction

Fixes segfaults from backtraces:

* util_destroy_blit
  pp_free

* u_transfer_inline_write_vtbl
  pp_jimenezmlaa_init_run
  pp_init

This patch also uses tgsi_alloc_tokens to allocate temporary tokens in
pp_tgsi_to_state, instead of allocating the array on the stack. This
fixes the following stack corruption segfault in pp_run.c:

* _int_free
  aaline_delete_fs_state
  pp_free

Bug Number: 1021843
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-27 09:44:29 -06:00
Brian Paul
482c43a946 glx: return True/False instead of GL_TRUE/GL_FALSE
Just to be consistent with the functions' Bool return type.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 07:48:19 -06:00
Brian Paul
d171bc9d19 glx: move declarations before code
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 07:48:18 -06:00
Brian Paul
d43548ca37 mesa: move declarations before code
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 07:48:18 -06:00
José Fonseca
15085b477b glsl: Use the C99 variadic macro syntax.
MSVC does not support the old GCC syntax.

See also
http://gcc.gnu.org/onlinedocs/gcc/Variadic-Macros.html
2013-06-27 07:44:11 +01:00
José Fonseca
bcd6f3b23c scons: Add dependencies to all .xml files.
Should prevent stuck builds when only some of the included .xml files
change.
2013-06-27 07:25:10 +01:00
Chia-I Wu
9f3cfe6aaf ilo: plug a potential index buffer leak
This is harmless since st_context and u_vbuf both set index buffer to NULL
before destroying themselves.  But we do not want to rely on that behavior.
2013-06-27 11:46:58 +08:00
Roland Scheidegger
eabe068747 softpipe: honor predication for clear_render_target and clear_depth_stencil
trivial, copied from llvmpipe

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 23:17:53 +02:00
Roland Scheidegger
2e4da1f594 llvmpipe: add support for nested / overlapping queries
OpenGL doesn't support this but d3d10 does.
It is a bit of a pain as it is necessary to keep track of queries
still active at the end of a scene, which is also why I cheat a bit
and limit the amount of simultaneously active queries to (arbitrary)
16 (simplifies things because don't have to deal with a real list
that way). I can't think of a reason why you'd really want large
numbers of overlapping/nested queries so it is hopefully fine.
(This only affects queries which need to be binned.)

v2: don't copy remainder of array when deleting an entry simply replace
the deleted entry with the last one (order doesn't matter).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 23:17:53 +02:00
Roland Scheidegger
0820342880 llvmpipe: rework query logic
Previously lp_rast_begin_query commands were always inserted into each bin,
and re-issued if the scene was restarted, while lp_rast_end_query commands
were executed for each still active query at the end of tile rasterization.
Also, the ps_invocations and vis_counter were set to zero when the respective
command was encountered.
This however cannot work for multiple queries of the same type (note that
occlusion counter and occlusion predicate while different type were also
affected).
So, change the logic to always set the ps_invocations and vis_counter to zero
at the start of tile rasterization, and then use "start" and "end" per-thread
query values when encountering the begin/end query commands instead, which
should work for multiple queries of the same type. This also means queries do
not have to be reissued in a new scene, however they still need to be finished
at end of tile rasterization, so a list of queries still active at the end of
a scene needs to be maintained.
Also while here don't bin the queries which don't do anything in rasterization.
(This change does not actually handle multiple queries of the same type yet,
as the list of active queries is just a simple fixed array and setup can still
only have one query active per type.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 23:17:53 +02:00
Eric Anholt
3dbba95b72 i965: Move the remaining intel code to the i965 directory.
Now that i915's forked off, they don't need to live in a shared directory.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Adam Jackson <ajax@redhat.com>
(and I hear second hand that idr is OK with it, too)
2013-06-26 12:28:26 -07:00
Eric Anholt
733d32f376 i915: Fork the shared code from i965.
Of this 15000 lines of code in intel/, we've identified 4000 lines that
are trivially unnecessary for i915, and another 1000 that are pointless for
i965, and expect to find more as time goes on.  Split the i915 driver off,
so that we can continue active development on i965 without worrying about
breaking i915.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Adam Jackson <ajax@redhat.com>
(and I hear second hand that idr is OK with it, too)
2013-06-26 12:28:25 -07:00
Eric Anholt
43a6795a1f i915: Remove dead symlink. 2013-06-26 12:28:25 -07:00
Eric Anholt
fc32d40534 glx: Fix another missed glMultiDrawElementsEXT const change.
The build was broken for me since
b7d9478f36.
2013-06-26 12:28:25 -07:00
Ian Romanick
c170c901d0 glsl: Move all var decls to the front of the IR list in reverse order
This has the (intended!) side effect that vertex shader inputs and
fragment shader outputs will appear in the IR in the same order that
they appeared in the shader code.  This results in the locations being
assigned in the declared order.  Many (arguably buggy) applications
depend on this behavior, and it matches what nearly all other drivers
do.

Fixes the (new) piglit test attrib-assignments.

NOTE: This is a candidate for stable release branches (and requires the
previous commit to prevent a regression in OpenGL ES 2.0 conformance
test stencil_plane_operation).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-26 12:27:23 -07:00
Ian Romanick
329cd6a9b1 i965: Be more careful with the interleaved user array upload optimization
The checks to determine when the data can be uploaded in an interleaved
fashion can be tricked by certain data layouts.  For example,

    float data[...];

    glVertexAttribPointer(0, 4, GL_FLOAT, GL_FALSE, 16, &data[0]);
    glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, 16, &data[4]);
    glDrawArrays(GL_POINTS, 0, 1);

will hit the interleaved path with an incorrect size (16 bytes instead
of 32 bytes).  As a result, the data for attribute 1 never gets
uploaded.  The single element draw case is the only sensible case I can
think of for non-interleaved-that-looks-like-interleaved data, but there
may be others as well.

To fix this, make sure that the end of the element in the array being
checked is within the stride "window."  Previously the code would check
that the begining of the element was within the window.

NOTE: This is a candidate for stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 12:27:23 -07:00
Brian Paul
b7d9478f36 mesa: add const qualifier to glMultiDrawElementsEXT() indices param
The 20130624 version of glext.h changed this to match the
glMultiDrawElements() function which already had the extra const
qualifier.

Fixes warnings/errors that seem to vary from one compiler to the next.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 13:12:01 -06:00
Brian Paul
15436adab0 mesa: remove const from glDebugMessageCallbackARB() function parameter
The new 20130624 version of glext.h removed the const qualifier on
the 'userParam' parameter.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 13:12:01 -06:00
Kenneth Graunke
dd0b99b0be i965/vs: Combine code generation's inst->opcode switch statements.
vec4_visitor::generate_code() switches on vec4_instruction::opcode and
calls into the brw_eu_emit.c layer to generate code for some of them.
It then has a default case which calls generate_vec4_instruction() to
handle the rest...which switches on opcode and handles the rest of the
cases.

The split apparently is that generate_code() handles the actual hardware
opcodes (BRW_OPCODE_*) while generate_vec4_instruction() handles the
virtual opcodes (SHADER_OPCODE_* and VS_OPCODE_*).  But this looks
fairly arbitrary, and it makes more sense to combine the two switches.

This patch moves the cases from generate_code() into the helper function
so that generate_code() isn't as large.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:13 -07:00
Kenneth Graunke
55272883ac i965: Remove broken source type assertions from brw_alu3().
Commit 526ffdfc03 attempted to generalize
the source register type assertions to allow D and UD.  However, the
src1 and src2 assertions actually checked src0.type against D and UD due
to a copy and paste bug.

It also began setting the source and destination register types based on
dest.type, ignoring src0/src1/src2.type completely.  BFE and BFI2 may
actually pass mixed D/UD types and expect them to be ignored, which is
arguably a bit sloppy, but not too crazy either.

This patch simply removes the source register assertions as those values
aren't used anyway.  It also clarifies the comment above the block that
sets the register types.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-06-26 11:25:13 -07:00
Kenneth Graunke
9321f3257f i965: Add back strict type assertions for MAD and LRP.
Commit 526ffdfc03 relaxed the type
assertions in brw_alu3 to allow D/UD types (required by BFE and BFI2).
This lost us the strict type checking for MAD and LRP, which require
all four types to be float.

This patch adds a new ALU3F wrapper which checks these once again.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-06-26 11:25:12 -07:00
Kenneth Graunke
4563dfe23a glsl: Streamline the built-in type handling code.
Over the last few years, the compiler has grown to support 7 different
language versions and 6 extensions that add new built-in types.  With
more and more features being added, some of our core code has devolved
into an unmaintainable spaghetti of sorts.

A few problems with the old code:
1. Built-in types are declared...where exactly?

   The types in builtin_types.h were organized in arrays by the language
   version or extension they were introduced in.  It's factored out to
   avoid duplicates---every type only exists in one array.  But that
   means that sampler1D is declared in 110, sampler2D is in core types,
   sampler3D is a unique global not in a list...and so on.

2. Spaghetti call-chains with weird parameters:

   generate_300ES_types calls generate_130_types which calls
   generate_120_types and generate_EXT_texture_array_types, which calls
   generate_110_types, which calls generate_100ES_types...and more

   Except that ES doesn't want 1D types, so we have a skip_1d parameter.
   add_deprecated also falls into this category.

3. Missing type accessors.

   Common types have convenience pointers (like glsl_type::vec4_type),
   but others may not be accessible at all without a symbol table (for
   example, sampler types).

4. Global variable declarations in a header file?

   #include "builtin_types.h" in two C++ files would break the build.

The new code addresses these problems.  All built-in types are declared
together in a single table, independent of when they were introduced.
The macro that declares a new built-in type also creates a convenience
pointer, so every type is available and it won't get out of sync.

The code to populate a symbol table with the appropriate types for a
particular language version and set of extensions is now a single
table-driven function.  The table lists the type name and GL/ES versions
when it was introduced (similar to how the lexer handles reserved
words).  A single loop adds types based on the language version.
Explicit extension checks then add additional types.  If they were
already added based on the language version, glsl_symbol_table simply
ignores the request to add them a second time, meaning we don't need
to worry about duplicates and can simply list types where they belong.

v2: Mark uvecs and shadow samplers as ES3 only, and 1DArrayShadow as
    unsupported in ES entirely.  Add a touch more doxygen.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:12 -07:00
Kenneth Graunke
818da74af5 glsl: Don't use random pointers as an array of glsl_type objects.
Using a random glsl_type convenience pointer as an array is a really bad
idea, for all the reasons mentioned in the previous commit.

The new glsl_type::bvec() function is simpler anyway.

Prevents breakage in the next commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:12 -07:00
Kenneth Graunke
4530ed4f26 glsl: Stop being clever with pointer arithmetic when fetching types.
Currently, vector types are linked together closely: the glsl_type
objects for float, vec2, vec3, and vec4 are all elements of the same
array, in that exact order.  This makes it possible to obtain vector
types via pointer arithmetic on the scalar type's convenience pointer.
For example, float_type + (3 - 1) = vec3.

However, relying on this is extremely fragile.  There's no particular
reason the underlying type objects need to be stored in an array.  They
could be individual class members, possibly with padding between them.
Then the pointer arithmetic would break, and we'd get bad pointers to
non-heap allocated data, causing subtle breakage that can't be detected
by valgrind.  Cue insanity.

Or someone could simply reorder the type variables, causing us to get
the wrong type entirely.  Also cue insanity.

Writing this explicitly is much safer.  With the new helper functions,
it's a bit less code even.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:12 -07:00
Kenneth Graunke
d367a1cbdb glsl: Add simple vector type accessor helpers.
This patch introduces new functions to quickly grab a pointer to a
vector type.  For example:

   glsl_type::bvec(4)   returns   glsl_type::bvec4_type
   glsl_type::ivec(3)   returns   glsl_type::ivec3_type
   glsl_type::uvec(2)   returns   glsl_type::uvec2_type
   glsl_type::vec(1)    returns   glsl_type::float_type

This is less wordy than glsl_type::get_instance(GLSL_TYPE_BOOL, 4, 1),
which can help avoid extra word wrapping.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:12 -07:00
Brian Paul
9a14e412d6 mesa: update glext.h to version 20130624
In glapi_priv.h we always need the typedef for the GLclampx type
since GL_OES_fixed_point is now defined in glext.h but the
GLclampx type is not.  GLclampx is not used by anything in glext.h
but we need it for GL ES dispatch.

This is a huge patch because the structure of the file has been
changed.

The following extensions are new, however:

GL_AMD_interleaved_elements
GL_AMD_shader_trinary_minmax
GL_IBM_static_data
GL_INTEL_map_texture
GL_NV_compute_program5
GL_NV_deep_texture3D
GL_NV_draw_texture
GL_NV_shader_atomic_counters
GL_NV_shader_storage_buffer_object
GL_NVX_conditional_render
GL_OES_byte_coordinates
GL_OES_compressed_paletted_texture
GL_OES_fixed_point
GL_OES_query_matrix
GL_OES_single_precision

And these extensions were removed:

GL_FfdMaskSGIX
GL_INGR_palette_buffer
GL_INTEL_texture_scissor
GL_SGI_depth_pass_instrument
GL_SGIX_fog_scale
GL_SGIX_impact_pixel_texture
GL_SGIX_texture_select

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-06-26 10:43:27 -06:00
Brian Paul
bc6eb8068f st/mesa: add casts to silence MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
202299d16e st/mesa: make rtt_level, face, slice unsigned to silence MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
2285645aa2 hud: add float casts to silence MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
87d5a16927 hud: include stdio.h since we use fprintf(), fscanf(), etc 2013-06-26 10:42:59 -06:00
Brian Paul
61964a9ceb hud: add cast to silence MSVC warning 2013-06-26 10:42:59 -06:00
Brian Paul
f06e60fde4 os: add cast in os_time_sleep() to silence MSVC warning 2013-06-26 10:42:59 -06:00
Brian Paul
21f8729c3d vega: add some casts to silence MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
4d452f1988 util: int/unsigned changes to silence some MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
bbdd7cfb8b util: add some casts to silence some MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
aab8ca8fd1 util: s/int/unsigned/ to silence some MSVC warnings 2013-06-26 10:42:58 -06:00
Maarten Lankhorst
e72cc26518 nvc0: set rsvd_kick correctly
This prevents trampling beyond the end of the command stream during flushes.

NOTE: This is a candidate for the stable branches.

Reported-by: Christoph Bumiller <christoph.bumiller@speed.at>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-06-26 16:50:08 +02:00
Maarten Lankhorst
30c2c34464 nvc0: fix push_space checks for video decoding 2013-06-26 16:18:42 +02:00
Vinson Lee
e6479b4330 ilo: Remove max_threads dead code path.
max_threads cannot be greater than 28. It is either 21 or 28.

Fixes "Logically dead code" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-06-26 21:51:07 +08:00
Jean-Sébastien Pédron
c6d52f2290 winsys/intel: fix typo in "ETIMEOUT"
Should be "ETIMEDOUT".

[olv: commit message slightly re-formatted]

Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-06-26 21:51:07 +08:00
Chia-I Wu
c610b67972 ilo: use a bitmask for enabled constant buffers
Looping over 4 * 13 constant buffers while in most cases only two are enabled
is stupid.
2013-06-26 21:50:26 +08:00
Maarten Lankhorst
9aebad618c vl/mpeg12: handle mpeg-1 bitstreams more correctly
Add support for D-frames.
Add support for slices ending on a different horizontal row of macroblocks.
2013-06-26 11:40:47 +02:00
Chia-I Wu
95c21f12f3 ilo: support PIPE_CAP_USER_INDEX_BUFFERS
We want to access the user buffer, if available, when primitive restart is
enabled and the restart index/primitive type is not natively supported.

And since we are handling index buffer uploads in the driver with this change,
we can also work around misalignment of index buffer offsets.
2013-06-26 16:42:46 +08:00
Chia-I Wu
5fb5d4f0a6 ilo: make pipe_draw_info a context state
Rename ilo_finalize_states() to ilo_finalize_3d_states(), and bind
pipe_draw_info to the context when it is called.  This saves us from having to
pass pipe_draw_info around in several places.
2013-06-26 16:42:46 +08:00
Chia-I Wu
3eb6754e94 ilo: support PIPE_CAP_USER_CONSTANT_BUFFERS
We need it for HUD support, and will need it for push constants in the future.
2013-06-26 16:42:45 +08:00
Eric Anholt
79385950f3 i915: Drop dead batch dumping code.
Batch dumping is now handled by shared code in libdrm.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:12 -07:00
Eric Anholt
57407bcaf8 intel: Drop little bits of dead code.
I noticed these while building the fork-i915 branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:12 -07:00
Eric Anholt
88514d922e i965: Stop recomputing the miptree's size from the texture image.
We've already computed what the dimensions of the miptree are, and stored
it in the miptree.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:12 -07:00
Eric Anholt
820325b258 i965: Drop unused argument to translate_tex_format().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
c20f973c4f i965/gen4-5: Stop using bogus polygon_offset_scale field.
The polygon offset math used for triangles by the WM is "OffsetUnits * 2 *
MRD + OffsetFactor * m" where 'MRD' is the minimum resolvable difference
for the depth buffer (~1/(1<<16) or ~1/(1<<24)), 'm' is the approximated
slope from the GL spec, and '2' is this magic number from the original
i965 code dump that we deviate from the GL spec by because "it makes glean
work" (except that it doesn't, because of some hilarity with 0.5 *
approximately 2.0 != 1.0.  go glean!).

This clipper code for unfilled polygons, on the other hand, was doing
"OffsetUnits * garbage + OffsetFactor * m", where garbage was MRD in the
case of 16-bit depth visual (regardless the FBO's depth resolution), or
128 * MRD for 24-bit depth visual.

This change just makes the unfilled polygons behavior match the WM's
filled polygons behavior.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
dba46831b0 i915: Use the current drawbuffer's depth for polygon offset scale.
There's no reason to care about the window system visual's depth for
handling polygon offset in an FBO, and it could only lead to pain.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
c31aee99f3 intel: Add perf debug for glCopyPixels() fallback checks.
The separate function for the fallback checks wasn't particularly
clarifying things, so I put the improved checks in the caller.  (Note that
the dropped _mesa_update_state() had already happened once at the start of
the caller)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
a2ca98b211 i965: Add debug to INTEL_DEBUG=blorp describing hiz/blit/clear ops.
I think we've all added instrumentation at one point or another to see
what's being called in blorp.  Now you can quickly get output like:

Testing glCopyPixels(depth).
intel_hiz_exec depth clear to mt 0x16d9160 level 0 layer 0
intel_hiz_exec depth resolve to mt 0x16d9160 level 0 layer 0
intel_hiz_exec hiz ambiguate to mt 0x16d9160 level 0 layer 0
intel_hiz_exec depth resolve to mt 0x16d9160 level 0 layer 0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
da00782ed8 ra: Fix register spilling.
Commit 551c991606 tried to avoid spilling
registers that were trivially colorable.  But since we do optimistic
coloring, the top of the stack also contains nodes that are not trivially
colorable, so we need to consider them for spilling (since they are some
of our best candidates).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58384
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63674
NOTE: This is a candidate for the 9.1 branch.
2013-06-26 01:07:11 -07:00
Eric Anholt
c6d74a4992 i965/fs: Dump IR when fatally not compiling due to bad register spilling.
It should never happen, but it does, and at this point, you're going to
_mesa_problem() and abort() (unless it's just in precompile).  Give the
developer something to look at.
2013-06-26 01:07:11 -07:00
Naohiro Aota
95e145aaee xmlpool/build: Make sure to set mo properly
Some shells does not set variables sequentially in a statement i.e. "a=X
b=${a}" won't set "b" to "X" but empty value.

This patch introduce ";" to make sure "mo" is set properly before "lang"
assignment.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=471302
2013-06-25 21:22:56 -07:00
Eric Anholt
04e03d9645 i965: Remove the rest of brw_update_draw_buffer().
The last piece of code with an effect was flagging _NEW_BUFFERS.  Only,
that is already flagged from everything that calls this function: Mesa GL
state updates flag it before even calling down into the driver, and the
calls from the DRI2 window system framebuffer update path end up flagging
it as part of the ResizeBuffers() hook.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:22 -07:00
Eric Anholt
c39111509d i965: Stop updating FBO state on drawbuffers change.
The computed fields are updated appropriately as part of the normal draw
call path due to _NEW_BUFFERS being set.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:22 -07:00
Eric Anholt
9d523e3372 i965: Stop recomputing drawbuffer bounds on drawbuffer change.
For winsys FBOs, the bounds are appropriately updated immediately upon
_mesa_resize_framebuffer().  For user FBOs, they're updated as part of the
normal draw path state update due to _NEW_BUFFERS having been flagged.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
15c47481ba i965: Remove _NEW_DEPTH state flagging on drawbuffers change.
Of the places noting a _NEW_DEPTH dependency, all were already checking
for _NEW_BUFFERS if appropriate.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
94ecf913b4 intel: Stop doing special _NEW_STENCIL state flagging on drawbuffers.
2/3 packets depending on Stencil._Enabled already checked for
_NEW_BUFFERS, so just add _NEW_BUFFERS to the remaining one.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
3faccc42ad i965: Stop flagging viewport/scissor change on drawbuffers change.
The viewport (ctx->Viewport._WindowMap) doesn't change with drawable size
changes, and we update scissor (ctx->DrawBuffer->_Xmin and friends) on
_NEW_BUFFERS in things like brw_sf_state.c.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
438f85717d i965: Stop flagging _NEW_POLYGON on drawbuffers change.
Things like brw_sf.c that need to know about orientation are already
recomputing on _NEW_BUFFERS.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
b04c718ebd radeon: Remove gratuitous custom framebuffer resize code.
_mesa_resize_framebuffer(), the default value of the ResizeBuffers hook,
already checks for a window system framebuffer and walks the renderbuffers
calling AllocStorage().

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
17bc8fdb1d intel: Remove gratuitous custom framebuffer resize code.
_mesa_resize_framebuffer(), the default value of the ResizeBuffers hook,
already checks for a window system framebuffer and walks the renderbuffers
calling AllocStorage().

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
d7165b383d mesa: Remove the Initialized field from framebuffers.
This existed to tell the core not to call GetBufferSize, except that even
if you didn't set it nothing happened because nobody had a GetBufferSize.

v2: Remove two more instances of setting the field (from Brian)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:20 -07:00
Eric Anholt
bab755ad1b mesa: Remove Driver.GetBufferSize and its callers.
Only the GDI driver set it to non-NULL any more, and that driver has a
Viewport hook that should keep it limping along as well as it ever has.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:20 -07:00
Vinson Lee
61bfed2d09 glsl: Fix gl_shader_program::UniformLocationBaseScale assert.
commit 26d86d26f9 added
gl_shader_program::UniformLocationBaseScale. According to the code
comments in that commit, UniformLocationBaseScale "must be >=1".

UniformLocationBaseScale is of type unsigned. Coverity reported a "Macro
compares unsigned to 0" defect as well.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-06-25 18:45:01 -07:00
Brian Paul
0b994961ff svga: allow 3D transfers in svga_texture_transfer_map()
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-25 17:54:24 -06:00
Brian Paul
808da7d8ca svga: use new svga_define_texture_level() helper
To get array bounds checking.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-25 17:54:24 -06:00
Brian Paul
2cc27c3faa svga: fix layer/level mix-up in svga_mark_surface_dirty()
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-25 17:54:24 -06:00
Brian Paul
04e3969597 svga: use new svga_age_texture_view() helper
The function does array bounds checking.  Note, this exposes a
bug in the svga_mark_surface_dirty() function: we're calling
svga_age_texture_view() with a texture slice instead of mipmap
level.  This can lead to a failed assertion.  That'll be fixed next.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-25 17:54:24 -06:00
Brian Paul
a4e4a413e5 svga: add array index assertion in svga_validate_sampler_view() 2013-06-25 17:54:24 -06:00
Brian Paul
82d6a52530 svga: use svga_texture() helper instead of casting 2013-06-25 17:54:23 -06:00
José Fonseca
464c6949cb util/debug: Cleanup/improve debug_symbol_name_dbghelp.
- use mgwhelp -- the successor for bfdhelp which does not have a hard
  dependency on BFD, and works on 64bits.
- use a macro instead of hand-typing to dispatch DbgHelp functions
- dump line numbers
- dump module names when symbols are not available
- support 64bits.
- add comments

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-25 18:41:59 +01:00
José Fonseca
a26f834a39 util/debug: Make debug_backtrace_capture work for 64bit windows.
Rely on Windows' CaptureStackBackTrace to do the grunt work.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-25 18:41:59 +01:00
Zack Rusin
29dacd9803 draw: allow overflows in the llvm paths
Because our code couldn't handle it we were skipping rendering
if we detected overflows. According to the spec we should
still render but with all 0 vertices, which is what the llvm
code already does. So for the llvm paths lets enable processing
even if an overflow condition has been detected.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-25 11:57:01 -04:00
Zack Rusin
f96326b2f6 draw: avoid overflows in the llvm draw loop
Before we could easily overflow if start+count>max integer. To
avoid it we can just iterate over the count. This makes sure
that we never crash, since most of the overflow conditions
is already handled.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-25 11:56:41 -04:00
Maarten Lankhorst
e2b02080d8 nvc0: do not set tiled mode on gart bo when fence debugging is used
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-06-25 13:34:15 +02:00
Chia-I Wu
c8240c9dea ilo: honor render condition in blitter
Make pass_render_condition() available for blitter, and check for render
condition in (and only in) clear(), clear_render_target(), and
clear_depth_stencil().
2013-06-25 15:38:07 +08:00
Chia-I Wu
5f4b769127 ilo: remove ilo_shader_internal.h from GEN6 pipeline
Replace direct shader accesses with ilo_shader_get_kernel_param() and etc.
2013-06-25 13:51:59 +08:00
Chia-I Wu
63165df90f ilo: remove ilo_shader_internal.h from GEN7 pipeline
Replace direct shader accesses with ilo_shader_get_kernel_param() and etc.
2013-06-25 13:51:59 +08:00
Chia-I Wu
855b684141 ilo: speed up ilo_shader_select_kernel_routing() a bit
Remember the order of the source attributes and avoid recomputation when it
does not change.
2013-06-25 13:51:59 +08:00
Chia-I Wu
9b18df6e08 ilo: move SBE setup code to ilo_shader.c
Add ilo_shader_select_kernel_routing() to construct 3DSTATE_SBE.  It is called
in ilo_finalize_states(), rather than in create_fs_state(), as it depends on
VS/GS and rasterizer states.

With this change, ilo_shader_internal.h is no longer needed for
ilo_gpe_gen6.c.
2013-06-25 13:51:58 +08:00
Chia-I Wu
c4fa24ff08 ilo: use ilo_shader_state exclusively in GPE
This allows us to remove ilo_shader_internal.h from ilo_gpe_gen7.c.  The
unfinished code in 3DSTATE_DS, 3DSTATE_HS, and INTERFACE_DESCRIPTOR_DATA are
partly or entirely removed.
2013-06-25 13:18:08 +08:00
Chia-I Wu
91cf6c1e92 ilo: map SO registers at shader compile time
The unmodified pipe_stream_output_info describes its outputs as if they are in
TGSI_FILE_OUTPUT.  Remap the register indices to where they appear in the VUE.

TGSI_SEMANTIC_PSIZE needs a little care because it is at the W channel.
2013-06-25 13:18:08 +08:00
Chia-I Wu
68522bf36c ilo: use ilo_shader_cso for FS
Add ilo_gpe_init_fs_cso() to construct 3DSTATE_PS and shader part of
3DSTATE_WM once and early for fragment shaders.
2013-06-25 13:18:08 +08:00
Chia-I Wu
639a2cddc6 ilo: use ilo_rasterizer_state exclusively in GPE
Replace pipe_rasterizer_state by ilo_rasterizer_state for the remaining GPE
functions for consistency.
2013-06-25 13:18:07 +08:00
Chia-I Wu
54ab03523b ilo: convert pipe_rasterizer_state to ilo_rasterizer_wm
Add ilo_gpe_init_rasterizer_wm() to construct fixed-function part of
3DSTATE_WM once in create_rasterizer_state().
2013-06-25 13:17:56 +08:00
Chia-I Wu
851202c319 ilo: use ilo_shader_cso for GS
Add ilo_gpe_init_gs_cso() to construct 3DSTATE_GS once and early for geometry
shaders.
2013-06-25 13:17:21 +08:00
Chia-I Wu
d209da5e33 ilo: introduce ilo_shader_cso for VS
When a new VS kernel is generated, a newly added function,
ilo_gpe_init_vs_cso(), is called to construct 3DSTATE_VS command in
ilo_shader_cso.  When the command needs to be emitted later, we copy the
command from the CSO instead of constructing it dynamically.
2013-06-25 12:42:04 +08:00
Chia-I Wu
5c8db569ab ilo: add functions to query shaders
Add ilo_shader_get_type() to query the type (PIPE_SHADER_x) of the shader.
Add ilo_shader_get_kernel_offset() and ilo_shader_get_kernel_param() to query
the cache offset and various kernel parameters of the selected kernel.
2013-06-25 12:28:54 +08:00
Chia-I Wu
96e2133e72 ilo: clean up finalize_shader_states()
Add ilo_shader_select_kernel() to replace the dependency table,
ilo_shader_variant_init(), and ilo_shader_state_use_variant().

With the changes, we no longer need to include ilo_shader_internal.h in
ilo_state.c.
2013-06-25 12:10:34 +08:00
Chia-I Wu
f0afedeb75 ilo: use multiple entry points for shader creation
Replace ilo_shader_state_create() by

 ilo_shader_create_vs()
 ilo_shader_create_gs()
 ilo_shader_create_fs()
 ilo_shader_create_cs()

Rename ilo_shader_state_destroy() to ilo_shader_destroy().  The old
ilo_shader_destroy() is renamed to ilo_shader_destroy_kernel().
2013-06-25 11:54:14 +08:00
Chia-I Wu
4d789c76dc ilo: move internal shader interface to a new header
Move it to ilo_shader_internal.h.  The goal is to make files not part of the
compiler include only ilo_shader.h eventually.
2013-06-25 11:51:26 +08:00
Brian Paul
e3cbb18321 gallium/hud: do not use free() for the free_query_data hook
That confuses Gallium's memory debugging code where CALLOC/MALLOC
must be matched with FREE, not free().

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-06-24 14:23:54 -06:00
Matthew McClure
e5bf19ac1c draw: check for out-of-memory conditions in the AA line module.
To prevent segfaults in the AA line module, the code will check for a
valid pointer to the aaline_stage in the draw context.

Fixes segfault from backtrace:

* aaline_stage_from_pipe
  aaline_delete_fs_state

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-24 08:36:47 -06:00
José Fonseca
06badea0da tests/graw: Fix typo in shader-leak.c 2013-06-24 15:29:25 +01:00
José Fonseca
a3d75db022 tools/trace: Fix syntax.
Cleaned/commented up the code, but forgot to actually test before
commiting...
2013-06-24 15:28:48 +01:00
Richard Sandiford
5a0556f061 st/dri/sw: Fix pitch calculation in drisw_update_tex_buffer
swrastGetImage rounds the pitch up to 4 bytes for compatibility reasons
that are explained in drisw_glx.c:bytes_per_line, so drisw_update_tex_buffer
must do the same.

Fixes window skew seen while running firefox over vnc on a 16-bit screen.

NOTE: This is a candidate for the stable branches.

[ajax: fixed typo in comment]

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-24 09:52:24 -04:00
Adam Jackson
2151d893fb gallium: Fix llvmpipe on big-endian machines
Squashed commit of the following:

commit 0857a7e105bfcbc4d1431b2cc56612094c747ca3
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:07 2013 -0400

    gallivm: Fix lp_build_rgba8_to_fi32_soa for big endian

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 0d65131649a8aa140e2db228ba779d685c4333e3
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:07 2013 -0400

    gallivm: Fix big-endian machines

    This adds a bit-shift count to the format table, and adds the concept of
    vector or bitwise alignment on gathers.

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 9740bda9b7dc894b629ed38be9b51059ce90818f
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:07 2013 -0400

    llvmpipe: Fix convert_to_blend_type on big-endian

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit ae037c2de0f029e4e99371c0de25560484f0d8df
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    util: Convert color pack to packed formats

    This fixes them on big-endian.

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 5b05ac0c89ae092ea8ba5bba9f739708d7396b5c
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    graw-xlib: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 51396e7d098cb6ff794391cf11afe4dbf86dbea0
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    format: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 417b60bc66eb450e68a92ab0e47f76e292b385e6
Author: Adam Jackson <ajax@redhat.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    st/dri: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 0934b2e022a5e0847d312c40734e2b44cac52fd8
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    st/xlib: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit a307ea3c3716a706963acce7966b5e405ba11db9
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    gbm: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 53eebdd253e1960a645ea278f31d7ef6a6cf4aeb
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    tests: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 2f77fe3ee524945eacd546efcac34f7799fb3124
Author: Adam Jackson <ajax@redhat.com>
Date:   Tue Jun 18 13:07:37 2013 -0400

    gallium: Document packed formats

    Signed-off-by: Adam Jackson <ajax@redhat.com>

commit 1f1017159ce951f922210a430de9229f91f62714
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    gallium: Introduce 32-bit packed format names

    These are for interacting with buffers natively described in terms of
    bit shifts, like X11 visuals:

        uint32_t xyzw8888 = (x << 0) | (y << 8) | (z << 16) | (w << 24);

    Define these in terms of (endian-dependent) aliases to the array-style
    format names.

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 6cc7ab1ee66ed668da78c1d951dfd7782b4e786a
Author: Adam Jackson <ajax@redhat.com>
Date:   Mon Jun 3 12:10:32 2013 -0400

    gallium: Document format name conventions

    v2:
    - Fix a channel name thinko (Michel Dänzer)
    - Elaborate on SCALED versus INT
    - Add links to DirectX and FOURCC docs

    Signed-off-by: Adam Jackson <ajax@redhat.com>

commit df4d269e7fb62051a3c029b84147465001e5776e
Author: Adam Jackson <ajax@redhat.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    gallivm: Remove all notion of byte-swapping

    Signed-off-by: Adam Jackson <ajax@redhat.com>

Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-06-24 09:48:56 -04:00
Roland Scheidegger
d282f4ea9b llvmpipe: fix wrong results for queries not in a scene
The result isn't always 0 in this case (depends on query type),
so instead of special casing this just use the ordinary path (should result
in correct values thanks to initialization in query_begin/end), just
skipping the fence wait.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-22 17:09:37 +02:00
Brian Paul
a415aa9489 gallium/docs: more documentation for pipe_resource::array_size
It should never be zero and for cube/cube_arrays it should be a
multiple of six.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-22 08:50:15 -06:00
Brian Paul
cba7939790 svga: minor cleanups, comments in svga_tgsi_insn.c 2013-06-22 08:49:09 -06:00
Brian Paul
b03f394508 svga: add null ptr check in svga_get_tex_sampler_view()
Trivial.
2013-06-22 08:49:09 -06:00
José Fonseca
67bfdea933 tools/trace: Several tweaks/fixes to dump_state 2013-06-22 12:30:39 +01:00
José Fonseca
545d3d32d8 trace: Dump result of create_stream_output_target 2013-06-22 12:30:39 +01:00
Maarten Lankhorst
6aabd9490c vl/mpeg12: fix mpeg-1 bytestream parsing
This fixes the bytestream parsing of mpeg-1 stream, but still leaves
open a number of issues with the interpretation:
- IDCT mismatch control is not correct for MPEG-1.
- Slices do not have to start and end on the same horizontal row of macroblocks.
- picture_coding_type = 4 (D-pictures) is not handled.
- full_pel_*_vector is not handled.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-06-22 09:40:15 +02:00
Rob Clark
efdc6caaf5 freedreno/a3xx/compiler: ensure min # of cycles after bary instr
The results of a bary.f do not appear to be immediatley available, but
there is no explicit sync bit.  Instead the compiler must just ensure
that there are a minimum number of instructions following the bary
before use of the result of the bary.  We aren't clever enough for that
so just throw in some nop's.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-21 15:37:05 -04:00
Rob Clark
d4aaa4439a freedreno/a3xx/compiler: add TGSI_OPCODE_ABS
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-21 15:37:05 -04:00
Rob Clark
fe4ae1163d freedreno/a3xx/compiler: add TGSI_OPCODE_DPH
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-21 15:37:05 -04:00
Rob Clark
3f965556b4 freedreno/a3xx/compiler: fix for replicating instructions
If we are accumulating result into tmp.x, and need a mov to final
destination, we want to move the .x component into all of the components
enabled from the read dest's writemask, ie. we want:

  MOV dst.xyzw tmp.xxxx

rather than:

  MOV dst.xyzw tmp.xyzw

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-21 15:37:05 -04:00
Eric Anholt
0343f20e2f mesa: Move the common _mesa_glsl_compile_shader() code to glsl/.
This code had no relation to ir_to_mesa.cpp, since it was also used by
intel and state_tracker, and most of it was duplicated with the standalone
compiler (which has periodically drifted from the Mesa copy).

v2: Split from the ir_to_mesa to shaderapi.c changes.

Acked-by: Paul Berry <stereotype441@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-21 10:04:30 -07:00
Eric Anholt
10c14d16d2 mesa: Move shader compiler API code to shaderapi.c
There was nothing ir_to_mesa-specific about this code, but it's not
exactly part of the compiler's core turning-source-into-IR job either.

v2: Split from the ir_to_mesa to glsl/ commit, avoid renaming the sh
    variable.

Acked-by: Paul Berry <stereotype441@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-21 10:04:29 -07:00
Eric Anholt
88398a817c mesa: Fix missing setting of shader->IsES.
I noticed this while trying to merge code with the builtin compiler, which
does set it.

Note that this causes two regressions in piglit in
default-precision-sampler.* which try to link without a vertex or fragment
shader, due to being run under the desktop glslparsertest binary (using
ARB_ES3_compatibility) that doesn't know about this requirement.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-21 10:04:29 -07:00
Eric Anholt
faf3dbad0d mesa: Use shared code for converting shader targets to short strings.
We were duplicating this code all over the place, and they all would need
updating for the next set of shader targets.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-21 10:04:29 -07:00
Eric Anholt
426ca34b7a glsl: Remove ir_print_visitor.h includes and usage
We have ir->print() to do the old declaration of a visitor and having the
IR accept the visitor (yuck!).  And now you can call _mesa_print_ir()
safely anywhere that you know what an ir_instruction is.

A couple of missing printf("\n")s are added in error paths -- when an
expression is handed to the visitor, it doesn't print '\n' (since it might
be a step in printing a whole expression tree).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-21 10:04:29 -07:00
Eric Anholt
2b049aa53e glsl: Make _mesa_print_ir() available from anything including ir.h.
No more forgetting to #include "ir_print_visitor.h" when doing temporary
debug code, or forgetting and leaving it in after removing your temporary
debug code.  Also, available from C code so you don't need to move the
caller to C++ just to call it (see also: ir_to_mesa.cpp).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-21 10:04:29 -07:00
Paul Berry
d0abac22c3 glsl: Make some files safe to include from C
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-21 10:04:28 -07:00
José Fonseca
2d7e837716 tools/trace: Quick instructions/notes.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
c14f516e58 tools/trace: Do a better job at comparing multi line strings.
For TGSI diffing.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
9b7d21f8f5 tools/trace: Tool to compare json state dumps.
Copied verbatim from apitrace's scripts/jsondiff.py
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
cc4ad695ca tools/trace: Tool to dump gallium state at any draw call.
Based from the code from the good old python state tracker.

Extremely handy to diagnose regressions in state trackers.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
a7bccb33b9 tools/trace: Defer blob hex-decoding.
To speed up parsing.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:19 +01:00
José Fonseca
a8f7e12d92 trace: Don't dump texture transfers.
Huge trace files with little value.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:19 +01:00
Chia-I Wu
bbd2d575e6 ilo: replace a boolean by bool
bool is used internally.  This is just cosmetic.
2013-06-20 11:40:20 +08:00
Chia-I Wu
8b2cba8f97 ilo: rename cache_seqno to uploaded
It has been used as a bool since shader cache rework.
2013-06-20 11:36:54 +08:00
Roland Scheidegger
ffebefa114 util: (trivial) add has_popcnt field
Not used yet but there's a couple of places in llvmpipe which should use this
(occlusion count is currently very inefficent if there's no cpu popcnt
instruction).
2013-06-19 23:47:36 +02:00
Roland Scheidegger
5c9aee111e llvmpipe: use 64bit counter for occlusion queries
Some APIs require 64bit and at least for 64bit archs the overhead
should be minimal.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:36 +02:00
Roland Scheidegger
dc5dc4fd94 llvmpipe: handle more queries
Handle PIPE_QUERY_GPU_FINISHED and PIPE_QUERY_TIMESTAMP_DISJOINT, and
also fill out the ps_invocations and c_primitives from the
PIPE_QUERY_PIPELINE_STATISTICS (the others in there should already
be handled). Note that ps_invocations isn't pixel exact, just 16 pixel
exact but I guess it's better than nothing.
Doesn't really seem to work correctly but there's probably bugs elsewhere.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:36 +02:00
Roland Scheidegger
bf5096303f softpipe: handle all queries, and change for the new disjoint semantics
The driver can do render_condition but wasn't handling the occlusion
and so_overflow predicates (though the latter might not work yet due
to gs support).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:36 +02:00
Roland Scheidegger
cdf89d0b5c gallium: fix PIPE_QUERY_TIMESTAMP_DISJOINT
The semantics didn't really make sense, not really matching neither d3d9
(though the docs are all broken there) nor d3d10. So make it match d3d10
semantics, which actually gives meaning to the "disjoint" part.
Drivers are fixed up in a very primitive way, I have no idea what could
actually cause the counter to become unreliable so just always return
FALSE for the disjoint part.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:35 +02:00
José Fonseca
a0a40805dd trace: Dump pipe_rasterizer_state::clip_halfz.
Trivial.
2013-06-19 18:16:16 +01:00
Brian Paul
1e16e48f88 svga: add some comments about primitive conversion
And clean up the svga_translate_prim() function with better
variable names.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
8b3d4efed8 indices: add some comments
This is pretty complicated code with few/any comments.  Here's a first stab.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
2e8c51c98f svga: reindent svga_tgsi.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
0de01a47dd svga: whitespace, comment, formatting fixes in svga_tgsi_emit.h
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
1f57349e20 svga: move some svga/tgsi functions
Move some functions from the svga_tgsi_insn.h header into the
svga_tgsi_insn.c file since they're only used there.  Plus, add
comments and fix formatting.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
3abd9285be svga: formatting fixes in svga_tgsi_insn.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:13 -06:00
Brian Paul
9e6c29bf12 mesa: wrap comments, code to 78 columns in multisample.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:13 -06:00
Brian Paul
bdd5a0c12b mesa: remove unused BITSET64 macros
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:13 -06:00
Maarten Lankhorst
f1cccd6ca0 nvc0: kill assert in ppp code
It's no longer always true, and the video tilign aligment should
ensure the alignment is handled correctly regardless.
2013-06-19 13:08:51 +02:00
Chia-I Wu
cf41fae96b ilo: rework shader cache
The new code makes the shader cache manages all shaders and be able to upload
all of them to a caller-provided bo as a whole.

Previously, we uploaded only the bound shaders.  When a different set of
shaders is bound, we had to allocate a new kernel bo to upload if the current
one is busy.
2013-06-19 16:46:42 +08:00
Emil Velikov
7f7b05d6b3 nv50: avoid crash on updating RASTERIZE_ENABLE state
When doing blit using the 3D engine, the rasterizer cso may be NULL.

Ported from nvc0 commit 8aa8b0539.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-06-19 00:02:24 +02:00
Kristian Høgsberg
712269d674 wayland: Handle global_remove event as well
We need to set up a handler for the global_remove event that gets sent
out when a global gets removed.  Without the handler we end up calling
a NULL pointer.

https://bugs.freedesktop.org/show_bug.cgi?id=65910

NOTE: This is a candidate for the stable branches.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2013-06-18 17:45:19 -04:00
Jordan Justen
adeda5afd4 gen7: fix GPU hang on WebGL texture-size test
When rendering to a texture with BaseLevel set, the miptree may be laid
out such that BaseLevel is in level 0 of the miptree (to avoid wasting
memory on unused levels between 0 and BaseLevel-1).  In that case, we
have to shift our render target's level down to the appropriate level of
the smaller miptree.

The WebGL test in combination with a meta code relating to
glGenerateMipmap also triggered a similar failure scenario.

This GPU hang regression was introduced by c754f7a8.

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=65324
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-18 14:06:46 -07:00
Eric Anholt
248fddecd8 intel: Remove unused IS_POWER_OF_TWO() macro.
The is_power_of_two() inline function has been used instead.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-06-18 12:08:08 -07:00
Zack Rusin
9542131b27 Revert "draw: clear the draw buffers in draw"
This reverts commit 41966fdb3b.
While it's a lot cleaner it causes regressions because
the draw interface is always called from the draw functions
of the drivers (because the buffers need to be mapped) which
means that the stream output buffers endup being cleared on
every draw rather than on setting.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-17 21:43:10 -04:00
Roland Scheidegger
8975dc798d llvmpipe: fixes for conditional rendering
honor render_condition for clear_render_target and clear_depth_stencil.
Also add minimal support for occlusion predicate, though it can't be active
at the same time as an occlusion query yet.
While here also switchify some large if-else (actually just mutually
exclusive if-if-if...) constructs.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-18 18:01:24 +02:00
Roland Scheidegger
793e8e3d7e gallium: add condition parameter to render_condition
For conditional rendering this makes it possible to skip rendering
if either the predicate is true or false, as supported by d3d10
(in fact previously it was sort of implied skip rendering if predicate
is false for occlusion predicate, and true for so_overflow predicate).
There's no cap bit for this as presumably all drivers could do it trivially
(but this patch does not implement it for the drivers using true
hw predicates, nvxx, r600, radeonsi, no change is expected for OpenGL
functionality).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-18 18:01:24 +02:00
Chia-I Wu
443dc15cf7 ilo: construct depth/stencil command in create_surface()
Add ilo_gpe_init_zs_surface() to construct

 3DSTATE_DEPTH_BUFFER
 3DSTATE_STENCIL_BUFFER
 3DSTATE_HIER_DEPTH_BUFFER

at surface creation time.  This allows fast state emission in draw_vbo().
2013-06-18 16:23:13 +08:00
Eric Anholt
eb20215075 intel: Allow blorp CopyTexSubImage to nonzero destination slices.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
746b57ef0e intel: Allow blit CopyTexSubImage to nonzero destination slices.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
b0e3c3b852 intel: Directly implement blit glBlitFramebuffer instead of awkward reuse.
This gets us support for blitting to attachment types other than
textures.

v2: fix up comments from review by Kenneth.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
815dce9282 intel: Move XRGB->ARGB blit logic into intel_miptree_blit().
Now any caller (such as glCopyPixels()) can benefit from it, and it only
changes the correct subset of the destination instead of a whole teximage.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
04a5e940c9 intel: Fix Y tiling support for glCopyTexSubImage's alpha override.
Apparently we don't have any piglit tests for this, because it would have
assertion failed in a debug build, or just rendered wrong in a non-debug
build if the destination wasn't covering whole tiles.

v2: Use the new macros.

Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-06-17 15:43:23 -07:00
Eric Anholt
78c2fc5925 intel: Make batch macros for doing BCS_SWCTRL setup.
We're going to add more BCS_SWCTRL setup instances soon, and you have to
be careful to have the set and restore atomic with the rendering that's
done, so that our state doesn't leak out to other rendering processes.

v2: Rewrite the patch to have batch begin/advance macros so that magic
    numbers don't get sprinkled around (and so you don't mix up your
    do-I-need-to-reset vs what-do-I-reset-to logic, which I nearly did in
    the next patch when first writing it)

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-17 15:43:13 -07:00
Eric Anholt
b65b1c3148 mesa: Hide weirdness of 1D_ARRAY textures from Driver.CopyTexSubImage().
Intel had brokenness here, and I'd like to continue moving Mesa toward
hiding 1D_ARRAY's ridiculousness inside of the core, like we did with
MapTextureImage.  Fixes copyteximage 1D_ARRAY on intel.

There's still an impedance mismatch in meta when falling back to read and
texsubimage, since texsubimage expects coordinates into 1D_ARRAY as
(width, slice, 0) instead of (width, 0, slice).

v2: Fix offset of scanline reads from the source. (Thanks Brian!), replace
    dd.h comment with Paul's text and replace early exit with an assert.

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)
2013-06-17 15:26:20 -07:00
Dave Airlie
9e8400f4c9 tgsi: text parser: fix parsing of array in declaration
I noticed this code didn't work as advertised while doing some passing around
of TGSI shaders and trying to reparse them, and things failing.

This seems to fix it here for at least the small test case I hacked into a
graw test.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-06-18 08:25:12 +10:00
Sven Joachim
0829b893a9 mesa: Fix ieee fp on Alpha
Commit 1f82bf12ed inadvertently broke it, checking for __IEEE_FLOAT on all
Alpha machines instead of only on VMS as before.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Sven Joachim <svenjoac@gmx.de>
2013-06-17 10:02:56 -07:00
Richard Sandiford
c132c2978b st/xlib: Fix XImage stride calculation
Fixes window skew seen while running gnome on a 16-bit screen over vnc.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-17 12:15:13 -04:00
Richard Sandiford
876fefe2ff st/xlib Fix XIMage bytes-per-pixel calculation
Fixes a crash seen while running gnome on a 16-bit screen over vnc.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-17 12:14:32 -04:00
Jonathan Gray
ebd68dd029 gallium: replace bswap_32 calls with util_bswap32
byteswap.h and bswap_32 aren't portable, replace them with calls to
gallium's util_bswap32 as suggested by Mark Kettenis.  Lets these files
build on OpenBSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-06-17 17:22:28 +02:00
Zack Rusin
7807763dd8 draw: fix a regression in computing max elt
gl can use elts without setting indices, in which case
our eltMax was set to 0 and always invoking the overflow
condition. So by default set eltMax to maximum, it will
be curbed by draw_set_indexes (if it ever comes) and if
not then it will let gl's glVertexPointer/glDrawArrays
work correctly. Fixes piglit's
triangle-rasterization-overdraw test.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-17 11:06:39 -04:00
Zack Rusin
41966fdb3b draw: clear the draw buffers in draw
Moves clearing of the draw so target buffers to the draw
module. They had to be cleared in the drivers before
which was quite messy.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-17 11:06:39 -04:00
Chia-I Wu
98bc4c62a6 ilo: add pipe-based copy method to ilo_blitter
It enables accelerated resource_copy_region() when blt-based method fails.
2013-06-17 18:28:58 +08:00
Chia-I Wu
ebfd7a61c0 ilo: add BLT-based blitting methods to ilo_blitter
Port BLT code in ilo_blit.c to BLT-based blitting methods of ilo_blitter.  Add
BLT-based clears.  The latter is verifed with util_clear(), but it is not in
use yet.
2013-06-17 16:36:53 +08:00
Chia-I Wu
b4b3a5c6dc ilo: replace util_blitter by ilo_blitter
ilo_blitter is just a wrapper for util_blitter for now.  We will port BLT code
to ilo_blitter shortly.
2013-06-17 14:37:10 +08:00
Kenneth Graunke
6d7abafdc8 i965: Assume flexible hardware primitive restart exists in the future.
Primitive restart with an arbitrary cut index was first supported as of
Haswell.  It's very doubtful that they'd take that away in future
hardware, so we may as well alter the check now.
2013-06-14 22:58:18 -07:00
Chris Forbes
def84d8014 i965: Shrink Gen5 VUE map layout to be the same as Gen4.
The PRM suggests a larger layout, mostly to support having
gl_ClipDistance[] somewhere predictable for the fixed-function clipper
-- but it didn't actually arrive in Gen5.

Just use the same layout for both Gen4 and Gen5.

No Piglit regressions.

Improves performance in CS:S Video Stress Test by ~3%.

V2: - Remove now-useless function for determining the SF URB read offset
    - Remove now-unused BRW_VARYING_SLOT_POS_DUPLICATE

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-16 01:05:41 +12:00
Kenneth Graunke
1b77d2133c i965: Implement 16-wide math on G45 and Ironlake.
[chrisf:]
Improves performance in CS:S video stress test by about 2%.
No piglit regressions on Ironlake.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-06-16 00:47:50 +12:00
Matt Turner
fcaa48d9cc glsl: Disallow return with a void argument from void functions.
NOTE: This is a candidate for the stable branches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Matt Turner
1a1b03e6bc glsl: Allow implicit conversion of return values.
Required by ARB_shading_language_420pack.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Matt Turner
876e16562b glsl: Add gl_{Max,Min}ProgramTexelOffset built-in constants.
Required by ARB_shading_language_420pack. Note that the 420pack spec
incorrectly specifies their values as (Min, Max) = (-7, 8) when they
should be (-8, 7) as listed in the GLSL 4.30 and ESSL 3.0 specs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Matt Turner
ed455cdb0b glsl: Allow swizzles on scalars.
Required by ARB_shading_language_420pack.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Matt Turner
a8492e8fe7 glsl: Allow .length() method on vectors and matrices.
Required by ARB_shading_language_420pack.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Todd Previte
cf7f424e18 mesa: Add infrastructure for ARB_shading_language_420pack.
v2 [mattst88]
  - Split infrastructure into separate patch.
  - Add preprocessor #define.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:48 -07:00
Chia-I Wu
bfa8d21759 ilo: fix for half-float vertex arrays
Commit 6fe0453c33 broke half-float vertex
arrays.  This reverts a part of that commit, and explains why.
2013-06-15 01:00:03 +08:00
Chia-I Wu
36ffd08706 ilo: add some assertions to help debugging
Assert that we do not support user vertex/index/constant buffers.  Issue a
warning when a sampler view is created for a resource without
PIPE_BIND_SAMPLER_VIEW.
2013-06-14 16:02:31 +08:00
Chia-I Wu
0d9afaad35 ilo: silence a compiler warning
The path should never be hit.
2013-06-14 15:36:30 +08:00
Vinson Lee
93534873b0 glsl: Fix null check in read_dereference.
Fixes "Logically dead code" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 22:13:34 -07:00
Chia-I Wu
399548b17f st/mesa: fix temp texture bindings in st_CopyPixels()
The temporary texture should have either PIPE_BIND_RENDER_TARGET or
PIPE_BIND_DEPTH_STENCIL set in addition to PIPE_BIND_SAMPLER_VIEW.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-06-14 08:46:04 +08:00
Zack Rusin
5507c11f85 gallium/draw: add limits to the clip and cull distances
There are strict limits on those registers. Define the maximums
and use them instead of magic numbers. Also allows us to add
some extra sanity checks.
Suggested by Brian.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-13 12:13:11 -04:00
Zack Rusin
b63eeaf7b7 draw: cleanup the distance culling code a bit
We don't need the clamped variable, because we can just
return early. We should also do the regular culling after
the distance culling passes.
All spotted by Brian.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-13 12:13:01 -04:00
Chia-I Wu
c7e9b15010 ilo: mapping a resource may make some states dirty
When a resource is busy and is mapped with
PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE, the underlying bo is replaced.  We need
to mark states affected by the resource dirty.

With this change, we no longer have to emit vertex buffers and index buffer
unconditionally.
2013-06-13 23:47:18 +08:00
Chia-I Wu
5f15050dc9 ilo: bump up PIPE_CAP_GLSL_FEATURE_LEVEL to 140
With UBO and TBO support, we are supposedly good to claim GLSL 1.40.
2013-06-13 23:47:18 +08:00
Chia-I Wu
4df85dbc06 ilo: initialize dirty flags in ilo_init_states()
Now that we have a function to initialize states, initialize dirty flags there
too.
2013-06-13 23:47:18 +08:00
Chia-I Wu
6057d7b7b5 ilo: re-emit states that involve resources
Even with hardware contexts, since we do not pin resources, we have to re-emit
the states so that the resources are referenced (by cp->bo) and their offsets
are updated in case they are moved.  This also allows us to elimiate cp flush
in is_bo_busy().
2013-06-13 12:58:47 +08:00
Chia-I Wu
b65bdc61bd ilo: fix for util_blitter_clear() changes
It has been broken since 17350ea979.
2013-06-13 12:58:47 +08:00
Manfred Ernst
bf2c074a2f mesa: Fix bug in unclamped float to ubyte conversion.
Problem: The IEEE float optimized version of UNCLAMPED_FLOAT_TO_UBYTE
in macros.h computed incorrect results for inputs in the range
0x3f7f0000 (=0.99609375) to 0x3f7f7f80 (=0.99803924560546875)
inclusive.  0x3f7f7f80 is the IEEE float value that results in 254.5
when multiplied by 255.  With rounding mode "round to closest even
integer", this is the largest float in the range 0.0-1.0 that is
converted to 254 by the generic implementation of
UNCLAMPED_FLOAT_TO_UBYTE.  The IEEE float optimized version
incorrectly defined the cut-off for mapping to 255 as 0x3f7f0000
(=255.0/256.0). The same bug was present in the function
float_to_ubyte in u_math.h.

Fix: The proposed fix replaces the incorrect cut-off value by
0x3f800000, which is the IEEE float representation of 1.0f. 0x3f7f7f81
(or any value in between) would also work, but 1.0f is probably
cleaner.

The patch does not regress piglit on llvmpipe and on i965 on sandy
bridge.

Tested-by Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-12 20:24:48 -07:00
Marek Olšák
3475b22133 st/dri: if flushing a drawable, don't set reason=SWAPBUFFERS
0 means SWAPBUFFERS.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
a713d7b1b9 st/dri: resolve the back buffer only in SwapBuffers
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
3b525036b9 st/dri: manually swap MSAA front and back buffers in SwapBuffers
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
b77316ad75 st/dri: always copy new DRI front and back buffers to corresponding MSAA buffers
This commit fixes these piglit tests with an MSAA visual forced on:
- read-front
- glx-copy-sub-buffer

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
fdf9d234e2 st/dri: refactor dri_msaa_resolve
The generic blit will be used by the following commit.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
6c6cfc02c9 st/dri: reuse depth-stencil and MSAA resources after DRI2 invalidate event
Page flipping generates an invalidate event every frame, causing reallocations
of all private resources (MSAA and depth-stencil).

Reusing the resources may improve performance (especially under memory
pressure).

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
683b065320 st/dri: fix MSAA resolving of buffers with height > width
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
526ebfa278 st/mesa: make generic CopyPixels path work with MSAA visuals
We have to use pipe->blit, not resource_copy_region, so that the read buffer
is resolved if it's multisampled. I also removed the CPU-based copying,
which just did format conversion (obsoleted by the blit).

Also, the layer/slice/face of the read buffer is taken into account (this was
ignored).

Last but not least, the format choosing is improved to take float and integer
read buffers into account.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
9ef44e6eb7 st/mesa: don't use blit_copy_pixels if an occlusion query is active
CopyPixels, just as DrawPixels, should count the samples that passed
depth test.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
79e421260a st/mesa: rework blit_copy_pixels to use pipe->blit
There were 2 issues with it:
- resource_copy_region doesn't allow different sample counts of both src
  and dst, which can occur if we blit between a window and a FBO, and
  the window has an MSAA colorbuffer and the FBO doesn't.
  (this was the main motivation for using pipe->blit)
- blitting from or to a non-zero layer/slice/face was broken, because
  rtt_face and rtt_slice were ignored.

blit_copy_pixels is now used even if the formats and orientation of
framebuffers don't match.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
4d59258856 r600g: upsample and downsample MSAA resources for transfers
We did downsample (=resolve) MSAA resources to make ReadPixels work with MSAA
GLX visuals, which was enough for read-only color-only transfers.

This commit makes write color transfers and depth-stencil transfers work
in a similar manner. It does downsampling in transfer_map and upsampling
in transfer_unmap.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
72a086b8b2 gallium/u_format: add a new helper for initializing pipe_blit_info::mask
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
d6d4a9a2e8 gallium/u_blitter: make clearing independent of the colorbuffer format
There isn't any difference between 32_FLOAT and 32_*INT in vertex fetching.
Both of them don't do any format conversion.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
17350ea979 gallium/u_blitter: make clearing independent of the number of bound colorbuffers
We can use the fragment shader TGSI property WRITES_ALL_CBUFS.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
de1c38299c gallium/util: make WRITES_ALL_CBUFS optional in the passthrough fragment shader
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
45595d5066 mesa: fix OES_EGL_image_external being partially allowed in the core profile
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-13 03:54:13 +02:00
Ian Romanick
cfa3c5ad82 glsl: Generate smaller values for uniform locations
Previously we would generate uniform locations as (slot << 16) +
array_index.  We do this to handle applications that assume the location
of a[2] will be +1 from the location of a[1].  This resulted in every
uniform location being at least 0x10000.  The OpenGL 4.3 spec was
amended to require this behavior, but previous versions did not require
locations of array (or structure) members be sequential.

We've now encountered two applications that assume uniform values will
be "small."  As far as we can tell, these applications store the GLint
returned by glGetUniformLocation in a int16_t or possibly an int8_t.

THIS BEHAVIOR IS NOT GUARANTEED OR IMPLIED BY ANY VERSION OF OpenGL.

Other implementations happen to have both these behaviors (sequential
array elements and small values) since OpenGL 2.0, so let's just match
their behavior.

Fixes "3D Bowling" on Android.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-12 16:30:29 -07:00
Ian Romanick
26d86d26f9 glsl: Add gl_shader_program::UniformLocationBaseScale
This is used by _mesa_uniform_merge_location_offset and
_mesa_uniform_split_location_offset to determine how the base and offset
are packed.  Previously, this value was hard coded as (1U<<16) in those
functions via the shift and mask contained therein.  The value is still
(1U<<16), but it can be changed in the future.

The next patch dynamically generates this value.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-12 16:30:18 -07:00
Ian Romanick
5097f35841 glsl: Add a gl_shader_program parameter to _mesa_uniform_{merge,split}_location_offset
This will be used in the next commit.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-12 16:30:06 -07:00
Roland Scheidegger
4cce4efaa3 util: new util_fill_box helper
Use new util_fill_box helper for util_clear_render_target.
(Also fix off-by-one map error.)

v2: handle non-zero z correctly in new helper

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-13 00:41:43 +02:00
Roland Scheidegger
957c040eb8 gallivm: (trivial) remove duplicated code block (including comment) 2013-06-13 00:41:43 +02:00
Paul Berry
b09a754078 i965/gen7: Enable support for fast color clears.
This patch adds code to place mcs_state into INTEL_MCS_STATE_RESOLVED
for miptrees that are capable of supporting fast color clears.  This
will have no effect on buffers that don't undergo a fast color clear;
however, for buffers that do undergo a fast color clear, an MCS
miptree will be allocated (at the time of the first fast clear), and
will be used thereafter.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
ef9142d4a3 i965/gen7+: Disable fast color clears on shared regions.
In certain circumstances the memory region underlying a miptree is
shared with other miptrees, or with other code outside Mesa's control.
This happens, for instance, when an extension like GL_OES_EGL_image or
GLX_EXT_texture_from_pixmap extension is used to associate a miptree
with an image existing outside of Mesa.

When this happens, we need to disable fast color clears on the miptree
in question, since there's no good synchronization mechanism to ensure
that deferred clear writes get performed by the time the buffer is
examined from the other miptree, or from outside of Mesa.

Fortunately, this should not be a performance hit for most
applications, since most applications that use these extensions use
them for importing textures into Mesa, rather than for exporting
rendered images out of Mesa.  So most of the time the miptrees
involved will never experience a clear.

v2: Rework based on the fact that we have decided not to use an
accessor function to protect access to the region.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
67cd0f9703 i965/gen7+: Resolve color buffers when necessary.
Resolve color buffers that have been fast-color cleared:
    1. before texturing from the buffer (brw_predraw_resolve_buffers())
    2. before using the buffer as the source in a blorp blit
       (brw_blorp_blit_miptrees())
    3. before mapping the buffer's miptree (intel_miptree_map_raw(),
       intel_texsubimage_tiled_memcpy())
    4. before accessing the buffer using the hardware blitter
       (intel_miptree_blit(), do_blit_bitmap())

v2: Rework based on the fact that we have decided not to use an
accessor function to protect access to the region.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
e9dfcb38e9 i965/gen7+: Ensure that front/back buffers are fast-clear resolved.
We already had code in intel_downsample_for_dri2_flush() for
downsampling front and back buffers when multisampling was in use.
This patch extends that function to perform fast color clear resolves
when necessary.

To account for the additional functionality, the function is renamed
to simply intel_resolve_for_dri2_flush().

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
418aecea7d i965/blorp: Write blorp code to do render target resolves.
This patch implements the "render target resolve" blorp operation.
This will be needed when a buffer that has experienced a fast color
clear is later used for a purpose other than as a render target
(texturing, glReadPixels, or swapped to the screen).  It resolves any
remaining deferred clear operation that was not taken care of during
normal rendering.

Fortunately not much work is necessary; all we need to do is scale
down the size of the rectangle primitive being emitted, run the
fragment shader with the "Render Target Resolve Enable" bit set, and
ensure that the fragment shader writes to the render target using the
"replicated color" message.  We already have a fragment shader that
does that (the shader that we use for fast color clears), so for
simplicity we re-use it.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
fac32c0bd3 i965/blorp: Expand clear class hierarchy to prepare for RT resolves.
The fragment shaders that to do color clears will be re-used to
perform so-called "render target resolves" (the resolves associated
with fast color clears).  To prepare for that, this patch expands the
class hierarchy for blorp params by adding
brw_blorp_const_color_params (which will be used for all blorp
operations where the fragment shader outputs a constant color).

Some other data structures and functions were also renamed to use
"const_color" nomenclature where appropriate.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:06 -07:00
Paul Berry
5e5d4e021f i965/gen7+: Implement fast color clear operation in BLORP.
Since we defer allocation of the MCS miptree until the time of the
fast clear operation, this patch also implements creation of the MCS
miptree.

In addition, this patch adds the field
intel_mipmap_tree::fast_clear_color_value, which holds the most recent
fast color clear value, if any. We use it to set the SURFACE_STATE's
clear color for render targets.

v2: Flag BRW_NEW_SURFACES when allocating the MCS miptree.  Generate a
perf_debug message if clearing to a color that isn't compatible with
fast color clear.  Fix "control reaches end of non-void function"
build warning.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:06 -07:00
Paul Berry
dd3f950115 i965/gen7+: Create helper functions for single-sample MCS buffers.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Paul Berry
460b7bc7a1 i965/gen7+: Set up MCS in SURFACE_STATE whenever MCS is present.
On Gen7+, MCS buffers are used both for compressed multisampled color
buffers and for "fast clear" of single-sampled color buffers.

Previous to this patch series, we didn't support fast clear, so we
only used MCS with multisampled bolor buffers.

As a first step to implementing fast clears, this patch modifies the
code that sets up SURFACE_STATE so that it configures the MCS buffer
whenever it is present, regardless of whether we are multisampling or
not.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Paul Berry
7e5cb4bc4c i965/gen7+: Create an enum for keeping track of fast color clear state.
This patch includes code to update the fast color clear state
appropriately when rendering occurs.  The state will also need to be
updated when a fast clear or a resolve operation is performed; those
state updates will be added when the fast clear and resolve operations
are added.

v2: Create a new function, intel_miptree_used_for_rendering() to
handle updating the fast color clear state when rendering occurs.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Paul Berry
8f5147c199 intel: Conditionally compile mcs-related code for i965 only.
This patch ifdefs out intel_mipmap_tree::mcs_mt when building the i915
(pre-Gen4) driver (MCS buffers aren't supported until Gen7, so there
is no need for this field in the i915 driver).  This should make it a
bit easier to implement fast color clears without undue risk to i915.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Paul Berry
a5efdca7b7 intel: Keep region name in intel_miptree_create_for_dri2_buffer().
When processing a buffer received from the X server,
intel_process_dri2_buffer() examines intel_region::name to determine
whether it's received a brand new buffer, or the same buffer it
received from the X server the last time it made a request.

However, this didn't work properly, because in the call to
intel_miptree_create_for_dri2_buffer(), we create a fresh intel_region
object to represent the buffer, and this was causing us to forget the
buffer's previous name.

This patch fixes things by copying over the region name when creating
the fresh intel_region object.

At the moment, this is just a minor performance optimization.
However, when fast color clears are added, it will be necessary to
ensure that the fast color clear state for a buffer doesn't get
discarded the next time we receive that buffer from the X server.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Chia-I Wu
adf324ad28 winsys/intel: make struct intel_bo alias drm_intel_bo
There is really nothing in struct intel_bo, and having it alias drm_intel_bo
makes the winsys impose almost zero overhead.

We can make the overhead gone completely by making the functions static
inline, if needed.
2013-06-12 17:46:52 +08:00
Chia-I Wu
e7a14eea16 winsys/intel: reorganize functions
Move functions around to match the order of the declarations in the header.
2013-06-12 17:46:52 +08:00
Chia-I Wu
39226705b7 ilo: update winsys interface
The motivation is to kill tiling and pitch in struct intel_bo.  That requires
us to make tiling and pitch not queryable, and be passed around as function
parameters.
2013-06-12 17:46:52 +08:00
Chia-I Wu
cdfb2163c4 ilo: get rid of function tables in winsys
We are moving toward making struct intel_bo alias drm_intel_bo.  As a first
step, we cannot have function tables.
2013-06-12 17:46:52 +08:00
Chia-I Wu
6fe0453c33 ilo: access bo size directly
buf->bo_size is readily avaiable, no need to go via buf->bo->get_size().
2013-06-12 17:46:52 +08:00
Chia-I Wu
3f79188854 ilo: remove unnecessary tex_set_bo/buf_set_bo
Merge the bodies to tex_create_bo/buf_create_bo respectively.
2013-06-12 17:46:52 +08:00
Kenneth Graunke
b00d61151d i965: Emit the depth/stencil state pointer directly, not via atoms.
See two commits ago for the rationale.  This allows us to delete the
whole gen7_cc_state.c file.

This does move these commands before the depth stall flushes from
brw_emit_depthbuffer, which may be a problem.  The documentation for
3DSTATE_DEPTH_BUFFER mentions that depth stall flushes are required
before changing any depth/stencil buffer state, but explicitly lists
3DSTATE_DEPTH_BUFFER, 3DSTATE_HIER_DEPTH_BUFFER, 3DSTATE_STENCIL_BUFFER,
and 3DSTATE_CLEAR_PARAMS.  It does not mention this particular packet
(_3DSTATE_DEPTH_STENCIL_STATE_POINTERS).

No observed Piglit regressions on Sandybridge or Ivybridge.

Together with the last two commits, this makes a cairo-gl benchmark
faster by 0.324552% +/- 0.258355% on Ivybridge.  No statistically
significant change on Sandybridge.  (Thanks to Eric for the numbers.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-11 15:42:17 -07:00
Kenneth Graunke
8ab15bacf4 i965: Emit the CC state pointer directly rather than via atoms.
See the previous commit for the rationale.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-11 15:42:17 -07:00
Kenneth Graunke
da1a896b0f i965: Emit the BLEND_STATE pointer directly rather than via atoms.
Previously, we would:
1. Emit the new indirect state.
2. Flag CACHE_NEW_BLEND_STATE.
3. Rely on later state atoms to notice CACHE_NEW_BLEND_STATE and emit a
   pointer to the new indirect state.

This is rather cumbersome: it requires two state atoms instead of one,
and there's a strict ordering dependency in the list.  Plus, the code
gets spread across two functions (or even files in the case of Gen7+).

Gen7+ has a packet to update just the blend state pointer, so it makes a
lot of sense to simply emit that right away.  Gen6 has a combined packet
which updates blending, the color calculator, and depth/stencil state;
however, each can still be modified independently.

This drops the Gen6 micro-optimization where we tried to only emit one
packet that changed all three states.  State updates are pretty cheap.

CACHE_NEW_BLEND_STATE is no longer necessary, so drop it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-11 15:42:16 -07:00
Zack Rusin
babe35a067 draw: implement distance culling
Works similarly to clip distance. If the cull distance is negative
for all vertices against a specific plane then the primitive
is culled.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-10 22:04:28 -04:00
Zack Rusin
3d08eada34 gallium: add a cull distance semantic
cull distance is analogous to clip distance. If a register is
given this semantic, then the values in it are assumed to be a
float32 distance to a plane. Primitives will be completely
discarded if the plane distance for all of the vertices in
the primitive are < 0.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-10 22:04:28 -04:00
Zack Rusin
0a3779d955 draw: fix clipper invocation statistics
We need to figure out the number of invocations of the clipper
before the emit, because in the emit we are after clipping
where the number of primitives will be equal to number of clipper
invocations minus the clipped primitives. So our computations
were always off by the number of clipped primitives.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-10 22:04:28 -04:00
Zack Rusin
2b2e7bb133 draw: enable user plane clipping when clipdistance is used
Draw depended on clip_plane_enable being set in the rasterizer
to use clipdistance registers for clipping. That's really
unfriendly because it requires that rasterizer state to have
variants for every shader out there. Instead of depending on
the rasterizer lets extract the info from the available state:
if a shader writes clipdistance then we need to use it and we
need to clip using a number of planes equal to the number
of writen clipdistance components. This way clipdistances
just work.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-10 22:04:27 -04:00
Zack Rusin
c1a50f5ed7 draw: make sure clipdistances work with geometry shaders
we were always fetching the info from the vertex shader, but if
geometry shader is present it should be used as the source of
that info.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-10 22:04:27 -04:00
Kenneth Graunke
3dacb7d40b Revert "i965: Disable unused pipeline stages once at startup on Gen7+."
This reverts commit 6c966ccf07.

Apparently causes GPU hangs.

Conflicts:
	src/mesa/drivers/dri/i965/brw_state.h
	src/mesa/drivers/dri/i965/brw_state_upload.c
2013-06-11 10:53:44 -07:00
Brian Paul
42adf5f0dd swrast: add texfetch code for some XBGR formats
Fixes piglit texture-packed-formats regression.  We need to implement
more XBGR formats here eventually, but many are UINT/SINT formats
which swrast doesn't handle yet anyway (integer textures).

Bugzilla https://bugs.freedesktop.org/show_bug.cgi?id=64935

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-11 08:26:38 -06:00
Brian Paul
91405e3502 mesa: add missing texture strings in tex_target_name()
And add a static assert for the future.
2013-06-10 16:35:35 -06:00
Alex Deucher
761320b197 winsys/radeon: add env var to disable VM on Cayman/Trinity
Set env var RADEON_VA=0 to disable VM on Cayman/Trinity.
Useful for debugging.

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-06-10 18:02:57 -04:00
Eric Anholt
fceff14450 mesa: Add a _mesa_problem to document a piglit failure on i965.
Having figured out what was going on with piglit fbo-depth copypixels
GL_DEPTH_COMPONENT32F (falling all the way back to swrast on CopyPixels to
a float depth buffer), I'm not inclined to fix the problem currently but
it seems worth saving someone else the debug time.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-06-10 14:04:25 -07:00
Eric Anholt
9a0bd682f9 i965/vs: Avoid the MUL/MACH/MOV sequence for small integer multiplies.
We do a lot of multiplies by 3 or 4 for skinning shaders, and we can avoid
the sequence if we just move them into the right argument of the MUL.

On pre-IVB, this means reliably putting a constant in a position where it
can't be constant folded, but that's still better than MUL/MACH/MOV.

Improves GLB 2.7 trex performance by 0.788648% +/- 0.23865% (n=29/30)

v2: Fix test for pre-sandybridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
2013-06-10 14:04:24 -07:00
Eric Anholt
d28e285d41 i965/vs: Allow copy propagation into MUL/MACH.
This is a trivial port of 1d6ead3804 from
the FS.

No significant performance difference on trex (misplaced the data, but it
was about n=20).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-06-10 14:04:24 -07:00
Eric Anholt
263a7e4cd9 i965/vs: Use the MAD instruction when possible.
This is different from how we do it in the FS - we are using MAD even when
some of the args are constants, because with the relatively unrestrained
ability to schedule a MOV to prepare a temporary with that data, we can
get lower latency for the sequence of instructions.

No significant performance difference on GLB2.7 trex (n=33/34), though it
doesn't have that many MADs.  I noticed MAD opportunities while reading
the code for the DOTA2 bug.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-06-10 14:04:24 -07:00
Richard Sandiford
1ff10f92e7 draw: Add A8R8G8B8 to draw_print_arrays
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>
2013-06-10 16:28:31 -04:00
Richard Sandiford
5876a4c71d draw: Fix type mismatch between draw_private.h and LLVM
draw_vertex_buffer declared the size field to be a size_t, but the LLVM
code used an int32 instead.  This caused problems on big-endian 64-bit
targets, because the first 32-bit chunk of the 64-bit size_t was always 0.

In one sense size_t seems like a good choice for a size, so one fix
would have been to try to get the LLVM code to use the equivalent of
size_t too.  However, in practice, the size is taken from things like ~0
or width0, both of which are int-sized, so it seemed simpler to make the
size field int-sized as well.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-10 16:26:14 -04:00
Richard Sandiford
337f21bc35 util: Use sizeof(void *) rather than 0 as the fallback cache line size
Without this, llvmpipe ends up giving a zero size to all uncompressed textures
on non-x86 systems, since align() cannot handle a 0 alignment.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-10 16:26:09 -04:00
Richard Sandiford
ba6cd796dd llvmpipe: Use saturating add/sub for UNORM formats
lp_build_add and lp_build_sub have fallback code for cases
that cannot be handled by known intrinsics.  For UNORM formats,
this code was using modulo rather than saturating arithmetic.

This fixes some rendering issues for a gnome session on System z.
It also fixes various piglit tests on z, such as
spec/ARB_color_buffer_float/GL_RGBA8-render.

The patch deliberately doesn't tackle the more complicated
SNORM case.

Tested against piglit on x86_64 and System z with no regressions.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-10 16:20:45 -04:00
Kenneth Graunke
a0037cecd1 intel: Reserve less batchbuffer space.
Now that Gen6+ relies on hardware contexts, we don't need to record an
occlusion query value at the end of each batch.  That means we no longer
need to reserve space for the absurd number of PIPE_CONTROLs required to
do that on Sandybridge.

See commit 4e087de51a, which bumped this
up to 60 bytes.  This is not quite a revert, as it uses 24 bytes instead
of 16, and saves the comments.  As far as I can tell, the old value of
16 bytes was just wrong, so we shouldn't go back to that.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:51 -07:00
Kenneth Graunke
fc800f0c60 i965: Allocate push constant L3 space once at startup on Gen7+.
We always allocate the maximum amount of space and never change it, so
it makes sense to do it once.  Programming it on startup also lets us
skip re-programming it from BLORP.

This removes a tiny amount of overhead from our drawing loop.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:47 -07:00
Kenneth Graunke
6c966ccf07 i965: Disable unused pipeline stages once at startup on Gen7+.
This removes a tiny bit of code from our drawing loop.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:46 -07:00
Kenneth Graunke
b607d57630 i965: Don't emit PIPELINE_SELECT from BLORP.
Now that we emit invariant state at startup (and never select the media
pipeline), the 3D pipeline will always already be selected, even if BLORP
is the first operation.  So this is unnecessary.

v2: Fix unused variable warning (intel_context is no longer used).

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:44 -07:00
Kenneth Graunke
d671eb140f i965: Emit invariant state once at startup on Gen6+.
Now that we have hardware contexts, we can safely initialize our GPU
state once at startup, rather than needing a state atom with the
BRW_NEW_CONTEXT flag set.

This removes a tiny bit of code from our drawing loop.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:42 -07:00
Kenneth Graunke
33b90804ee i965: Delete some dead state atom prototypes.
These atoms don't actually exist.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:40 -07:00
Kenneth Graunke
233de8e8d3 i965: Change return type of check_state() to bool.
The existing code already returned a boolean; this just clarifies that.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:38 -07:00
Kenneth Graunke
650d5de6ea i965: Remove unused second parameter of brw_print_dirty_count().
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:58:29 -07:00
Kenneth Graunke
ca6b520f3a glsl: Allow the use of determinant() in GLSL 1.50.
We already implemented this for ES3, so we just need to turn it on.

Fixes 6 Piglit tests:
spec/glsl-1.50/compiler/built-in-functions/determinant-mat[234].{vert,frag}

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:54:57 -07:00
Kenneth Graunke
603940d5bb glcpp: Automatically #define GL_core_profile 1 on GLSL 1.50+.
Page 17 of the GLSL 1.50.11 specification states:
"There is a built-in macro definition for each profile the
 implementation supports.  All implementations provide the following
 macro:

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:54:56 -07:00
Kenneth Graunke
e203919a4e glsl: Parse "#version 150 core" directives.
Previously we only supported "#version 150".  This patch recognizes
"compatibility" to give the user a more descriptive error message.

Fixes Piglit's version-150-core-profile test.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:54:42 -07:00
Kenneth Graunke
f730b1f72a glsl: Bail on parsing if the #version directive is bogus.
If we didn't successfully parse the #version line, there's no point in
continuing with parsing and compiling: it's already failed.

Furthermore, it can actually be harmful: right after handling #version,
we call _mesa_glsl_initialize_types(), which checks state->es_shader and
language_version.  If it isn't valid, it hits an assertion failure.

Fixes Piglit's "invalid-version-es."  When processing "#version 110 es",
our code set state->es_shader and state->language_version = 110.  It
then properly determined that this was invalid and flagged an error.
Since we continued anyway, we hit the assertion mentioned above.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-10 10:50:12 -07:00
Chris Forbes
a2e3b1c4e2 dlist: fix save_SamplerParameteri
This was building the temporary array to pass to
save_SamplerParameteriv, and then not passing it.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-06-09 14:00:40 -07:00
Vinson Lee
ce1f85133d mesa: Prevent possible out-of-bounds read by save_SamplerParameteriv.
Fixes "Out-of-bounds access" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-08 13:32:53 -07:00
Maarten Lankhorst
26e047dec8 nvc0: fix up video buffer alignment requirements
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-06-08 20:11:33 +02:00
Rob Clark
e9edbf0a68 freedreno: better scissor fix
Actually respect rasterizer state.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-08 13:15:51 -04:00
Rob Clark
4af1dcbb7d freedreno: gmem bypass
The GPU (at least a3xx, but I think also a2xx) can render directly to
memory, bypassing tiling.  Although it can't do this if blend, depth,
and a few other features of the pipeline are enabled.  This direct
memory mode can be faster for some sorts of operations, such as simple
blits.  In particular, this significantly speeds up XA by avoiding to
pull the entire dest pixmap into GMEM, render tiles, and write it all
back out again.  This should also speed up resource copy-region and
blit.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-08 13:15:51 -04:00
Rob Clark
2855f3f7bc freedreno: add a3xx support
The adreno a3xx GPU is found in newer snapdragon devices, such as the
nexus4.  The a3xx is GLESv3 and OpenCL capable, although that is not
enabled yet in gallium.

Compared to a2xx, it introduces an entirely new unified shader ISA, and
re-shuffles all or nearly all of the registers.  The good news is that
(for the most part) the registers are more orthogonal, not combining
unrelated state in a single register.  And that there is a lot more
flexibility, so we don't need to patch and re-emit the shader like we
did on a2xx.

The shader compiler is currently quite dumb, there would be a lot of
room for improvement with an optimizing pass.  Despite that, with the
a320 in my nexus4 it seems to be ~2-3x faster compared to the a220 in my
HP touchpad.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-08 13:15:51 -04:00
Rob Clark
18c317b21d freedreno: prepare for a3xx
Split the parts that are specific to adreno a2xx series GPUs from the
parts that will be in common with a3xx, so that a3xx support can be
added more cleanly.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-08 13:15:51 -04:00
Roland Scheidegger
213c207b3a gallivm: work around slow code generated for interleaving 128bit vectors
We use 128bit vector interleave for untwiddling in the blend code (with
256bit vectors). llvm generates terrible code for this for some reason,
so instead of generating a shuffle for 2 128bit vectors use a
extract/insert shuffle instead (it only seems to matter we're not using
128bit wide vectors for the shuffle). This decreases instruction count of
the blend code generated for a rgba8 render target without blending from
169 to 113 with llvm 3.1 and from 136 to 114 in llvm 3.2/3.3, and I got
a ~8% (llvm 3.1) and ~5% (3.2/3.3) performance improvement in gears.
(The generated code is still not terribly good as we could actually avoid
the interleaving completely but llvm can't know this.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-08 17:33:51 +02:00
José Fonseca
0aca2c6b60 scons: Fix implicit python dependency discovery on Windows.
Probably due to CRLF endings, the discovery of python import statements
was not working on Windows builds, causing incremental builds to often
fail unless one wiped out the build directory.

NOTE: This is a candidate for stable branches.
2013-06-08 08:55:06 +01:00
Stéphane Marchesin
4f905d4900 st/xlib: Flush the front buffer before doing CopySubBuffer
We flush pending rendering before running CopySubBuffer, which
ensures that the right bits get to the screen.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-07 18:53:54 -07:00
Stéphane Marchesin
4e5416b0e2 st/xlib: Fix upside down coordinates for CopySubBuffer
The coordinates need to be inverted between glX and gallium.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-07 18:53:54 -07:00
Eric Anholt
3c21a7d3c9 mesa: Report core FBO incompleteness cases through GL_ARB_debug_output.
Just like we produce from inside the Intel driver, this can help provide
information quickly about FBO incompatibility problems (particularly when
using apitrace replay).

Currently, in driver-marked incompleteness cases, you'll get both the
driver message and the core message on Intel.  Until the other drivers are
fixed to produce output, I think this is better than not putting in a
message for driver-marked incomplete.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-07 16:05:42 -07:00
Paul Berry
9e3475b39a intel: flush fake front buffer if server is about to destroy it.
Fixes piglit test "spec/!OpenGL 1.0/gl-1.0-front-invalidate-back"

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-06-07 13:42:34 -07:00
Paul Berry
447df5eaba intel: flush fake front buffer more robustly.
When a fake front buffer is in use, if we request the front buffer
(using screen->dri2.loader->getBuffersWithFormat()), the X server
copies the real front buffer to the fake front buffer and returns the
fake front buffer.  We sometimes make redundant requests for the front
buffer (due to using a single counter to track invalidates for both
the front and back buffers), so there's a danger of pending front
buffer rendering getting overwritten when the redundant front buffer
request occurs.

Previous to this patch, intel_update_renderbuffers() worked around
that problem by sometimes doing intel_flush() and intel_flush_front()
before calling intel_query_dri2_buffers().  But it only did the
workaround when the front buffer was bound for drawing; it didn't do
it when the front buffer was bound for reading.

This patch moves the workaround code to intel_query_dri2_buffers(), so
that it happens in exactly the circumstances where it is needed.

This should fix some of the sporadic failures in Piglit tests
fbo-sys-blit and fbo-sys-sub-blit.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-06-07 13:26:43 -07:00
Paul Berry
03cc310313 intel: make intel_flush_front safe to call during initial MakeCurrent
The patch that follows will fix a bug that prevents
intel_flush_front() from being called often enough.  In doing so, it
will create a situation where intel_flush_front() is called during the
initial call to glXMakeCurrent().  In this circumstance,
ctx->DrawBuffer hasn't been initialized yet and is NULL.  Fortunately,
intel->front_buffer_dirty is false, so intel_flush_front() doesn't
actually need to do anything.  To avoid a segfault, swap the order of
terms in intel_flush_front()'s if statement.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-06-07 13:26:36 -07:00
Eric Anholt
bc8bfdc42c mesa: Expose MAX_FRAGMENT_INPUT_COMPONENTS on ES3 and desktop 3.2.
piglit OpenGL ES 3.0/minmax now passes.  This was also one of the subcase
failures in OpenGL 3.2/minmax (and still is, because our value is too low
for 3.2, but at least we report what it is).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-07 12:55:07 -07:00
Eric Anholt
7500ad23eb mesa: Expose texture array getters on GLES3.
Part of fixing piglit OpenGL ES 3.0/minmax.

v2: s/_gles3/_es3/ in extra name, for consistency (review by Matt).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-06-07 12:55:06 -07:00
Eric Anholt
fd27e82ded mesa: Fix the return value of TEXTURE_BINDING_2D_ARRAY.
Noticed by inspection when reviewing the next commit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-07 12:55:06 -07:00
Eric Anholt
11ace8a827 mesa: Expose texel offset limits in GLES3.
Part of fixing piglit OpenGL ES 3.0/minmax.

v2: s/_gles3/_es3/ in extra name, for consistency (review by Matt).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-06-07 12:55:06 -07:00
Roland Scheidegger
fa8cefa892 util: add comment about bogus transfer flags 2013-06-07 21:15:01 +02:00
Roland Scheidegger
b47d13f425 util: fix util_clear_render_target and util_clear_depth_stencil layer handling
These functions must clear all bound layers, not just the first.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-07 21:15:01 +02:00
Roland Scheidegger
201d7a352b llvmpipe: move create_surface/destroy_surface functions to lp_surface.c
Believe it or not but these two are actually the first two functions which
really belong in this file nowadays.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-07 21:15:01 +02:00
Roland Scheidegger
d8146f240e llvmpipe: add support for layered rendering
Mostly just make sure the layer parameter gets passed through to the right
places (and get clamped, can do this at setup time), fix up clears to
clear all layers and disable opaque optimization. Luckily don't need to
touch the jitted code.
(Clears invoked via pipe's clear_render_target method will not work however
since the pipe_util_clear function used for it doesn't handle clearing
multiple layers yet.)

v2: per Brian's suggestion, prettify var initialization and add some comments,
add assertion for impossible layer specification for surface.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-07 21:15:01 +02:00
Roland Scheidegger
0f4c08aea2 gallium/docs: fix up transfer description for 1d arrays, add cube map arrays
Transfers always use z/depth for layers no matter if it's a 1d or 2d array
texture, we don't follow OpenGL's crazyness there. Luckily this appears to
only be a doc bug, everyone doing the right thing already.
While here also document z/depth parameter for cube map arrays.

v2: fix typo spotted by Eric Anholt

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-07 21:15:01 +02:00
Chia-I Wu
7916d5ed88 ilo: fix textureSize() for single-layered array textures
We returned 0 instead of 1 for the number of layers when the array texutre is
single-layered.  This fixed it on GEN7+.
2013-06-08 01:39:47 +08:00
Chia-I Wu
d6c2708e1e util: add util_resource_is_array_texture()
Checking if array_size is greater than 1 is not enough for single-layered
array textures.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-08 01:37:40 +08:00
Brian Paul
90fa71b277 docs: update some environment variable info
Drop the GALLIUM_NOSSE/PPC env vars, added ST_DEBUG and some of the
VMware SVGA driver env vars.
2013-06-07 10:12:32 -06:00
Arnas Milasevicius
3069357ef0 gallium: Remove draw_arrays() and draw_arrays_instanced() functions
Moved draw_arrays() to st_draw_feedback.c and removed draw_arrays_instanced().
draw_arrays() was used by nobody else.  Now there's just one "draw" entrypoint
into the draw module.

Signed-off-by: Brian Paul <brianp@vmware.com>
2013-06-07 09:29:29 -06:00
Brian Paul
14541dacab tgsi: replace tgsi_file_names tgsi_file_names[] with tgsi_file_name() function
This change came from the discovery that the STATIC_ASSERT to check that
the number of register file strings didn't actually work.

Similar changes could be made for the other string arrays in tgsi_string.c

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-07 09:23:24 -06:00
Chia-I Wu
97d641eb22 u_vbuf: fix index buffer leak
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-06-07 19:33:30 +08:00
Chris Forbes
06a503ca71 i965/vs: add support for emitting gl_ClipVertex
Removes the special-case suppression of gl_ClipVertex in the VUE map.

Also calculate vertex outcodes for user clip planes based on
gl_ClipVertex if written; otherwise gl_Position.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-07 20:50:33 +12:00
Chris Forbes
3615949990 i965/clip: Add support for gl_ClipVertex
When clipping triangles against a user clip plane, and gl_ClipVertex
is provided in the vertex, use it instead of hpos.

TODO: A similar change should be made at some point for line clipping.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-07 20:50:33 +12:00
Chia-I Wu
9b34a7f29a ilo: advertise PIPE_CAP_CUBE_MAP_ARRAY
It was supported but not advertised.  Also remove TODO tag for
PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT, as it is not a TODO.
2013-06-07 15:37:40 +08:00
Chia-I Wu
cde49c71a3 ilo: add support for TEX2/TXB2/TXL2 in fs
They were already supported, just being rejected in the TGSI translator.
2013-06-07 15:37:35 +08:00
Vinson Lee
f8df73f41c glsl linker: Initialize member variable interface_namespace.
Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-06 22:55:24 -07:00
Chia-I Wu
7142da6dd1 ilo: use slab allocator for transfers
Slab allocator is perfect for transfer.  Improved OpenArena performance by 1%
with several casual runs.
2013-06-07 13:23:43 +08:00
Chia-I Wu
09f62a13fc ilo: clean up states upon context destroy
We need to unreference resources that we referenced.
2013-06-07 11:28:21 +08:00
Chia-I Wu
7cbf0a410e ilo: unmap cp bo before destroying it
The BOs are mapped in their entire life times for the chipsets we support so
do not forget to unmap it.
2013-06-07 11:28:20 +08:00
Chia-I Wu
27804b2fc7 ilo: enable bo reuse
This magical line of code must have got lost at some point in the history...
2013-06-07 11:28:20 +08:00
Chia-I Wu
20d23b2275 ilo: construct 3DSTATE_SF in create_rasterizer_state()
Add ilo_rasterizer_sf and initialize it in create_rasterizer_state().
2013-06-07 11:13:16 +08:00
Chia-I Wu
3c2fea206f ilo: construct 3DSTATE_CLIP in create_rasterizer_state()
Add ilo_rasterizer_clip and initialize it in create_rasterizer_state().
2013-06-07 11:13:16 +08:00
Chia-I Wu
4006f4ce26 ilo: use emit_SURFACE_STATE() for render targets
Introduce ilo_surface_cso and initialize it in create_surface().  With the
change, we can emit SURFACE_STATE directly from the CSO and remove
emit_surf_SURFACE_STATE().  We do not deal with depth/stencil surfaces yet.
2013-06-07 11:13:16 +08:00
Chia-I Wu
5354dc7428 ilo: use emit_SURFACE_STATE() for constant buffers
Introduce ilo_cbuf_cso and initialize it in set_constant_buffer().  As
ilo_view_surface is embedded in ilo_cbuf_cso, switch to emit_SURFACE_STATE()
for constant buffers and remove emit_cbuf_SURFACE_STATE().
2013-06-07 11:13:16 +08:00
Chia-I Wu
2d82885d3c ilo: add emit_SURFACE_STATE() for sampler views
Introduce ilo_view_cso and initialize it in create_sampler_view().  Add
emit_SURFACE_STATE() to GPE, which can emit SURFACE_STATE from
ilo_view_surface.
2013-06-07 11:13:16 +08:00
Chia-I Wu
39e947569e ilo: add ilo_view_surface for SURFACE_STATE
Define struct ilo_view_surface for SURFACE_STATE construction and emission.
2013-06-07 11:13:15 +08:00
Courtney Goeltzenleuchter
c6983ea035 ilo: convert generic depth-stencil-alpha pipe state to ilo pipe state
Moving the work to create time reduces the work at emit time.
Saves time overall as create work is only done once.
Fix compiler warning in gen7_pipeline_sol.

[olv: remember pipe_alpha_state instead of pipe_depth_stencil_alpha_state in
      ilo_dsa_state]
2013-06-07 11:13:15 +08:00
Chia-I Wu
70e78211d6 ilo: introduce vertex element CSO
Introduce ilo_ve_cso and initialize it in create_vertex_elements_state().
This commit goes a step further by setting up mappings from HW VB to PIPE VB,
which we failed to do previously.  That allows us to support instanced
rendering.
2013-06-07 11:13:15 +08:00
Chia-I Wu
d4fa98db0c ilo: simplify emit_3DSTATE_DEPTH_BUFFER()
Remove hiz and dsa from the parameters.  We would know whether HiZ buffer
exists from ilo_texture once it is supported.  DSA state should not affect
3DSTATE_DEPTH_BUFFER.
2013-06-07 11:13:15 +08:00
Chia-I Wu
eea1be2072 ilo: introduce blend CSO
Introduce ilo_blend_cso and initialize it in create_blend_state().  This saves
us from having to construct hardware blend states in draw_vbo().
2013-06-07 11:13:15 +08:00
Chia-I Wu
b3c9e2161f ilo: introduce sampler CSO
Introduce ilo_sampler_cso and initialize it in create_sampler_state().  This
saves us from having to perform CPU-intensive calculations to construct
hardware sampler states in draw_vbo().
2013-06-07 11:13:15 +08:00
Chia-I Wu
99725d2f8a ilo: construct SCISSOR_RECT in set_scissor_states()
This allows us to memcpy() the state in draw_vbo().  Add ilo_init_states() and
ilo_cleanup_states() that are called when contexts are created and destroyed
respectively, and properly set the initial scissor state in ilo_init_states().
2013-06-07 11:13:15 +08:00
Chia-I Wu
e51806ee7a ilo: introduce viewport CSO
Introduce ilo_viewport_cso and initialize it in set_viewport_states().  This
saves us from having to perform CPU-intensive calculations to construct
hardware viewport states in draw_vbo().
2013-06-07 11:13:15 +08:00
Chia-I Wu
4228cf3746 ilo: switch to ilo states for shaders and resources
Define and use

 struct ilo_sampler_state;
 struct ilo_view_state;
 struct ilo_cbuf_state;
 struct ilo_resource_state;
 struct ilo_global_binding;

in ilo_context.
2013-06-07 11:13:15 +08:00
Chia-I Wu
94212915ee ilo: switch to ilo states for CC stage
Define and use

 struct ilo_dsa_state;
 struct ilo_blend_state;
 struct ilo_fb_state;

in ilo_context.
2013-06-07 11:13:15 +08:00
Chia-I Wu
29b938d9f4 ilo: switch to ilo states for WM stage
Define and use

 struct ilo_rasterizer_state;

in ilo_context.
2013-06-07 11:13:15 +08:00
Chia-I Wu
130364ad1d ilo: switch to ilo states for CLIP and SF stages
Define and use

 struct ilo_viewport_state;
 struct ilo_scissor_state;

in ilo_context.
2013-06-07 11:13:14 +08:00
Chia-I Wu
3bc8289f49 ilo: switch to ilo states for SOL stage
Define and use

 struct ilo_so_state;

in ilo_context.
2013-06-07 11:13:14 +08:00
Chia-I Wu
6b14b392d0 ilo: switch to ilo states for VF stage
Define and use

 struct ilo_vb_state;
 struct ilo_ve_state;
 struct ilo_ib_state;

in ilo_context.
2013-06-07 11:13:14 +08:00
Chia-I Wu
f0af292239 ilo: move hardware limits to ilo_gpe.h 2013-06-07 11:13:14 +08:00
Roland Scheidegger
644b8346fd draw: trivial fix comment typo 2013-06-06 23:51:39 +02:00
Roland Scheidegger
769449b3e8 gallium/tgsi: add missing string for layer semantic
Also report if a shader writes the layer semantic

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-06 23:51:38 +02:00
Roland Scheidegger
d0518c4c69 llvmpipe: bump 3d and cube map limits to 2048 and 8192 respectively
These should just work, required by d3d10. Too large resources will
get thrown out separately anyway.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-06 23:51:38 +02:00
Eric Anholt
38e77e545d glsl: Fix uniform buffer object counting.
We were counting uniforms located in UBOs against the default uniform
block limit, while not doing any counting against the specific combined
limit.

Note that I couldn't quite find justification for the way I did this, but
I think it's the only sensible thing: The spec talks about components, so
each "float" in a std140 block would count as 1 component and a "vec4"
would count as 4, though they occupy the same amount of space.  Since GPU
limits on uniform buffer loads are surely going to be about the size of
the blocks, I just counted them that way.

Fixes link failures in piglit
arb_uniform_buffer_object/maxuniformblocksize when ported to geometry
shaders on Paul's GS branch, since in that case the max block size is
bigger than the default uniform block component limit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-06-06 14:37:41 -07:00
Eric Anholt
93c8692ce9 glsl: Make a local variable to avoid restating this array lookup.
v2: Convert another instance of the array lookup. (caught by Tapani)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-06-06 14:37:40 -07:00
Kenneth Graunke
757ad82867 intel: Use the CHIPSET macro in the PCI ID tables for the device name.
Putting the human readable device names directly in the PCI ID list
consolidates things in one place.  It also makes it easy to customize
the name on a per-PCI ID basis without a huge code explosion.

Based on a patch by Kristian Høgsberg.

v2: Fix 830M/845G names and #undef CHIPSET (caught by Emit Velikov).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-06 14:28:35 -07:00
Kenneth Graunke
ea92b700df intel: Remove 'misc' parameter from CHIPSET macro in PCI ID tables.
This has never actually been used for anything.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-06 14:28:35 -07:00
Andreas Boll
8bc788ea9e build: Use PACKAGE_VERSION from autoconf
Both variables had the same value.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-06 19:07:22 +02:00
Andreas Boll
c0f7ccc136 build: Unify PACKAGE_VERSION on autotools, scons and Android
This patch unifies mesa's PACKAGE_VERSION on autotools, scons and
Android build systems.

Current behaviour is:
 - Autotools uses 9.2.0 as PACKAGE_VERSION
 - Scons and Android use 9.2-devel as PACKAGE_VERSION

With this patch all three build systems use 9.2.0-devel as
PACKAGE_VERSION.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-06 19:07:14 +02:00
Jonathan Gray
5bd808a2c7 radeon/winsys: correct RADEON_GEM_WAIT_IDLE use
RADEON_GEM_WAIT_IDLE is declared DRM_IOW but mesa
uses it with drmCommandWriteRead instead of drmCommandWrite
which leads to the ioctl being unmatched and returning an
error on at least OpenBSD.

Problem originally noticed in libdrm by Mark Kettenis.
Dave Airlie pointed out that mesa has the same issue.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
2013-06-06 11:01:18 +02:00
Mike Stroyan
962204961d configure.ac: Build dricommon for gallium swrast
When building dri-swrast, use gallium_check_st to set HAVE_COMMON_DRI.
Commit 07f2dee7 added setting of HAVE_COMMON_DRI in gallium_check_st.
But the dri-swrast case did not use gallium_check_st.
So dri/common was still not built.

v2: set HAVE_COMMON_DRI=yes instead of using gallium_check_st

NOTE: This is a candidate for the 9.1 branch.
      (Depends on 7de78ce5 and 07f2dee)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61821
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-06-06 08:54:07 +02:00
Rodrigo Vivi
ce67fb4715 i965: Adding more reserved PCI IDs for Haswell.
At DDX commit Chris mentioned the tendency we have of finding out more
PCI IDs only when users report. So Let's add all new reserved Haswell IDs.

NOTE: This is a candidate for stable branches.

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=63701
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-05 10:44:15 -07:00
Rico Schüller
3998cfa933 mesa: remove outdated version lines in comments
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-06-05 08:54:27 -06:00
Richard Sandiford
7bdf1f2f1a gallium: System z support
The main change is to use MCJIT rather than the old JIT, which will never
be supported for System z.  The endianness part is by example since the
patch was tested on a glibc system.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-06-05 08:36:24 -06:00
Roland Scheidegger
008fd03600 llvmpipe: improve alignment calculation for fetching/storing pixels
This was always doing per-pixel alignment which isn't necessary, except
for the buffer case (due to the per-element offset). The disabled code
for calculating it was incorrect because it assumed that always the full
block would be fetched, which may not be the case, so fix this up.
The original code failed for instance for r10g10b10a2 the alignment would
have been calculated as 4 (block_width) * 4 (bytes) so 16, but the actual
fetch may have only fetched 2 values at a time, hence only alignment 8 -
it is unclear what exactly would happen in this case (alignment larger
than size to fetch).
So just use the (already calculated) fetch size instead and get alignment
from that which should always work, no matter if fetching 1,2 or 4 pixels.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-05 00:29:47 +02:00
Roland Scheidegger
ffe2a1ca3c llvmpipe: reduce alignment requirement for 1d resources from 4x4 to 4x1
For rendering to buffers, we cannot have any y alignment.
So make sure that tile clear commands only clear up to the fb width/height,
not more (do this for all resources actually as clearing more seems
pointless for other resources too). For the jit fs function, skip execution
of the lower half of the fragment shader for the 4x4 stamp completely,
for depth/stencil only load/store the values from the first row
(replace other row with undef).
For the blend function, also only load half the values from fs output,
replace the rest with undefs so that everything still operates on the
full 4x4 block to keep code the same between 4x1 and 4x4 (except for
load/store of course which also needs to skip (store) or replace these
values with undefs (load))., at the cost of slightly less optimal code
being produced in some cases.
Also reduce 1d and 1d array alignment too, because they can be handled the
same as buffers so don't need to waste memory.

v2: don't try to run special blend code for 4x1, (very) slightly less
complexity if we just use the same code as for 4x4 which may or may not
make it easier to optimize in the future (as we care a lot more about 4x4
performance than 1d).

v2: don't use undef values for unused fs src outputs with llvm 3.1 as it
apparently can trigger a bug in llvm.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-05 00:29:47 +02:00
Roland Scheidegger
ef3e887084 llvmpipe: cleanup of generate_unswizzled_blend
Some parameters were used inconsistently, for instance not using
block_width/block_height/block_size for deferring number of pixels
but rather relying on guesses from the number of fragment shaders etc,
so fix this up (no actual change in behavior since the block size stays
fixed). (Though most of the code would work with different block_height,
with three exceptions, one being the hacked r11g11b10 conversions and
twiddle code which only work with block_height 2 not 1, and the last
one being blend vector type not being 128bit wide.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-05 00:29:47 +02:00
Roland Scheidegger
44993c1808 gallivm: enhance special sse2 4x4f and 2x8f -> 1x16ub conversion
There's no good reason why it can't handle 2x4f->1x8ub, 1x4f->1x4ub and
1x8f->1x8ub cases, there might be legitimate reasons why we don't have
enough input vectors for a full destination vector, and using pack
intrinsics should still be much better than using generic conversion
(it looks like convert_alpha from the blend code might hit this though
I suspect it could be avoided).

v2: add another test vector format to lp_test_conv so this gets tested.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-05 00:29:46 +02:00
Roland Scheidegger
ce82523db9 gallivm: (trivial) fix lp_build_concat_n
The code was designed to handle no-op concat but failed (unless the
caller was using same pointer for src and dst).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-05 00:29:46 +02:00
Brian Paul
f270baf074 mesa: change MAX_PROGRAM_ADDRESS_REGS to 1, clamp to it in state tracker
We've never properly supported more than one address register.  There
isn't even a field in prog_src_register or prog_dst_register to indicate
which address register to use if RelAddr!=0.

In the state tracker, clamp MaxAddressRegs against MAX_PROGRAM_ADDRESS_REGS
since many gallium drivers do support more.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65226

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-04 13:29:38 -06:00
Paul Berry
2fd785d126 intel: Don't try to blorp or blit CopyTexSubImage(1D_ARRAY).
Blorp and the hardware blitter can't be used to implement
CopyTexSubImage when the image type is 1D_ARRAY, because of a
coordinate system mismatch (the Y coordinate in the source image is
supposed to be matched up to the Z coordinate in the destination
texture).

The hardware blitter path (intel_copy_texsubimage) contained a perf
debug warning for this case, but it failed to actually fall back.  The
blorp path didn't even check.

Fixes piglit test "copyteximage 1D_ARRAY".

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-04 09:14:44 -07:00
Paul Berry
32d1f423bc i965/gen6+: Fix multisample assertions in CopyTexSubImage hw blitter path.
Commit 045612c (intel: Add an assert for glCopyTexSubImage() being
called on MSAA buffers) added an assertion to intel_copy_texsubimage()
to make sure that multisampling was not in use, based on the
assumption that glCopyTexSubImage() can't legally be used with
multisampling.

However, there is one case where glCopyTexSubImage() can legally be
used with multisampling: when the source buffer is a multisampled
window system buffer.  If the source and destination color formats
don't match, the blorp path will fail, so intel_copy_texsubimage()
will be called.  In this case, we need intel_copy_texsubimage() to
return false so that we fall back to meta to do the copy.  (The
multisampled source buffer won't cause a problem for the meta path,
because it uses glReadPixels, which forces a multisample resolve).

It's still safe to assert that the destination image is
single-sampled, because it's not legal to call glCopyTexSubImage() on
multisampled textures.

Fixes some failures with piglit tests "copyteximage
{1D,2D,CUBE,RECT,2D_ARRAY}" (with "samples=..." argument).

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-04 09:14:40 -07:00
Vinson Lee
7bafd88c15 mesa: Prevent possible out-of-bounds read by save_SamplerParameterfv.
Fixes "Out-of-bounds access" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-03 23:01:46 -07:00
Dave Airlie
0677ea063c i965: fix problem with constant out of bounds access (v3)
Okay I now understand why Frank would want to run away, this is
my attempt at fixing the CVE out of bounds access to constants
outside the range. This attempt converts any illegal constants
to constant 0 as per the GL spec, and is undefined behaviour.

A future patch should add some debug for users to find this out,
but this needs to be backported to stable branches.

CVE-2013-1872

v2: drop the last hunk which was a separate fix (now in master).
hopefully fix the indentations.

v3: don't fail piglit, the whole 8/16 dispatch stuff was over
my head, and I spent a while figuring it out, but this one is
definitely safe, one piglit pass extra on my Ironlake.

NOTE: This is a candidate for stable branches.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-06-04 13:50:20 +10:00
Eric Anholt
bb525f1f11 intel: Fix copying of separate stencil data in glCopyTexSubImage().
We were copying the source stencil data onto the destination depth data.

Fixes piglit copyteximage other than 1D_ARRAY.

v2: Fix unintentional dropping of the "don't double-copy for packed
    depth/stencil" check.  While blorp is only supported on separate
    stencil hardware at the moment, hopefully that will change soon.
    Review by Jordan.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-06-03 14:22:54 -07:00
Eric Anholt
c937aea3d1 meta: Fix temporary image type for float depth/stencil.
Fixes assertion failure in piglit copyteximage.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-03 13:47:19 -07:00
Eric Anholt
f96de8ad96 intel: Fix performance regression from miptree blit changes.
When making v2 of da2880bea0, I carefully
checked all of the calls in that commit to see that I'd updated them, but
forgot to update the new calls in the later commits such as
.e845c5cf7abce55759501a473459aff3bf25c9ca.  As a result, we were getting Y
tiled temporaries even though the whole point of the temporary was to
untile!

The steady state of the intro scene of lightsmark goes from 13 to 17 fps.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65154
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-03 13:47:18 -07:00
Carl Worth
610fe6da79 glcpp: Add test case for recently fixed loop-control underflow bug.
To trigger the bug, it suffices to have a line-continuation followed by
a newline and then a non-line-continuation backslash.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-03 13:33:32 -07:00
Carl Worth
d8eeb1d330 glcpp: Fix post-decrement underflow in loop-control variable
This loop-control condition with a post-decrement operator would lead to
an underflow of collapsed_newlines. This in turn would cause a subsequent
execution of the loop to labor inordinately trying to return the loop-control
variable to a value of 0 again.

Fix this by dis-intertwining the test and the decrement.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65112

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-03 13:33:31 -07:00
Chad Versace
7a9f4d3e71 i965: Fix glColorPointer(GL_FIXED)
When a gl_client_array is created with glColorPointer,
gl_client_array::Normalized is true. This caused the translation from the
gl_client_array's type to a BRW_SURFACEFORMAT to assertion fail.

Fixes the spinning cube's color in Android 4.2's ApiDemos.apk,
"Graphics > OpenGL ES".

Fixes assertion failure in mesa-demos/src/egl/opengles1/tri_x11 on Haswell
and Ivybridge:
  brw_draw_upload.c:287: get_surface_type: Assertion `0' failed.

No Piglit regressions on Haswell.

Note: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42182
Issue: AXIA-2954
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-03 13:03:28 -07:00
Zack Rusin
e54c924a0e softpipe: draw_find_shader_output returns -1 on invalid outputs
It was changed from 0 to allow shader outputs at 0 that are
different from position.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-30 19:54:25 -04:00
Tom Stellard
124e1f91a7 radeonsi/compute: Upload work group, work item size in input buffer 2013-06-03 14:03:13 -04:00
Tom Stellard
3d831206a4 radeonsi/compute: Pass kernel arguments in a buffer v2
v2:
  - Fix memory leak in si_set_constant_buffer()
2013-06-03 14:03:08 -04:00
Tom Stellard
67e5c9ae0e radeonsi/compute: Implement un-binding of global buffers 2013-06-03 10:24:54 -04:00
Tom Stellard
d2472ceb92 radeonsi/compute: Support multiple kernels in a compute program 2013-06-03 10:24:54 -04:00
Tom Stellard
3f24190325 radeonsi/compute: Add missing PIPE_COMPUTE caps 2013-06-03 10:24:54 -04:00
Jordan Justen
c754f7a8fd i965 gen7: use SURFACE_STATE fields to select render level/layer
Rather than pointing the surface_state directly at a single
sub-image of the texture for rendering, we now point the
surface_state at the top level of the texture, and configure
the surface_state as needed based on this.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-02 20:39:38 -07:00
Jordan Justen
6bfd897fc4 mesa/texformat: add _mesa_tex_target_is_array function
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-02 20:39:38 -07:00
Jordan Justen
6a5469cff9 intel: add layered parameter to update_renderbuffer_surface
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-02 20:38:37 -07:00
Jordan Justen
8312caf673 intel_fbo: set gl_renderbuffer Depth field
Set the renderbuffer's Depth field to match the texture's
Depth when rendering to a texture.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-02 20:38:37 -07:00
Jordan Justen
a2d31371e9 intel: print image depth in debug message
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-02 20:38:37 -07:00
Brian Paul
e20a2df401 mesa: handle missing read buffer in _mesa_get_color_read_format/type()
We were crashing when GL_READ_BUFFER == GL_NONE.  Check for NULL
pointers and reorganize the code.  The spec doesn't say which error
to generate in this situation, but NVIDIA raises GL_INVALID_OPERATION.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65173
NOTE: This is a candidate for the stable branches.

Tested-by: Vedran Rodic <vrodic@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-06-02 18:12:07 -06:00
Brian Paul
dcc5b6bfb7 meta: move vertex array enables for mipmap generation
Before, on the second call to GenerateMipmap we were enabling two
vertex arrays for the current vertex array object, rather than
the private generate-mipmap vertex array object.  This caused
things to blow up elsewhere.

This patch moves the array enables into the block where the
generate-mipmap vertex array object is created, as we do in
the setup_ff_generate_mipmap() function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60518
NOTE: This is a candidate for the stable branches.

Tested-by: core13@gmx.net
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-06-02 18:06:17 -06:00
Brian Paul
8588350dc0 mesa: fix hodge podge indentation, update comments in texformat.c 2013-06-02 18:06:17 -06:00
Roland Scheidegger
6b53e2b038 gallium: add support for layered rendering
Since pipe_surface already has all the necessary fields no interface
changes are necessary except adding a new shader semantic value
(TGSI_SEMANTIC_LAYER).
(Note that what GL knows as "gl_Layer" variable d3d10 is naming
"RENDER_TARGET_ARRAY_INDEX".)

v2: drop cap bit (just tied to geometry shader), add docs.
2013-06-01 20:03:59 +02:00
Roland Scheidegger
458a9a0f85 gallivm: fix out-of-bounds access with mirror_clamp_to_edge address mode
Surprising this bug survived so long, we were missing a clamp (in the
linear filtering version).
(Valgrind complained a lot about invalid reads with piglit texwrap,
I've also seen spurios failures in this test which might have
happened due to this. Valgrind probably didn't complain before the
alignment reduction in llvmpipe to 4x4 since the test is using tiny
textures so the reads were still always well within allocated area.)
While here, also do an effective clamp (after half subtraction)
of [0,length-0.5] instead of [0, length-1] which saves an instruction
(the filtering weight could be different due to this, but only if
both texels point to the same max texel so it doesn't matter).
(Both changes are borrowed from PIPE_TEX_CLAMP_TO_EDGE case.)

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-01 20:03:59 +02:00
Roland Scheidegger
f51fc7a71c llvmpipe: fix bogus assertions for buffer surfaces
One of the assertion made no sense for buffer rendertargets
(due to the union), so drop it. (The same assertion is present already in
the path for texture surfaces later.).

v2: make assertion completely accurate (suggested by Jose).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-01 20:03:59 +02:00
Kenneth Graunke
4405ff4055 i965: Fix haswell_upload_cut_index when there's no index buffer.
brw->ib.type is reset to -1 at the start of each batch.  If there's no
index buffer, it won't get updated to a sensible value, resulting in
_mesa_primitive_restart_index's "Invalid index buffer type" assertion
tripping.

Fixes a regression since 7c87a3b5da.

NOTE: This is a candidate for the 9.1 branch (and should be squashed).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65195
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-31 21:54:49 -07:00
Roland Scheidegger
869c5d438f llvmpipe: reduce alignment requirement for resources from 64x64 to 4x4
The overallocation was very bad especially for things like 1d array
textures which got blown up by a factor of 64. (Even ordinary smallish
2d textures benefit a lot from this, a mipmapped 64x64 rgba8 texture
previously used 7*16kB = 112kB instead of now ~22kB.)
4x4 is chosen because this is the size the jit functions run on, so
making it smaller is going to be a bit more complicated.
It is actually not strictly 4x4 pixel, since we'd want to avoid situations
where different threads are rendering to the same cacheline so we keep
cacheline size alignment in x direction (often 64bytes).
To make this work introduce new task width/height parameters and make
sure clears don't clear the whole tile if it's a partial tile. Likewise,
the rasterizer may produce fragments outside the 4x4 blocks present in a
tile, so don't call the jit function for them.
This does not yet fix rendering to buffers (which cannot have any y
alignment at all), and 1d/1d array textures are still overallocated by a
factor of 4.

v2: replace magic number 4 with LP_RASTER_BLOCK_SIZE, fix size of buffers
allocated (needed in case we render to them).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-31 20:21:05 +02:00
Adam Jackson
e881c9a5dc llvmpipe: Remove x/y from cmd_bin
These were mostly just a waste of memory and cache pressure, and were
really only used for debugging.

This change reduces instruction count (as measured by callgrind's Ir
event) of gnome-shell-perf-tool on Ivybridge by 3.5% ± 0.015% (n=20).

Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-05-31 20:21:05 +02:00
Vadim Girlin
eb4c992ea5 r600g/sb: fix broken assert
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-31 22:11:42 +04:00
Andreas Boll
5ea43e6549 glapi: Add some missing static_dispatch="false" annotations to es_EXT.xml
This fixes the following build errors on powerpc:

  CC     glapi_dispatch.lo
  In file included from glapi_dispatch.c:90:0:
  ../../../../../src/mapi/glapi/glapitemp.h:1640:1: error: no previous
  prototype for 'glReadBufferNV' [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:4198:1: error: no previous
  prototype for 'glDrawBuffersNV' [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:6377:1: error: no previous
  prototype for 'glFlushMappedBufferRangeEXT'
  [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:6389:1: error: no previous
  prototype for 'glMapBufferRangeEXT' [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:6401:1: error: no previous
  prototype for 'glBindVertexArrayOES' [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:6413:1: error: no previous
  prototype for 'glDeleteVertexArraysOES' [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:6433:1: error: no previous
  prototype for 'glGenVertexArraysOES' [-Werror=missing-prototypes]
  ../../../../../src/mapi/glapi/glapitemp.h:6445:1: error: no previous
  prototype for 'glIsVertexArrayOES' [-Werror=missing-prototypes]

NOTE: This is a candidate for the 9.0 and 9.1 branches.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-31 17:18:57 +02:00
Vinson Lee
171199b2b7 mesa: Add missing break statement in _mesa_choose_tex_format.
Fixes "Missing break in switch" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-30 23:12:32 -07:00
Alan Coopersmith
306f630e67 integer overflow in XF86DRIGetClientDriverName() [CVE-2013-1993 2/2]
clientDriverNameLength is a CARD32 and needs to be bounds checked before
adding one to it to come up with the total size to allocate, to avoid
integer overflow leading to underallocation and writing data from the
network past the end of the allocated buffer.

NOTE: This is a candidate for stable release branches.

Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-30 18:03:45 -07:00
Alan Coopersmith
2e5a268f18 integer overflow in XF86DRIOpenConnection() [CVE-2013-1993 1/2]
busIdStringLength is a CARD32 and needs to be bounds checked before adding
one to it to come up with the total size to allocate, to avoid integer
overflow leading to underallocation and writing data from the network past
the end of the allocated buffer.

NOTE: This is a candidate for stable release branches.

Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-30 18:03:39 -07:00
Brian Paul
51498a3e71 mesa: fix error checking of DXT sRGB formats in _mesa_base_tex_format()
For formats such as GL_COMPRESSED_SRGB_S3TC_DXT1_EXT we need to
have both the GL_EXT_texture_sRGB and GL_EXT_texture_compression_s3tc
extensions.  This patch adds the missing check for the later.

Found when checking out https://bugs.freedesktop.org/show_bug.cgi?id=65173

NOTE: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-30 14:01:31 -06:00
Brian Paul
fb1785197f mesa: asst. whitespace, formatting fixes in teximage.c 2013-05-30 14:01:31 -06:00
Zack Rusin
978d5ed06b draw: fix vs/fs input/output mismatches
When we've changed draw_find_shader_output to return -1 instead
of 0 on non found attribs we broke the default behavior of
draw, which was to always redirect those to the first (0th) slot.
To preserve that behavior if draw_emit_vertex_attr notices a
mismatched vertex attrib, it just redirects it to the first slot
(instead of trying to use negative index in an array).

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-30 15:34:19 -04:00
Anuj Phogat
0a70fdfb3f intel: Add multisample scaled blitting in blorp engine
In traditional multisampled framebuffer rendering, color samples must be
explicitly resolved via BlitFramebuffer before doing the scaled blitting
of the framebuffer. So, scaled blitting of a multisample framebuffer
takes two separate calls to BlitFramebuffer.

This patch implements the functionality of doing multisampled scaled
resolve using just one BlitFramebuffer call. Important changes involved
in this patch are listed below:
    - Use float registers to scale and offset texture coordinates.
    - Change offset computation to consider float coordinates.
    - Round the scaled coordinates down to nearest integer.
    - Modify src texture coordinates clipping to account for scaling..
    - Linear filter is not yet implemented in blorp. So, don't use
      blorp engine to do single sampled scaled blitting.

V3: Fix nearest filtering issue in scaled blits. Makes failing piglit
fbo-blit-stetch test and framebuffer_blit_functionality_magnifying_blit.test
in gles3 CTS pass.

Observed no piglit, gles3 CTS regressions on sandybridge & ivybridge with
this patch.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-30 10:50:30 -07:00
Anuj Phogat
6e28713a8d intel: Change the register type from UW to UD in blorp engine
These changes are required to implement scaled blitting in blorp
in my next patch.

No regressions observed in piglit quick-driver.tests with this patch.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-30 10:50:29 -07:00
Anuj Phogat
40e3298125 mesa: Implement ext_framebuffer_multisample_blit_scaled extension
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-30 10:50:29 -07:00
Kenneth Graunke
60f9b722ef Revert "i965: fix problem with constant out of bounds access (v2)"
This reverts commit 98dfd59a04.

The patch was clearly not Piglit tested, as it caused at least 225
tests to start crashing with assertion failures.  That was before my
desktop tanked and the test run died completely.
2013-05-29 23:31:09 -07:00
Courtney Goeltzenleuchter
8b1c9de166 ilo: simplify shader variant handling
Remove hash function on shader variants. Nature of variants limits them to a
small number and thus its more efficient to just do a memory compare of the
actual shader structures rather than compute and compare hashes.
2013-05-30 13:58:40 +08:00
Dave Airlie
98dfd59a04 i965: fix problem with constant out of bounds access (v2)
This is my attempt at fixing this as the CVE is making RH security team
care enough to make me look at this. (please upstream, security fixes are
more important than whatever else you are doing, if for no other reason than
it saves me having to fix stuff I've no real clue about).

Since Frank's original fix was denied, here is my attempt to just
alias all constants that are out of bounds < 0 or > nr_params to constant 0,
hopefully this provides the undefined behaviour idr requires..

CVE-2013-1872

v2: drop the last hunk which was a separate fix (now in master).
hopefully fix the indentations.

NOTE: This is a candidate for stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-05-30 12:59:34 +10:00
Frank Henigman
02fe736cc0 intel: initialize fs_visitor::params_remap in constructor
Set fs_visitor::params_remap to NULL in the constructor.
This variable was potentially tested in fs_visitor::remove_dead_constants()
before being set.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Frank Henigman <fjhenigman@google.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-05-30 10:37:35 +10:00
Brian Paul
83aaf61e24 draw: add cast in debug_printf() to silence warning 2013-05-29 18:07:35 -06:00
Brian Paul
71682c1599 svga: add PIPE_CAP_MAX_VIEWPORTS to switch to silence warning 2013-05-29 18:07:11 -06:00
Zack Rusin
c08baef508 draw: make sure viewport index is fetched from leading vertex
Viewport index should only be used on a per primitive basis, so
instead of fetching it from each vertex, potentially making each
vertex in a primitive use a different viewport index, which is
obviously broken, make sure that we only fetch from the first
vertex in the primitive making the viewport index the same
for the entire primtive.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
c88ce3480c llvmpipe: clamp scissors to be between 0 and max
We need to clamp to make sure invalid shader doesn't crash our
driver. The spec says to return 0-th index for everything that's
out of bounds.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
d7d676252d draw: clamp the viewports to always be between 0 and max
If the viewport index is larger than the PIPE_MAX_VIEWPORTS,
then the first (0-th) viewport should be used.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
26fe24c479 gallium/docs: adds documentation for multi viewport cap
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
4b5595b38b draw: fixup draw_find_shader_output
draw_find_shader_output like most of the code in draw used to
depend on position always being at output slot 0. which meant
that any other attribute being at 0 could signify an error.
unfortunately position can be at any of the output slots, thus
other attributes can occupy slot 0 and we need to mark the ones
which were not found by something else. This commit changes
draw_find_shader_output so that it returns -1 if it can't
find the given attribute and adjust the code that depended
on it returning >0 whenever it correctly found an attrib.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
97b8ae429e llvmpipe: implement support for multiple viewports
Largely related to making sure the rasterizer can correctly
pick out the correct scissor box for the current viewport.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
7756aae815 draw: implement support for multiple viewports
This adds support for multiple viewports to the draw module.
Multiple viewports depend on the presence of geometry shaders
which can write the viewport index.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin
eaabb4ead0 gallium: Add support for multiple viewports
Gallium supported only a single viewport/scissor combination. This
commit changes the interface to allow us to add support for multiple
viewports/scissors.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-25 09:49:20 -04:00
Kenneth Graunke
e6efb900e7 mesa: Delete the ctx->Array._RestartIndex derived state.
It's incorrect and isn't used any longer.

v2: Actually flush vertices/flag _NEW_TRANSFORM on RestartIndex change.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:17 -07:00
Kenneth Graunke
51c0ffacb2 mesa: Ignore fixed-index primitive restart in ArrayElement().
GL_PRIMITIVE_RESTART_FIXED_INDEX is only supposed to apply to
glDrawElements*.  This code is for legacy drawing paths and display
lists, so it shouldn't apply.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:14 -07:00
Kenneth Graunke
a41478e3f6 st/mesa: Go back to using ctx->Array.RestartIndex, not _RestartIndex.
The derived _RestartIndex field is an attempt to support both
GL_PRIMITIVE_RESTART and GL_PRIMITIVE_RESTART_FIXED_INDEX (part of ES
3.0).  Gallium drivers don't appear to support ES 3.0 yet, so they don't
need to use it.  Plus, it's broken and going to go away soon.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:11 -07:00
Kenneth Graunke
49aba27973 i965: Fix can_cut_index_handle_restart_index() for byte/short types.
Pre-Haswell hardware doesn't support an arbitrary restart index, and
instead compares the index buffer value against 0xFF for byte-size
buffers, 0xFFFF for short-size buffers, or 0xFFFFFFFF for unsigned
integer buffers.

OpenGL allows the restart index to be an arbitrary unsigned integer.
When comparing against byte/short types, the index buffer value should
be promoted to a full 32-bit integer before doing the comparison.  The
restart index is /not/ supposed to be masked to byte/short size.

This means that with certain restart indexes, the comparison should
always fail.  For example, a restart index of 0xF000FFFF should never
match any byte/short index buffer values due to the extra high bits.

We must not enable hardware primitive restart in such a case.  For now,
fall back to software primitive restart as it's the simplest fix.  In
the future, we could detect restart indexes that will never match and
skip both hardware and software primitive restart.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:08 -07:00
Kenneth Graunke
7c87a3b5da i965: Use the correct restart index for fixed index mode on Haswell.
The code that updates the ctx->Array._RestartIndex derived state mashed
it to 0xFFFFFFFF when GL_PRIMITIVE_RESTART_FIXED_INDEX was enabled
regardless of the index buffer type.  It's supposed to be 0xFF for byte,
0xFFFF for short, or 0xFFFFFFFF for integer types.

The new _mesa_primitive_restart_index() helper gets this right.

The hardware appears to compare against the full 32-bit value some of
the time, causing primitive restart not to occur when it should.  The
fact that it works some of the time is rather frightening.

Fixes sporadic failures in the ES 3 instanced_arrays_primitive_restart
conformance test when run in combination with other tests.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:06 -07:00
Kenneth Graunke
1569709663 vbo: Use the new primitive restart index helper function.
This gets the correct restart index for unsigned byte/short types when
using GL_PRIMITIVE_RESTART_FIXED_INDEX.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:04 -07:00
Kenneth Graunke
959d076b30 mesa: Add a helper function for determining the restart index.
The derived state approach currently used (_RestartIndex) doesn't work:
in the GL_PRIMITIVE_RESTART_FIXED_INDEX case, the restart index depends
on the index buffer's data type, and that isn't known until draw time.

The existing code also fails to obey the GL 4.3 rules which say that
FIXED_INDEX takes precedence over normal primitive restart.

This helper function correctly determines the restart index, and will
replace the derived state.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:22:02 -07:00
Kenneth Graunke
37f278000c vbo: Ignore PRIMITIVE_RESTART_FIXED_INDEX for glDrawArrays().
The derived _PrimitiveRestart enable flag combines the PrimitiveRestart
and PrimitiveRestartFixedIndex enable flags.  However, DrawArrays is not
supposed to do FixedIndex restart:

From the OpenGL 4.3 Core specification, section 10.3.5 (page 302):
"If PRIMITIVE_RESTART_FIXED_INDEX is enabled, primitive restart is not
 performed for array elements transferred by any drawing command not
 taking a type parameter, including all of the *Draw* commands other
 than *DrawElements*."

The OpenGL ES 3.0 specification agrees by omission:
"When DrawElements, DrawElementsInstanced, or DrawRangeElements
 transfers a set of generic attribute array elements to the GL..."

Notably, DrawArrays is not included in the list of draw calls that
take PRIMITIVE_RESTART_FIXED_INDEX into consideration.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-29 14:21:51 -07:00
Eric Anholt
6220cc931f i965/vs: Fix implied_mrf_writes() for integer division pre-gen6.
Previously it would assertion fail in debug builds (though the correct
value was returned in a non-debug build).  Marking it as a candidate for
stable even though it has no current consumers in the stable branches, in
case one shows up in a later backport.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64727
NOTE: This is a candidate for stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-29 11:02:01 -07:00
Eric Anholt
0a0b323193 i965/fs: Fix test for smearing enabled on an instruction.
We were expanding the live range too far, breaking register_coalesce_2()
and compute_to_mrf() on 16-wide shaders.  Turning it back on improves
GLB2.7 performance by 0.239355% +/- 0.0850649% (n=398). shader-db stats
are:

total instructions in shared programs: 1627211 -> 1609262 (-1.10%)
instructions in affected programs:     450351 -> 432402 (-3.99%)

While 33 new 16-wide shaders are gained, 70 are lost.  Despite that,
tropics (the app that lost the most 16-wide) shows a .41% +/- .16%
(n=7/8, first-run outlier removed) performance improvement on my HSW.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-29 10:20:26 -07:00
Eric Anholt
9a31c4f9ac i965/fs: Fix segfault in instruction scheduling with LINTERP using last GRF.
The scheduler didn't know about uniform-type accesses, and if a uniform
access was last in a 16-wide, we'd walk off the end of the array.  This
never happened, because we'd never coalesce out all the GRFs, due to a bug
to be fixed in the next commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-29 10:16:44 -07:00
Eric Anholt
7e7600d10b mesa: Fix test for optimistic coloring being necessary.
i965 and radeon use ra_set_node_reg() to force payload registers to
specific registers while exposing those registers to the allocator still.
We were treating those register nodes as unsuccessfully allocated in the
ra_simplify() step, leading to walking the registers again to do
optimistic coloring even if there was nothing left ot do.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-29 10:16:44 -07:00
Anthony G. Basile
22f1add968 gallium: fix build on uclibc system
execinfo.h and debug_symbol_name_glibc() are pure GNU-isms and do not
build on uclibc systems.  A previous patch addressed this issue, but
there was an error.  This patch corrects that error.  See

  https://bugs.freedesktop.org/show_bug.cgi?id=51782
  https://bugs.gentoo.org/show_bug.cgi?id=469768

Signed-off-by: Anthony G. Basile <blueness@gentoo.org>
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-05-29 08:32:35 -06:00
Eric Anholt
4dea6cf215 intel: Enable blit glCopyTexSubImage/glBlitFramebuffer with sRGB.
Since the introduction of default-to-SARGB8 window system framebuffers,
non-blorp hardware lost blit acceleration for these two paths between the
window system and ARGB8888 textures.  Since we shouldn't be doing any
conversion anyway, just compatibility-check the linear variants of the
formats.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61954
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
2013-05-28 17:53:44 -07:00
Andreas Hartmetz
f43f07d588 radeonsi: Add ipo to LLVM_COMPONENTS
r600g needs it too, so add ipo in the common radeon_llvm_check().

radeonsi compiled and linked, but it failed at dynamic link time
with a missing symbol.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-05-28 17:08:00 -07:00
Roland Scheidegger
33fcce3682 llvmpipe: get rid of tiled/linear layout remains
Eliminate the rest of the no longer needed layout logic.
(It is possible some code could be simplified a bit further still.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-29 00:41:06 +02:00
Eric Anholt
b3abc93f47 intel: Remove dead intel_drawbuf_region().
Since the glBitmap() MRT change, it's unused.  There was basically no way
to responsibly use this function since MRT was introduced.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:58 -07:00
Eric Anholt
0a39cb88de intel: Fix format handling of blit glBitmap()
Any 32-bit format got ARGB8888 handling (including, say, GL_RG1616), and
anything else got 16-bit (including, say, GL_R8), which could potentially
hang the GPU by writing out of bounds.

NOTE: This is a candidate for the stable branches.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:58 -07:00
Eric Anholt
1cb8de6fff intel: Fix MRT handling of glBitmap().
We'd only hit color buffer 0 even if multiple draw buffers were bound.

NOTE: This is a candidate for the stable branches.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:57 -07:00
Eric Anholt
5f29dca070 intel: Rebuild PBO blit glTexImage() on top of miptrees.
This will ensure that we have resolves if we ever extend this to
glTexSubImage(), and fixes missing image start offset handling.

The texture buffer alloc ended up getting moved up, because we want to
look at the format of the image's actual mt to see if we'll end up
blitting the right thing, in the case of packed depth/stencil uploads.

This is the last caller of intelEmitCopyBlit() on a miptree-wrapped BO.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:57 -07:00
Eric Anholt
3c3e83014b intel: Rebuild PBO blit glReadPixels() on top of miptrees.
The previous code was missing depth resolves, that had only been prevented
due to no blitting of Y tiling.  The pair of flip args in the new blit
function means that we can just drop the pack->Invert fallback.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:57 -07:00
Eric Anholt
8c3392e274 intel: Rework intel_miptree_create_for_region() to wrap a BO.
I needed to do this for the PBO blit cases to use intel_miptree_blit().
But this also actually partially fixes a bug in EGLImage handling: We
can't share regions across contexts, because regions have a refcount that
isn't protected by a mutex, and different contexts can be simulataneously
accessed from multiple threads.  Now we just need to get regions out of
__DRIImage.  There was also a missing use of image->offset in the EGLImage
renderbuffer storage code.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:57 -07:00
Eric Anholt
e845c5cf7a intel: Make a temporary miptree for the blit path of miptree mapping.
In a bit of debug code, we no longer have the inter-slice x/y to print.
But I think the level/slice is more useful in this case for looking at
what's getting mapped, especially given that INTEL_DEBUG=blit will tell
you the other value.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:56 -07:00
Eric Anholt
4a13beef88 intel: Make a temporary miptree when doing blit uploads for glTexSubImage().
While this is a bit more CPU work, it also is less code to handle this
path, and fixes problems with 32k-pitch textures and missing resolves.

v2: Add error checking in new code.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 13:06:56 -07:00
Eric Anholt
da2880bea0 intel: Extend the force_y_tiling flag to allow forcing no tiling.
For a blit-uploaded temporary, it's faster on current hardware to memcpy
the data into a linear CPU mapping than to go through the GTT.

v2: Turn the not-fully-supported mask into 3 supported enum values.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Paul Berry <stereotype441@gmail.com> (v2)
Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v2)
2013-05-28 13:06:43 -07:00
Eric Anholt
045612c90e intel: Add an assert for glCopyTexSubImage() being called on MSAA buffers.
This is just in case someone else trips over this due to our weird reuse
of this code in glBlitFramebuffer().

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:44 -07:00
Eric Anholt
7638f5578e i965: Allow glCopyTexSubImage() on depth textures.
If the hw is pre-gen5 and can't blit depth, it'll cleanly error out.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:39 -07:00
Eric Anholt
48a22340cf i965: Prefer blorp glBlitFramebuffer() to the glCopyTexSubImage-based blit.
I think we've measured no performance difference from this in the past,
except that the blorp code can do things like multisample resolves.
Prevents piglit regression in the next commit when a testcase started
trying to do a multisampled resolve through the old glCopyTexSubImage()
path.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:35 -07:00
Eric Anholt
9720d436d1 i965: Consistently do depth resolves before blitting.
We were protected for a long time by the fact that depth was Y tiled and
you couldn't blit Y.  Now that we can blit Y, we were failing to resolve
depth in glCopyPixels().

Note in the comment about swrast, that the swrast map path does resolves
appropriately already.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:30 -07:00
Eric Anholt
6a7c27786c intel: Make a wrapper for intelEmitCopyBlit using miptrees.
I had previously asserted that it was hard to write a useful, simpler
blit function, but I think this might be it.

This has the side effect of extending the 32k pitch check to a few more
places that were missing it.

v2: Update comment for being moved inside intel_miptree_blit().

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:25 -07:00
Eric Anholt
0ae294bf7c intel: Rename intel_renderbuffer_tile_offsets.
This makes it more consistent with intel_miptree_get_tile_offsets().

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:21 -07:00
Eric Anholt
4e8eafd8f4 intel: Reduce intel_renderbuffer_tile_offsets to a thin wrapper.
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:15 -07:00
Eric Anholt
5c85e1cf55 intel: Make intel_miptree_get_tile_offsets return a page offset.
Right now, the callers in i965 don't expect a nonzero page offset to
actually occur (since that's being handled elsewhere), but it seems
like a trap to leave it this way.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-28 12:40:00 -07:00
José Fonseca
4eaa0999b5 glsl: Fix MSVC build.
It appears that `sizeof(Class::member)` is either non-standard or
merely unsupported in MSVC.

So use `sizeof(instance->member)` instead, which is guaranteed to work
everywhere.

Also promote the assert to a static assert.

Trivial.
2013-05-28 13:56:18 +01:00
Marek Olšák
d4a06d77f5 mesa: fix GLSL program objects with more than 16 samplers combined
The problem is the sampler units are allocated from the same pool for all
shader stages, so if a vertex shader uses 12 samplers (0..11), the fragment
shader samplers start at index 12, leaving only 4 sampler units
for the fragment shader. The main cause is probably the fact that samplers
(texture unit -> sampler unit mapping, etc.) are tracked globally
for an entire program object.

This commit adapts the GLSL linker and core Mesa such that the sampler units
are assigned to sampler uniforms for each shader stage separately
(if a sampler uniform is used in all shader stages, it may occupy a different
sampler unit in each, and vice versa, an i-th sampler unit may refer to
a different sampler uniform in each shader stage), and the sampler-specific
variables are moved from gl_shader_program to gl_shader.

This doesn't require any driver changes, and it fixes piglit/max-samplers
for gallium and classic swrast. It also works with any number of shader
stages.

v2: - converted tabs to spaces
    - added an assertion to _mesa_get_sampler_uniform_value

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-28 13:05:30 +02:00
Marek Olšák
b4cb857dbf swrast: increase array size of TextureSample
to match the size of ctx->Texture.Unit, and it will also fix
piglit/max-samplers with the following commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-28 13:05:30 +02:00
Marek Olšák
15a4b6db21 mesa: declare UniformBufferBindings as an array with a static size
Some Gallium drivers were crashing, because the array was not large enough.

v2: clamp the per-shader maximum in st/mesa, then sum them all up

NOTE: This is a candidate for the stable branches.
2013-05-28 13:05:30 +02:00
Michel Dänzer
cdad129f9c radeonsi: Enable GLSL 1.30 2013-05-28 11:20:53 +02:00
Michel Dänzer
0495adbac5 radeonsi: Handle TGSI TXQ opcode 2013-05-28 11:20:53 +02:00
Michel Dänzer
3623111960 radeonsi: Add support for TGSI TXF opcode 2013-05-28 11:20:53 +02:00
Michel Dänzer
beaa5eb03a radeonsi: Use tgsi_util_get_texture_coord_dim() 2013-05-28 11:20:53 +02:00
Michel Dänzer
0afeea5ad2 radeonsi: Handle TGSI_SEMANTIC_CLIPDIST 2013-05-28 11:20:16 +02:00
Michel Dänzer
784df2e115 radeonsi: Make border colour state handling safe for integer textures 2013-05-28 09:55:46 +02:00
Michel Dänzer
e369f40a9b radeonsi: Fix hardware state for dual source blending
Set up CB_SHADER_MASK register according to pixel shader exports, and enable
some minimal state for colour buffer 1 in case dual source blending is used.
2013-05-28 09:55:46 +02:00
Vadim Girlin
08810ca9ef r600g/sb: handle more cases for folding in gvn pass
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-28 05:24:53 +04:00
Christian König
5328c8001b st/vdpau: destroy handle table only when it's empty
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-27 18:18:32 +02:00
Christian König
f796b67431 st/vdpau: remove vlCreateHTAB from surface functions
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-27 18:18:32 +02:00
Christian König
8ea34fa0e8 st/vdpau: invalidate the handles on destruction
Fixes a problem with xbmc when switching channels.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-27 18:18:32 +02:00
Vadim Girlin
5de41575a1 r600g/sb: improve folding for SETcc
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 15:30:01 +04:00
Vadim Girlin
88e700329b r600g/sb: optimize CNDcc instructions
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 15:29:56 +04:00
Vadim Girlin
725671a83a r600g/sb: improve optimization of conditional instructions
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 15:19:20 +04:00
Chia-I Wu
5285c4c88e ilo: enable multiple constant buffers
This effectively enables uniform buffer object support.
2013-05-27 12:31:42 +08:00
Chia-I Wu
3a5dd39b1d ilo: add support for indirect access of CONST in FS
Unlike other register files, CONST is read with a message and indirect access
is easier to implement.
2013-05-27 12:30:51 +08:00
Chia-I Wu
8e7987cc49 ilo: add support for TBOs on GEN6
This hunk was missing in the last commit.
2013-05-27 12:30:42 +08:00
Chia-I Wu
11c9aaf30a ilo: advertise supports for pure integer formats
For pure integer formats, no filtering nor blending is needed.
2013-05-27 11:02:57 +08:00
Chia-I Wu
fb40aca879 ilo: add support for texture buffer objects
Take care of sampler views that have buffers as the underlying resources.
Update caps related to TBOs.
2013-05-27 11:02:57 +08:00
Chia-I Wu
441aa9326a tgsi: add buffer texture to tgsi_util_get_texture_coord_dim()
TGSI_TEXTURE_BUFFER is one-dimensional.  Assert that exec_tex() is never
called with TGSI_TEXTURE_BUFFER.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-27 11:02:06 +08:00
Vadim Girlin
63d09a0cb7 r600g/sb: improve handling of KILL instructions
This patch improves handling of unconditional KILL instructions inside
the conditional blocks, uncovering more opportunities for if-conversion.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 01:45:07 +04:00
Vadim Girlin
880f435a7e r600g/sb: fix peephole optimization for PRED_SETE
Fixes incorrect condition that prevented optimization for
PRED_SETE/PRED_SETE_INT.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 01:45:07 +04:00
Vadim Girlin
ff2a611699 r600g/sb: fix scheduling of PRED_SET instructions
PRED_SET instructions that update exec mask should be scheduled immediately
prior to the "if-then-else" block, because any instruction that is
inserted after alu clause with PRED_SET and before conditional block is
also conditionally executed by hw (exec mask is already updated at that
moment).

Propbably it's better to make PRED_SET a part of conditional
"if-then-else" block in the IR to handle this more cleanly,
but for now this temporary solution should prevent the problem.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-27 01:45:07 +04:00
Vadim Girlin
44a117ab9a r600g/sb: fix handling of preloaded inputs for compute shaders
For compute shaders we need to let the backend know that
GPRs 0 and 1 are preloaded with some compute-specific input
values, otherwise any use of these regs without previous
definition is considered as undefined value and usually
is simply replaced with 0.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-25 22:56:53 +04:00
Brian Paul
fd9fe4470b xlib: add null ctx check in glXDestroyContext()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934
NOTE: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-24 16:35:25 -06:00
Brian Paul
fd29e4acda st/glx: add null ctx check in glXDestroyContext()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934
NOTE: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-24 16:35:25 -06:00
Brian Paul
db4580cbdf st/mesa: add switch cases for new IR enums to silence warnings 2013-05-24 16:35:25 -06:00
Brian Paul
820de34ceb st/glx/xlib: assorted whitespace, comment fixes 2013-05-24 16:35:24 -06:00
Vadim Girlin
8e41ced4b3 r600g/sb: fix incorrect assert
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 21:00:54 +04:00
Vadim Girlin
e9aa46e665 r600g/sb: relax some restrictions for FETCH instructions
This allows GVN rewrite pass to propagate non-const (register)
values to FETCH source operands, helping to eliminate unnecessary
copies in some cases.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 21:00:54 +04:00
Vadim Girlin
5a68a29706 r600g/sb: relax register allocation for compute shaders
We have to assume that all GPRs in compute shader can be indirectly
addressed because LLVM backend doesn't provide any indirect array info.
That's why for compute shaders GPR array is created that covers all used
GPRs (0..r600_bytecode::ngpr-1), but this seriously restricts register
allocation in sb.

This patch checks for actual use of indirect access in the shader and
if it's not used then GPR array is not created, so that regalloc is not
unnecessarily restricted.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 21:00:54 +04:00
Vadim Girlin
0b5b3f8816 r600g/sb: fix gpr array handling for compute shaders
Fixes segfault with bfgminer and R600_DEBUG=sbcl.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 16:45:58 +04:00
Vadim Girlin
d1e0dc6275 r600g/sb: fix buffer overflow in sb_ostream
Fixes segfault during bytecode dump with bfgminer kernel

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-24 16:40:58 +04:00
Tom Stellard
b1797c3a38 r600g/compute: Use common transfer_{map,unmap} functions for global resources
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-23 14:52:34 -07:00
Tom Stellard
65d67bcc4b r600g/compute: Use common transfer_{map,unmap} functions for kernel inputs
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-23 14:52:34 -07:00
Kenneth Graunke
062317d667 i965: Go back to using the kernel SOL reset feature.
It turns out the MI_LOAD_REGISTER_IMM approach doesn't work on Haswell,
and regressed essentially all the transform feedback Piglit tests.

This morally reverts eaa6fbe6d5.  However,
the code is still simpler than it was.  On BeginTransformFeedback, we
simply flush the batch and set the SOL reset flag so that the next batch
will start with zeroed offsets.  There's still no software counting.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64887
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-23 13:32:02 -07:00
Rob Clark
95670bdee2 freedreno: scissor fix
Don't assume the state-tracker will set the scissor after the
framebuffer state is changed.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-05-23 14:35:21 -04:00
Rob Clark
97fa811d14 freedreno: implement pipe->resource_copy_region()
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-05-23 14:35:21 -04:00
Kenneth Graunke
3ddfccb303 glsl linker: compare interface blocks during interstage linking
Verify that interface blocks match when linking separate shader
stages into a program.

Fixes piglit glsl-1.50 tests:
* linker/interface-blocks-vs-fs-member-count-mismatch.shader_test
* linker/interface-blocks-vs-fs-member-order-mismatch.shader_test

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2013-05-23 09:37:12 -07:00
Jordan Justen
4a0bcd90cf glsl linker: compare interface blocks during intrastage linking
Verify that interface blocks match when combining compilation
units at the same stage. (For example, when merging all vertex
shaders.)

Fixes piglit glsl-1.50 test:
* linker/interface-blocks-multiple-vs-member-count-mismatch.shader_test

v5 (Ken): Rename to link_interface_blocks.cpp and drop the separate .h
file for consistency with other linker code.  Remove "ok" variable.
Fold cross_validate_interface_blocks into its caller.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
d6863acb9f glsl linker: support arrays of interface block instances
With this change we now support interface block arrays.
For example, cases like this:

out block_name {
    float f;
} block_instance[2];

This allows Mesa to pass the piglit glsl-1.50 test:
* execution/interface-blocks-complex-vs-fs.shader_test

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
c30ca431ba glsl link_varyings: link interface blocks using the block name
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
5ebf547312 glsl linker: remove interface block instance names
Convert interface blocks with instance names into flat
interface blocks without an instance name.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
b24eeb078f glsl ast_to_hir: support in/out for interface blocks
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
cb29a7095f glsl ast_to_hir: reject row/column_major for in/out interface blocks
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
c00387497d glsl ast_to_hir: move uniform block symbols to interface blocks namespace
Uniform/interface blocks are a separate namespace from types.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
3919c19468 glsl_symbol_table: add interface block namespaces
For interface blocks, there are three separate namespaces for
uniform, input and output blocks.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:12 -07:00
Jordan Justen
9368604d99 glsl parser: allow in & out for interface block members
Previously uniform blocks allowed for the 'uniform' keyword
to be used with members of a uniform blocks. With interface
blocks 'in' can be used on 'in' interface block members and
'out' can be used on 'out' interface block members.

The basic_interface_block rule will verify that the same
qualifier type is used with the block and each member.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
067cc08d6a glsl ast_to_hir: reject interpolation qualifiers for uniform blocks
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
4410eba598 glsl parser: handle interface block member qualifier
An interface block member may specify the type:
in {
    in vec4 in_var_with_qualifier;
};

When specified with the member, it must match the same
type as interface block type.

It can also omit the qualifier:
uniform {
    vec4 uniform_var_without_qualifier;
};

When the type is not specified with the member,
it will adopt the same type as the interface block.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
4369acff5e glsl parser: on desktop GL require GLSL 150 for instance names
Interface blocks in GLSL 150 allow an instance name to be used.

v2:
 * use state->check_version

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
d36cb3617c glsl parser: reject VS+in & FS+out interface blocks
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
6d3d974e37 glsl: parse in/out types for interface blocks
Previously only 'uniform' was allowed for uniform blocks.

Now, in/out can be parsed, but it will only be allowed for
GLSL >= 150.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
744c270406 glsl parser: rename uniform block to interface block
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Jordan Justen
c9f58544be glsl: rename ast_uniform_block to ast_interface_block
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-23 09:37:11 -07:00
Chris Forbes
7bfb4bea65 i965: Enable guardband clipping on Gen4/5.
Enables guardband clipping when the viewport covers the entire render
target.

No piglit regressions on Ironlake.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-24 08:00:47 +12:00
Chris Forbes
a3d8e7c57c ARB_fp: accept duplicate precision options
Relaxes the validation of

   OPTION ARB_precision_hint_{nicest,fastest};

to allow duplicate options. The spec says that both /nicest/ and
/fastest/ cannot be specified together, but could be interpreted
either way for respecification of the same option.

Other drivers (NVIDIA etc) accept this, and at least one Unity3D game
expects it to succeed (Kerbal Space Program).

V2: Add spec quote.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-24 07:50:51 +12:00
Vinson Lee
e3eeb72f24 ilo: Initialize need_flush in draw_vbo.
need_flush was uninitialized if hw3d->new_batch was true.

Fixes "Uninitialized scalar variable" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-05-23 15:31:42 +08:00
Vinson Lee
36e2c7cc1a radeon: Initialize variables in radeon_llvm_context_init.
'type' was not fully initialized when calling lp_build_context_init.

Fixes "Uninitialized scalar variable" defect reported by Coverity.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-05-22 23:06:23 -07:00
Eric Anholt
cf37e12024 intel: Count fragments in our blitter-based glBitmap() path.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59440
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-22 14:35:44 -07:00
Eric Anholt
0af614727a i965: Shut up more compiler warnings from vector insert/extract changes.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-22 14:35:25 -07:00
Roland Scheidegger
2b291eaa90 softpipe: change TEX_TILE_SIZE and NUM_TEX_TILE_ENTRIES
Initially we had NUM_TEX_TILE_ENTRIES of 50, however this was using too much
memory (mostly because the tile cache is operating on fixed max current
sampler views which could be fixed but that's another topic). So it was
decreased to 4. However this is a ridiculously low number which can't
actually really work (the number of tiles needed for as little as
a single quad with linear_mipmap_linear is 2 to 8 for a 2d texture, and
4 to 16 for a 3d texture), as it just about guarantees there will be
cache thrashing sometimes (just about always for 3d textures in fact, since
while there are 4 entries the cache is direct mapped).
So increase that number to 16 (which is still on the low side for direct
mapped cache though I guess using something like 4-way associativity would
be more effective than increasing this further) which has at least some good
chance to avoid thrashing. Since we don't want to increase memory requirements
however in turn decrease the tile size accordingly from 64 to 32 (as a bonus
point this also decreases the cost of texture thrashing which might still
happen sometimes).
I've seen performance improvement in the order of factor ~200 (specifically,
drawing the first frame from the replay from bug 41787 needs "only" ~10s
instead of ~30min, meaning I can actually compare the output with other
drivers...) with this.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Roland Scheidegger
2f567fb7b5 softpipe: disambiguate TILE_SIZE / TEX_TILE_SIZE
These can be different (just like NUM_TEX_TILE_ENTRIES / NUM_ENTRIES),
though currently they aren't.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Roland Scheidegger
80e2cc0f97 llvmpipe: disable simple_shader optimization
This optimization disabled mask checks if the shader is simple enough.
While this should work correctly, the problem is that it can hide real issues
because shaders in practice are usually complex enough (8 instructions or 1
texture is already enough) so this doesn't get used, whereas dumbed-down
tests which should hit all the same code paths suddenly do something quite
different. This was the reason that bug 41787 could not be easily tracked as
stencil test not working correctly (piglit would in fact have failed some
tests without that optimization).
So disable it for now, it's unclear if it's much of a win in any case.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Roland Scheidegger
e108716429 llvmpipe: fix early depth test / late depth write stencil issues
We actually did early depth/stencil test and late depth/stencil write even
when the shader could kill the fragment (alpha test or discard). Since it
matters for the new stencil value if the fragment is killed by depth/stencil
test or by the shader (in which case it will not reach the depth/stencil
test) this simply cannot work (we also would possibly skip writing the new
stencil value due to mask checks but this is a secondary issue).
So use late depth test / late depth write instead in this case.
(No piglit changes as it doesn't seem to hit such bogus early depth test
/ late depth write path.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Roland Scheidegger
82d7733b52 llvmpipe: fix issue with not writing new stencil values
We did mask checks between depth/stencil testing and depth/stencil write.
This meant that if the depth/stencil test killed off all fragments we never
actually wrote the new stencil value. This issue affected all early/late
test/write combinations.
So move the mask check after depth/stencil write (for early depth test,
could do the same for late depth test but might not be worth it at that
point so just skip it there).
This addresses https://bugs.freedesktop.org/show_bug.cgi?id=41787.
Piglit does not hit this issue because of the simple_shader optimization
in generate_fs_loop() which means we're skipping the mask checks.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Roland Scheidegger
3c91ef0f29 llvmpipe: (trivial) remove confusing code in stencil test
This was meant to disable some code which isn't needed when depth/stencil
isn't written. However, there's more code which wouldn't be needed in that
case so having the condition there was just odd (llvm will drop all the code
anyway).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Roland Scheidegger
5314f5d829 llvmpipe: fix bug in early depth test / late depth write handling
Using wrong type if the format was less than 32bits.
No piglit changes as it doesn't hit that path.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-22 22:57:27 +02:00
Alexander von Gluck IV
6d20e251f2 Haiku: Add Gallium winsys and target code
* We generate a static library for Haiku
  Gallium targets as our port system combines
  the compiled rendering code into a modular
  ar for each module (for example, our port
  system combines llvm libsoftpipe.a libllvmpipe.a
  into a single ar for the Haiku build system.
  I'd like the Gallium hgl target scons build
  system to do this some day, however how is
  beyond me at the moment. This is a first step.
2013-05-22 14:31:44 -05:00
Chia-I Wu
ff68f61bed ilo: set more fields of 3DSTATE_DEPTH_BUFFER
Set lod/layer related fields of 3DSTATE_DEPTH_BUFFER.  Since we always point
to a single level/layer, those fields are always zero and this commit
effectively makes no change.

While at it, make it easier to disable manual slice offset calculation.
2013-05-22 20:25:57 +08:00
Chia-I Wu
f3da711bea ilo: correctly set view extent in SURFACE_STATE
The view extent was set to be the same as the depth while it should be set to
the number of layers.  It makes a difference for 3D textures.

Also use this as a chance to clean up the code.
2013-05-22 18:12:01 +08:00
Chia-I Wu
bbb30398e5 ilo: avoid unnecessary emission of SO states
No need to emit 3DSTATE_SO_BUFFER and 3DSTATE_SO_DECL_LIST when SO is
disabled.  As the implicit flush done by the commands is also gone, emit an
explicit flush.
2013-05-22 18:09:17 +08:00
Eric Anholt
08f87ac333 i965: Skip etc-to-rgb transcode on BayTrail.
The hardware does it, so no need for this workaround.

Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-20 23:04:32 -07:00
Eric Anholt
c245efe7e8 mesa: Remove extension checking from ChooseTexFormat.
This should already be handled by _mesa_base_tex_format() calls in
TexImage*.
2013-05-21 15:20:28 -07:00
Eric Anholt
36e7c01101 mesa: Add ChooseTexFormat support for the new XBGR formats. 2013-05-21 15:20:28 -07:00
Kenneth Graunke
b29381567a i965: Split BeginTransformFeedback hook into Gen6 and Gen7+ variants.
Most of the work in BeginTransformFeedback is only necessary on Gen6.
We may as well just skip it on Gen7+.

v2: Add an intel->gen == 6 assert.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:40 -07:00
Kenneth Graunke
64a87f29ce i965: Kill software primitive counting entirely.
Now that we have hardware contexts, we don't need to continually
reprogram the GS_SVBI_INDEX registers.  They're automatically saved and
restored with the context, so they can just increment over time.  We
only need to reset them when starting transform feedback.

There's also no reason to delay until the next drawing operation; we can
just emit the packet immediately.  However, this means we must drop the
initialization in brw_invariant_state, as BeginTransformFeedback may
occur before the first drawing in a context.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:27 -07:00
Kenneth Graunke
647fc0c50b i965: Remove software geometry query code.
EXT_transform_feedback isn't yet supported on Gen4-5, so none of this
query code is actually used.  This also means we can remove some of the
surrounding support code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:25 -07:00
Kenneth Graunke
b863d44451 i965: Delete unused brw->sol.offset_0_batch_start field.
This was only used for the the non-hardware context code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:24 -07:00
Kenneth Graunke
eaa6fbe6d5 i965: Stop using the kernel SOL reset feature.
We can just do it ourselves with MI_LOAD_REGISTER_IMM.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:22 -07:00
Kenneth Graunke
6837ebd00f i965: Remove dead code for Gen7 SOL without hardware contexts.
Failing to get a hardware context now means failing to load the driver,
so this code will never get hit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:19 -07:00
Kenneth Graunke
58765bb481 i965: Add a macro for accessing the SO_WRITE_OFFSET[0-3] registers.
Using a function-like macro makes it easy to loop over all four streams.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-21 13:29:06 -07:00
Ian Romanick
0ba1e65fb6 docs: Import 9.1.3 release notes, add news item.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-21 13:16:56 -07:00
Michel Dänzer
d42a2df19c radeonsi: Fix user clip planes
4 more little piglits.

NOTE: This is a candidate for the 9.1 branch.
2013-05-21 17:50:13 +02:00
Michel Dänzer
e3befbca5e radeonsi: Handle TGSI_SEMANTIC_CLIPVERTEX
17 more little piglits.

NOTE: This is a candidate for the 9.1 branch.
2013-05-21 17:50:13 +02:00
Michel Dänzer
eb19163a4d radeonsi: Initial support for multiple constant buffers
Just enough to support an additional internal constant buffer for the user
clip planes.

NOTE: This is a candidate for the 9.1 branch.
2013-05-21 17:50:12 +02:00
Michel Dänzer
4730dea5f5 radeonsi: Fix handling of TGSI_SEMANTIC_PSIZE
Two more little piglits.

NOTE: This is a candidate for the 9.1 branch.
2013-05-21 17:50:12 +02:00
Marek Olšák
2eac0aa1d8 radeonsi: increase array size for shader inputs and outputs
and add assertions to prevent buffer overflow. This fixes corruption
of the si_shader struct.

NOTE: This is a candidate for the 9.1 branch.

[ Cherry-pick of r600g commit da33f9b919 ]

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-21 17:47:44 +02:00
Brian Paul
9772284df2 xlib: check for null ctx pointer in glXIsDirect()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64745
Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-21 07:35:12 -06:00
Brian Paul
1e9875acbe st/glx/xlib: check for null ctx pointer in glXIsDirect()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64745
Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-21 07:35:12 -06:00
José Fonseca
8cabc7be1d scons: Don't force stabs debug format for Mingw.
- recent gdb handles DWARF fine (tested both with version
  7.1.90.20100730 from mingw-w64 project, and 7.5-1 from mingw project)

- http://people.freedesktop.org/~jrfonseca/bfdhelp/ was updated to
  handle DWARF

- stabs requires ugly hacks to prevent compilation failures

- mixing stabs/dwarf prevents proper backtraces (which is inevitable,
  given that the MinGW C runtime is pre-built with DWARF)

For example, without this change I get:

  (gdb) bt
  #0  _wassert (_Message=0xf925060 L"Num < NumOperands && \"Invalid child # of SDNode!\"",
      _File=0xf60b488 L"llvm/include/llvm/CodeGen/SelectionDAGNodes.h", _Line=534)
      at ../../../../mingw-w64-crt/misc/wassert.c:51
  #1  0x0368996b in _assert (_Message=0x39d7ee4 "Num < NumOperands && \"Invalid child # of SDNode!\"",
      _File=0x39d7e94 "llvm/include/llvm/CodeGen/SelectionDAGNodes.h", _Line=534)
      at ../../../../mingw-w64-crt/misc/wassert.c:44
  #2  0x00000004 in ?? ()
  #3  0x00000004 in ?? ()
  #4  0x0f60b488 in ?? ()
  #5  0x00000000 in ?? ()

While with this change I get:

  (gdb) bt
  #0  _wassert (_Message=0xfb982e8 L"Num < NumOperands && \"Invalid child # of SDNode!\"",
      _File=0xefbcb40 L"llvm/include/llvm/CodeGen/SelectionDAGNodes.h", _Line=534)
      at ../../../../mingw-w64-crt/misc/wassert.c:51
  #1  0x039c996b in _assert (_Message=0x3d17f24 "Num < NumOperands && \"Invalid child # of SDNode!\"",
      _File=0x3d17ed4 "llvm/include/llvm/CodeGen/SelectionDAGNodes.h", _Line=534)
      at ../../../../mingw-w64-crt/misc/wassert.c:44
  #2  0x033111cc in getOperand (Num=4, this=<optimized out>)
      at llvm/include/llvm/CodeGen/SelectionDAGNodes.h:534
  #3  getOperand (i=4, this=<optimized out>)
      at llvm/include/llvm/CodeGen/SelectionDAGNodes.h:779
  #4  llvm::SelectionDAG::getNode (this=0xf00cb08, Opcode=79, DL=..., VT=..., N1=..., N2=...)
      at llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:2859
  #5  0x03377b20 in llvm::SelectionDAGBuilder::visitExtractElement (this=0xfb45028, I=...)
      at llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:2803
  [...]

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-21 12:34:19 +01:00
Chia-I Wu
2b7463cf3a ilo: use BLT engine to copy between textures
Emit XY_SRC_COPY_BLT to do the job.  Since ETC1 textures cannot be mapped for
reading, as is required by util_copy_resource_region, this fixes copying of
ETC1 textures.
2013-05-21 12:02:55 +08:00
Chia-I Wu
c44ebb4ef4 ilo: use BLT engine to copy between buffers
Emit (possibly multiple) SRC_COPY_BLT to copy between buffers of arbitrary
sizes.
2013-05-21 11:47:20 +08:00
Chia-I Wu
731cafe7b2 ilo: refactor blitter_xy_color_blt()
Add gen6_XY_COLOR_BLT() and let blitter_xy_color_blt() call the function.  Not
sure if this path is still being hit by any application.
2013-05-21 11:47:20 +08:00
Chia-I Wu
0d42a9e941 ilo: replace cp hooks by cp owner and flush callback
The problem with cp hooks is that when we switch from 3D ring to 2D ring, and
when there are active queries, we will emit 3D commands to 2D ring because
the new-batch hook is called.

This commit introduces the idea of cp owner.  When the cp is flushed, or when
another owner takes place, the current owner is notified, giving it a chance
to emit whatever commands there need to be.  With this mechanism, we can
resume queries when the 3D pipeline owns the cp, and pause queries when it
loses the cp.  Ring switch will just work.

As we still need to know when the cp bo is reallocated, a flush callback is
added.
2013-05-21 11:47:20 +08:00
Chia-I Wu
a04d8574c6 ilo: harware contexts are only for the render ring
The hardware context should not be passed for bo execution when the ring is
not the render ring.  Rename hw_ctx to render_ctx for clarity.
2013-05-21 11:47:19 +08:00
Chia-I Wu
1ed7b825cf ilo: update format mappings
Add more PIPE_FORMAT -> BRW_SURFACEFORMAT mappings, and update
surface_format_info from i965.
2013-05-21 11:47:19 +08:00
Chia-I Wu
bd8090a5af ilo: update headers from i965
Mainly for MI_LOAD_REGISTER_IMM and BCS_SWCTRL.
2013-05-21 11:47:19 +08:00
Anuj Phogat
06cd89a88c i965: Fix build failure
meta.h should be included in brw_state_upload.c to get access to
function _mesa_meta_in_progress().
2013-05-20 16:15:57 -07:00
Kenneth Graunke
f09b91f782 i965: Implement transform feedback query support in hardware on Gen6+.
Now that we have hardware contexts and can use MI_STORE_REGISTER_MEM,
we can use the GPU's pipeline statistics counters rather than going out
of our way to count primitives in software.

Aside from being simpler, this also paves the way for Geometry Shaders,
which can output an arbitrary number of primitives on the GPU.  It will
also allow us to use hardware primitive restart when these queries are
in use.

The GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN query is easy: it
corresponds to the SO_NUM_PRIMS_WRITTEN/SO_NUM_PRIMS_WRITTEN0_IVB
counters.

The GL_PRIMITIVES_GENERATED query is trickier.  Gen provides several
statistics registers which /almost/ match the semantics required:
- IA_PRIMITIVES_COUNT
  The number of primitives fetched by the VF or IA (input assembler).
  This undercounts when GS is enabled, as it can output many primitives.
- GS_PRIMITIVES_COUNT
  The number of primitives output by the GS.  Unfortunately, this
  doesn't increment unless the GS unit is actually enabled, and it
  usually isn't.
- SO_PRIM_STORAGE_NEEDED*_IVB
  The amount of space needed to write primitives output by transform
  feedback.  These naturally only work when transform feedback is on.
  We'd also have to add the counters for all four streams.
- CL_INVOCATION_COUNT
  The number of primitives processed by the clipper.  This doesn't work
  if the GS or SOL throw away primitives for rasterizer discard.
  However, it does increment even if the clipper is in REJECT_ALL mode.

Dynamically switching between counters would be painfully complicated,
especially since GS, rasterizer discard, and transform feedback can all
be switched on and off repeatedly during a single query.

The most usable counter is CL_INVOCATION_COUNT.  The previous two
patches reworked rasterizer discard support so that all primitives hit
the clipper, making this work.

v2: Occlusion query bug fixes removed and squashed in earlier patches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
037a901a5b i965: Handle rasterizer discard in the clipper rather than GS on Gen6.
This has more of a negative impact than the previous patch, as on Gen6
passing primitives through to the clipper means we actually have to make
the GS thread write them to the URB.

I don't see another good solution though, and rasterizer discard is not
the most common of cases, so hopefully it won't be too terrible.

v2: Add a perf_debug; resolve rebase conflicts on the brw dirty flags;
    remove the rasterizer_discard field from brw_gs_prog_key.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
d1e4e9960c i965: Handle rasterizer discard in the clipper rather than SOL on Gen7.
In order to implement the GL_PRIMITIVES_GENERATED query in a sane
fashion on our hardware, we can't discard primitives until the clipper.
The patch after next explains the rationale.

By setting the clipper to REJECT_ALL mode, all primitives get thrown away,
so rendering is still appropriately disabled.

This may negatively impact performance in the rasterizer discard case,
but it's unclear how much and this hasn't been observed to be a
bottleneck in any application we've looked at.  The clipper is the very
next stage in the pipeline, so I don't think it will be terrible.

v2: Add a perf_debug; resolve rebase conflicts on the brw dirty flags.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
5ebe9523f9 i965: Disable clipper statistics when meta operations are in progress.
We don't currently use the clipper statistics, but we'll soon use
CL_INVOCATIONS_COUNT to implement the GL_PRIMITIVES_GENERATED query.
The number of primitives generated is not supposed to be altered during
operations such as glGenerateMipmap.

Prevents spec/EXT_transform_feedback/generatemipmap prims_generated
from breaking when we start using pipeline statistics registers to
implement the GL_PRIMITIVES_GENERATED query in a few commits.

v2: Use the BRW_NEW_META_IN_PROGRESS flag for correct state handling.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
b96f93c453 i965: Create a BRW_NEW_META_IN_PROGRESS state flag.
This will allow us to disable statistics during meta operations.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
bbf86712f8 i965: Add #defines for the pipeline statistics counter registers.
These come from the Ivybridge PRM, Volume 1, Part 3.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
e32cd5ffbb i965: Rely on hardware contexts for query objects on Gen6+.
Hardware contexts greatly simplify the query object code.  The pipeline
statistics counters get saved and restored with the context, which means
that we don't need to worry about other workloads polluting them.

This means that we can simply write a single pair of values (one at
BeginQuery and one at EndQuery) rather than a series of pairs.  This
also means we don't need to worry about the BO getting full.  We also
don't need to delay BO allocation and starting snapshot until the first
draw.

The generation split here is a little off: technically, Ironlake can also
support hardware contexts.  However, the kernel currently doesn't, and
even if it were to do so someday, we'd need to wait a while before
bumping the kernel requirement to take advantage of it.

v2: Incorporate Paul's feedback.
- Clarify which functions are Gen4/5-only via assertions and comments.
- Change how driver hook initialization happens.
- Update comments.
- Squash a bug fix from a later commit here where it belongs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:18 -07:00
Kenneth Graunke
72b1e440dd i965: Disable pixel statistics in BLORP.
BLORP is used for operations like glClear, glCopyTexImage, and
glBlitFramebuffer which aren't supposed to contribute fragments toward
occlusion queries.

This prevents Piglit tests from breaking in the next commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:17 -07:00
Kenneth Graunke
92d2f5acfa i965: Require hardware contexts (and thus Kernel 3.6) on Gen6+.
Hardware contexts are necessary to reasonably support OpenGL 3.2.
In particular, we currently maintain software counters for transform
feedback buffer offsets and counters, which relies on knowing the number
of primitives generated.  Geometry shaders violate that assumption.

At the time of writing, Debian has moved to Kernel 3.8, which means most
people probably have a newer kernel by now.  It's also worth noting that
this patch won't land until Mesa 10 which is currently targeted for
September.  By that point, even more people will have a newer kernel.

Also, don't bother trying to allocate contexts on pre-Gen6, as it
currently will always fail, and if this changes in the future, we'll
need to reevaluate our hw_ctx/gen checks.

This patch leaves the code for flagging BRW_NEW_CONTEXT on new
batchbuffers if hw_ctx == NULL since that still occurs pre-Gen6.

Also remove the Gen7+ check for kernel 3.3, since it's now redundant.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:17 -07:00
Kenneth Graunke
50e60bf8da i965: Bump kernel requirement to 3.3 on Ivybridge.
Kernel 3.3 introduced the SOL reset execbuf parameter, needed for GL 3.0
on Ivybridge.  Bumping the requirement will give an obvious error
message rather than simply reporting GL 2.1.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-20 13:03:17 -07:00
Vincent Lejeune
9fd7ea786c r600g/llvm: fix cubemap lod/bias 2013-05-20 20:23:19 +02:00
Vincent Lejeune
9a95fb1605 r600g/llvm: Fix texelFetchOffset-2D 2013-05-20 20:23:14 +02:00
Vincent Lejeune
32c9cbb38f r600g/llvm: Fix cubearray textureSize 2013-05-20 20:23:09 +02:00
Vincent Lejeune
9c2943601e r600g/llvm: Factorize code loading from const buffer. 2013-05-20 20:23:04 +02:00
Kenneth Graunke
01b79b2e3b i965: Add cases for ir_triop_vector_insert that assert.
brw_link_shader() unconditionally calls lower_vector_insert() with true
as the second parameter.  This means that both constant and variable
indexed expressions will get lowered, so we should never see this in the
backend.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-20 10:19:48 -07:00
Kenneth Graunke
e1e8876797 i965: Add cases for ir_binop_vector_extract that assert.
do_vec_index_to_swizzle() should remove any vector extract operations
with a constant index.  It's unconditionally called from
do_common_optimization().

do_vec_index_to_cond_assign() should remove the rest, and it is
unconditionally called from brw_link_shader().  This means that we
should never see ir_binop_vector_extract in the backend.

Silences compiler warnings.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-20 10:19:30 -07:00
Roland Scheidegger
f6beb4c6b6 llvmpipe: enable z32s8x24 format
Now that we can handle it both for sampling and as depth/stencil enable it.
Passes nearly all additional piglit tests which are now performed, with two
exceptions (one being a framebuffer blit which fails for all other formats
including stencil too as we don't support stencil blits, the other reporting
a unexpected GL error so doesn't look to be llvmpipe's fault).
2013-05-18 00:32:45 +02:00
Roland Scheidegger
070a9afb54 llvmpipe: handle z32s8x24 depth/stencil format
We need to split up the depth and stencil values in this case, and there's
some new logic required to handle float depth and stencil simultaneously.
Also make sure we get the 64bit zs clear values and masks propagated
correctly.
2013-05-18 00:32:33 +02:00
Roland Scheidegger
f3ad716e8f llvmpipe: get rid of unused tiled/linear logic
We do rendering to linear color buffers for quite some time, and since
switching to linear depth buffers all the tiled/linear logic was unused.
So get rid of (most) of it - there's still some LAYOUT_NONE things and
late allocation of resources which probably could be simplified.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-18 00:32:27 +02:00
Roland Scheidegger
87978518e9 llvmpipe: fix bogus handling of first_layer when setting up texture sampling
The code avoided first_layer parameter in the sampler interface (and needing
to do another calculation at runtime) by fixing up the base texture pointer
instead. Unfortunately, this didn't actually work as we have mip-first
texture layout so fixing up the base ptr by a fixed amount is very wrong if
there are mipmaps present. The wrong offsets caused misrendering and crashes.
Fix this by just adjusting the individual mip level offsets instead.
Spotted by Jose.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-18 00:32:18 +02:00
Roland Scheidegger
d7e811c0b0 gallivm: handle z32s8x24 format for sampling
Since we can only sample either depth or stencil but not both only load
the required bits which makes things a bit easier (it requires special
handling since the format doesn't fit into 32bit).
The logic for deciding if depth or stencil should be sampled is a bit odd,
but seems to be what other drivers and statetrackers do: if it's a format with
both depth and stencil (or just with depth) then sample depth, for sampling
stencil a sampler view format with only stencil is required.
Also while here fix up stencil sampling for other formats as well, though
this isn't supported by mesa (ARB_stencil_texturing), and while blits would
use it they don't work neither since they'd also need stencil export.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-18 00:31:49 +02:00
Roland Scheidegger
0346e9b3bb st/mesa: fix weird UCMP opcode use for bool ubo load
I don't know what this code was trying to do but whatever it was it couldn't
have worked since negation of integer boolean inputs while not specified as
outright illegal (not yet at least) won't do anything since it doesn't affect
the result of comparison with zero at all. In fact it looks like the whole
instruction can just be omitted.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-18 00:31:49 +02:00
Eric Anholt
a5b0452400 mesa: Make FinishRenderTexture just take the renderbuffer being finished.
Now that the rb has a reference to the teximage, we didn't need anything
else out of the attachment.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-17 13:04:05 -07:00
Eric Anholt
e98c39c109 mesa: Track the TexImage being rendered to in the gl_renderbuffer.
We keep having to pass the attachments around with our gl_renderbuffers
because that's the only way to find what the gl_renderbuffer actually
refers to.  This is a step toward removing that (though drivers still need
the Zoffset as well).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-17 13:04:05 -07:00
Eric Anholt
7b085d1bfa radeon: Remove dead radeon_wrap_texture().
I should have killed this in my previous cleanup.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-17 13:04:04 -07:00
Eric Anholt
c810e67c55 mesa: Make gl_renderbuffers backed by EGL images use FinishRenderTexture.
This is the opportunity that radeon and intel drivers rely on for flushing
render targets that may get reused as textures.  Before EGL, that only
happened for GL_TEXTURE attachments.

Fixes piglits:
KHR_gl_renderbuffer_image/renderbuffer-texture
OES_EGL_image/renderbuffer-texture

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-17 13:04:04 -07:00
José Fonseca
6166ffeaf7 gallivm: Eliminate 8.8 fixed point intermediates from AoS sampling path.
This change was meant as a stepping stone to use PMADDUBSW SSSE3
instruction, but actually this refactoring by itself yields a 10%
speedup on texture intensive shaders (e.g, Google Earth's ocean water
w/o S3TC on a Ivy Bridge machine), while giving yielding exactly the
same results, whereas PMADDUBSW only gave an extra 5%, at the expense of
2bits of precision in the interpolation.

I belive that the speedup of this change comes from the reduced register
pressure (as 8.8 fixed point intermediates take twice the space of 8bit
unorm).

Also, not dealing with 8.8 simplifies lp_bld_sample_aos.c code
substantially -- it's no longer necessary to have code duplicated for
low and high register halfs.

Note about lp_build_sample_mipmap(): the path for num_quads > 1 is never
executed (as it is faster on AVX to split the 256bit wide texture
computation into two 128bit chunks, in order to leverage integer
opcodes).  This path might be useful in the future, so in order to
verify this change did not break that path I had to apply this change:

  @@ -1662,11 +1662,11 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
         /*
          * we only try 8-wide sampling with soa as it appears to
          * be a loss with aos with AVX (but it should work).
          * (It should be faster if we'd support avx2)
          */
  -      if (num_quads == 1 || !use_aos) {
  +      if (/* num_quads == 1 || ! */ use_aos) {

            if (num_quads > 1) {
               if (mip_filter == PIPE_TEX_MIPFILTER_NONE) {
                  LLVMValueRef index0 = lp_build_const_int32(gallivm, 0);
                  /*

and then run texfilt mesademo:

  LP_NATIVE_VECTOR_WIDTH=256 ./texfilt

Ran whole piglit without regressions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-17 20:23:00 +01:00
José Fonseca
5aaa4bafe0 gallivm: Add and use lp_build_lerp_3d.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-17 20:22:50 +01:00
Tom Stellard
e230d9debb radeon/llvm: Run standard optimization passes on conpute shader modules
The SROA and function inliner passes are espically important, because
they optimize away unsupported features: functions and indirect
private memory access.
2013-05-17 07:38:01 -07:00
Kenneth Graunke
ccb041fe8e intel: Don't spam "intelReadPixels: fallback to swrast" in non-PBO case.
When an application is using PBOs, we attempt to use the BLT engine to
perform ReadPixels.  If that fails due to some restrictions, it's useful
to raise a performance warning.

In the non-PBO case, we always use a CPU mapping since getting the data
into client memory requires a CPU-side copy.  This is a very common case,
so raising a performance warning is annoying.  In particular, apitrace's
image dumping code hits this path, causing it to print hundreds of
thousands of performance warnings via ARB_debug_output.  This tends to
obscure actual errors or other important messages.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-16 22:35:01 -07:00
Paul Berry
46ea804107 intel: Do a depth resolve before copying images between miptrees.
When intel_finalize_mipmap_tree() calls intel_miptree_copy_teximage()
to reassemble a depth miptree that has been broken apart into pieces
(to deal with misalignment of levels/layers within the miptree), it
just copies the depth data, not the HiZ data.  This is reasonable,
since the alignment restrictions of HiZ are a large part of the reason
why the miptree had to be broken apart in the first place.  However,
in order for the depth copy to be sufficient, we need to do a depth
resolve first, to make sure any deferred depth writes that are in the
HiZ buffer get performed.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=64662 and
https://bugs.freedesktop.org/show_bug.cgi?id=64659.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-05-16 14:42:54 -07:00
Niels Ole Salscheider
7e17e72cb7 r600g: fixup for MSAA texture support checking
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
2013-05-16 12:03:47 -07:00
José Fonseca
4f518e1738 llvmpipe: Temporary workaround to prevent segfault on array textures. 2013-05-16 15:14:10 +01:00
José Fonseca
cb9913cdab gallivm: Support pointers in lp_build_print_value().
Trivial.
2013-05-16 15:14:10 +01:00
Chia-I Wu
435aea6f32 ilo: emit 3DSTATE_STENCIL_BUFFER on GEN7+
Whether HiZ is enalbed or not, separate stencil is supported and enforced on
GEN7+.  Now that we support separate stencil resources, we know how to emit
3DSTATE_STENCIL_BUFFER.
2013-05-16 18:33:59 +08:00
Chia-I Wu
6b894e6900 ilo: add support for stencil resources on GEN7+
For allocations, we need to support stencil-only and separate stencil
resources.  For mapping, we need to support software tiling and
packing/unpacking for separate stencil resources.
2013-05-16 18:20:17 +08:00
Chia-I Wu
5c9b69d259 winsys/intel: test for and expose address swizzling
Without knowing whether addresses are swizzled or not, we cannot manipulate a
tiled surface in CPU.
2013-05-16 11:24:59 +08:00
Marek Olšák
639d0f73c1 st/mesa: handle texture_from_pixmap and other surface-based textures correctly
There were 2 issues with it:
1) The texture format which should be used for texturing was only set
   in gl_texture_image::TexFormat, which wasn't used for sampler views.
2) Textures are sometimes reallocated under some circumstances
   in st_finalize_texture, which is unacceptable if the texture comes
   from a window system.

The issues are resolved as follows:
1) If surface_based is true (texture_from_pixmap, etc.), store the format
   in a new variable st_texture_object::surface_format.
2) Don't reallocate a surface-based texture in st_finalize_texture.

Also don't use st_ChooseTextureFormat is st_context_teximage, because
the format is dictated by the caller.

This fixes the glx-tfp piglit test.

Reviewed-by: Adam Jackson <ajax@redhat.com>
2013-05-15 20:22:48 +02:00
Marek Olšák
5a3fac4d26 r600g: cleanup MSAA texture support checking
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-05-15 20:20:32 +02:00
Marek Olšák
61c995bc47 r600g: rewrite FMASK allocation, fix FMASK texturing with 2 and 4 samples
This fixes and enables texturing with compressed MSAA colorbuffers
on Evergreen and Cayman. For the first time, multisample textures work
on Cayman.

This requires the libdrm flag RADEON_SURF_FMASK.

v2: require libdrm_radeon 2.4.45

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-05-15 20:19:45 +02:00
Eric Anholt
61506257f6 i965: Fill in brw_format_for_mesa_format for some non-rendering formats.
This should have no change on driver operation, but it means that when you
wonder why some format isn't supported natively, you can just look at the
table above, instead of wondering if maybe there's an appropriate entry in
the surface formats table that is already supported.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-15 09:43:46 -07:00
Eric Anholt
9db9bc3aa1 i965: Use native RGB_FLOAT16 support when available.
Previously we would expand it to RGBA_FLOAT16.  This format now comes out
as framebuffer incomplete, but it seems worth the memory savings if that's
what people are asking for (and GL3 does list it under "texture-only"
color formats)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-15 09:43:46 -07:00
Eric Anholt
645b610b62 intel: Add support for blitting 6 byte-per-pixel formats.
The next commit introduces what is apparently our first one, which tripped
over this in glReadPixels.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-15 09:43:45 -07:00
Eric Anholt
028c11e8e3 i965: Use the Mesa surface formats for float RGB surfaces.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-15 09:43:45 -07:00
Eric Anholt
2e057076a8 i965: Use the new XRGB UNORM formats.
This is a step on the way to removing some of our code for forcing alpha
to 1, but I want easy bisecting so I'll add groups of formats separately.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-15 09:43:45 -07:00
José Fonseca
2a43dfda95 draw: More defensive coding in DRAW_GET_IDX.
Doesn't make a difference ATM, but just in case.
2013-05-15 16:59:28 +01:00
José Fonseca
1883e1d3e9 draw: Fix vsplit regression when the ib can be used directly.
`ib` no longer is offseted by `istart`.

Trivial.
2013-05-15 16:57:44 +01:00
Chris Forbes
53a5f11f0d mesa: Stop clamping stencil reference value at specification time
All drivers now clamp this to the appropriate range for the bound
stencil buffer when emitting stencil state.

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:04:53 +12:00
Chris Forbes
978f91b829 swrast: Use accessor for stencil reference values
NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:04:53 +12:00
Chris Forbes
db8a84de87 st: Use accessor for stencil reference values
NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:04:53 +12:00
Chris Forbes
c411f40cba radeon: Use accessor for stencil reference values
V2: Drop spurious mask with 0xff.

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:04:34 +12:00
Chris Forbes
7bbe9b78ae nouveau: Use accessor for stencil reference values
NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:01:08 +12:00
Chris Forbes
f819ec46d5 intel: Use accessor for stencil reference values
NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:01:06 +12:00
Chris Forbes
96a1bf1ba3 mesa: Use accessor for stencil reference values in glGet
NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:01:03 +12:00
Chris Forbes
38f65162af mesa: add accessor for effective stencil ref
Clamps the stencil reference value to the range representable in the
currently-bound draw framebuffer's stencil attachment.

V2: Add spec quote.

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-15 22:00:55 +12:00
Chia-I Wu
c68424bac4 ilo: clean up transfer format conversion
Map the bo directly, instead of calling transfer_map().
2013-05-15 15:21:50 +08:00
Chia-I Wu
cb57da421a ilo: rework transfer mapping method choosing
Always check if a bo is busy in choose_transfer_method() since we always need
to map it in either map() or unmap().  Also determine how a bo is mapped in
choose_transfer_method().
2013-05-15 15:21:50 +08:00
Chia-I Wu
b6c307744f ilo: refactor transfer mapping
Add tex_get_box_offset() to compute transfer offet from the pipe_box.  Add
tex_get_slice_stride() to compute slice stride for a transfer.
2013-05-15 15:21:50 +08:00
Chia-I Wu
5af8641ce0 ilo: no writeback without PIPE_TRANSFER_WRITE
We should not write staging data back when PIPE_TRANSFER_WRITE is not set.
2013-05-15 15:08:54 +08:00
Chia-I Wu
46bb33bc21 ilo: minor cleanups for transfers
Rename some functions and reorder some code.
2013-05-15 15:08:54 +08:00
Chia-I Wu
ca349e0217 ilo: simplify ilo_texture_get_slice_offset()
Always return a tile-aligned offset.  Also fix for W tiling.
2013-05-15 15:08:54 +08:00
Zack Rusin
013424678e draw/gs: fix extracting of the clip
The indices are not consecutive when using the geometry shader,
which means we were extracting non existing values. Create
an array of linear indices and always use it instead of the passed
indices. Found by Jose.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-14 04:04:08 -04:00
Kenneth Graunke
a6961f391a docs: Mark a few things as in progress. 2013-05-14 12:22:40 -07:00
Zack Rusin
5104ed3dbf draw: try to prevent overflows on index buffers
Pass in the size of the index buffer, when available, and use it
to handle out of bounds conditions. The behavior in the case of
an overflow needs to be the same as with other overflows in the
vertex processing pipeline meaning that a vertex should still
be generated but all attributes in it set to zero.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-14 03:10:56 -04:00
Zack Rusin
d5250da818 draw: use the total number of vertices for statistics
the number of vertices to fetch doesn't necessarily equal the
total number of input vertices, e.g. we might want to fetch
a single vertex but then draw it twice. Lets use the correct
number of input vertices in the statistics.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-14 03:10:33 -04:00
Zack Rusin
29853ab7b8 draw: don't crash on vertex buffer overflow
We would crash when stride was bigger than the size of the buffer.
The correct behavior is to just fetch zero's in this case.
Unfortunatly with user_buffer's there's no way to validate the size
because currently we're just not getting it. Adjust the draw interface
to pass the size along the mapped buffer, which works perfectly
for buffer backed vertex_buffers and, in future, it will allow
us to plumb user_buffer sizes through the same interface.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-14 03:09:32 -04:00
Zack Rusin
386327c48f gallivm/soa: implement indirect addressing in immediates
The support is analogous to the way we handle indirect addressing
in temporaries, except that we don't have to worry about storing
(after declarations) and thus we'll able to keep using the old
code when indirect addressing isn't used. In other words we're
still using constants directly, unless the instruction has
immediate register with indirect addressing.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-14 03:09:15 -04:00
Zack Rusin
2866525b86 draw/gs: don't bind the tgsi state if we're using llvm paths
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-14 03:08:56 -04:00
Vinson Lee
ff256ec068 gallivm: Fix build with LLVM >= 3.4 r181680.
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-05-14 09:06:14 -07:00
José Fonseca
36385c0bdf mesa/st: Temporary workaround for fdo bug 64568.
Effectively reverting the problematic hunk of
commit 614ee25077
2013-05-14 17:02:53 +01:00
Alex Deucher
29b8d6a1da radeonsi: add Hainan pci ids
Note: this is a candidate for the 9.1 branch

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-05-14 10:51:10 -04:00
Alex Deucher
d188f14941 radeonsi: update r600_get_llvm_processor_name for hainan
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-05-14 10:51:10 -04:00
Alex Deucher
4045c3d060 radeonsi: add support for hainan chips
Note: this is a candidate for the 9.1 branch

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-05-14 10:51:10 -04:00
José Fonseca
c475ae5d3d draw: Fix io_ptr/num_prims name in IR.
Trivial.
2013-05-14 15:36:37 +01:00
José Fonseca
2f3d939e36 graw/tgsi_dump: Fix gdb macro.
The macro was relying on "tokens" local variable to exist.
2013-05-14 15:36:37 +01:00
Vadim Girlin
560ddad261 r600g/sb: add missing cases for ARUBA chips
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-14 17:36:25 +04:00
Vadim Girlin
ecde4b07e2 r600g/sb: get rid of standard c++ streams
Static initialization of internal libstdc++ data related to iostream
causes segfaults with some apps.

This patch replaces all uses of std::ostream and std::ostringstream in sb
with custom lightweight classes.

Prevents segfaults with ut2004demo and probably some other old apps.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-14 17:36:25 +04:00
Vadim Girlin
57d1be0d2d r600g/sb: separate bytecode decoding and parsing
Parsing and ir construction is required for optimization only,
it's unnecessary if we only need to print shader dump.
This should make new disassembler more tolerant to any new
features in the bytecode.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-14 17:36:25 +04:00
Christian König
e195d301ae vl/vdpau: fix PresentationQueueQuerySurfaceStatus
The last queued surface always keeps displaying.

Fixing a problem with XBMC.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-14 15:16:15 +02:00
Chia-I Wu
176ad54c04 ilo: rework ilo_texture
Use ilo_buffer for buffer resources and ilo_texture for texture resources.  A
major cleanup is necessitated by the separation.
2013-05-14 16:07:22 +08:00
Chia-I Wu
768296dd05 ilo: rename ilo_resource to ilo_texture
In preparation for the introduction of ilo_buffer.
2013-05-14 16:01:25 +08:00
Chia-I Wu
528ac68f7a ilo: move transfer-related functions to a new file
Resource mapping is distinct from resource allocation, and is going to get
more and more complex.  Move the related functions to a new file to make the
separation clear.
2013-05-14 16:01:20 +08:00
Rodrigo Vivi
888fc7a891 i965: Add missing Haswell GT3 Desktop to IS_HSW_GT3 check.
NOTE: This is a candidate for stable branches.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 17:00:46 -07:00
Jordan Justen
a16a2d7147 i965: write layer if gl_Layer is used in VS
This is enabled by the AMD_vertex_shader_layer extension.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-13 13:57:57 -07:00
Jordan Justen
220f70667d glsl: add AMD_vertex_shader_layer support
This GLSL extension requires that AMD_vertex_shader_layer be
enabled by the driver.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-13 13:57:52 -07:00
Jordan Justen
c9e981b8fb extensions: add AMD_vertex_shader_layer
This extension will require driver support, so it must
be enabled by the driver.

http://www.opengl.org/registry/specs/AMD/vertex_shader_layer.txt

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-13 13:57:03 -07:00
Chad Versace
1776eeedd3 mesa: Expose GL_OES_texture_npot on GLES1
Mesa's extension table incorrectly lists this GL_OES_texture_npot as
ES2-only. It's also an ES1 extension. This patch adds ES1 to the
extensions API mask.

From the GL_OES_texture_npot spec:
    OpenGL ES 1.0 or OpenGL ES 2.0 is required. This extension is
    written against OpenGL ES 1.1.12 and OpenGL ES 2.0.25.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:08:37 -07:00
Ian Romanick
a61a0dbed2 glsl: Death to array dereferences of vectors!
Now that all the places that used to generate array derefeneces of
vectors have been changed to generate either ir_binop_vector_extract or
ir_triop_vector_insert (or both), remove all support for dealing with
this deprecated construct.

As an added safeguard, modify ir_validate to reject ir_dereference_array
of a vector.

v2: Convert tabs to spaces.  Suggested by Eric.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:19 -07:00
Ian Romanick
1e773626ee glsl: Generate correct ir_binop_vector_extract code for out and inout parameters
Like with type conversions on out parameters, some extra copies need to
occur to handle these cases.  The fundamental problem is that
ir_binop_vector_extract is not an lvalue, but out and inout parameters
must be lvalues.  A previous patch delt with a similar problem in the
LHS of ir_assignment.

v2: Convert tabs to spaces.  Suggested by Eric.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:19 -07:00
Ian Romanick
c3bb07f875 glsl: Use vector-insert and vector-extract on elements of gl_ClipDistanceMESA
Variable indexing into vectors using ir_dereference_array is being
removed, so this lowering pass has to generate something different.

v2: Convert tabs to spaces.  Suggested by Eric.

v3: Simplify code slightly by assuming that elements of
gl_ClipDistanceMESA will always be vec4.  Suggested by Paul.

v4: Fairly substantial rewrite based on the rewrite of "glsl: Convert
lower_clip_distance_visitor to be an ir_rvalue_visitor"

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-13 12:05:19 -07:00
Ian Romanick
d13fbeea96 glsl: Remove some stale comments about ir_call
ir_call was changed long ago to be a statement rather than an
expression.  That makes this comment no longer valid.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-13 12:05:19 -07:00
Ian Romanick
065da16508 glsl: Convert lower_clip_distance_visitor to be an ir_rvalue_visitor
Right now the lower_clip_distance_visitor lowers variable indexing into
gl_ClipDistance into variable indexing into both the array
gl_ClipDistanceMESA and the vectors of that array.  For example,

    gl_ClipDistance[i] = f;

becomes

    gl_ClipDistanceMESA[i >> 2][i & 3] = f;

However, variable indexing into vectors using ir_dereference_array is
being removed.  Instead, ir_expression with ir_triop_vector_insert will
be used.  The above code will become

    gl_ClipDistanceMESA[i >> 2] =
        vector_insert(gl_ClipDistanceMESA[i >> 2], i & 3, f);

In order to do this, an ir_rvalue_visitor will need to be used.  This
commit is really just a refactor to get ready for that.

v4: Split the least amount of refactor from the rest of the code
changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-05-13 12:05:19 -07:00
Ian Romanick
3acb21517b glsl: Generate ir_binop_vector_extract for indexing of vectors
Now ir_dereference_array of a vector will never occur in the RHS of an
expression.

v2: Add back the { } around the if-statement body to make it more
readable.  Suggested by Eric.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:19 -07:00
Ian Romanick
89704eb1b0 glsl: Convert ir_binop_vector_extract in the LHS to ir_triop_vector_insert
The ast_array_index code can't know whether to generate an
ir_binop_vector_extract or an ir_triop_vector_insert.  Instead it will
always generate ir_binop_vector_extract, and the LHS and RHS have to be
re-written.

v2: Convert tabs to spaces.  Suggested by Eric.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:19 -07:00
Ian Romanick
ee7a6dad30 glsl: Add lowering pass for ir_triop_vector_insert
This will eventually replace do_vec_index_to_cond_assign.  This lowering
pass is called in all the places where do_vec_index_to_cond_assign or
do_vec_index_to_swizzle is called.

v2: Use WRITEMASK_* instead of integer literals.  Use a more concise
method of generating broadcast_index.  Both suggested by Eric.

v3: Use a series of scalar compares instead of a single vector compare.
Suggested by Eric and Ken.  It still uses 'if (cond) v.x = y;' instead
of conditional assignments because ir_builder doesn't do conditional
assignments, and I'd rather keep the code simple.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:19 -07:00
Ian Romanick
b881ddba7d glsl: Lower ir_binop_vector_extract to conditional moves
Lower ir_binop_vector_extract with a non-constant index to a series of
conditional moves.  This is exactly like ir_dereference_array of a
vector with a non-constant index.

v2: Convert tabs to spaces.  Suggested by Eric.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:19 -07:00
Ian Romanick
943de9cdea glsl: Lower ir_binop_vector_extract to swizzle
Lower ir_binop_vector_extract with a constant index to a swizzle.  This
is exactly like ir_dereference_array of a vector with a constant index.

v2: Convert tabs to spaces.  Suggested by Eric.

v3: Correctly call convert_vector_extract_to_swizzle in
ir_vec_index_to_swizzle_visitor::visit_enter(ir_call *ir).  Suggested by
Ken.

v4: Use CLAMP instead of MIN2(MAX2()).  Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:18 -07:00
Ian Romanick
63e1147ea1 glsl: Refactor part of convert_vec_index_to_cond_assign
Use a first function that extract the vector being indexed and the index
from the deref.  Call the second function that does the real work.

Coming patches will add a new ir_expression for variable indexing into a
vector.  Having the lowering pass split into two functions will make it
much easier to lower the new ir_expression.

v2: Convert tabs to spaces.  Suggested by Eric.

v3: Move some bits from a later patch back to this patch so that it
actually compiles.  Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:18 -07:00
Ian Romanick
dafd6918f3 glsl: Add ir_triop_vector_insert
The new opcode is used to generate a new vector with a single field from
the source vector replaced.  This will eventually replace
ir_dereference_array of vectors in the LHS of assignments.

v2: Convert tabs to spaces.  Suggested by Eric.

v3: Add constant expression handling for ir_triop_vector_insert.  This
prevents the constant matrix inversion tests from regressing.  Duh.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:18 -07:00
Ian Romanick
f274a2ca87 glsl: Add ir_binop_vector_extract
The new opcode is used to get a single field from a vector.  The field
index may not be constant.  This will eventually replace
ir_dereference_array of vectors.  This is similar to the extractelement
instruction in LLVM IR.

http://llvm.org/docs/LangRef.html#extractelement-instruction

v2: Convert tabs to spaces.  Suggested by Eric.

v3: Add array index range checking to ir_binop_vector_extract constant
expression handling.  Suggested by Ken.

v4: Use CLAMP instead of MIN2(MAX2()).  Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-13 12:05:18 -07:00
Paul Berry
b0bb6103d2 glsl: Fix "make check" breakage after adding options to do_common_optimization.
Commit b765740 (glsl: Pass struct shader_compiler_options into
do_common_optimization.) added a new parameter to
do_common_optimization() but didn't update test_optpass.cpp, causing
"make check" to break.

This patch makes the proper updates to test_optpass.cpp so that the
build succeeds again.
2013-05-13 07:55:37 -07:00
Kenneth Graunke
e413d3f15c glsl: Add a pass to flip matrix/vector multiplies to use dot products.
This pass flips (matrix * vector) operations to (vector *
matrixTranspose) for certain built-in matrices (currently
gl_ModelViewProjectionMatrix and gl_TextureMatrix).

This is equivalent, but results in dot products rather than multiplies
and adds.  On some hardware, this is more efficient.

This pass is conditionalized on ctx->mvp_with_dp4, the flag drivers set
to indicate they prefer dot products.

Improves performance in Lightsmark by 1.01131% +/- 0.162069% (n = 10)
on a Haswell GT2 system.  Passes Piglit on Ivybridge.

v2: Use struct gl_shader_compiler_options instead of plumbing through
    another boolean flag for this purpose.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-12 09:36:46 -07:00
Kenneth Graunke
72a0b7a435 i965/vs: Set the PreferDP4 shader compiler option.
Doing matrix multiplies with DP4s is fewer instructions than MUL/ADD,
especially since we don't support MAD in the vertex shader.

Not observed to improve performance in any fixed function applications,
but is useful for the next patch.

I've left this unset for the fragment shader because the scalar backend
can't use DP4 and does have MAD support.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-12 09:36:44 -07:00
Kenneth Graunke
bbf029f7cf mesa: Move the mvp_with_dp4 flag to ShaderCompilerOptions.
This flag essentially tells the compiler whether it prefers
dot products or multiply/adds for matrix operations.  As such,
ShaderCompilerOptions seems like the right place for it.

This also lets us specify it on a per-stage basis.  This patch makes all
existing users set the flag for the Vertex Shader stage only, as it's
currently only used for fixed-function vertex programs.  That will
change soon, and I wanted to preserve the existing behavior.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-12 09:36:43 -07:00
Kenneth Graunke
b765740a66 glsl: Pass struct shader_compiler_options into do_common_optimization.
do_common_optimization may need to make choices about whether to emit
certain kinds of instructions.  gl_context::ShaderCompilerOptions
contains exactly that information, so it makes sense to pass it in.

Rather than passing the whole array, pass the structure for the stage
that's currently being worked on.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-12 09:36:41 -07:00
Kenneth Graunke
6bb9acfb4e glsl: Initialize ctx->ShaderCompilerOptions in standalone scaffolding.
This code is copied from _mesa_init_shader_state().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-12 09:36:39 -07:00
Kenneth Graunke
1c95cea40b glsl: Copy _mesa_shader_type_to_index() to standalone scaffolding.
We can't include shaderobj.h from the standalone utilities, so we
unfortunately have to copy this function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-12 09:36:18 -07:00
Kenneth Graunke
a67b18e5a7 mesa: Add comments about bit-ordering of new XRGB/XBGR formats.
Marek added these new formats in commit f9fa725690, but
without comments relating to the packing.  Sometimes the naming is
confusing, so these comments are helpful in determining whether two
formats are compatible.

The new comments are based on my reading of format_unpack.c.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-12 09:32:42 -07:00
Marek Olšák
f486c52f9e st/mesa: remove dependency on _NEW_BUFFER_OBJECT for vertex arrays
_NEW_BUFFER_OBJECT means glBufferData was called. We can just set our own
flag in BufferData.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:59:20 +02:00
Marek Olšák
b88cebb634 st/mesa: don't check for _NEW_PROGRAM when binding UBOs
Probably copied from i965. However st/mesa has its flags ST_NEW_xxx_PROGRAM.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:02 +02:00
Marek Olšák
a17e87d4eb st/mesa: fix a couple of issues in st_bind_ubos
- don't reference a buffer for a local variable
  (that's never useful unless it can be the only reference to the buffer)
- check if the buffer is not NULL
- set buffer_size as specified with BindBufferRange

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:02 +02:00
Marek Olšák
1ba1d617bf st/mesa: restore the transfer_inline_write path for BufferData
Version 2 that shouldn't crash.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:02 +02:00
Marek Olšák
6a2ad679e6 st/mesa: initialize Const.MaxColorAttachments
NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:02 +02:00
Marek Olšák
52cb395bb1 gallium: add PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE for GL
v2: fix typo 65535 -> 65536

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:01 +02:00
Marek Olšák
b6d3373442 st/mesa: consolidate setting MaxTextureImageUnits
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:01 +02:00
Marek Olšák
614ee25077 st/mesa: initialize all program constants and UBO limits
Also simplify UBO support checking.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:01 +02:00
Marek Olšák
d90f04a65b glsl: fix the value of gl_MaxFragmentUniformVectors
NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:45:01 +02:00
Marek Olšák
77d8fbcfd4 mesa: add & use a new driver flag for UBO updates instead of _NEW_BUFFER_OBJECT
v2: move the flagging from intel_bufferobj_data to intel_bufferobj_alloc_buffer

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-11 23:45:01 +02:00
Marek Olšák
081c789c3e mesa: skip _MaxElement computation unless driver needs strict bounds checking
If Const.CheckArrayBounds is false, the only code using _MaxElement is
glDrawRangeElements, so I changed it and explained in the code why
_MaxElement is not very useful there.

BTW, the big magic number was copied to the letter
from _mesa_update_array_max_element.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-11 23:45:01 +02:00
Marek Olšák
db38e9a0e1 mesa: remove unused gl_array_object::NewArray
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-11 23:45:01 +02:00
Marek Olšák
74ca7f0974 mesa: remove unused gl_constants::MaxColorTableSize
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-11 23:45:01 +02:00
Marek Olšák
286d06ddc4 mesa: unify MaxVertexVaryingComponents and MaxGeometryVaryingComponents
The limits should not be different and OpenGL requires both to be at least 32,
which is also the maximum limit on radeon.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-11 23:45:01 +02:00
Marek Olšák
5e78433eec mesa: move max texture image unit constants to gl_program_constants
Const.MaxTextureImageUnits -> Const.FragmentProgram.MaxTextureImageUnits
Const.MaxVertexTextureImageUnits -> Const.VertexProgram.MaxTextureImageUnits
etc.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-11 23:45:01 +02:00
Marek Olšák
d27d29f1a6 mesa: consolidate definitions of max texture image units
Shaders are unified on most hardware (= same limits in all stages).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-11 23:44:55 +02:00
Vinson Lee
5471e3949c ilo: Initialize read_back in transfer_map_sys.
Fixes "Uninitialized scalar variable" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-05-10 15:29:40 +08:00
Marek Olšák
da33f9b919 r600g: increase array size for shader inputs and outputs
and add assertions to prevent buffer overflow. This fixes corruption
of the r600_shader struct.

NOTE: This is a candidate for the stable branches.
2013-05-10 03:23:31 +02:00
Chí-Thanh Christopher Nguyễn
121c2c8983 targets/dri-i915: Force c++ linker in all cases
NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=461696
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-05-09 17:04:27 -07:00
Ben Widawsky
fc98c47115 i965: Actually use the user timeout in glClientWaitSync.
Use the new libdrm functionality to actually do timed waits on the sync
object.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-05-09 16:41:44 -07:00
Paulo Zanoni
f1d2b37317 i965: make GT3 machines work as GT3 instead of GT2
We were not allowed to say the "GT3" name, but we really needed to
have the PCI IDs because too many people had such machines, so we had
to make the GT3 machines work as GT2.

Let's just say that GT2_PLUS was a short for GT2_PLUS_1 :)

NOTE: This is a candidate for stable branches.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-09 15:11:53 -07:00
Kenneth Graunke
d0b82b1add i965: Add chipset limits for the Haswell GT3 variant.
NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
2013-05-09 15:11:53 -07:00
Kenneth Graunke
eca2251f42 i965: Update URB partitioning code for Haswell's GT3 variant.
Haswell's GT3 variant offers 32kB of URB space for push constants, while
GT1 and GT2 match Ivybridge, providing 16kB.  Update the code to reserve
the full 32kB on GT3.

v2: Specify push constant size correctly.  I thought GT3 reinterpreted
    the value as multiples of 2kB, but it doesn't.  You simply have to
    program an even number.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-09 15:11:52 -07:00
Kenneth Graunke
c56eba5adb i965: Delete dead intel_span.c symlink. 2013-05-09 15:11:52 -07:00
Eric Anholt
0f3068a58b i965/vs: Make virtual grf live intervals actually cover their used range.
This is the same change as the previous commit to the FS.  A very few VSes
are regressed by 1 or 2 instructions, which look recoverable with a bit
more dead code elimination.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-09 14:38:05 -07:00
Eric Anholt
e290372542 i965/fs: Make virtual grf live intervals actually cover their used range.
Previously, we would sometimes not consider a write to a register to
extend the end of the interval, nor would we consider a read before a
write to extend the start.  This made for a bunch of complicated logic
related to how to treat the results when dead code might be present.
Instead, just extend the interval and fix dead code elimination to know
how to remove it.

Interestingly, this actually results in a tiny bit more optimization:
total instructions in shared programs: 1391220 -> 1390799 (-0.03%)
instructions in affected programs:     14037 -> 13616 (-3.00%)

v2: Fix a theoretical problem with the simd16 workaround if dst == src,
    where we would revert the bump of the live range.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
2013-05-09 14:38:05 -07:00
Marek Olšák
dd6152b6ca docs: document GALLIUM_HUD and LIBGL_SHOW_FPS 2013-05-09 23:28:05 +02:00
Courtney Goeltzenleuchter
daa90f91ff ilo: Add support for HW primitive restart.
Now tells Gallium that ilo supports primitive restart.
Updated ilo_draw_vbo to be able to check that the indexed
primitive being rendered can actually be supported in HW. If not,
will break up into individual prims similar to what Mesa does.

[olv: a minor fix after rebasing and formatting]
2013-05-10 00:06:14 +08:00
Brian Paul
009d79734f svga: misc whitespace and comment fixes in svga_cmd.c 2013-05-09 07:43:46 -06:00
Brian Paul
60c71cce3f docs: remove ^M chars from GL3.txt 2013-05-09 07:43:46 -06:00
Brian Paul
e0144019c0 st/mesa: generate GL_OUT_OF_MEMORY if we can't create the index buffer
Before, if we failed to allocate the index buffer we'd silently
return from st_draw_vbo() without drawing anything.  We should
raise GL_OUT_OF_MEMORY to give some indication that something went
wrong.

Note: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-05-09 07:43:46 -06:00
Chia-I Wu
a8e4614071 ilo: add support for PIPE_FORMAT_ETC1_RGB8
It is decompressed to and stored as PIPE_FORMAT_R8G8B8X8_UNORM on-the-fly.
2013-05-09 16:05:48 +08:00
Chia-I Wu
183ea823fd ilo: support mapping with a staging system buffer
It can be used for unpacking compressed texture on-the-fly or to support
explicit transfer flushing.
2013-05-09 16:05:47 +08:00
Chia-I Wu
baa44db065 ilo: allow for different mapping methods
We want to or need to use a different mapping method when when the resource is
busy, the bo format differs from the requested format, and etc.
2013-05-09 16:05:47 +08:00
Chia-I Wu
7cca1aac9d ilo: allow bo format to differ from that requested
For separate stencil buffer or formats not supported natively, the real format
of the bo may differ from that requested.
2013-05-09 16:05:47 +08:00
Stéphane Marchesin
1c56fc1025 draw/llvm: Add additional llvm optimization passes
It helps a bit with vertex shader performance on i915g
(a couple percent faster with openarena).

I have tried most other passes, and they weren't showing
any measurable improvement. Note that my vertex shaders
didn't have loops, so maybe the loop optimizations could
still be useful in the future.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-08 22:05:54 -07:00
Eric Anholt
0b0d6f97cf i965: Sync brw_format_for_mesa_format() table with new Mesa formats.
I'm not filling them all in, to prevent any breakage in this commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-08 15:31:07 -07:00
Eric Anholt
2755946427 i965: Update the surface formats table from the current specs.
Unfortunately the surface formats table is now splattered across multiple
chapters.  All surface format enums from brw_defines.h are present, but
only support for them that is mentioned in the public specs is included
here.

v2 (from Ken): Mark R32G32B32A32_SFIXED as unsupported on Ivybridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-08 15:31:06 -07:00
Eric Anholt
5d89487eb2 i965: Add surface format defines from the public specs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-08 14:27:30 -07:00
Fabian Bieler
4e9c7f9c5a mesa/program: Don't copy propagate from swizzles.
Do not propagate a copy if source and destination are identical.

Otherwise code like

MOV TEMP[0].xyzw, TEMP[0].wzyx
MOV TEMP[1].xyzw, TEMP[0].xyzw

is changed to

MOV TEMP[0].xyzw, TEMP[0].wzyx
MOV TEMP[1].xyzw, TEMP[0].wzyx

This fixes Piglit test shaders/glsl-copy-propagation-self-2 for drivers that
use Mesa IR.

NOTE: This is a candidate for the stable branches.
Signed-off-by: Fabian Bieler <fabianbieler@fastmail.fm>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-08 13:59:19 -07:00
Fabian Bieler
e1ff753d67 mesa/st: Don't copy propagate from swizzles.
Do not propagate a copy if source and destination are identical.

Otherwise code like

MOV TEMP[0].xyzw, TEMP[0].wzyx
MOV TEMP[1].xyzw, TEMP[0].xyzw

is changed to

MOV TEMP[0].xyzw, TEMP[0].wzyx
MOV TEMP[1].xyzw, TEMP[0].wzyx

This fixes Piglit test shaders/glsl-copy-propagation-self-2 for gallium drivers.

NOTE: This is a candidate for the stable branches.
Signed-off-by: Fabian Bieler <fabianbieler@fastmail.fm>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-08 13:59:14 -07:00
Eric Anholt
5d06c9ea0f i965: Fix hangs on HSW since the gen6 blorp fix.
The constant packets for gen6 are too small for gen7, and while IVB seems
happy with them HSW blows up.  Fix it by emitting the correct packets on
gen7, for all stages.

v2: Include the packets instead of just skipping them.
NOTE: This is a candidate for the stable branches.
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-08 10:23:41 -07:00
Chad Versace
2878f4685c egl/android: Fix error condition for EGL_ANDROID_image_native_buffer
Emit EGL_BAD_CONTEXT if the user passes a context to
eglCreateImageKHR(type=EGL_ANDROID_image_native_buffer).

From the EGL_ANDROID_image_native_buffer spec:
  * If <target> is EGL_NATIVE_BUFFER_ANDROID and <ctx> is not
    EGL_NO_CONTEXT, the error EGL_BAD_CONTEXT is generated.

Note: This is a candidate for the stable branches.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-05-08 08:44:05 -07:00
Stéphane Marchesin
38d2a16c01 i915: Use Y tiling for textures
This basically reverts commit
2acc719374.

With the previous change, we're not batchbuffer limited any
longer. So we actually start seeing a performance difference
between X and Y tiling. X tiling is funny because it is
faster for screen-aligned quads but slower in games. So let's
use Y tiling which is 10% faster overall.
2013-05-08 02:07:00 -07:00
Stéphane Marchesin
fc24c7aede i915g: Optimize batchbuffer sizes
Now that we don't throttle at every batchbuffer, we can shrink
the size of batchbuffers to achieve early flushing. This gives
a significant speed boost in a lot of games (on the order of
20%).
2013-05-08 02:06:56 -07:00
Stéphane Marchesin
7f7c7fda83 i915g: Add more PIPE_CAP_* support 2013-05-08 01:37:55 -07:00
Chia-I Wu
00035670de ilo: remove our own type inference
tgsi_opcode_infer_{src,dst}_type() works just fine.
2013-05-08 11:33:34 +08:00
Chia-I Wu
b74af51a46 ilo: use tgsi_util_get_texture_coord_dim()
And remove toy_tgsi_get_texture_coord_dim().
2013-05-08 11:07:46 +08:00
Chia-I Wu
75a48a53d8 tgsi: fix operand type of TGSI_OPCODE_NOT
It should be TGSI_TYPE_UNSIGNED, not TGSI_TYPE_FLOAT.

Fixed also gallivm not_emit_cpu() to use uint build context.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2013-05-08 11:03:49 +08:00
Chia-I Wu
1f970816b1 tgsi: refactor tgsi_opcode_infer_src_type()
Call tgsi_opcode_infer_type() from tgsi_opcode_infer_src_type().

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2013-05-08 11:03:47 +08:00
Chia-I Wu
364feb327d tgsi: refactor tgsi_opcode_infer_dst_type()
Move the body of tgsi_opcode_infer_dst_type() to a new helper function,
tgsi_opcode_infer_type(), and call the helper function from
tgsi_opcode_infer_dst_type().  The diff looks complicated simply because the
code is moved around.

A following commit will make tgsi_opcode_infer_src_type() call
tgsi_opcode_infer_type().

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2013-05-08 11:03:43 +08:00
Chia-I Wu
8a52453f5d tgsi: reorder opcodes in opcode type inference
Reorder opcodes by their assigned numbers.  This makes it easier to see the
differences between tgsi_opcode_infer_src_type() and
tgsi_opcode_infer_dst_type().

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2013-05-08 11:03:24 +08:00
Chia-I Wu
61d57ec276 tgsi: clean up exec_tex()
Make use of tgsi_util_get_texture_coord_dim() to replace the big switch table.

There is a subtle difference with this change.  When TXP is used with an array
texture, the layer is now also projected.  This behavior matches the TGSI doc.
Since GLSL does not allow TXP on an array texture, I am not sure which
behavior is correct or preferred.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2013-05-08 11:00:07 +08:00
Chia-I Wu
80857d2c8b tgsi: add tgsi_util_get_texture_coord_dim()
This util function returns the dimension of the texture coordinates for a
texture target, and the location of the shadow reference value.

For example, when the texture target is TGSI_TEXTURE_SHADOW2D, the dimension
of the texture coordinates is 2, and the location of the ref value is 2
(that is, the Z channel).

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2013-05-08 10:58:53 +08:00
Bryan Cain
14a0bb81fe nv50: initialize kick_notify callback in nv50_create
Fixes infinite loop on startup in Portal and Left 4 Dead 2.

NOTE: This is a candidate for the 9.0 and 9.1 branches.
2013-05-07 17:01:59 -05:00
Eric Anholt
3f09e528d5 i965: Use Y-tiled blits to untile for cached mappings of miptrees.
Fixes a regression in firefox's unaccelerated compositing path for WebGL
with the introduction of Y tiling.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64213
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-07 11:45:45 -07:00
Eric Anholt
d641a01d98 i965: Add support for Y-tiled blits on gen6+.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-07 11:45:45 -07:00
Eric Anholt
7a74808d78 i965: Count occlusion query samples for CopyPixels using the 2D engine.
We accidentally "fixed" the piglit test for this when introducing Y
tiling, since this path stopped being executed.  In reenabling this path
for Y tiling, we ended up regressing it again, so just fix it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59439
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-05-07 11:45:45 -07:00
Robert Bragg
f8c3242682 egl/wayland: Implement EGL_EXT_swap_buffers_with_damage
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2013-05-07 17:07:50 +01:00
Robert Bragg
6425b14515 egl: Add extension infrastructure for EGL_EXT_swap_buffers_with_damage
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2013-05-07 17:07:45 +01:00
Robert Bragg
95dda0d649 egl: Update to revision 21254 of eglext.h
This pulls in EGL_EXT_swap_buffers_with_damage.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2013-05-07 17:07:44 +01:00
Roland Scheidegger
65102b708b gallium: more tgsi documentation updates
Adds the remaining integer opcodes, and some opcodes are moved to more
appropriate places, along with getting rid of the (already nearly empty)
ps_2_x section. Though the CAP bits for some of these are still a bit in
the air so the documentation isn't quite as watertight as is desirable.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-07 16:13:23 +02:00
Vinson Lee
4ba9c9c5be ilo: Add missing break statement in aos_tex TGSI_OPCODE_TEX2 case.
Fixes "Missing break in switch" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-05-07 12:15:48 +08:00
Vadim Girlin
c9cf83b587 r600g/sb: optimize some cases for CNDxx instructions
We can replace CNDxx with MOV (and possibly eliminate after
propagation) in following cases:

If src1 is equal to src2 in CNDxx instruction then the result doesn't
depend on condition and we can replace the instruction with
"MOV dst, src1".

If src0 is const then we can evaluate the condition at compile time and
also replace it with MOV.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-07 04:40:26 +04:00
Vadim Girlin
46dfad8b36 r600g/sb: fix memory leaks
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-07 04:40:26 +04:00
Vadim Girlin
1c28e7c5a1 r600g/sb: fix kcache handling on r6xx
Use the same limit for kcache constants in alu group on r6xx as on other
chips (two const pairs). Relaxing this will require additional checks to
make sure that all 4 consts in the group come from 2 kcache sets (clause
limit), probably without noticeable improvements of shader performance.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-07 04:40:26 +04:00
Eric Anholt
03ef60681e intel: Remove renderbuffer delete setup from texture wrapping.
This is already set by intel_new_renderbuffer().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 14:34:27 -07:00
Eric Anholt
77a405dba7 mesa: Make Mesa core set up wrapped texture renderbuffer state.
Everyone was doing effectively the same thing, except for some funky code
reuse in Intel, and swrast mistakenly recomputing _BaseFormat instead of
using the texture's _BaseFormat.  swrast's sRGB handling is left in place,
though it should be done by using _mesa_get_render_format() at render time
instead (as-is, it will miss updates to GL_FRAMEBUFFER_SRGB).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 14:34:14 -07:00
Eric Anholt
5b190d19d3 intel: Simplify renderbuffer-for-texture width setup.
We're looking for the logical width of our level, which is what
image->Width2/Height2 is.  The previous code relied on MSAA textures being
only level 0.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 14:33:43 -07:00
Eric Anholt
749a92786d mesa: Make core Mesa allocate the texture renderbuffer wrapper.
Every driver did the same thing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 14:33:38 -07:00
Eric Anholt
5b9609f59a i965: Use brw_blorp_blit_miptrees() for CopyTexSubImage().
Now that depth resolves are handled there, we don't need to make the
temporary renderbuffer.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 14:33:33 -07:00
Eric Anholt
40956c5519 i965: Move blorp resolve setup into brw_blorp_blit_miptrees().
There was some comment about trying to avoid marking resolves in
updownsample, but if the downsample is never actually rendered to, then
the required resolve tracked in the downsample will never be executed, so
who cares?

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 14:33:27 -07:00
Tom Stellard
730c90a70e gallivm: Fix build for LLVM < 3.3
The C API versions of the LLVM multithreaded functions were added in
LLVM 3.3.
2013-05-06 11:17:03 -07:00
Tom Stellard
bb94d4d8fe r600g/llvm: Parse config values in register / value pairs
Rather than relying on a predetermined order for the config values.
2013-05-06 10:54:52 -07:00
Tom Stellard
df27320560 r600g/llvm: Don't feed LLVM output through r600_bytecode_build()
The LLVM backend emits raw ISA now, so we can just its output
unmodified.
2013-05-06 10:54:52 -07:00
Tom Stellard
e917ed96ae r600g/llvm: Don't emit CALL_FS for vertex shaders
The LLVM backend takes care of this now.
2013-05-06 10:54:52 -07:00
Matt Turner
1d09a8c3cd i965: Lower bitfieldInsert.
v2: Only lower bitfieldInsert to BFM+BFI (and don't lower
    bitfieldExtract at all) since three-source instructions are now
    usable in the vertex shader.
v3: Lower bitfield_insert in the same pass with everything else, since
    it doesn't produce any instructions to be lowered (the other two
    lowering passes that were in a previous iteration of this series
    emitted subtractions which needed to be lowered).

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]
2013-05-06 10:17:14 -07:00
Matt Turner
acd2bccd85 i965/vs: Add support for bit instructions.
v2: Rebase on LRP addition.
    Use fix_3src_operand() when emitting BFE and BFI2.
    Add BFE and BFI2 to is_3src_inst check in
      brw_vec4_copy_propagation.cpp.
    Subtract result of FBH from 31 (unless an error) to convert
      MSB counts to LSB counts

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:14 -07:00
Matt Turner
1f0f26d60c i965/fs: Add support for bit instructions.
Don't bother scalarizing ir_binop_bfm, since its results are
identical for all channels.

v2: Subtract result of FBH from 31 (unless an error) to convert
    MSB counts to LSB counts.
v3: Use op0->clone() in ir_triop_bfi to prevent (var_ref
    channel_expressions) from appearing multiple times in the IR.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]
2013-05-06 10:17:14 -07:00
Matt Turner
fa958182b7 i965: Add support for emitting and disassembling bit instructions.
Specifically
   bfe - for bitfieldExtract()
   bfi1 and bfi2 - for bitfieldInsert()
   bfrev - for bitfieldReverse()
   cbit - for bitCount()
   fbh - for findMSB()
   fbl - for findLSB()

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:14 -07:00
Matt Turner
c71bee757b i965: Print the correct dst and shared-src types for 3-src instructions.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:14 -07:00
Matt Turner
526ffdfc03 i965/gen7: Set src/dst types for 3-src instructions.
Also update asserts to allow BFE and BFI2, which take (unsigned)
doubleword arguments.

v2: Allow BRW_REGISTER_TYPE_UD for src1 and src2 as well.
    Assert that src2.type (instead of src0.type) matches dest.type since
    it's the primary argument and src0 and src1 might correctly have
    different types.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v1]
2013-05-06 10:17:13 -07:00
Matt Turner
2305047823 i965: Add 3-src destination and shared-source type macros.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:13 -07:00
Matt Turner
4049d48e02 i965: Add Gen7+ fields to brw_instruction and add comments.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:13 -07:00
Matt Turner
dafd050883 glsl: Add a pass to lower bitfield-insert into bfm+bfi.
i965/Gen7+ and Radeon/Evergreen+ have bfm/bfi instructions to implement
bitfieldInsert() from ARB_gpu_shader5.

v2: Add ir_binop_bfm and ir_triop_bfi to st_glsl_to_tgsi.cpp.
    Remove spurious temporary assignment and dereference.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:13 -07:00
Matt Turner
9c04b8c28c glsl: Add constant evaluation of bit built-ins.
v2: Order bits from LSB end (31 - count) for ir_unop_find_msb.
v3: Add ir_triop_bitfield_extract as an exception to the op[0]->type ==
    op[1]->type assertion in ir_constant_expression.cpp.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]
2013-05-06 10:17:13 -07:00
Matt Turner
499d8c6545 glsl: Add support for new bit built-ins in ARB_gpu_shader5.
v2: Move use of ir_binop_bfm and ir_triop_bfi to a later patch.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:13 -07:00
Matt Turner
44d3287ecd glsl: Add new bit built-ins IR and prototypes from ARB_gpu_shader5.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:13 -07:00
Matt Turner
f9e37879eb glsl: Rework ir_reader to handle expressions with four operands.
Needed to support the bitfieldInsert() built-in added by
ARB_gpu_shader5.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:12 -07:00
Matt Turner
f99f78e49a mesa: Add infrastructure for ARB_gpu_shader5.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-05-06 10:17:12 -07:00
Tom Stellard
914d797797 radeon/llvm: Always build libradeonllvm as static
This library is very small, so there is not much to gain from building
it as a shared library.  Also, when linking statically with LLVM, a
shared libradeonllvm exports LLVM symbols and creates problems when
used with other shared objects that also link statically to LLVM.

Reviewed-by: Mathias.Froehlich@web.de
2013-05-06 09:06:10 -07:00
Tom Stellard
024fe6852a radeon/llvm: Use LLVM C API for compiling LLVM IR to ISA v2
The LLVM C API is considered stable and should never change, so it
is much more desirable to use than the LLVM C++ API, which is constantly in
flux.

v2:
  - Split target initialization and lookup into separate functions

Reviewed-by: Mathias.Froehlich@web.de
2013-05-06 09:06:06 -07:00
Tom Stellard
55eb8eaaa8 gallivm: Move LLVMStartMultithreaded() static initializer into gallivm
This does not solve all of the problems with using LLVM in a
multithreaded enivronment, but it should help in some cases.

Reviewed-by: Mathias.Froehlich@web.de
2013-05-06 09:06:03 -07:00
Tom Stellard
7cc98ea88f radeon/llvm: Don't use the global context when parsing LLVM IR
This leads to crashes when multiple threads try to compile compute
shaders in the same time.

Fixes a crash in bfgminer when using more than one thread.
2013-05-06 09:06:00 -07:00
Eric Anholt
bd850cb4f2 i965: Remove GL_ARB_color_buffer_float from GL core contexts.
Of the 3 controls in the extension, one was kept in GL core and the other
two were explicitly deprecated and the reasonable default behavior was
encoded in the spec.  By not exposing the extension, we avoid shader
recompiles when switching between float and unorm color buffers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-06 09:01:51 -07:00
Tom Stellard
ec143dc0b1 r600g/llvm: Update radeon family mappings for LLVM backend
New processors were added to the backend to distinguish between
GPUs with and without vertex caches.
2013-05-06 08:22:24 -07:00
Chia-I Wu
5cca6b6280 android: libsync is needed on Android 4.2+ for any driver
Add libsync not only for MESA_BUILD_CLASSIC, but also for MESA_BUILD_GALLIUM.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-05-06 07:20:08 -07:00
Chia-I Wu
da109d56d5 android: add ilo to the build system
It can be selected with

  BOARD_GPU_DRIVERS := ilo

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2013-05-06 07:20:07 -07:00
Eric Anholt
739b88330c glsl: Flip around "if" statements with empty "then" blocks.
This cleans up some funny-looking code in some unigine shaders I was
looking at.  Also slightly helps on planeshift and a few shaders in an
upcoming Valve release.

total instructions in shared programs: 1653715 -> 1653587 (-0.01%)
instructions in affected programs:     16550 -> 16422 (-0.77%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-05 13:20:42 -07:00
Chia-I Wu
008346273c ilo: correctly set return types of sampler messages
Correctly set the types of the temporaries.  We do not want type conversions
when moving the results to the final destinations.
2013-05-05 14:36:39 +08:00
Vincent Lejeune
b42fe195a2 r600g/llvm: Undefines unrequired texture coord values
This is a port of "r600g:mask unused source components for SAMPLE"
patch from Vadim Girlin.
2013-05-04 23:38:50 +02:00
Maarten Lankhorst
c4150123aa nvc0: fixup video decoding with 2D_ARRAY
Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
2013-05-04 20:56:23 +02:00
Chia-I Wu
8c347d4e57 gallium: fix type of flags in pipe_context::flush()
It should be unsigned, not enum pipe_flush_flags.

Fixed a build error:

  src/gallium/state_trackers/egl/android/native_android.cpp:426:29: error:
  invalid conversion from 'int' to 'pipe_flush_flags' [-fpermissive]

v2: replace all occurrences of enum pipe_flush_flags by unsigned

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>

[olv: document the parameter now that the type is unsigned]
2013-05-04 17:32:10 +08:00
Eric Anholt
cbf3462c35 i965: Enable fast clears on non-8x4-aligned sizes.
Improves glb2.7 performance at a misaligned size by 2.3% +/- 0.7% (n=11).
The workaround was to avoid bad primitive/surface sizes, but that's worked
around as of a14dc4f92c.  (One might note
that pre-gen7 we don't know that the right half of an 8x4 at the right
edge is actually our pixels, but we're already clobbering those pixels for
depth resolves anyway and more work would be required to avoid that).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-05-03 20:59:51 -07:00
Brian Paul
76084907fb vbo: add comments, const qualifiers
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
0baf32508a mesa: whitespace, formatting fixes, etc in api_arrayelt.c
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
7c9e5afe81 vbo: use new no-op ArrayElement in _mesa_noop_vtxfmt_init()
As we do for the other commands which can appear between glBegin/End.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
7b762305d5 mesa: change ctx->Driver.NeedFlush to GLbitfield and update comment
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
36c83ccca0 mesa; change ctx->Driver.SaveNeedFlush to boolean, and document it.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
af30987a69 vbo: update comments for vbo_save_NotifyBegin()
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
4ea05bcba6 vbo: implement primitive merging for glBegin/End sequences
A surprising number of apps and benchmarks have poor code like this:

glBegin(GL_LINE_STRIP);
glVertex(v1);
glVertex(v2);
glEnd();
// Possibly some no-op state changes here
glBegin(GL_LINE_STRIP);
glVertex(v3);
glVertex(v4);
glEnd();
// repeat many, many times.

The above sequence can be converted into:

glBegin(GL_LINES);
glVertex(v1);
glVertex(v2);
glVertex(v3);
glVertex(v4);
glEnd();

Similarly for GL_POINTS, GL_TRIANGLES, etc.

Merging was already implemented for GL_QUADS in the display list code.
Now other prim types are handled and it's also done for immediate mode.

In one case:
                                 before   after
-----------------------------------------------
number of st_draw_vbo() calls:     141      45
number of _mesa_prims issued:     7520     632

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul
3702d25082 vbo: create a few utility functions for merging primitives
To be used by following commit.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Zack Rusin
a232afdbfb draw/pt: adjust overflow calculations
gallium lies. buffer_size is not actually buffer_size but available
size, which is 'buffer_size - buffer_offset' so by adding buffer
offset we'd incorrectly compute overflow.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 07:07:33 -04:00
Zack Rusin
8490d21cbe tgsi/ureg: make the dst register match the src indirection
In ureg src registers could have an indirect register that was
either a temp or an addr register, while dst registers allowed
only addr. That made moving between them a little difficult so
make them behave the same way and allow temp's and addr registers
as indirect files for both (tgsi supports it, just ureg didn't).

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 07:07:33 -04:00
Roland Scheidegger
23025ed15d gallium: tgsi documentation updates and clarification for integer opcodes.
A lot of them were missing. Others were moved from the Compute ISA
to a new Integer ISA section as that seemed more appropriate.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-03 21:36:28 +02:00
Roland Scheidegger
ae507b6260 llvmpipe: get rid of depth swizzling.
Eliminating this we no longer need to copy between linear and swizzled layout.
This is probably not quite ideal since it's a bit more work for now, could do
some optimizations by moving depth testing outside the fragment shader loop
(but tricky for early depth test as we don't have neither the mask nor the
interpolated z in the right order handy).
The large amount of tile/untile code is no longer needed will be deleted
in next commit.
No piglit regressions.
v2: change a forgotten LAYOUT_NONE to LAYOUT_LINEAR.
v3: fix (bogus) uninitialized variable warnings, add comments, fix a bad type

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-03 21:36:20 +02:00
Lauri Kasanen
e495d88453 r600g: Correctly initialize the shader key, v2
Assigning a struct only copies the members - any padding is left as is.

Thus this code:

struct foo_t foo;
foo = bar;

leaves the padding of foo intact, ie uninitialized random garbage.

This patch fixes constant shader recompiles by initializing the struct
to zero. For completeness, memcpy is used to copy the key to the shader
struct.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Lauri Kasanen <cand@gmx.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-05-03 19:28:57 +02:00
Lauri Kasanen
5ff81cfd86 st/xvmc/tests: Fix build failure, v2
v2: Removed extra libs as requested by Matt Turner.

Signed-off-by: Lauri Kasanen <cand@gmx.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-05-03 19:14:54 +02:00
Andreas Boll
e62be5de53 scons: remove nouveau build
One build system for linux/unix only drivers should be enough.
Additionally the nouveau target was disabled anyway.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-03 18:44:57 +02:00
Andreas Boll
4ca44f2c5e scons: remove radeon build
One build system for linux/unix only drivers should be enough.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48694

Acked-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-03 18:44:43 +02:00
Alex Deucher
4539f8e20a r600g: don't emit surface_sync after FLUSH_AND_INV_EVENT
It shouldn't be needed since the FLUSH_AND_INV_EVENT has already
made sure the destination caches are flushed.  Additionally,
we didn't previously emit the surface_sync until this commit:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=e5e4c07e7964a3258ed02b530bcdc24c0650204b
Emitting them together causes hangs in compute on cayman/TN
and hangs in Heaven on evergreen.

Note: this patch is a candidate for the 9.1 branch, but requires:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=156bcca62c9f4e79e78929f72bc085757f36a65a
as well.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-05-03 10:55:05 -04:00
Vadim Girlin
41005d7bd2 r600g/sb: zero-initialize bytecode structs
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin
f92bd0958e r600g/sb: fix constant propagation in gvn pass
Fixes the bug that prevented propagation of literals in some cases.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin
3c201a22ca r600g/sb: don't run unnecessary passes
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin
48ba5712f5 r600g/sb: silence warnings with gcc 4.8
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin
c49b6d7f27 r600g/sb: fix handling of interference sets in post_scheduler
post_scheduler clears interference set for reallocatable values when
the value becomes live first time, and then updates it to take into
account modified order of operations, but this was not handled properly
if the value appears first time as a source in copy operation.

Fixes issues with webgl demo: http://madebyevan.com/webgl-water/

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin
e16ef1f454 r600g/sb: fix allocation of indirectly addressed input arrays
Some inputs may be preloaded into predefined GPRs,
so we can't reallocate arrays with such inputs.

Fixes issues with webgl demo: http://oos.moxiecode.com/js_webgl/snake/

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:41 +04:00
Vadim Girlin
a6fe055fa7 r600g/sb: use hex instead of binary constants
This should fix build issues with GCC < 4.3

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:41 +04:00
Vadim Girlin
4ca67dbf0c r600g: use old shader disassembler by default
New disassembler is not completely isolated yet from further processing
in r600g/sb that is not required for printing the dump, so it has higher
probability to fail in case of any unexpected features in the bytecode.

This patch adds "sbdisasm" flag for R600_DEBUG that allows to use new
disassembler in r600g/sb for shader dumps when shader optimization
is not enabled.

If shader optimization is enabled, new disassembler is used by default.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:41 +04:00
Christian König
b4b3041132 radeon/uvd: enable interlaced buffers by default
Kills tilling on UVD buffers, but we currently don't really need that.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-03 11:00:21 +02:00
Christian König
85b0880a17 vl/idct: fix for commit 7d2f2a0c89
We still need the option for handling 3D textures as well.

Should fix: https://bugs.freedesktop.org/show_bug.cgi?id=64143

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-03 11:00:21 +02:00
Christian König
379753869d vl/buffers: fix typo in function name
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-03 11:00:20 +02:00
Christian König
9c353ea293 radeon/uvd: fix some MPEG4 artifacts
Still not perfect, but a step in the right direction.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-03 11:00:20 +02:00
José Fonseca
abbbc9b667 draw: Update for u_assembled_primitive -> u_assembled_prim rename.
Mesa build is too complex to rely on successful builds. On refactorings
it is always a good idea to use git grep to prevent missing cases:

  $ git grep u_assembled_primitive
  src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline_llvm.c:      u_assembled_primitive(in_prim);
2013-05-03 08:35:17 +01:00
Chia-I Wu
8b2a967e32 st/egl: fix bulid errors on Android 4.2
The differences from the previous releases that affect st/egl are

 - logging macros are prefixed with an 'A'
 - dequeueBuffer() and enqueueBuffer() require an additoinal argument for
   fence fd, acquired from libsync

Additionally, include gralloc_drm.h with extern "C".
2013-05-03 13:04:00 +08:00
Chia-I Wu
7346ab3b43 ilo: use u_reduced_prims_for_vertices()
We do not need our own prim_count() anymore.
2013-05-03 11:59:10 +08:00
Chia-I Wu
f87dccdc19 util/prim: add u_reduced_prims_for_vertices()
The function returns the number of reduced/tessellated primitives for the
given vertex count.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu
90d5190594 util/prim: assorted fixes for u_decomposed_prims_for_vertices()
Switch to '>=' for comparisons, and it becomes obvious that the comparison for
PIPE_PRIM_QUAD_STRIP was wrong.

Add minimum vertex count check for PIPE_PRIM_LINE_LOOP.  Return 1 for
PIPE_PRIM_POLYGON with 3 vertices.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu
30671cecc0 util/prim: use vertex count info in u_validate_pipe_prim()
As a side effect, primitives with adjacency are now correctly validated.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu
ddf0e3930f util/prim: fix the name of the include guard
It should be U_PRIM_H, not U_BLIT_H.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu
5dd3bd70a1 draw: use u_assembled_prim() instead of u_assembled_primitive()
The latter function is also removed as a result of the change.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu
185692e72c util/prim: clean up and add comments
Move together (or add) functions to decompose/reduce/assemble a primitive,
give them consistent names, and document them.  Add u_prim_vertex_count() so
that the vertex count information can be used elsewhere.

u_assembled_primitive() will be removed in a folow-on commit.

[olv: fix a warning when -Wold-style-declaration is enabled]

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:58:57 +08:00
Chia-I Wu
64913002e4 util/prim: fix primitive trimming for triangles with adjacency
Fix for PIPE_PRIM_TRIANGLES_ADJACENCY and PIPE_PRIM_TRIANGLE_STRIP_ADJACENCY.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:39:12 +08:00
Eric Anholt
573d8813fd i965/vs: Add instruction scheduling.
While this is ignorant of dependency control, it's still good for a 0.39%
+/- 0.08% performance improvement on GLBenchmark 2.7 (n=548)

v2: Rewrite as a subclass of the base class for the FS instruction
    scheduler, inheriting the same latency information.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:47 -07:00
Eric Anholt
3b00a6acac i965: Move most of the FS instruction scheduler code to a general class.
About half of this is shareable with the VS code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:43 -07:00
Eric Anholt
ce22dd75b7 i965: Pull a couple of FS scheduling functions out to methods.
These will get virtualized as we add VS scheduling support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:39 -07:00
Eric Anholt
ee0223ba2a i965: Move FS instruction scheduling to a non-FS-specific file.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:35 -07:00
Eric Anholt
ab04f3b2d7 i965: Share the register file enum between the two backends.
I need this so I can look at vec4 and fs registers' files from the same
.cpp file without namespaces.  As far as I can tell we never rely on the
particular numerical values of the files, though I thought it sounded like
a good idea when doing the VS (it turns out having 0 be BAD_FILE is nicer).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:31 -07:00
Eric Anholt
63c8155b09 i965: Make dump_instructions be a virtual method of the visitor.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:26 -07:00
Eric Anholt
74e670d0a3 i965/vs: Do round-robin register allocation on gen6+ like we do in the FS.
This will free instruction scheduling to make better choices.  No
statistically significant performance difference on GLB2.7 (n=93).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:09 -07:00
Rob Bradford
15e64de9e6 wayland: Make eglQueryBufferWL succeed for width and height requests too
Following the addition of the EGL_WIDTH and EGL_HEIGHT this function should
return EGL_TRUE for those requested attributes too.
2013-05-02 16:46:04 -04:00
Zack Rusin
396b861ceb draw/gs: don't crash when vs/gs signatures don't match
instead of crashing just fill zeros at the input slots that don't
match, that's the mandated behavior and it avoids debug asserts.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-02 02:43:42 -04:00
Zack Rusin
999cd79c9e tgsi: allow negation of all integer types
It's valid because we reuse certain arithmetic operations
for both signed and unsigned types (e.g. uadd, umad, which
have a bit unfortunate naming)

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-02 02:43:42 -04:00
Eric Anholt
1dfea559c3 i965: Fix SNB GPU hangs when a blorp batch is the first thing to execute.
The GPU apparently goes looking for constants even though there are no
shader stages enabled, and gets stuck because we haven't told it there are
no constants to collect.  If any other user of the 3D pipeline had run
(even the Render accel of the X server!) since power on, then the in-GPU
constant buffers would have been set up with some contents we didn't use,
and we would succeed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56416
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Dave Airlie <airlied@redhat.com>
NOTE: This is a candidate for the stable branches.
2013-05-02 11:27:37 -07:00
Tom Stellard
156bcca62c r600g: Don't set the dest cache bits on surface sync for R600_CONTEXT_FLUSH_AND_INV
We are already emitting a EVENT_TYPE_CACHE_FLUSH_AND_INV_EVENT packet
when this flush flag is set, so flushing the dest caches with a
SURFACE_SYNC should not be necessary.

The motivation for this change is that emitting a SURFACE_SYNC packet with
the CB bits set was causing compute shaders to hang on Cayman.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-05-02 09:00:37 -07:00
Tom Stellard
5752be0cb7 r600g/compute: Fix build error in debug code
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-05-02 09:00:37 -07:00
Armin K
cd84353d57 radeon: Fix build with LLVM 3.3
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-05-02 09:00:37 -07:00
Armin K
4742f9b00b gallivm: Fix build with LLVM 3.3
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-05-02 09:00:37 -07:00
Brian Paul
fcfbf4a19f mesa: update comments, simplify code in vtxfmt.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:16 -06:00
Brian Paul
5dc0081ade mesa: update GLvertexformat comments
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:16 -06:00
Brian Paul
200e09e393 mesa: remove GLvertexformat::EvalMesh1(), EvalMesh2()
See previous commit comments.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:16 -06:00
Brian Paul
0f365b2d77 mesa: remove GLvertexformat::Rectf()
As with the glDraw* functions, this doesn't have to be in GLvertexformat.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:16 -06:00
Brian Paul
49993a1a9d mesa: simplify dispatch for glDraw* functions
Remove all the glDraw* functions from the GLvertexformat structure.
The point of that dispatch struct is to handle all the functions which
dispatch differently depending on whether we're inside glBegin/End.
glDraw* are never allowed inside glBegin/End so we can remove those
entries.

This simplifies the code paths and gets rid of quite a bit of code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:16 -06:00
Brian Paul
79679e258b vbo: add new vbo_initialize_exec_dispatch(), vbo_initialize_save_dispatch()
First step in simplifying the vertex array / glDraw dispatch code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
d0102500bd mesa: remove _MESA_INIT_EVAL_VTXFMT() macro
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
43b3d3bc25 mesa: remove _MESA_INIT_ARRAYELT_VTXFMT() macro
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
95188fd10f mesa: remove _MESA_INIT_DLIST_VTXFMT() macro
Just expand the code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
84e62b7358 mesa: change _mesa_inside_dlist_begin_end() to handle PRIM_UNKNOWN
If the currently compiled primitive state is PRIM_UNKNOWN we should
not return true from _mesa_inside_dlist_begin_end().  This lets us
simplify the calls to that function.

Note, the call to _mesa_inside_dlist_begin_end() in vbo_save_EndList()
should have probably been checking for PRIM_UNKNOWN too, but it wasn't.
So there's no code change change.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
daf19f28c6 mesa: add names of geometry shader prims in gl_enums.py
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
5472ae1fa9 vbo: fix initial value of ctx->Driver.CurrentSavePrimitive
This is set during context creation/initialization.  We know we're
not inside glBegin/glEnd at this point so use PRIM_OUTSIDE_BEGIN_END.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
ecea61e414 vbo: fix error detection in vbo_save_playback_vertex_list()
The old code didn't make sense.  The clause in question did the
same thing as the next else-if clause.  If we're already executing
a glBegin/End pair and we're starting a new primitive, that's an
error.

Fixes more failures in piglit gl-1.0-beginend-coverage test.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
a07437dc28 mesa: comments, formatting fixes in dlist code
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
e880b7cbf8 vbo: remove redundant vfmt->Begin = _save_Begin assignment
The same assignment appears later in the function.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
3e7c16997a mesa: don't install glDraw* functions into the BeginEnd dispatch table
Functions like glDrawArrays, glDrawElements, etc. are illegal between
glBegin/glEnd and should generate GL_INVALID_OPERATION.

Fixes several piglit gl-1.0-beginend-coverage failures.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
d6f3ef92d7 vbo: fix parameter validation for saving dlist glDraw* functions
The _save_OBE_DrawArrays/Elements/RangeElements() functions are
called when building a display list and we know we're outside
glBegin/End.

We shouldn't call the normal _mesa_validate_DrawArrays/Elements()
functions here because those functions only work properly in immediate
mode or during dlist execution.  At dlist compile time, we can't call
_mesa_update_state(), etc. and examine the current state since it won't
apply when the list is executed later.

Fixes several failures in piglit's gl-1.0-beginend-coverage test.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
94c7caf406 mesa: add missing error check in _mesa_EndList()
If we're in GL_COMPILE_AND_EXECUTE mode and inside glBegin, calling
glEndList() should generate an error.

Fixes a failure in piglit's gl-1.0-beginend-coverage test.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
c1a5c5c13d mesa: remove unused PRIM_INSIDE_UNKNOWN_PRIM constant
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
d5bdce1142 mesa: simplify save_Begin() error checking
The old code was hard to understand and not entirely correct.
Note that PRIM_INSIDE_UNKNOWN_PRIM is no longer set anywhere so
we'll be able to remove that next.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:15 -06:00
Brian Paul
bb459f6295 mesa: refactor _mesa_valid_prim_mode()
...in terms of new _mesa_is_valid_prim_mode().  We need a mode validater
function that doesn't depend on current state for the display list code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:14 -06:00
Brian Paul
8be093e2f6 mesa: fix CurrentSavePrimitive <= GL_POLYGON tests
Use the new PRIM_MAX value instead so that new geometry shader primitive
types are accounted for.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:14 -06:00
Brian Paul
cce6e30613 mesa: adjust PRIM_x constants for geometry shaders
These values pertain to display lists, and the new types of geometry
shader primitives can be used in display lists.

And add new PRIM_MAX constant for follow-on changes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:14 -06:00
Brian Paul
aa782f260d mesa: fix save_ShadeModel() logic and add new comments
This removes the test for _mesa_inside_dlist_begin_end().
If ctx->Driver.CurrentSavePrimitive==PRIM_UNKNOWN (the initial value),
_mesa_inside_dlist_begin_end() will, confusingly, return TRUE.
So we didn't set the ctx->ListState.Current.ShadeModel value and it
remained in its indeterminate state.

This didn't effect correctness, but it defeated the intended optimization
of dropping redundant glShadeModel() state changes in order to
coalesce sequences of drawing commands.

Verified with new piglit gl-1.0-dlist-shademodel test.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-02 09:03:14 -06:00
Adam Jackson
16296cc843 gallivm: Fix altivec intrinsics for 8xi16 add/sub
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-05-02 10:34:08 -04:00
Lauri Kasanen
35c5b95b94 r600/sb: Fix build failure with non-standard libdrm installation prefix
Just like radeon/uvd, r600/sb fails to find the libdrm includes.

Signed-off-by: Lauri Kasanen <cand@gmx.com>
2013-05-02 14:57:00 +02:00
Lauri Kasanen
e2b985dc0f radeon/uvd: Fix build failure with non-standard libdrm installation prefix
Without this patch, radeon_uvd failed to find the libdrm includes:

In file included from radeon_uvd.c:48:
../../winsys/radeon/drm/radeon_winsys.h:44:35: error:
libdrm/radeon_surface.h: No such file or directory

Signed-off-by: Lauri Kasanen <cand@gmx.com>
2013-05-02 14:54:03 +02:00
Jordan Justen
02f2bce08d mesa: implement glFramebufferTexture
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-01 16:18:25 -07:00
Jordan Justen
5da8288911 mesa: add Layered field to framebuffers
When checking framebuffer completeness, we test each attachment.
We verify that all attachments are consistent in terms of layers.

1. They must all be layered, or all non-layered
2. If they are layered, they must match in depth

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-01 15:31:48 -07:00
Jordan Justen
a62808085a mesa: add renderbuffer attachment Layered field
If glFramebufferTexture is used, then the framebuffer attachment is
layered.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-01 15:31:44 -07:00
Jordan Justen
a05e201d4a mesa: add renderbuffer Depth field
With glFramebufferTexture, a renderbuffer may support
all layers of the texture, so we need the depth of the
renderbuffer to check for consistency which is required
for framebuffer completeness.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-01 15:30:48 -07:00
Andreas Boll
b8e41db053 mesa: add usage examples to get-pick-list and shortlog scripts
NOTE: This is a candidate for the stable branches.
2013-05-01 21:42:02 +02:00
Andreas Boll
df01201132 docs: add info about bugzilla_mesa.sh script 2013-05-01 21:42:02 +02:00
Andreas Boll
ca79b72c00 mesa: Add a script to generate the list of fixed bugs
This list appears in the fixed bugs section of the release notes.

v2: Add usage examples

NOTE: This is a candidate for the stable branches.
2013-05-01 21:42:02 +02:00
Andreas Boll
f6aab27d43 scons: remove IN_DRI_DRIVER
Not used anymore.
2013-05-01 21:34:48 +02:00
Andreas Boll
be0fec4f5b build: remove unused API_DEFINES
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-01 21:34:48 +02:00
Brian Paul
7f8434b866 configure: remove IN_DRI_DRIVER
Not used anymore.

v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - split patch into two patches
    - remove more unused code

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-01 21:34:48 +02:00
Brian Paul
4ede5fb0c6 configure: remove FEATURE_GL/ES1/ES2
Not used anymore.

v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - split patch into two patches

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-01 21:34:48 +02:00
Andreas Boll
6b8f55c4da intel: use automake conditionals for defining FEATURE_{ES1,ES2}
Removes the need of API_DEFINES.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-01 21:34:48 +02:00
Andreas Boll
afa33a001a egl-static: use automake conditionals for defining FEATURE_{GL,ES1,ES2}
Removes the need of API_DEFINES.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-01 21:34:48 +02:00
Andreas Boll
3537d853d0 intel: remove executable bit from C file
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-01 21:34:48 +02:00
Brian Paul
aaab450d22 docs: s/Aprile/April/ 2013-05-01 13:17:21 -06:00
Andreas Boll
85e5bc106c docs: fix 9.1.2 release notes 2013-05-01 21:01:48 +02:00
Marek Olšák
8eef6ad2e2 vbo: fix possible use-after-free segfault after a VAO is deleted
This like the fifth attempt to fix the issue.

Also with the new "validating" flag, we can set recalculate_inputs to FALSE
earlier in vbo_bind_arrays, because _mesa_update_state won't change it.

NOTE: This is a candidate for the stable branches.

v2: fixed a typo

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-01 20:08:53 +02:00
Kenneth Graunke
b5b6460c40 i965/vs: Fix textureGrad() with shadow samplers on Haswell.
The shadow comparitor needs to be loaded into the Z component of the
last DWord.

Fixes es3conform's shadow_execution_vert and oglconform's
shadow-grad advanced.textureGrad.1D tests on Haswell.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-01 10:42:51 -07:00
Kenneth Graunke
e2f887b243 i965: Lower textureGrad() for samplerCubeShadow.
According to the Ivybridge PRM, Volume 4 Part 1, page 130, in the
section for the sample_d message: "The r coordinate contains the faceid,
and the r gradients are ignored by hardware."

This doesn't match GLSL, which provides gradients for all of the
coordinates.  So we would need to do some math to compute the face ID
before using sample_d.  We currently don't have any code to do that.

However, we do have a lowering pass that converts textureGrad to
textureLod, which solves this problem.  Since textureGrad on three
components is sufficiently obscure, it's not a performance path.

For now, only handle samplerCubeShadow; we need tests for samplerCube
and samplerCubeArray.

Fixes es3conform's shadow_comparison_frag test on Haswell.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-05-01 10:42:51 -07:00
Christian König
163b4da874 radeon/uvd: fix quant scan order for mpeg2
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-01 13:33:46 +02:00
Christian König
3aafe2437d st/vdpau: fix background handling in the mixer
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-01 13:33:46 +02:00
Christian König
7d2f2a0c89 vl/buffer: use 2D_ARRAY instead of 3D textures
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-01 13:33:46 +02:00
Christian König
e27f87b549 vl/compositor: cleanup background clearing
Add an extra parameter to specify if we should clear the render target.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-01 13:33:46 +02:00
Brian Paul
236ea7900f swrast: add casts for ImageSlices pointer arithmetic
MSVC doesn't like pointer arithmetic with void * so use GLubyte *.

Reviewed-by: Jose Fonseca<jfonseca@vmware.com>
2013-05-01 11:53:02 +01:00
Chia-I Wu
22c5e048bd ilo: fix PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS
On GEN7+, is->dev.has_gen7_sol_reset is required.
2013-05-01 17:41:39 +08:00
Chia-I Wu
16f81fcf1e ilo: enable SO support on GEN7 2013-05-01 17:36:44 +08:00
Chia-I Wu
d26f70e208 ilo: reset SO write offsets for new SO targets
When the SO targets are changed and no appending is requested, we need to send
SOL_RESET on GEN7+.
2013-05-01 17:36:44 +08:00
Chia-I Wu
68e1f76e46 ilo: correctly program SO states for GEN7
With the commands supported by GPE, we can finally program the states.
2013-05-01 17:36:44 +08:00
Chia-I Wu
9557cd39e2 ilo: implement GEN7 SO GPE functions
They were just stubs before.
2013-05-01 17:36:09 +08:00
Chia-I Wu
9069a3b065 ilo: add gen6_pipeline_update_max_svbi()
Move max_svbi calculation to a helper function and make it available for other
GENs.
2013-05-01 17:35:43 +08:00
Chia-I Wu
252a21c2cc ilo: expose register indices of OUTs in ilo_shader
pipe_stream_output_info tells us which of OUT[i] needs to be written out.
We need the info to map OUT[i] to VUE offset.
2013-05-01 17:34:49 +08:00
Chia-I Wu
440557db4e ilo: allow one-off flags to be specified for CP
It will be used for SOL_RESET on GEN7.
2013-05-01 16:03:44 +08:00
Chia-I Wu
dd62e7bc02 ilo: fix tiling/size for special-purpose resources
We do not allocate such resources yet though.
2013-05-01 12:00:32 +08:00
Chia-I Wu
7726e9500c ilo: use UMS layout for render targets
As we do not advertise MSAA support, this change should not make any
difference yet.
2013-05-01 11:56:43 +08:00
Chia-I Wu
334abed828 ilo: support and prefer compact array spacing
There is no reason to waste the memory when the HW can support compact array
spacing (ARYSPC_LOD0).
2013-05-01 11:31:15 +08:00
Chia-I Wu
ce188bb252 ilo: move device limits to ilo_dev_info or to GPEs
It seems a bit weird to have device limits in a context.
2013-05-01 11:23:11 +08:00
Chia-I Wu
bef98f9c3a ilo: use ilo_dev_info in toy compiler
We need only dev->gen, but it makes sense to expose other information to the
compiler.
2013-05-01 11:22:57 +08:00
Chia-I Wu
51d749e7e2 ilo: use ilo_dev_info in GPE and 3D pipeline
We need only dev->gen and dev->gt, but it makes sense to expose other
information to the pipeline.
2013-05-01 11:22:20 +08:00
Chia-I Wu
bb1f635dcc ilo: add ilo_dev_info shared by the screen and contexts
The struct is used to describe the device information, such as PCI ID, GEN,
GT, and etc.
2013-05-01 11:20:41 +08:00
Chia-I Wu
355f3f7ab5 ilo: fix indentation of ilo_gpe_gen*.h 2013-05-01 11:20:32 +08:00
Kenneth Graunke
6c5cf8baa1 glsl: Ignore redundant prototypes after a function's been defined.
Consider the following shader:

    vec4 f(vec4 v) { return v; }
    vec4 f(vec4 v);

The prototype exactly matches the signature of the earlier definition,
so there's absolutely no point in it.  However, it doesn't appear to
be illegal.  The GLSL 4.30 specification offers two relevant quotes:

"If a function name is declared twice with the same parameter types,
 then the return types and all qualifiers must also match, and it is the
 same function being declared."

"User-defined functions can have multiple declarations, but only one
 definition."

In this case the same function was declared twice, and there's only one
definition, which fits both pieces of text.  There doesn't appear to be
any text saying late prototypes are illegal, so presumably it's valid.

Unfortunately, it currently triggers an assertion failure:
ir_dereference_variable @ <p1> specifies undeclared variable `v' @ <p2>

When we process the second line, we look for an existing exact match so
we can enforce the one-definition rule.  We then leave sig set to that
existing function, and hit sig->replace_parameters(&hir_parameters),
unfortunately nuking our existing definition's parameters (which have
actual dereferences) with the prototype's bogus unused parameters.

Simply bailing out and ignoring such late prototypes is the safest
thing to do.

Fixes Piglit's late-proto.vert as well as 3DMark/Ice Storm for Android.

NOTE: This is a candidate for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2013-04-30 16:43:42 -07:00
Ian Romanick
abfe486b9e docs: Import 9.1.2 release notes, add news item.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-30 15:33:29 -07:00
Matt Turner
1b6281443d build: Remove libws_xlib.la from GALLIUM_PIPE_LOADER_LIBS.
The three users of GALLIUM_PIPE_LOADER_LIBS (OpenCL, gallium-gbm,
gallium tests) don't appear to need libws_xlib.la.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Matt Turner
460996b937 build: Remove libpipe_loader.la from GALLIUM_PIPE_LOADER_LIBS.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Matt Turner
538e10f3ea build: Remove HAVE_PIPE_LOADER_SW.
It guarded the function prototype of pipe_loader_sw_probe, whose use (in
pipe_loader.c) and definition (in pipe_loader_sw.c) were not guarded.
Both are built into libpipe_loader.la if HAVE_LOADER_GALLIUM, which is
enable_gallium_loader in configure.ac.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Matt Turner
ea6caf4cdf build: Remove libws_null.la from GALLIUM_PIPE_LOADER_LIBS.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Matt Turner
242809942f build: Rename PIPE_LOADER_HAVE_XCB to HAVE_PIPE_LOADER_XCB.
For consistency, since we already have HAVE_PIPE_LOADER_{SW,DRM}.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Matt Turner
657cfe6252 configure.ac: Remove unused HAVE_PIPE_LOADER_XLIB macro.
Added in e1364530 but never used.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:31 -07:00
Paul Berry
bdf13dc832 i965: Stop passing num_samples to intel_miptree_alloc_hiz().
The number of samples is already available in the miptree data
structure, so there's no need to pass it in.

I suspect this may fix a subtle bug because in one case
(intel_renderbuffer_update_wrapper) we were always passing zero for
num_samples, even though the buffer in question was not guaranteed to
be single-sampled.  But I wasn't able to find a failing test case.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-30 13:46:57 -07:00
Zack Rusin
d48054ff22 draw: don't crash if GS doesn't emit anything
Technically it's legal for geometry shader to not emit any
vertices. It's silly, but perfectly legal, so lets make draw
stop crashing if it happens.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-27 17:28:04 -04:00
Eric Anholt
e56095dc2e i965: Implement color clears using a simple shader in blorp.
The upside is less CPU overhead in fiddling with GL error handling, the
ability to use the constant color write message in most cases, and no GLSL
clear shaders appearing in MESA_GLSL=dump output.  The downside is more
batch flushing and a total recompute of GL state at the end of blorp.
However, if we're ever going to use the fast color clear feature of CMS
surfaces, we'll need this anyway since it requires very special state
setup.

This increases the fail rate of some the GLES3conform ARB_sync tests,
because of the initial flush at the start of blorp.  The tests already
intermittently failed (because it's just a bad testing procedure), and we
can return it to its previous fail rate by fixing the initial flush.

Improves GLB2.7 performance 0.37% +/- 0.11% (n=71/70, outlier removed).

v2: Rename the key member, use the core helper for sRGB, and use
    BRW_MASK_* enums, fix comment and indentation (review by Paul).
v3: Rewrite a comment, drop a silly temporary variable (review by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-30 11:59:23 -07:00
Eric Anholt
e34c857639 mesa: Make a Mesa core function for sRGB render encoding handling.
v2: const-qualify ctx, and add a comment about the function (recommended
    by Brian and Kenneth).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-04-30 11:59:23 -07:00
Eric Anholt
db31bc5cfb i965: Don't flush the batch at the end of blorp.
Improves GLB2.7 performance 0.13% +/- 0.09% (n=104/105, outliers removed).
More importantly, once color glClear()s are done through blorp in the next
commit, this reduces regression in GLES3 conformance tests that rely on
queueing up many glClear()s and having the GPU report being still busy in
an ARB_sync query after that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-30 11:59:23 -07:00
Vadim Girlin
fb1eed9ec5 r600g/sb: remove unused code
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:48 +04:00
Vadim Girlin
3f18dd818f r600g/sb: collect shader statistics
Collects various statistical information for each shader
and total stats for contexts.

Printed with R600_DEBUG=sb,sbstat

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:48 +04:00
Vadim Girlin
6ba7a162b6 r600g/sb: don't propagate dead values in GVN pass
In some cases we use value::gvn_source field to link values that
are known to be equal before gvn pass (e.g. results of DOT4 in different
slots of the same alu group), but then source value may become dead later
and this confuses further passes.

This patch resets value::gvn_source to NULL in the dce_cleanup pass
if it points to dead value.

Fixes segfault during shader optimization with ETQW.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:48 +04:00
Vadim Girlin
3e476c311f r600g/sb: use simple heuristic to limit register pressure
It's not a complete register pressure tracking, yet it helps to prevent
register allocation problems in some cases where they were observed.

The problems are uncovered by false dependencies between fetch instructions
introduced by some recent changes in TGSI and/or default backend.
Sometimes we have code like this:

...
SAMPLE R5.xyzw, R5.xyzw
... store R5.xyzw somewhere
MOV R5.x, <next x coord>
MOV R5.y, <next y coord>
SAMPLE R5.xyzw, R5.xyzw
... <may be repeated a lot of times>

With 2D resources, z and w in SAMPLE src reg aren't used and can be simply
masked, but shader backend doesn't have this information, so it's
considered as data dependency by optimization algorithms.
2013-04-30 21:50:48 +04:00
Vadim Girlin
6d6c8c88a3 r600g/sb: improve error checking in ra_coalesce pass 2013-04-30 21:50:47 +04:00
Vadim Girlin
188c893e65 r600g/sb: use source bytecode in case of optimization errors 2013-04-30 21:50:47 +04:00
Vadim Girlin
ad1df471d0 r600g: plug in optimizing backend
Optimization is enabled with "R600_DEBUG=sb".

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:47 +04:00
Vadim Girlin
2cd7691793 r600g/sb: initial commit of the optimizing shader backend 2013-04-30 21:50:47 +04:00
Vadim Girlin
fbb065d629 r600g: use enum type for domains field in struct r600_resource
This prevents the problems when the header is included in C++ code.
2013-04-30 21:50:47 +04:00
Vadim Girlin
d5b30fd036 r600g: add new flags to isa instruction tables 2013-04-30 21:50:47 +04:00
Vadim Girlin
a919424215 r600g: always create reverse lookup isa tables 2013-04-30 21:50:47 +04:00
Vadim Girlin
7d555f2f4c r600g: mask unused source components for SAMPLE
This results in more clean shader code and may improve the quality of
optimized code produced by r600-sb due to eliminated false dependencies
in some cases.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:47 +04:00
Eric Anholt
df410863d7 intel: Remove the last spans code!
The remaining bits happen to do nothing that
_swrast_span_render_start()/finish() don't do.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:45 -07:00
Eric Anholt
526cf46666 intel: Move the S8 offset calc function near its remaining usage.
It's not really span code ever since we stopped using spans for S8.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:45 -07:00
Eric Anholt
e7c5e9949b intel: Ensure renderbuffers are current when mapping them.
In the case of renering to windows in X, we would render to stale buffers
(or not render at all!) if you hit a MapRenderbuffer as the first thing
done to your window after new buffers are ready to be collected in DRI2.

I think this also covers the weird comment about irb->mt being missing
sometimes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:45 -07:00
Eric Anholt
0e8ef74c5f mesa: Add a clarifying comment about rowStride of compressed textures.
I always forget how we do this for compressed textures.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:45 -07:00
Eric Anholt
3750ff9e5f mesa: Remove the Map field from texture images.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
adf958d9c2 swrast: Always use MapTextureImage for mapping textures for swrast.
Now that everything goes through ImageSlices[], we can rely on the
driver's existing texture mapping function.

A big block of code goes away on Radeon that looks like it was to deal with
the validate that happened at SpanRenderStart, which no longer occurs since we
don't need validation for the MapTextureImage hook.

v2: Rewrite comment about ImageSlices, fix duplicated swImages, touch up
    unmap loop.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
ea05e259c9 nouveau: Replace swrast_texture_image->Map usage with ->Buffer.
This code is trying to deal with providing a map in the case that
AllocTexImageBuffer was called, which is hooked up to the swrast variant.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
b78e48289f nouveau: Just use MapTextureImage instead of duplicating the logic.
MapTextureImage has the exact same logic, except it can also handle
swrast-allocated buffers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
f91823f026 swrast: Make a teximage's stored RowStride be in terms of bytes per row.
For hardware drivers with pitch alignment requirements, a
non-power-of-two-sized texture format won't end up being an integer number
of pixels per row.  Also, avoids having to change our units between
MapTextureImage's rowStride and swrast's RowStride.

This doesn't fully convert the compressed texel fetch path, but does make
sure we don't drop any bits (not that we'd expect to).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
35e179b18c swrast: Replace use of teximage Map in 1D/2D paths with ImageSlices[0].
This gets us ready for the Map field to die.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
0c883e46d8 swrast: Replace ImageOffsets with an ImageSlices pointer.
This is a step toward allowing drivers to use their normal mapping paths,
instead of requiring that all slice mappings come from an aligned offset
from the first slice's map.

This incidentally fixes missing slice handling in FXT1 swrast.

v2: Use slice height helper function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:43 -07:00
Eric Anholt
e7ecc11311 swrast: Reuse _swrast_free_texture_image_buffer from drivers.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:43 -07:00
Eric Anholt
0a484f1006 swrast: Move ImageOffsets allocation to shared code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:43 -07:00
Eric Anholt
f709c31c67 swrast: Clean up and explain the mapping process.
v2: Move slice height calculation to a helper function (recommeded by Brian).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:43 -07:00
Eric Anholt
741e540055 swrast: Factor out texture slice counting.
This function going to get used a lot more in upcoming patches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:42 -07:00
Eric Anholt
dca4178130 radeon: Remove some dead teximage mapping code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:42 -07:00
Eric Anholt
0de08fb594 radeon: Add missing swrast field initialization.
This is the equivalent of intel's
80513ec8b4.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:42 -07:00
Vincent Lejeune
a6a4b70e2d r600g/llvm: Fix opencl build 2013-04-30 16:38:47 +02:00
Alexander von Gluck IV
f1361ed084 Gallium: Use mmap on Haiku for executable memory vs malloc
* Haiku now has DEP enabled by default.
2013-04-29 23:22:35 -05:00
Alexander von Gluck IV
60cc73c333 Mapi: Use mmap on Haiku for executable memory vs malloc
* Haiku now has DEP enabled by default.
2013-04-29 23:22:35 -05:00
Alexander von Gluck IV
39bdf08628 Mesa: Use mmap on Haiku for executable memory vs malloc
* Haiku now has DEP enabled by default.
2013-04-29 23:22:35 -05:00
Vincent Lejeune
51e9bfdc48 r600g/llvm: get use_kill from compiler shader 2013-04-30 02:17:18 +02:00
Eric Anholt
a79786af64 i965/fs: Print out the estimated cycle count in INTEL_DEBUG=wm
This could be used by shader-db for hopefully more accurate regression
testing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-29 11:44:35 -07:00
Eric Anholt
61ca2c4f73 i965/fs: Allow LRPs with uniform registers.
Improves GLB2.7 performance on my HSW by 0.671455% +/- 0.225037% (n=62).

v2: Make is_valid_3src() a method of the fs_reg. (recommended by Ken)

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-04-29 11:41:35 -07:00
Eric Anholt
de7e8b1d01 intel: Be more conservative in disabling tiling to save memory.
Improves GLB2.7 trex performance 1.01985% +/- 0.721366% on my IVB (n=10)
and by 3.38771% +/- 0.584241% (n=15) on my HSW, due to a 32x32 ARGB8888
cubemap going from untiled to tiled.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-29 11:41:34 -07:00
Eric Anholt
73bc6061f5 i965: Disable Z16 on contexts that don't require it.
It appears that Z16 on Intel hardware is in fact slower than Z24, so
people are getting surprisingly hurt when trying to use Z16 as a
performance-versus-precision tradeoff, or when they're targeting GLES2 and
that's all you get.

GL 3.0+ have Z16 on the list of required exact format sizes, but GLES
doesn't, so choose the better-performing layout in that case.  Improves
GLB 2.7 trex performance at 1920x1080 by 10.7% +/- 1.1% (n=3) on my IVB
system.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-29 11:41:34 -07:00
Eric Anholt
e409889213 intel: Report FBO incompleteness causes through GL_ARB_debug_output.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-29 11:41:34 -07:00
Eric Anholt
6ae473221a intel: Fold the one last function intel_tex_format.c into the caller.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-29 11:41:34 -07:00
Eric Anholt
40b207b62f mesa: Fix error checking for GS UBO getters.
These are supposed to be present if both things are available, but we were
enabling them if either one was.
2013-04-29 11:41:34 -07:00
Eric Anholt
072709da91 mesa: Add a clarifying comment about EXTRA_ error checking. 2013-04-29 11:41:34 -07:00
Eric Anholt
eac1199604 mesa: Add an extra clarifying set of braces to getter checking.
For this multi-page single statement, my thought the end was to that the
next block was mis-indented, rather than that the dropped indentation
actually indicated the end of the loop.
2013-04-29 11:41:33 -07:00
Eric Anholt
2534f0a57d mesa: Fix error checking for getters consisting of only API versions.
In almost all of our cases, getters that are turned on for only some API
variants will have an extension listed as one of the things that can
enable it, and thus api_check gets set.  For extra_gl30_es3 (used for
NUM_EXTENSIONS, MAJOR_VERSION, MINOR_VERSION) on a GL 2.1 context, though,
we would check twice, not find either one, but never actually throw the
error.
2013-04-29 11:41:33 -07:00
Eric Anholt
d63a10afcc mesa: Clarify the names of error checking variables for glGet.
There's no reason to actually count these things, so the integer ++
behavior was just confusing.
2013-04-29 11:41:33 -07:00
Eric Anholt
4df1b986d3 i915: Add support for GL_EXT_texture_sRGB and GL_EXT_texture_sRGB_decode.
This brings the driver up to GL 2.1.
2013-04-29 11:41:33 -07:00
Eric Anholt
97217a40f9 i915: Always enable GL 2.0 support.
There's no point in shipping a non-GL2 driver today.
2013-04-29 11:41:33 -07:00
Eric Anholt
eb062ab07f i915: Correctly set the OQ counter bits.
While we may provide the extension, we need to tell applications that they
can't actually use it:

            An implementation can either set QUERY_COUNTER_BITS_ARB to the
            value 0, or to some number greater than or equal to n.  If an
            implementation returns 0 for QUERY_COUNTER_BITS_ARB, then the
            occlusion queries will always return that zero samples passed the
            occlusion test, and so an application should not use occlusion
            queries on that implementation.
2013-04-29 11:41:33 -07:00
Kenneth Graunke
5e46482993 i965: Move is_math/is_tex/is_control_flow() to backend_instruction.
These are entirely based on the opcode, which is available in
backend_instruction.  It makes sense to only implement them in one
place.

This changes the VS implementation of is_tex() slightly, which now
accepts FS_OPCODE_TXB and SHADER_OPCODE_LOD.  However, since those
aren't generated in the VS anyway, it should be fine.

This also makes is_control_flow() available in the VS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-29 11:10:50 -07:00
Zack Rusin
a6e7c22664 draw/so: fix overflow calculation
only report overflow for missing targets if they're actually being
used. if the targets are missing but are not being used by any
slot in the stream output declaration we should correctly just
ignore them.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 03:48:36 -04:00
José Fonseca
220ef8295c llvmpipe: Fix queries when screen->num_threads == 0.
That is, when llvmpipe is run in single-threaded mode.

Trivial.

Tested with

  LP_NUM_THREADS=0 glean --run results --overwrite --quick --tests occluQry
2013-04-29 15:40:06 +01:00
José Fonseca
c4bea00fb3 Revert "st/mesa: add a simple path to BufferData if it only discards buffer contents"
This reverts commit 5649f886f7.

It causes segfaults when size is zero.
2013-04-29 15:13:57 +01:00
Jerome Glisse
c7a13dc5f5 r600g: force full cache for hyperz
Seems that in some case allowing half cache usage confuse the gpu
and trigger lockup. Force full cache use.

Should fix :
https://bugs.freedesktop.org/show_bug.cgi?id=59592
https://bugs.freedesktop.org/show_bug.cgi?id=60848
https://bugs.freedesktop.org/show_bug.cgi?id=60969
https://bugs.freedesktop.org/show_bug.cgi?id=61747
https://bugs.freedesktop.org/show_bug.cgi?id=62466
https://bugs.freedesktop.org/show_bug.cgi?id=62669
https://bugs.freedesktop.org/show_bug.cgi?id=62721
https://bugs.freedesktop.org/show_bug.cgi?id=63124

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-04-29 10:06:29 -04:00
Rob Clark
3900a0e4df freedreno: fix rebase screw-up
Add back 2nd arg to emit_vertexbufs() which got lost in rebase.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-29 07:36:27 -04:00
Chris Forbes
79f786f936 i965/fs: Don't try to use bogus interpolation modes pre-Gen6.
Interpolation modes other than perspective-barycentric-pixel-center (and
their associated coefficients in the WM payload) only exist in Gen6 and
later.

Unfortunately, if a varying was declared as `centroid`, we would blindly
read the nonexistant values, and so produce all manner of bad behavior
-- texture swimming, snow, etc.

Fixes rendering in Counter-Strike Source and Team Fortress 2 on
Ironlake.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-30 06:50:16 +12:00
Matt Turner
a8eed0299d i965/vs: Fix order of source arguments to LRP.
The order or arguments matches DirectX, and is backwards from GLSL's
mix() built-in.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63983
2013-04-28 14:38:14 -07:00
Zack Rusin
3bba787879 llvmpipe: stop crashing when one of the so targets is null
Fixes a crash when one of the so targets is null.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:19:12 -04:00
Zack Rusin
0031cde1e1 draw/so: indicate overflow when buffer is missing
We were crashing if one of the buffers wasn't set, we should
just treat it as an overflow. It's useful when using so
statistics because it allows one to figure out how much data
would be generated by so without actually writing any of it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:19:07 -04:00
Zack Rusin
f9f57312de gallivm: fix indirect addressing of temps in soa mode
we weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first vertex/primitive/pixel in the SoA structure
and not correctly fetching from all structures.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:18:51 -04:00
Zack Rusin
3093ac6f4f tgsi/ureg: Add a function to return the number of outputs
We already hold the variable, just weren't providing access
to it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-26 23:05:45 -04:00
Zack Rusin
53d36d5fb0 draw/so: Fix overflow calculations
We weren't taking the buffer offset, destination offset or the
stride into consideration so we were frequently writing into
an overflown buffer.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 23:04:26 -04:00
Zack Rusin
d996622cfa draw/llvm: fix viewport transformations
This was a very serious bug. We were always doing the viewport
transformations on the first output of the vertex shader. That means
that every application that was storing position in anything but
OUT[0] was outputing untransformed vertices and had broken output
for whatever it was storing at OUT[0]. Correctly take into
consideration where the vertex position is actually stored.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 23:01:46 -04:00
Zack Rusin
5d9ef5b365 gallium: increase the number of available stream output decls
There can be more stream output decls than shader outputs because
individual components from them can be split and distributed
among different so buffers.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 23:01:23 -04:00
Zack Rusin
562835bcdf llvmpipe: implement so_overflow query
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 22:58:54 -04:00
Brian Paul
49dda2d92f mesa: fix the compressed TexSubImage size checking code
Before, we'd incorrectly generate an error if we we tried to
replace a non-4x4 block near the edge of a NPOT compressed texture.
For example, if the dest image was 15 texels wide and xoffset=12
and width=3 we'd incorrectly generate GL_INVALID_OPERATION.

Verified with new tests added to piglit s3tc-errors test.

Note: This is a candidate for the stable branches.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 16:22:30 -06:00
Brian Paul
ff74cf62b1 llvmpipe: replace LP_MAX_THREADS with screen->num_threads in query code
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 16:22:24 -06:00
Brian Paul
38a751cbe8 llvmpipe: bump LP_MAX_THREADS to 16
On the mesa-users list, Burlen Loring reported a speed-up with 16 cores
and his test/app.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 16:22:12 -06:00
Brian Paul
8fbc36ff48 mesa: updated read_buffer_enum_to_index() comment
Remove the part about the value of gl_framebuffer::Name.
2013-04-26 08:30:25 -06:00
Christian König
e3ac293daa r600/uvd: stop advertising MPEG4 on UVD 2.x chips v2
That is just not supported by the hardware.

v2: fix compare

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-26 15:35:36 +02:00
Christian König
2c2c54b819 radeon/uvd: stop using anonymous unions
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-04-26 15:35:36 +02:00
Tapani Pälli
12b0bfa6e9 mesa: fix type comparison errors in sub-texture error checking code
patch fixes a crash that happens if glTexSubImage2D is called with a
negative xoffset.

NOTE: This is a candidate for stable branches.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-26 06:47:44 -06:00
José Fonseca
c5e8573762 Revert "draw: Yield zeros for LLVM fetches of non-existing vertex elements."
After more thought/discussion, it seems it is better to handle this sort
of stuff in the state tracker.

So this reverts commit 12096f334b, except the
variant->key -> key shorthands.
2013-04-26 12:15:39 +01:00
Chia-I Wu
5816a471af ilo: add the driver to the build system
Add ilo to targets/egl-static and add a new target dri-ilo.  Update autoconf
and automake rules.
2013-04-26 16:20:52 +08:00
Chia-I Wu
825aa60707 ilo: compile VS/GS/FS with the toy compiler 2013-04-26 16:20:52 +08:00
Chia-I Wu
7118ff8bb0 ilo: add a toy shader compiler
This is a simple shader compiler that performs almost zero optimizations.  The
generated code is usually much larger comparing to that generated by i965.
The generated code also requires many more registers.

Function-wise, it lacks register spilling and does not support most TGSI
indirections.  Other than those, it works alright.
2013-04-26 16:20:52 +08:00
Chia-I Wu
0fa2d0e98a ilo: hook up pipe context GPGPU functions
This just adds a stub.
2013-04-26 16:16:43 +08:00
Chia-I Wu
cf8f3dd373 ilo: hook up pipe context video functions
This just hooks them up with auxiliary/vl layer.
2013-04-26 16:16:43 +08:00
Chia-I Wu
12dd397d0c ilo: add support for time/occlusion/primitive queries 2013-04-26 16:16:43 +08:00
Chia-I Wu
e6186b0769 ilo: hook up pipe context 3D functions 2013-04-26 16:16:43 +08:00
Chia-I Wu
5b310f6230 ilo: add GEN7 support for 3D pipeline 2013-04-26 16:16:43 +08:00
Chia-I Wu
91ce766c35 ilo: add 3D pipeline for GEN6
The 3D pipeline is a high-level interface to emit 3D commands and states.  It
uses GEN6 GPE to do the real work.
2013-04-26 16:16:43 +08:00
Chia-I Wu
67233b56d6 ilo: add GEN7 GPE 2013-04-26 16:16:43 +08:00
Chia-I Wu
d3602dfac6 ilo: add GEN6 GPE
GEN6 GPE (Graphics Processing Engine) is a low-level interface to emit 3D
commands and states.
2013-04-26 16:16:43 +08:00
Chia-I Wu
72357cf3bb ilo: hook up pipe context query functions
None of the query types are supported yet.
2013-04-26 16:16:43 +08:00
Chia-I Wu
8f949bc1da ilo: hook up pipe context transfer functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
0754ff33e3 ilo: hook up pipe context blit functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
89d1702b9b ilo: hook up pipe context state functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
520af66797 ilo: add functions to manage shaders
This commits add shader cache, shader state, shader variant, and etc.  It does
not add the shader compiler though.
2013-04-26 16:16:42 +08:00
Chia-I Wu
86940bf41c ilo: hook up pipe context flush function 2013-04-26 16:16:42 +08:00
Chia-I Wu
eed1e5a407 ilo: add command parser
The command parser manages batch buffers and command submissions.
2013-04-26 16:16:42 +08:00
Chia-I Wu
3a4a570c34 ilo: hook up pipe screen resource functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
b50e68cb67 ilo: hook up pipe screen format functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
babb2b5c50 ilo: hook up pipe_screen param and fence functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
e74d67738d ilo: add debug flags settable through ILO_DEBUG 2013-04-26 16:16:42 +08:00
Chia-I Wu
63b5720105 ilo: new pipe driver for Intel GEN6+
This commit adds some boilerplate code.  The header files found under include/
are copied from i965.
2013-04-26 16:16:41 +08:00
Chia-I Wu
380e6875b8 winsys/intel: new winsys for intel
This is a wrapper for libdrm_intel to allow the pipe driver to stay OS
agnostic.
2013-04-26 15:49:00 +08:00
José Fonseca
542c5b3703 gallivm: Fix trivial out-of-bounds indirection in lp_build_cube_lookup().
Courtesy of clang:

  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1483:10: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
           tmp[2] = lp_build_swizzle_aos(coord_bld, ddx_ddy[1], swizzle02);
           ^   ~
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
           LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
           ^
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1487:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
              rho_vec = lp_build_add(coord_bld, rho_vec, tmp[2]);
                                                       ^   ~
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
           LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
           ^
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1491:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
              rho_vec = lp_build_max(coord_bld, rho_vec, tmp[2]);
                                                       ^   ~
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
           LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
           ^
2013-04-26 08:44:37 +01:00
Matt Turner
0c1d87b0d7 i965/vs: Add support for LRP instruction.
Only 13 affected programs in shader-db, but they were all helped.

total instructions in shared programs: 368877 -> 368851 (-0.01%)
instructions in affected programs:     1576 -> 1550 (-1.65%)

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-25 18:27:39 -07:00
Matt Turner
c0f67a127b i965/vs: Add a function to fix-up uniform arguments for 3-src insts.
Three-source instructions have a vertical stride overloaded to 4, which
prevents directly using vec4 uniforms as arguments. Instead we need to
insert a MOV instruction to do the replication for the three-source
instruction.

With this in place, we can use three-source instructions in the vertex
shader. While some thought needs to go into deciding whether its better
to use a three-source instruction rather than a sequence of equivalent
instructions (when one or more sources are uniforms or immediates), this
will allow us to skip a lot of ugly lowering code and use the BFE and
BFI2 instructions directly.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-25 18:27:39 -07:00
Jerome Glisse
abb96fdea7 winsys/radeon: consolidate tracing into winsys v2
This move the tracing timeout and printing into winsys and add
an debug environement variable for it (R600_DEBUG=trace_cs).

Lot of file touched because of winsys API changes.

v2: Do not write lockup file if ib uniq id does not match last one

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-25 18:36:31 -04:00
Tom Stellard
53fbae7eac r600g/compute: Removed unused and untested code
There was a lot of code in evergreen_compute_internal.c that was not
being used at all and most of it was duplicating code from other parts
of the driver.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 13:32:22 -07:00
Tom Stellard
f986087d5c r600g/compute: Use a constant buffer to store kernel parameters v2
v2:
  - Fix usage of set_constant_buffer()
  - Fix typo in comment

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-25 13:32:17 -07:00
Tom Stellard
ffadc71afb r600g: Add evergreen_emit_cs_constant_buffers() v2
v2:
  - Bump R600_NUM_ATOMS

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-25 13:25:00 -07:00
Tom Stellard
83a00a1de8 r600g/compute: Don't use radeon_winsys::buffer_wait() after dispatching a kernel
The state tracker should be responsible for waiting for the kernel to
finish.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 13:24:51 -07:00
Tom Stellard
09e47f7a25 r600g/compute: Fix input buffer size calculation
Buffer size should be in bytes not dwords.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 13:24:24 -07:00
Adam Jackson
904b03824b linux: Don't emit a .note.ABI-tag section anymore (#26663)
We don't support pre-2.6 kernels anyway - the install docs say 2.6.28
for DRI - and apparently this confuses ld.so's sorting when multiple
libGLs are installed.  Just remove it.

Note: this is a candidate for the stable branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-04-25 15:51:35 -04:00
Rob Clark
73de07cbbc freedreno: use writecombine buffers
Better than uncached for writes, which are common for vertex buffer
upload, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-25 15:10:56 -04:00
Rob Clark
f706d4d340 freedreno: don't patch and re-emit same shader as much
New textures or vertex buffers don't always require patching and
re-emitting the shaders.  So do a better job of figuring out when we
actually have to patch the shader.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-25 15:10:56 -04:00
Eric Anholt
578987ce1c i965: Avoid recompiles for fragment clamping on non-clamping APIs.
Removes 75/78 state-dependent recompiles in GLB2.7 (the remaining 3 are
due to FBO-rendering size predictions).  We currently expose
GL_ARB_color_buffer_float on GL core, so we may mis-predict there, but I'm
about to send a patch for removing that silly extension in that case.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-25 12:03:00 -07:00
Alex Deucher
b5145ca2a8 radeonsi: add new SI pci ids
Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 14:22:46 -04:00
Alex Deucher
b3a856dfa9 r600g: add new richland pci ids
Note: this is a candidate for the stable branches.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 14:21:15 -04:00
José Fonseca
12096f334b draw: Yield zeros for LLVM fetches of non-existing vertex elements.
If a bug in an app/stater-tacker causes vertex buffer to fetch vertex
elements that are not bound, simply return zeros instead of crashing.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-25 16:16:21 +01:00
José Fonseca
28e6a272fc trace: Only close trace files on exit.
Many applications don't exit cleanly, others may create and destroy a
screen multiple times, so we only write </trace> tag and close at exit
time.
2013-04-25 14:18:33 +01:00
José Fonseca
74d1153c9c graw: Set the vertex shader constant buffer.
We were setting the fragment shader, which wasn't needed.
2013-04-25 14:06:50 +01:00
José Fonseca
e88a1dba09 graw: Simple utilities to dump and disassemble TGSI tokens.
Useful for core dumps, where calling tgsi_dump() from gdb is not an
alternative.
2013-04-25 13:03:06 +01:00
José Fonseca
1687932d2b scons: Support clang.
clang is supports most gcc options / extensions, with a some exceptions.

The biggest advantage of using clang is that compilation times are much
short.

One can tell scons to use clang when building by invoking it as

   CC=clang CXX=clang++ scons libgl-xlib
2013-04-25 11:59:01 +01:00
José Fonseca
f0c296773d util/u_sse: Fix _mm_shuffle_epi8 prototype for clang.
Clang does not support __artificial__. Instead match precisely what's
in the clang headers.
2013-04-25 11:59:01 +01:00
José Fonseca
45a60e2e7a scons: Remove redundant code.
-fvisibility=hidden is already elsewhere for the whole tree.
2013-04-25 11:59:01 +01:00
Chris Forbes
8fd0190278 mesa: fix bogus comment about PrimitiveRestart fields
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-25 20:49:25 +12:00
Chris Forbes
447bf1fb52 i965: report correct sample positions
From low to high bits, the sample positions are packed y0,x0,y1,x1...

Fixes arb_texture_multisample-sample-position piglit.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-25 20:47:54 +12:00
Rob Clark
49a7624973 freedreno: fix bogus IMM const reg index
We were assigning incorrect const register for immediates, and
potentially writing immediate const to the wrong location.  This fixes
an incorrect-rendering bug with xonotic.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
9495ee12c6 freedreno: clear fixes and debugging
Set a few extra registers to make sure we are in proper state for
clearing.  And also add some debug options to mark all state dirty in
clear and gmem operations to aid in debugging.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
d5d6ec8843 freedreno: fix texture fetch type
There is a bit we need to set for 2D vs 3D fetch, to tell the hw whether
there are two or there valid input components.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
d086bb22bc freedreno: fix temp register usage
The previous approach of using the dst register as an intermediate
temporary doesn't work in a lot of cases.  For example, if the dst
register is the same as one of the src registers.

For now, just simplify it and always allocate a new register to use as
an intermediate.  In some cases this will result in more registers used
than required.  I think the best solution would be to implement an
optimization pass to reduce the number of registers used, which would
also solve the problem we have now of not being able to use GPRs that
are assigned for TGSI_FILE_INPUT.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
7a837da556 freedreno: add noop driver
It is useful for debugging.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
eec37f1cdc freedreno: use u_math macros/helpers more
Get rid of a few self-defined macros:
  ALIGN() -> align()
  min() -> MIN2()
  max() -> MAX2()

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
38d8b02eba freedreno: implement fd_screen_destroy()
Opps, didn't notice that I had left it stubbed out.

Also, make things fail a bit more gracefully when things go wrong.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
a64e2d9d9f freedreno: set SWAP bit based on format
Really this should be set based on buffer format, not on color vs
depth/stencil.  Probably there should be more formats that set the bit
as we add support for more render target formats.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Tom Stellard
d9a32b84e3 radeon/llvm: Fix segfault with a specifc libelf implementation
The libelf implementation that is distributed here:
http://www.mr511.de/software/english.html
requires calling elf_version() prior to calling elf_memory()

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-24 16:51:25 -07:00
Alex Deucher
5bbeae7a3d r600g: use CP DMA for buffer clears on evergreen+
Lighter weight then using streamout.  Only evergreen
and newer asics support embedded data as src with
CP DMA.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-24 18:54:31 -04:00
Chia-I Wu
9d0ad4c2f2 i965/gen7: fix encoding of (huge) surface size for BRW_SURFACE_BUFFER
Unlike GEN6, the bits of entry count are distributed like this

  width  = (entry_count & 0x0000007f);       /* bits [6:0] */
  height = (entry_count & 0x001fff80) >> 7;  /* bits [20:7] */
  depth  = (entry_count & 0x7fe00000) >> 21; /* bits [30:21] */

The maximum entry count is still limited to 2^27.

This was noted while going over the PRM.  No test is impacted, because
1<<20 (the bit that moved) is much larger than GL_UNIFORM_BLOCK_MAX_SIZE,
GL_MAX_TEXTURE_BUFFER_SIZE, or MAX_*_UNIFORM_COMPONENTS.

v2: Explain more in the commit message (by anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 12:56:17 -07:00
Chia-I Wu
75d402b211 i965/gen7: fix 3DSTATE_LINE_STIPPLE_PATTERN
The inverse repeat count should taks up bits 31:15 and is in U1.16.  Fixes
the "Restarting lines within a single Begin/End block" subtest of piglit
linestipple, and gets the other failing subtests much closer to passing.

v2: Rewrite commit message with more detailed piglit info (by anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 12:56:17 -07:00
Chia-I Wu
bc98950a2a i965: fix SURFACE_STATE dumping
Wrong fields were used when dumping width and height.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 12:56:17 -07:00
Matt Turner
d611f12d82 i965: Remove strange comments about math functions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-24 12:51:36 -07:00
Matt Turner
0c16c12e46 i965: Remove traces of nonexistent TAN math function.
Never existed? At least never supported. Doesn't appear in 965, G45,
or ILK documentation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-24 12:51:36 -07:00
Paul Berry
5bb90cfceb glsl: Teach basic block analysis about break/continue/discard.
Previously, the only kind of ir_jump that would terminate a basic
block was "return".  However, the other possible types of ir_jump
("break", "continue", and "discard") should terminate a basic block
too.  This patch modifies basic block analysis so that it terminates a
basic block on any type of ir_jump, not just ir_return.

Fixes piglit test dead-code-break-interaction.shader_test.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-24 09:57:37 -07:00
Paul Berry
70ca263623 glsl: Add virtual function ir_instruction::as_jump()
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-24 09:57:37 -07:00
Tom Stellard
f64058803a r600g/llvm: Pass struct r600_bytecode to r600_llvm_compile
This way we don't need to update the function signature everytime we
emit a new config value.  This also fixes the build with
--enable-opencl.
2013-04-24 12:42:41 -04:00
José Fonseca
e29525f79f winsys/sw/xlib: Prevent shared memory segment leakage.
Running piglit with this was causing all sort of weird stuff happening
to my desktop (Chromium webpages become blank, Qt Creator flickered,
etc).  I tracked this down to shared memory segment leakage when GL is
not shutdown properly. The segments can be seen running `ipcs` and
looking for nattch==0.

This changes fixes this by calling shmctl(IPC_RMID) soon after creation
(which does not remove the segment immediately, but simply marks it for
removal when no more processes are attached).

This matches src/mesa/drivers/x11/xm_buffer.c behaviour.

v2:
- move shmctl(IPC_RMID) after XShmAttach() for *BSD, per Chris Wilson
- remove stray debug printfs, spotted by Ian Romanick

NOTE: This is a candidate for stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-24 16:54:58 +01:00
Zack Rusin
1a87473998 draw/gs: preserve leading vertex info for gs
We need to handle the leading vertex information when
assembling primitives for the geometry shader otherwise
the resulting triangles will have vertices at incorrect
input locations.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-23 06:17:59 -04:00
Laurent Carlier
addf00e2ad r200: fix build regression introduced with 9a32203e16
Signed-off-by: Laurent Carlier <lordheavym@gmail.com>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-04-24 16:48:29 +02:00
Christian König
c5c754d184 radeonsi: cleanup disabling tiling for UVD v3
Should fix: https://bugs.freedesktop.org/show_bug.cgi?id=63702

v2: add a comment that this is just a workaround
v3: fix typo in comment

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-24 11:07:26 +02:00
Chad Versace
d3dfce3276 egl/dri2: Fix min/max swap interval of configs
The commit below exposed a bug in dri2_add_config.

    commit 3998f8c6b5
    Author: Ralf Jung <post@ralfj.de>
    Date:   Tue Apr 9 14:09:50 2013 +0200

	egl/x11: Fix initialisation of swap_interval

This little code snippet near the bottom of dri2_add_config,

    if (double_buffer) {
       ...
       conf->base.MinSwapInterval = dri2_dpy->min_swap_interval;
       conf->base.MaxSwapInterval = dri2_dpy->max_swap_interval;
    }

it never did what it claimed to do. The assignment never changed the value
of conf->base.MaxSwapInterval, because dri2_dpy->max_swap_interval was,
until the above exposing commit, unitialized here. That is,
conf->base.MaxSwapInterval was 0 before and after assignment. Ditto for
the min swap interval.

Above the troublesome code snippet, the call to _eglFilterArray rejects
the config as unmatching if its swap interval bounds differ from the base
config's.  Before the exposing commit, at the call to _eglFilterArray, the
swap interval bounds were always [0,0], and hence no config was rejected
due to swap interval.

After the exposing commit, _eglFilterArray incorrectly rejected some
configs, which prevented dri2_egl_config::dri_double_config from getting
set for the rejected config, which resulted in a NULL pointer getting
passed into dri2CreateNewDrawable, and then segfault.

The solution: set the swap interval bounds before _eglFilterArray.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63447
Tested-by: Lu Hua <huax.lu@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-24 08:05:13 +02:00
Kenneth Graunke
cef31bb290 mesa: Add unpack functions for A/I/L/LA [U]INT8/16/32 formats.
NOTE: This is a candidate for stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:13:02 -07:00
Kenneth Graunke
995051ee34 mesa: Add unpack functions for R/RG/RGB [U]INT8/16/32 formats.
NOTE: This is a candidate for stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:13:00 -07:00
Kenneth Graunke
531be501de mesa: Add an unpack function for ARGB2101010_UINT.
v2: Remove extra parenthesis (suggested by Brian).

NOTE: This is a candidate for stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:12:58 -07:00
Kenneth Graunke
b1fded54c9 mesa: Fix unpack function for ETC2_SRGB8_PUNCHTHROUGH_ALPHA1.
We accidentally set MESA_FORMAT_ETC2_RGB8_PUNCHTHROUGH_ALPHA1 twice,
rather than setting the RGB8 and SRGB8 formats.

NOTE: This is a candidate for stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:12:50 -07:00
Kenneth Graunke
097b39276c mesa: Fix up some final license word wrapping issues by hand.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:07:14 -07:00
Kenneth Graunke
f0cb66b699 mesa: Restore 78-column wrapping of license text in C++-style comments.
The previous commit introduced extra words, breaking the formatting.

This text transformation was done automatically via the following shell
command:
$ git grep 'THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY' | sed 's/:.*$//' | xargs -I {} sh -c 'vim -e -s {} < vimscript2

where 'vimscript2' is a file containing:
/THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY/;/^ *$/ !fmt -w 78 -p '// '
:wq

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:07:12 -07:00
Kenneth Graunke
3d8d5b298a mesa: Restore 78-column wrapping of license text in C-style comments.
The previous commit introduced extra words, breaking the formatting.

This text transformation was done automatically via the following shell
command:
$ git grep 'THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY' | sed 's/:.*$//' | xargs -I {} sh -c 'vim -e -s {} < vimscript

where 'vimscript' is a file containing:
/THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY/;/\*\// !fmt -w 78 -p ' * '
:wq

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:07:09 -07:00
Kenneth Graunke
96ff2edc73 mesa: Add "OR COPYRIGHT HOLDERS" to license text disclaiming liability.
This brings the license text in line with the MIT License as published
on the Open Source Initiative website:

http://opensource.org/licenses/mit-license.php

Generated automatically be the following shell command:
$ git grep 'THE AUTHORS BE LIABLE' | sed 's/:.*$//g' | xargs -I '{}' \
  sed -i 's/THE AUTHORS/THE AUTHORS OR COPYRIGHT HOLDERS/' {}

This introduces some wrapping issues, to be fixed in the next commit.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:07:06 -07:00
Kenneth Graunke
ca29382dc3 mesa: Change "BRIAN PAUL OR IBM" to "THE AUTHORS" in license text.
See previous commit for the rationale.  These weren't caught by the
automatic conversion due to the "OR IBM" addition.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:07:04 -07:00
Kenneth Graunke
dd404bc94f mesa: Change "BRIAN PAUL" to "THE AUTHORS" in license text.
Generated automatically be the following shell command:
$ git grep 'BRIAN PAUL BE LIABLE' | sed 's/:.*$//g' | xargs -I '{}' \
  sed -i 's/BRIAN PAUL/THE AUTHORS/' {}

The intention here is to protect all authors, not just Brian Paul.  I
believe that was already the sensible interpretation, but spelling it
out is probably better.

More practically, it also prevents people from accidentally copy &
pasting the license into a new file which says Brian is not liable when
he isn't even one of the authors.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-23 22:06:38 -07:00
Brian Paul
cab19eced5 mesa: make _mesa_save_vtxfmt_init() static
It's called from nowhere else.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-23 21:12:25 -06:00
Brian Paul
71ee003041 docs: document issue with Viewperf proe-05 test 6 2013-04-23 21:09:17 -06:00
Brian Paul
f74da3e988 mesa: use new _mesa_inside_dlist_begin_end() function
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-23 21:09:17 -06:00
Brian Paul
976b529b7c mesa: use new _mesa_inside_begin_end() function
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-23 21:09:17 -06:00
Marek Olšák
9a32203e16 mesa: remove unused opcodes AND, DP2A, NOT, NRM3, NRM4, OR, PRINT, XOR
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:24 +02:00
Marek Olšák
3140d132ef mesa: don't flush vertices and don't flag _NEW_COLOR in ClearColor, ClearIndex
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:24 +02:00
Marek Olšák
9f3985238f mesa: don't flush vertices and don't flag _NEW_COLOR for GL_CLAMP_READ_COLOR
There used to be a derived state _ClampReadColor, so setting _NEW_COLOR
made sense. The state is gone now.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:24 +02:00
Marek Olšák
43dac2700c mesa: don't flag _NEW_DEPTH in Begin/EndQuery if driver implements the functions
We don't want to set the flag for Gallium.

I think only swrast needs the flag to be set for occlusion queries.

v2: fix stats_wm updates in i965

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
629813d9de mesa: don't flush vertices and don't flag _NEW_DEPTH in ClearDepth
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
3975f52eb4 mesa: don't flush and don't flag _NEW_STENCIL in ClearStencil, ActiveStencilFace
The functions don't affect driver state. There is no code that would rely
on vertices being flushed prior to changing the states, and no code that
would check for _NEW_STENCIL before using the states.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
1e3b422685 mesa: don't set _NEW_BUFFERS in GenerateMipmap and BlitFramebuffer
both functions don't change the framebuffer in any way
(if mesa_meta is not used)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
d883d00878 mesa: remove _NEW_PACKUNPACK
No driver checks the flag. Nobody uses it.

I also removed the FLUSH_VERTICES calls, because PixelStorei has no effect
on rendering.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
99bd76d834 mesa: convert _NEW_RASTERIZER_DISCARD to a driver flag
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
b95cbe5e80 mesa,i965: use NewDriverState to communicate TFB state changes with the driver
_NEW_TRANSFORM_FEEDBACK is not used by core Mesa, so it can be removed.
Instead, an new private flag is added to i965 to serve the same purpose.

If you're new to this:

* When creating a context. you can set private dirty flags
  in gl_context::DriverFlags, eg.:
    ctx->DriverFlags.NewStateX = BRW_NEW_STATE_X;

* When StateX is changed, core Mesa does:
    ctx->NewDriverState |= ctx->DriverFlags.NewStateX;

* When you have to draw, read and clear ctx->NewDriverState.

* Pros: not touching NewState, the driver decides the mapping between
  GL states and hw state groups, unlimited number of flags in core Mesa
  (still limited number of flags in the driver though)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
ef39bc4f2e mesa: remove redundant _NEW_BUFFERS setting in ReadBuffer
already set by _mesa_readbuffer

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-24 03:23:23 +02:00
Marek Olšák
5649f886f7 st/mesa: add a simple path to BufferData if it only discards buffer contents
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-24 03:23:23 +02:00
Marek Olšák
d23c7455ae st/mesa: depth-stencil-alpha state also depends on _NEW_BUFFERS
because the code looks at the visual if there is a depth or stencil buffer
before enabling depth or stencil, respectively.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-24 03:23:23 +02:00
José Fonseca
2737abb44e gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center.
Squashed commit of the following:

commit 04c5fa2cbb8e89d6f2fa5a75af1cca03b1f6b852
Author: José Fonseca <jfonseca@vmware.com>
Date:   Tue Apr 23 17:37:18 2013 +0100

    gallium: s/lower_left_origin/bottom_edge_rule/

commit 4dff4f64fa83b9737def136fffd161d55e4f1722
Author: José Fonseca <jfonseca@vmware.com>
Date:   Tue Apr 23 17:35:04 2013 +0100

    gallium: Move diagram to docs.

commit 442a63012c8c3c3797f45e03f2ca20ad5f399832
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 11 17:50:55 2012 +0100

    gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center.

    This change is necessary to achieve correct results when using OpenGL
    FBOs.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-23 19:42:47 +01:00
Marek Olšák
b692076420 r600g: initialize CMASK and HTILE with the GPU using streamout
This fixes a crash when a resource cannot be mapped to the CPU's address space
because it's too big.

This puts a global pipe_context in r600_screen, which is guarded by a mutex,
so that we can use pipe_context when there isn't one around.
Hopefully our multi-context support is solid.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

NOTE: This is a candidate for the 9.1 branch.
2013-04-23 20:26:20 +02:00
Marek Olšák
1ba46bbb4c gallium/u_blitter: implement buffer clearing
Although this might be useful for ARB_clear_buffer_object,
I need it for initializating resources in r600g.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

v2: comment cleanups

NOTE: This is a candidate for the 9.1 branch.
2013-04-23 20:26:20 +02:00
Vincent Lejeune
edd90a19ca r600/llvm: Read stacksize from config header 2013-04-23 19:52:29 +02:00
Vincent Lejeune
a7f73f5155 /bin/bash: q : commande introuvable 2013-04-23 19:52:02 +02:00
Tom Stellard
a0c8942bb4 radeon/llvm: Fix build with LLVM >= r180063 2013-04-23 11:53:05 -04:00
Tom Stellard
ead4db420e gallivm: Fix build with LLVM >= r180063 2013-04-23 11:53:05 -04:00
Zack Rusin
1fb8c3ce55 draw: use the prim count for ia primitives
Number of vertices to fetch doesn't always equal the number of input
vertices. To correctly compute the number if IA primitives we need
to use the total number of input vertices, not only those that
need to be fetched.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-22 20:36:07 -04:00
Zack Rusin
76587d2e5e tgsi/scan: set correct input limits for geometry shader
TGSI geometry shader input declerations are of the IN[][2] format
and the dimensions of the array have to be deduced from the input
primitive property.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-22 20:36:07 -04:00
Zack Rusin
913ed25f18 draw: add code to reset instance dependent data
We want to be able to reset certain parts of the pipeline,
in particular the input primitive index, but only either with
seperate invocations of the draw_vbo or new instances. In all
other cases (e.g. new invocations due to primitive restart)
that data needs to be preserved. Add a function through which
we can reset instance dependent data.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-22 20:36:07 -04:00
Zack Rusin
2aad06844f softpipe: fix streamout with an emptry geometry shader
Same approach as in the llvmpipe, if the geometry shader is
null and we have stream output then attach it to the vertex
shader right before executing the draw pipeline.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-22 20:36:07 -04:00
Andreas Boll
723b78397f configure.ac: Allow OpenGL ES1 and ES2 only with enabled OpenGL
Building OpenGL ES1 and/or ES2 without OpenGL is not supported on mesa
9.0.x

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-23 03:16:10 +02:00
Matt Turner
7be536bb19 i965/fs: Don't save value returned by emit() if it's not used.
Probably a copy-n-paste mistake.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-22 15:34:32 -07:00
Brian Paul
4d5827ea83 mesa: Remove extra MapBufferRange in create_beginend_table()
Looks like a copy&paste typo.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-22 12:38:04 -06:00
José Fonseca
7c1bf8e381 gallium: Add a new clip_halfz rasterizer state.
gl_rasterization_rules lumps too many different flags.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-22 18:39:06 +01:00
Kenneth Graunke
95c83824e6 i965: Fix a mistake in the comments for software counters.
The code doesn't set brw->query.obj to NULL, it sets query->bo to NULL.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-22 10:34:49 -07:00
José Fonseca
c0538860bf gallivm: Fix assignment of unsigned values to OUT register.
TEMP is not the only register file that accept unsigned. OUT too.

Actually, what determines the appropriate type of the destination value is
not the opcode, but rather the register.

Also cleanup/simplify code.  Add a few more asserts, but also make
code more robust by handling graceful if assert fails.

This fixes segfault / assertion in the included vert-uadd.sh graw shader.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-22 18:23:42 +01:00
Matt Turner
ec646e4654 i965: Apply CMP NULL {Switch} work-around to other Gen7s.
Listed in the restrictions section of CMP, but not on the work-arounds
page.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-22 09:45:10 -07:00
Brian Paul
6654b9d1eb st/mesa: minor indentation fixes 2013-04-22 10:08:06 -06:00
Eric Anholt
47c0b5ecdd mesa: Introduce a globally-available minify() macro.
This matches u_minify()'s behavior, for consistency.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-21 12:28:04 -07:00
Eric Anholt
1842dd08b8 mesa: Generalize TexStorage allocator between swrast and intel.
This should be reusable for other non-gallium drivers, so we can make the
extension always be available.

v2: Add a more detailed comment than the old function had (recommended
    by Brian).

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2013-04-21 12:28:04 -07:00
Eric Anholt
e86170c2b8 mesa: Add performance debug for meta code.
I noticed a fallback in regnum through sysprof, and wanted a nicer way to
get information about it.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-21 12:28:03 -07:00
Eric Anholt
cbe8b75b58 intel: Mention how much data we're trying to subdata in perf debug.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-21 12:28:03 -07:00
José Fonseca
9fb5b2f45c Revert "gallivm: Emit vector selects."
It caused inumerous regressions (LLVM 3.1) in blending. In particular:

 - lp_test_blend

    type=u8nx16 rgb_func=sub rgb_src_factor=zero rgb_dst_factor=inv_src_color alpha_func=rev_sub alpha_src_factor=one alpha_dst_factor=const_color ...  MISMATCH
     Src:  0  0  0 b5 49 29  0 a2  0 21 de  0 c3 1b ec  0
     Src1: 2d 85 14  0 f8  0 79 a1 99  0 d8  0 59 16  0  0
     Dst:  0 a9 97  0 c0  0 78  0  0 8b aa f0 bd  0 78 f6
     Con: 7d  0 c0  0  0 bb 77  0  0  0 50  0 40 51  0  0
     Res:  0  0  0  0  0 29  0  0  0  0 c8  0 97 1b e3  0
     Ref:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
    type=u8nx16 rgb_func=max rgb_src_factor=one rgb_dst_factor=inv_const_color alpha_func=min alpha_src_factor=zero alpha_dst_factor=inv_src1_alpha ...  MISMATCH
     Src:  d  0  0 e9  0 37 35 f0 62  0  0 b2 e9 f7  0 5c
     Src1: 8f  0 bf  0 a8  5  0  0 c4  0 d7  7 92  a  0 17
     Dst: cb  0 1e  0  0  0 19 8e  0 4d  0  0  0  0  3 46
     Con: aa 5a 5f 8f  0  0 bc 92  0 88  0  0 b7 8a c0 88
     Res: 44  0 13  0  0  0  7 8e  0 24  0  0  0  0  1 40
     Ref: 44  0 13  0  0 37 35  0 62 24  0  0 e9 f7  1  0

This reverts commit 1e266c7ef0.
2013-04-21 09:07:19 +01:00
José Fonseca
d8a4c4c524 llvmpipe: verify function on blend test. 2013-04-21 08:53:31 +01:00
José Fonseca
a79990bec0 llvmpipe: Don't support Z32_FLOAT_S8X24_UINT texture sampling support either.
Because we don't support, and the u_format fallback doesn't work for
zs formats.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-20 23:25:36 +01:00
José Fonseca
c08b04992a llvmpipe: Ignore depth-stencil state if format has no depth/stencil.
Prevents assertion failures inside the driver for such state combinations.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-20 23:25:36 +01:00
José Fonseca
f701a5a0fe gallivm: Disable LLVM 2.7 workaround on other versions.
2.7 was a particularly trouble ridden release.

Furthermore, the bug no longer can be reproduced ever since the
first_level state was taken in account.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-20 23:25:36 +01:00
José Fonseca
1e266c7ef0 gallivm: Emit vector selects.
They are supported on LLVM 3.1, at least on x86. (I haven't tested on PPC
though.)

Actually lp_build_linear_mip_levels() already has been emitting them for
some time.

This avoids intrinsics, which tend to be an obstacle for certain
optimization passes.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-20 23:25:36 +01:00
Rob Clark
26b39df08f freedreno: move ir -> ir2
There will be a new IR for a3xx, which has a very different shader ISA
(more scalar oriented).  So rename to avoid conflicts later when I start
adding a3xx support to the gallium driver.

Signed-off-by: Rob Clark <Rob Clark robdclark@freedesktop.org>
2013-04-20 17:59:41 -04:00
Rob Clark
d8134792ae freedreno: cleanup some cruft left over from fdre
The standalone shader assembler needed some meta-data to know about
attributes/varyings/etc, to do the shader linkage.  We don't need these
parts with gallium/tgsi, so just get rid of it.

Signed-off-by: Rob Clark <Rob Clark robdclark@freedesktop.org>
2013-04-20 17:31:47 -04:00
Roland Scheidegger
85974e5fee gallivm: implement switch opcode
Should be able to handle all things which make this tricky to implement.
Fallthroughs, including most notably into/out of default, should be handled
correctly but are quite a mess.
If we see largely unoptimized switches in the wild should probably think
about some "real" switch optimization pass, e.g. things like this:

switch
case1
someinst
brk
case2
default
case3
someinst
brk
case4
someinst
endswitch

are legal, but the pointless case2/case3 statements not only cause condition
evaluation but will turn this into a "fake" fallthrough case (because
mask and defaultmask are already updated for case2 when default is
encountered) requiring executing code twice.
If default is at the end though, there's never any code re-execution, and
if that's not the case if there's no fallthrough in (not even a fake one)
and out of default there's no code re-execution neither.

v2: add comments, and use enum for break type instead of magic boolean.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-20 02:27:53 +02:00
Roland Scheidegger
8f5d4283c0 gallivm: use uint build context for mask instead of float
Unsurprisingly noone was using it except for grabbing builder.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-20 02:27:53 +02:00
Roland Scheidegger
107550e71a gallivm/tgsi: fix up breakc
It seems there was a typo in gallivm breakc handling (I am actually still
not sure it is really needed but otherwise that statement really should go
away). Also fix the wrong src argument type, even though they weren't really
used.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-20 02:27:53 +02:00
Roland Scheidegger
e8d1b26a82 svga: remove TGSI_OPCODE_BREAKC instruction translation
While initially that opcode probably was meant for something along the
lines of sm3 break_comp it has never worked that way (not even the
argument count was right) and now the opcode has quite different
semantics so just remove it. (Discovered by Jose Fonseca)
2013-04-20 02:27:53 +02:00
Roland Scheidegger
794579105a gallium: document breakc and switch/case/default/endswitch
docs were missing, especially the opcode-from-hell switch however is anything
but obvious.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-20 02:27:53 +02:00
Roland Scheidegger
443950c6aa gallivm: increase nesting limit to 66
This is still not really correct, since at least for sm 4.0
the nesting limit is 64 per subroutine, and subroutine nesting itself
has a limit of 32, so since we have a flat stack we'd need 32*64.
But this should probably be better fixed with per-subroutine stacks,
since otherwise these structures get really big (like 100kB for the
lp_exec_mask).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-20 02:27:53 +02:00
Zack Rusin
12eab7cc56 draw: implement primitive assembler
Input assembler needs to be able to decompose adjacency primitives
into something that can be understood by the rest of the pipeline.
The specs say that the adjacency primitives are *only* visible
in the geometry shader, for everything else they need to be
decomposed. Which in most of the cases is not an issue, because
the geometry shader always decomposes them for us, but without
geometry shader we were passing unchanged adjacency primitives
to the rest of the pipeline and causing crashes everywhere. This
commit introduces a primitive assembler which, if geometry
shader is missing and the input primitive is one of the
adjacency primitives, decomposes them into something
that the rest of the pipeline can understand.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-18 11:51:22 -07:00
Zack Rusin
e4752d0f56 util/prim: fix decomposed counts for adjacency primitives
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-18 11:37:37 -07:00
Zack Rusin
c1299204ad draw/so: uses the correct index with the pre clipped coordinates
pre_clip_pos is a float[4] we just used (*float)[4] to be able to
jump within the array of vertex_headers with it. So if the idx
happened to be anything but 0, we'd actually read from some garbage
in memory. Change it to just be a simple pointer instead of casting
it to something that it's not. As suggested by Jose.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-18 11:36:38 -07:00
Eric Anholt
8b2662e900 glapi: Add counter information for glBufferData(), like glBufferSubData().
This causes this function to become asynchronous with glthread.
2013-04-19 10:13:00 -07:00
Eric Anholt
1a3ea852ea glapi: Add parameter count information for uniforms.
This is the kind of information that would have been present for GLX, if
GLX supported modern GL.  This allows these entrypoints to get automatic
asynchronous marshalling code generated for glthread.
2013-04-19 10:13:00 -07:00
Paul Berry
57b7c20ca5 glapi: skip padding in get_called_parameter_string
This bug is currently benign, since get_called_parameter_string() is
currently only used for functions that return true for
glx_function.has_different_protocol(), and none of those functions
include padding.  However, in order to implement marshalling of GL API
functions, we'll need to use get_called_parameter_string() far more
often.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-19 10:12:36 -07:00
Paul Berry
fe955dc6b6 mesa: Fix up program_parse.y to avoid uninitialized $$
Without this patch, $$.negate, $$.rgba_valid, and $$.xyzw_valid take
on garbage values.  At the moment this problem is benign (the garbage
values happen to be zero), but in my experiments executing GL
operations on a background thread, the garbage values change, leading
to piglit failures.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-19 10:12:27 -07:00
Eric Anholt
ea6cf2b686 mesa: Use quotes on bool driconf options to prevent stdbool.h breakage.
Since stdbool.h's "true" and "false" are #defines, they got expanded when
used as macro arguments, and that expanded value was stored in the
XML string, producing XML that driconf would then fail to parse.

Currently no drivers included stdbool along with driconf, but I keep
accidentally doing so on intel as we move towards using normal C.

v2: rebase on master.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-04-19 10:10:22 -07:00
Brian Paul
cecbfce5eb svga: whitespace, comment fixes in svga_pipe_query.c 2013-04-19 10:04:11 -06:00
Brian Paul
ef1b2b8da7 svga: whitespace, comment fixes in svga_pipe_fs/vs.c 2013-04-19 10:03:56 -06:00
José Fonseca
dbb690872e gallivm: Fix half floats with MCJIT.
Prevents:

  LLVM ERROR: Cannot select: intrinsic %llvm.x86.vcvtph2ps.128
2013-04-19 10:13:19 +01:00
Matt Turner
e87015f508 Revert "i965: Check reg.nr for BRW_ARF_NULL instead of reg.file."
This reverts commit ecdda414d3.

Commit was supposed to be a simple typo fix. Clearly needs more
investigating.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63688
2013-04-18 21:52:27 -07:00
Matt Turner
34efd9295e configure.ac: Remove gallium-g3dvl flag.
It's next to useless, since it just allows you to turn off VDPAU and
XvMC with a single switch. Just check whether Gallium drivers are
enabled instead.

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-04-18 21:52:26 -07:00
Jerome Glisse
d0e9aaa31c radeonsi: add support for compressed texture v2
Most test pass, issue are with border color and swizzle.

Based on ircnick<maelcum> patch.

v2: Restaged commit hunk

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-04-18 17:25:38 -04:00
Jerome Glisse
dc21e30a62 radeonsi: add 2d tiling support for texture v3
v2: Remove left over code
v3: Restage properly the commit so hunk of first one are not in
    second one.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-04-18 17:25:38 -04:00
Vadim Girlin
f732036f12 gallium: handle drirc disable_glsl_line_continuations option
NOTE: This is a candidate for the 9.1 branch

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-19 01:05:03 +04:00
José Fonseca
b72ff373fb llvmpipe: Take in consideration all current constant buffers when mapping.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-04-18 20:48:12 +01:00
Christoph Bumiller
78eaaff696 nv50: add remaining RGBX formats
Not all are supported as render targets.

The state tracker fallback of using RGBA instead of RGBX currently
fails for blending, we could work around this by clearing their alpha
to 1 and modifying the color mask to disable writing alpha.
2013-04-18 21:04:22 +02:00
Christoph Bumiller
729abfd0f5 st/mesa: optionally apply texture swizzle to border color v2
This is the only sane solution for nv50 and nvc0 (really, trust me),
but since on other hardware the border colour is tightly coupled with
texture state they'd have to undo the swizzle, so I've added a cap.

The dependency of update_sampler on the texture updates was
introduced to avoid doing the apply_depthmode to the swizzle twice.

v2: Moved swizzling helper to u_format.c, extended the CAP to
provide more accurate information.
2013-04-18 20:35:40 +02:00
Christoph Bumiller
246ff8f887 nv50: set BORDER_COLOR_SRGB in sampler objects 2013-04-18 20:35:40 +02:00
Christoph Bumiller
2d5d054752 nv50: fix 4th component of Lx_SINT/UINT formats 2013-04-18 20:35:40 +02:00
Tom Stellard
3b20170b2f r600g: Fix build with --enable-opencl 2013-04-18 11:24:48 -07:00
Brian Paul
877e3c1d42 mesa: enable GL_ARB_texture_float if TEXTURE_FLOAT_ENABLED is defined
Per message on mesa-users list, this wasn't working before.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-18 10:41:08 -06:00
Roland Scheidegger
50cbcf0c46 gallivm: change cubemaps / derivatives handling, take 55
Turns out the previous "fix" for handling per-pixel face selection and
derivatives didn't work out that well - the derivatives were wrong by
quite a bit, in theory transformation of the derivatives into cube space
should work, but would be _a lot_ more work than the "simplified" transform
used.
So, for explicit derivatives, I'm just giving up and go back to not honoring
them.
For implicit derivatives (and the fake explicit ones) however we try
something a little different, we just calculate rho as we would for a 3d
texture, that is after scaling the coords by the inverse major axis.
This gives the same results as calculating the derivs after projection of
the coords to the same face as long as all pixels hit the same face (and
only without rho_no_opt, otherwise it should be a bit worse). And when
not all pixels are hitting the same face, the results aren't so hot but
not catastrophically bad (I believe not off by more than a factor of 2 without
no_rho_approx and not more than sqrt(2) with no_rho_approx). I think this is
better than just picking the wrong face but who knows...

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-18 17:06:43 +02:00
Roland Scheidegger
0d07f05ee8 gallivm: Add no_rho_approx debug option
This will calculate rho correctly as
sqrt(max((ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2), (ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2))
instead of max(|ds/dx|,|dt/dx|,|dr/dx|,|ds/dy|,|dt/dy,|dr/dy|)
(for 3 coords - 2 coords work analogous, for 1 coord there's no point doing
the exact version), for both implicit and explicit derivatives.
While such approximation seems to be allowed in OpenGL some APIs may be less
forgiving, and the error can be quite large (sqrt(2) for 2 coords, sqrt(3) for
3 coords so wrong by nearly one mip level in the latter case).
This also helps to single out "real" bugs from "expected" ones, so it is debug
only (though at least combined with no_brilinear I didn't really see much of a
performance difference but only tested with a debug build - at least with
implicit mipmaps the instruction count is almost exactly the same though the
instructions are more complex (1 sqrt and mul/adds instead of and/max mostly).
The code when the option isn't set stays exactly the same.

v2: rename no_rho_opt to no_rho_approx.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-18 17:04:01 +02:00
José Fonseca
a930136977 llvmpipe: Support half integer pixel center fs coord.
Tested with graw/fs-fragcoord 2/3, and piglit
glsl-arb-fragment-coord-conventions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-18 14:18:25 +01:00
José Fonseca
b191be52f2 llvmpipe: Remove the static interpolation.
No longer used.

If we ever want the old behavior we can run a loop unroller pass.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-18 14:18:22 +01:00
José Fonseca
6e833d4d09 gallivm: Drop pos arg from lp_build_tgsi_soa.
Never used.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-18 14:18:13 +01:00
Andreas Boll
34bec4a251 docs: update release notes for 9.2
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-18 09:36:57 +02:00
José Fonseca
392f6cfced ralloc: Move declarations before statements.
Trivial.  Should fix MSVC build.
2013-04-18 06:21:04 +01:00
Emil Velikov
c7b88ed16e configure: enable vdpau and xvmc detection, with gallium
Currently the vdpau and xvmc detection code, is enabled for all builds. The
state trackers exist only within gallium. Enable whenever at least one gallium
driver is selected

v2: removed stray '-a'
[mattst88 v3]: Removed stray $.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63645
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-04-17 18:19:34 -07:00
Matt Turner
ecdda414d3 i965: Check reg.nr for BRW_ARF_NULL instead of reg.file.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-17 18:19:34 -07:00
Matt Turner
60e4c99488 i965: Implement work-around for CMP with null dest on Haswell.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-17 18:19:34 -07:00
Stuart Abercrombie
1a59cc777f i915g: Release old fragment shader sampler views with current pipe
We were trying to use a destroy method from a deleted context.
This fix is based on what's in the svga driver.

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
2013-04-17 18:15:12 -07:00
Paul Berry
417d8917d4 i965/vec4: Fix hypothetical use of uninitialized data in attribute_map[].
Fixes issue identified by Klocwork analysis:

    'attribute_map' array elements might be used uninitialized in this
    function (vec4_visitor::lower_attributes_to_hw_regs).

The attribute_map array contains the mapping from shader input
attributes to the hardware registers they are stored in.
vec4_vs_visitor::setup_attributes() only populates elements of this
array which, according to core Mesa, are actually used by the shader.
Therefore, when vec4_visitor::lower_attributes_to_hw_regs() accesses
the array to lower a register access in the shader, it should in
principle only access elements of attribute_map that contain valid
data.  However, if a bug ever caused the driver back-end to access an
input that was not flagged as used by core Mesa, then
lower_attributes_to_hw_regs() would access uninitialized memory, which
could cause illegal instructions to get generated, resulting in a
possible GPU hang.

This patch makes the situation more robust by using memset() to
pre-initialize the attribute_map array to zero, so that if such a bug
ever occurred, lower_attributes_to_hw_regs() would generate a (mostly)
harmless access to r0.  In addition, it adds assertions to
lower_attributes_to_hw_regs() so that if we do have such a bug, we're
likely to discover it quickly.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:41:55 -07:00
Dave Airlie
47bd6e46fe ralloc: don't write to memory in case of alloc fail.
For some reason I made this happen under indirect rendering,
I think we might have a leak, valgrind gave out, so I said I'd
fix the basic problem.

NOTE: This is a candidate for stable branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-18 09:50:42 +10:00
Brian Paul
815ca0bf38 mesa: generate glGetInteger/Boolean/Float/Doublev() code for all APIs
No longer pass -a flag to the get_hash_generate.py script to specify
OpenGL, ES1, ES2, etc.  This updates the autoconf, scons and android
build files too (so we can bisect).

This is the last of the API-dependent conditional compilation in
core Mesa.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:33:40 -06:00
Brian Paul
9835d90596 mesa: remove mfeatures.h
No longer needed.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:33:40 -06:00
Brian Paul
b76f6d9557 mesa: remove #include "mfeatures.h" from numerous source files
None of the remaining FEATURE_x symbols in mfeatures.h are used anymore.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:33:40 -06:00
Brian Paul
c6e00b6f6c glapi: no longer emit #include "mfeatures.h" in generated files
None of the symbols in mfeatures.h are used anymore.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:33:40 -06:00
Brian Paul
7fd12a8ae1 mesa: remove FEATURE_remap_table from remap.[ch]
It was always defined.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:33:39 -06:00
Brian Paul
0bcced7716 glapi: remove FEATURE_remap_table test (it's always defined)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-17 17:33:39 -06:00
Zack Rusin
8e7f7e9693 draw/so: respect leading/provoking vertex info
we were ignoring leading/provoking vertex settings which was
breaking decomposition of some strips.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-17 15:43:50 -07:00
Zack Rusin
6bb217a489 softpipe/so: use the correct variable for reporting stream out
we were using the wrong vars, reporting incorrect stream output
statistics.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-17 15:28:54 -07:00
Zack Rusin
cb58c79efb gallivm/gs: fix indirect addressing in geometry shaders
We were always treating the vertex index as a scalar but when the
shader is using indirect addressing it will be a vector of indices
for each channel. This was causing some nasty crashes insides
LLVM.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-17 15:28:54 -07:00
Brian Paul
02039066a8 st/wgl: fix issue with SwapBuffers of minimized windows
If a window's minimized we get a zero-size window.  Skip the SwapBuffers
in that case to avoid some warning messages with the VMware svga driver.
Internal bug #996695

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-17 16:23:19 -06:00
Ian Romanick
505ac6ddc6 intel: Don't dereference a NULL pointer of calloc fails
The caller of NewTextureObject does the right thing if NULL is returned,
so this function should do the right thing too.

NOTE: This is a candidate for stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-17 14:12:46 -07:00
Eric Anholt
50064164a4 i965: Trim trailing whitespace in brw_defines.h.
It was all over the formats section I wanted to edit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-17 14:12:01 -07:00
Laurent Carlier
867f71db6b r200: fix build failure introduced with cbbcb0247e
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-04-17 13:48:40 -06:00
Brian Paul
1079475481 st/mesa: clean up formatting in st_cb_msaa.c
Insert blank lines, wrap lines, remove trailing whitespace, etc.
2013-04-17 12:28:13 -06:00
Brian Paul
3350ca223e mesa: remove gl_context::_TriangleCaps
No longer used anywhere.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:42 -06:00
Brian Paul
cbbcb0247e mesa: remove DD_TRI_LIGHT_TWOSIDE flag
v2: use conditional operator instead of bit shifting

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:42 -06:00
Brian Paul
c9bb052e31 mesa: remove DD_TRI_UNFILLED flag
Use alternate code in intel, r200, radeon drivers.
v2: use conditional operator instead of bit shifting

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:41 -06:00
Brian Paul
56dc53ed5b mesa: remove DD_TRI_SMOOTH flag
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:41 -06:00
Brian Paul
b32fb8ac9e mesa: remove DD_TRI_STIPPLE flag
Make it a local macro for the i915 driver.
v2: use conditional operator instead of bit shifting

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:41 -06:00
Brian Paul
dfb1474aac mesa: remove DD_TRI_OFFSET flag
Make it a local macro for the i915 driver.
v2: use conditional operator instead of bit shifting

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:40 -06:00
Brian Paul
c6a81448f8 mesa: remove DD_POINT_ATTEN flag
For the i915 driver, make it a local macro.
v2: use conditional operator instead of bit shifting

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:40 -06:00
Brian Paul
4f57fbb507 mesa: remove DD_POINT_SMOOTH flag
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:40 -06:00
Brian Paul
8ac8ae8360 mesa: remove DD_LINE_STIPPLE flag
For the i915 driver, make it a local macro.
v2: use conditional operator instead of bit shifting

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:40 -06:00
Brian Paul
55b2033f0a mesa: remove DD_SEPARATE_SPECULAR flag
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:39 -06:00
Brian Paul
c1c5d689c5 mesa: remove unused DD_LINE_SMOOTH flag
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-17 11:59:39 -06:00
Zack Rusin
f01f754ca1 draw/gs: make sure geometry shaders don't overflow
The specification says that the geometry shader should exit if the
number of emitted vertices is bigger or equal to max_output_vertices and
we can't do that because we're running in the SoA mode, which means that
our storing routines will keep getting called on channels that have
overflown (even though they will be masked out, but we just can't skip
them).
So we need some scratch area where we can keep writing the overflown
vertices without overwriting anything important or crashing.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-16 23:38:47 -07:00
Zack Rusin
be497ac9d3 draw/gs: Return early if the passed geometry shader is null
Can happen if we were using stream output without geometry
shader, by returning early we avoid a crash.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-16 23:38:47 -07:00
Zack Rusin
80ee4a407a draw: implement pipeline statistics in the draw module
This is a basic implementation of the pipeline statistics in the
draw module. The interface is similar to the stream output statistics
and also requires that the callers explicitly enable it.
Included is the implementation of the interface in llvmpipe and
softpipe. Only softpipe enables the pipeline statistics capability
though because llvmpipe is lacking gathering of the fragment shading
and rasterization statistics.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-16 23:38:47 -07:00
Zack Rusin
b739376cff gallivm/gs: fix the end primitive calls
The issue with SOA execution and end_primitive opcode is that it
can be executed both when we haven't emitted any vertices, in
which case we don't want to emit an empty primitive, and when
the execution mask is zero and the execution should be skipped. We
handled only the latter of those conditions. Now we're combining the
execution mask with a mask created from emitted vertices to handle
both cases. As a result we don't need the pending_end_primitive
flag which was broken because it was static and could be affected
by both above mentioned conditions at run-time.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-16 23:38:46 -07:00
Zack Rusin
93627e33cc tgsi/exec: geometry shaders are executed on a single primitive
which means that our execution mask in GS is equal to 1 not 0xf.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-16 23:38:46 -07:00
Zack Rusin
88db6f0a73 tgsi/exec: fix the udiv and umod instructions
Same as with llvmpipe: we can't be divind/moding by zero and we
need to make sure that dividing/moding by zero produces 0xffffffff.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-16 23:38:46 -07:00
José Fonseca
b8f6858fcb gallivm: JIT symbol resolution with linux perf.
Details on docs/llvmpipe.html

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-17 16:50:52 +01:00
José Fonseca
35ef27d485 draw: Silence uninitialized var warnings.
Trivial.
2013-04-17 16:50:52 +01:00
Vincent Lejeune
2b9ed257c0 r600g/llvm: Use gprcount from llvm 2013-04-17 17:24:29 +02:00
Anuj Phogat
484b89ace9 intel: Add a null pointer check before dereferencing the pointer
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-17 08:17:47 -07:00
Emil Velikov
b03f6de63b docs: Update 'Making new mesa release'
Add a note to update PACKAGE_VERSION for Android and scons builds

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-17 08:48:15 -06:00
Emil Velikov
91984a732e docs: Add some missing release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-17 08:48:15 -06:00
Emil Velikov
cf9bf1d4a6 docs: move specs to a separate folder
Handle legacy/obsolete specs as well
List all specs in extensions.html
Mark 'OLD' extensions as obsolete in extensions.html
Update the spec location in old relnotes

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-17 08:48:14 -06:00
Emil Velikov
5fd3b3b085 docs: restructure release notes into separate folder
relnotes-*html > relnotes/*html
RELNOTES-* > relnotes/*
fix links, css and frames

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-17 08:48:14 -06:00
José Fonseca
50b3fc6204 gallium: Disambiguate TGSI_OPCODE_IF.
TGSI_OPCODE_IF condition had two possible interpretations:

- src.x != 0.0f

  - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was false either for
    vertex and fragment shaders
  - gallivm/llvmpipe
  - postprocess
  - vl state tracker
  - vega state tracker
  - most old drivers
  - old internal state trackers
  - many graw examples

- src.x != 0U

  - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was true for both
    vertex and fragment shaders
  - tgsi_exec/softpipe
  - r600
  - radeonsi
  - nv50

And drivers that use draw module also were a mess (because Mesa would
emit float IFs, but draw module supports native integers so it would
interpret IF arg as integers...)

This sort of works if the source argument is limited to float +0.0f or
+1.0f, integer 0, but would fail if source is float -0.0f, or integer in
the float NaN range.  It could also fail if source is integer 1, and
hardware flushes denormalized numbers to zero.

But with this change there are now two opcodes, IF and UIF, with clear
meaning.

Drivers that do not support native integers do not need to worry about
UIF.  However, for backwards compatibility with old state trackers and
examples, it is advisable that native integer capable drivers also
support the float IF opcode.

I tried to implement this for r600 and radeonsi based on the surrounding
code.  I couldn't do this for nouveau, so I just shunted IF/UIF
together, which matches the current behavior.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>

v2:
- Incorporate Roland's feedback.
- Fix r600_shader.c merge conflict.
- Fix typo in radeon, spotted by Michel Dänzer.
- Incorporte  Christoph Bumiller's patch to handle TGSI_OPCODE_IF(float)
  properly in nv50/ir.
2013-04-17 10:54:08 +01:00
José Fonseca
f61b7da80e gallium: Eliminate TGSI_OPCODE_IFC.
Never used or implemented.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-17 10:54:08 +01:00
Kenneth Graunke
e7965598b7 i965: Enable the Bay Trail platform.
This patch adds PCI IDs for Bay Trail (sometimes called Valley View).
As far as the 3D driver is concerned, it's very similar to Ivybridge,
so the existing code should work just fine.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-16 15:08:12 -07:00
Christian König
13ddf9baf2 r600/uvd: cleanup disabling tiling on pre EG asics
Set transfer flag instead of fiddling with the tilling params directly.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-04-16 22:36:51 +02:00
Christian König
7490eeb3d6 autoconf: enable detection of vdpau and xvmc by default
Since we now have UVD support we should enable them by default.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-16 22:36:20 +02:00
Ian Romanick
025f03f3b7 mesa/swrast: Move memory allocation outside the blit loop
Assume the maximum pixel size (16 bytes per pixel).  In addition to
moving redundant malloc and free calls outside the loop, this fixes a
potential resource leak when a surface is mapped and the malloc fails.
This also makes blit_nearest look a bit more like blit_linear.

v2: Use MAX_PIXEL_BYTES instead of 16.  Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-16 10:18:14 -07:00
Ian Romanick
a27c6e1aea mesa/swrast: Move free calls outside the attachment loop
This was originally discovered by Klocwork analysis:

    Possible memory leak. Dynamic memory stored in 'srcBuffer0'
    allocated through function 'malloc' at line 566 can be lost at line
    746

However, I think the problem is actually much worse.  Since the memory
is freed after the first pass through the loop, the released buffer may
be used on the next iteration!

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-16 10:13:48 -07:00
Ian Romanick
6758498eb7 mesa/swrast: Refactor no-memory error checking in blit_linear
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-16 10:13:10 -07:00
Martin Andersson
4c3ed79566 r600g: Workaround for a harware bug with nested loops on Cayman
There is a hardware bug on Cayman where a BREAK/CONTINUE followed by
LOOP_STARTxxx for nested loops may put the branch stack into a state
such that ALU_PUSH_BEFORE doesn't work as expected. Workaround this
by replacing the ALU_PUSH_BEFORE with a PUSH + ALU

Fixes piglit tests EXT_transform_feedback/order*

v2: Use existing loop count and improve comment
v3: [Vadim Girlin] Set jump address for PUSH instructions

NOTE: This is a candidate for the 9.1 branch

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-16 18:02:11 +04:00
Marek Olšák
8616b224bf gallium/hud: fix FPS computation for framerate > 4.2k 2013-04-16 13:56:47 +02:00
Marek Olšák
332af88c39 gallium/hud: increase vertex buffer size for background black rectangles
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-16 13:56:47 +02:00
Marek Olšák
0108114619 gallium/hud: update the contents of GALLIUM_HUD=help
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-16 13:56:47 +02:00
Marek Olšák
30284f8892 gallium/hud: remove pipeline-statistics- prefix in query names
for the env var string not to be awfully long

v2: fix bug in indexing of "name"

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-16 13:56:47 +02:00
Marek Olšák
dfe5367f0f r600g: implement pipeline statistics query 2013-04-16 13:56:47 +02:00
Marek Olšák
817723baf8 winsys/radeon: use query_value for timestamp, remove query_timestamp 2013-04-16 13:56:47 +02:00
Marek Olšák
413ca78af3 r600g: add a debug flag for printing virtual addresses of resources 2013-04-16 13:56:47 +02:00
Marek Olšák
05fa3595e0 r600g: add a query returning the amount of time spent during bo_map sync. 2013-04-16 13:56:47 +02:00
Matt Turner
b3f1f665b0 build: Get rid of GALLIUM_WINSYS_DIRS
configure still uses it to print the enabled winsys.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:05:55 -07:00
Matt Turner
3a6e548a85 build: Get rid of GALLIUM_TARGET_DIRS
configure still uses it to print the enabled targets.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:05:55 -07:00
Matt Turner
2f7a37d858 build: Build pipe-loader before gallium tests
And don't build it from other Makefiles. That's awful, and breaks
distclean.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:05:55 -07:00
Matt Turner
0d3b1b0e2e build: Get rid of GALLIUM_MAKE_DIRS
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:05:55 -07:00
Matt Turner
69b69b1a0b build: Stop using GALLIUM_STATE_TRACKERS_DIRS for SUBDIRS
configure still uses it to print the enabled state trackers.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:26 -07:00
Matt Turner
13a7010c21 build: Get rid of DRIVER_DIRS
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:26 -07:00
Matt Turner
8341effd4a build: Stop AC_SUBST'ing DRI_DIRS and GALLIUM_DRIVERS_DIRS
Neither are used in Makefile.ams.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:26 -07:00
Matt Turner
70531b4a25 build: Remove GALLIUM_DIRS
It's always constant anyway.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:26 -07:00
Matt Turner
a9676ae44a build: Get rid of SRC_DIRS
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:26 -07:00
Matt Turner
691c30404d build: Get rid of CORE_DIRS
A step toward working make dist/distcheck.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:25 -07:00
Matt Turner
d5e9426b96 build: Move src/mapi/mapi/* to src/mapi/
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:25 -07:00
Matt Turner
3c690524e2 build: Rename sources.mak -> Makefile.sources
For the sake of consistency.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-15 12:04:25 -07:00
Tom Stellard
d50343dff1 radeonsi: Read config values from the .AMDGPU.config ELF section
Instead of emitting configuration values (e.g. number of gprs used) in a
predefined order, the LLVM backend now emits these values in
register/value pairs.  The first dword contains the register address and
the second dword contians the value to write.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-15 10:54:30 -07:00
Tom Stellard
9277b04c02 radeon/llvm: Handle ELF formatted binary output from the LLVM backend 2013-04-15 10:54:29 -07:00
Tom Stellard
7782d19cdc radeon/llvm: Use a struct for storing compiled code 2013-04-15 10:13:10 -07:00
Roland Scheidegger
1d6eb23f2d gallivm: fix small but severe bug in handling multiple lod level strides
Inserting the value for the second quad in the wrong place for the
following shuffle. This meant the row or image stride was undefined which is
quite catastrophic, can lead to bogus texels fetched or just segfault.
This code is only hit for SoA path currently, still surprising it
didn't crash more or caused more visible issues (I think llvm used a
broadcast shuffle for the undefined parts of the vector, hence the undefined
value for the second quad was just the same as that from the first quad,
so as long as both quads hit the same mip level everything was fine, and since
lower mips always have the same large stride it made it less likely to
hit out-of-bound memory in case of differing lods).

Note: this is a candidate for stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-15 15:23:40 +02:00
Francisco Jerez
02b808b08a clover: Fix usage of incorrect object as destination in clEnqueueCopyBufferToImage.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
2013-04-13 14:24:10 +02:00
Francisco Jerez
1a8ad6c2e3 clover: Define platform class and merge with device_registry.
Null platform IDs are OK according to the spec, but some applications have
been reported to get paranoid and assume that our NULL platform is unusable.

As it doesn't hurt to have device enumeration separate from the rest of the
device code (quite the opposite, it makes the code cleaner), make the API use
an actual platform object that keeps track of the available devices instead of
the former NULL pointer.

Reported-and-reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
2013-04-13 14:20:16 +02:00
Francisco Jerez
6ace452055 clover: Add missing fields to the module serializer.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
2013-04-13 14:12:49 +02:00
Eric Anholt
1658efc42c i965: Shut up the last release build warning.
I don't see a sensible value to use in this path, but we shouldn't ever
hit this outside of developer new-texture-target enabling.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:14 -07:00
Eric Anholt
dcb1b89c65 i965: Silence one more compile warning.
We don't want to store this thing in the class, and we do need the
definition to be at the top of the function and held onto until the end
here, so there's not much to do besides (void) reference it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:14 -07:00
Eric Anholt
dea70404eb i965: Fix a warning in the release build.
This was copy and pasted from can_reswizzle_dst(), and we can just fold it
in instead to avoid the warning.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:14 -07:00
Eric Anholt
28170c5b7f i965: Fix an unused variable warning in the release build.
I think this actually clarifies what's going on in the asserts a bit,
given how many regions we've got floating around.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:13 -07:00
Eric Anholt
248175ab3b i965: Fix an unused variable warning in the release build.
It's used in an assert, but we have this as a member of the class anyway.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:13 -07:00
Eric Anholt
6cec233c62 intel: Return failure properly in the texsubimage blit path.
We assert that failure doesn't happen, but it fixes a warning in the
release build and it would at least give working behavior for a user by
falling back to the normal texsubimage path.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:13 -07:00
Eric Anholt
b681a89588 intel: Fix a warning in the release build.
This was silly -- checking that we didn't overflow the array by dividing
the array size by 2 and then multiplying it back up by 2.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:13 -07:00
Eric Anholt
1433936fe5 intel: Fix an unused variable warning in the release build.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:13 -07:00
Eric Anholt
9167ba8584 intel: Improve diagnostics for emit_linear_blit failure path.
This fixes unused variable warnings in the release build, and should be
more useful if it ever triggers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:13 -07:00
Eric Anholt
aceba66795 i965: Fix error path for MCS allocation.
Asserts don't stop execution in release builds, so we would continue on to
use an uninitialized format value.  Just take the failure path, which
appears to continue up the call stack for a while.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
331766b9a2 i830: Move assert-only code into the assert.
The call has no side effects, and moving it into the assert cleans up a
compile warning in the release build.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
adf251406b i965/fs: Fix some untriggered optimization bugs with uncompressed/sechalf.
We have this support for firsthalf/sechalf instructions, which would be
called in the !has_compr4 (aka original gen4) 16-wide case.  We currently
only support 16-wide for gen5+, so we weren't tripping over this, but it
would have been a problem if we ever try to enable it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
eaca8a94e2 i965/fs: Add basic-block-level dead code elimination.
This is a poor substitute for proper global dead code elimination that
could replace both our current paths, but it was very easy to write.  It
particularly helps with Valve's shaders that are translated out of DX
assembly, which has been register allocated and thus have a bunch of
unrelated uses of the same variable (some of which get copy-propagated
from and then left for dead).

shader-db results:
total instructions in shared programs: 1735753 -> 1731698 (-0.23%)
instructions in affected programs:     492620 -> 488565 (-0.82%)

v2: Fix comment typo

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
36d0fde603 i965/fs: Remove incorrect note of writing attr in centroid workaround.
This instruction doesn't update its IR destination, it just moves from
payload to f0.  This caused the dead code elimination pass I'm adding to
dead-code-eliminate the first step of interpolation.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
2cb7f1e766 i965/fs: Add a helper function for checking for partial register updates.
These checks were all over, and every time I wrote one I had to try to
decide again what the cases were for partial updates.

v2: Fix inadvertent reladdr check removal.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
df25b4f3cf mesa: Add a macro to bitset for determining bitset size.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:12 -07:00
Eric Anholt
b5a0f59c0f i965: Fix compiler warnings since the introduction of texture multisample.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-12 16:32:11 -07:00
Ian Romanick
1faaa411c7 mesa: Don't leak gl_context::BeginEnd at context destruction
The other dispatch tables (Exec and Save) are freed, but BeginEnd is
never freed.  This was found by inspection why investigating the leak of
shared state in _mesa_initialize_context.

NOTE: This is a candidate for stable branches

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-12 16:24:48 -07:00
Ian Romanick
6e06550e4e mesa: Don't leak shared state when context initialization fails
Back up at line 1017 (not shown in patch), we add a reference to the
shared state.  Several places after that may divert to the error
handler, but, as far as I can tell, nothing ever unreferences the shared
state.

Fixes issue identified by Klocwork analysis:

    Resource acquired to 'shared->TexMutex' at line 1012 may be lost
    here. Also there is one similar error on line 1087.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-12 16:24:48 -07:00
Ian Romanick
f730c210b8 egl/dri2: NULL check value returned by dri2_create_surface
dri2_create_surface can fail for a variety of reasons, including bad
input data.  Dereferencing the NULL pointer and crashing is not okay.

Fixes issue identified by Klocwork analysis:

    Pointer 'surf' returned from call to function 'dri2_create_surface'
    at line 285 may be NULL and will be dereferenced at line 291.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-12 16:24:48 -07:00
Ian Romanick
2cc0b3294a mesa: NULL check the pointer before trying to dereference it
Duh.

Fixes issues identified by Klocwork analysis:

    Pointer 'table' returned from call to function 'calloc' at line 115
    may be NULL and will be dereferenced at line 117.

and

    Suspicious dereference of pointer 'table' before NULL check at line
    119.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-12 16:24:48 -07:00
Ian Romanick
ee55b845d2 glsl: Fix hypothetical NULL dereference related to process_array_type
Ensure that process_array_type never returns NULL, and let
process_array_type handle the case where the supplied base type is NULL.

Fixes issues identified by Klocwork analysis:

    Pointer 'type' returned from call to function 'get_type' at line
    1907 may be NULL and may be dereferenced at line 1912.

and

    Pointer 'field_type' checked for NULL at line 4160 will be
    dereferenced at line 4165. Also there is one similar error on line
    4174.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-12 16:24:44 -07:00
Ian Romanick
278c9af85e glsl: Fix hypothetical NULL dereference in ast_process_structure_or_interface_block
Fixes issue identified by Klocwork analysis:

    Pointer 'field_type' returned from call to function 'glsl_type' at
    line 4126 may be NULL and may be dereferenced at line 4139.  Also
    there are 2 similar errors on line(s) 4165, 4174.

In practice, it should be impossible to actually get NULL in here
because a syntax error would have already caused compilation to halt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-12 16:24:39 -07:00
Tom Stellard
c6a86fb563 r300g: Fix bug in OMOD optimization
https://bugs.freedesktop.org/show_bug.cgi?id=60503

NOTE: This is a candidate for the stable branches.
2013-04-12 08:33:31 -07:00
Emil Velikov
ac1118d53c nvc0: set ret variable if launch desc allocation failed
Pointed out by gcc

nve4_compute.c: In function 'nve4_launch_grid':
nve4_compute.c:511:7: warning: 'ret' may be used uninitialized in
 this function [-Wmaybe-uninitialized]
    if (ret)
       ^

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Edit by Christoph Bumiller:
Set it to -1 to indicate failure and only when it's actually required.
2013-04-12 17:15:14 +02:00
Emil Velikov
48bcb94dc3 nvc0: bail out early during nve4_compute_setup()
Exit gracefully rather than trying to create a random object, whenever the
chipset is unknown

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-04-12 17:10:11 +02:00
Emil Velikov
e28c266682 nvc0: compile nve4_cache_split_name() only in debug build
As otherwise it is unused - pointed out by gcc

nve4_compute.c:586:20: warning: 'nve4_cache_split_name' defined but not used [-Wunused-function]
 static const char *nve4_cache_split_name(unsigned value)
                    ^

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-04-12 17:09:03 +02:00
Emil Velikov
249f3d73cf nv50/codegen: do not emitATOM() if the subOp is unknown
For debug build we'll hit the assert, for release we are going to emit random data
as subOp is used uninitilised. Spotted by gcc

codegen/nv50_ir_emit_nv50.cpp: In member function 'void nv50_ir::CodeEmitterNV50::emitATOM(const nv50_ir::Instruction*)':
codegen/nv50_ir_emit_nv50.cpp:1554:12: warning: 'subOp' may be used uninitialized in this function [-Wmaybe-uninitialized]
    uint8_t subOp;
            ^

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-04-12 17:08:26 +02:00
Christoph Bumiller
4da54c91d2 nvc0: implement multisample textures 2013-04-12 13:02:18 +02:00
Christoph Bumiller
71c1c8a9b8 nvc0: patch up TEX cases with 5 or 6 sources on nve4
Hackishly fixes alignment requirement of 2nd tuple for now.
2013-04-12 11:41:35 +02:00
Christoph Bumiller
2b62ba7cb0 nvc0: fix 2D engine MS2 resolve 2013-04-12 11:41:35 +02:00
Christoph Bumiller
69804c2ab8 nv50,nvc0: add RGBX16/32_FLOAT formats 2013-04-12 11:41:35 +02:00
Matt Turner
195a6cca3c i965/vs: Print error if vertex shader fails to compile.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-11 17:22:07 -07:00
Matt Turner
32a8e87766 i965: NULL check prog on shader compilation failure.
Also change if (shader) to if (prog) for consistency.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-11 17:21:13 -07:00
José Fonseca
ed9687cf1b scons: Add st_cb_msaa.c to source list. 2013-04-11 22:37:34 +01:00
Dave Airlie
f024c72476 r600g: add get_sample_position support (v3)
v2: I rewrote this to use the sample positions properly.
v3: rewrite properly to use bitfield to cast back to signed ints

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-11 21:09:29 +01:00
Dave Airlie
f152da6bf9 st/mesa: add support for ARB_texture_multisample (v3)
This adds support to the mesa state tracker for ARB_texture_multisample.

hardware doesn't seem to use a different texture instructions, so
I don't think we need to create one for TGSI at this time.

Thanks to Marek for fixes to sample number picking.

v2: idr pointed out a bug in how we picked the max sample counts,
use new internal format chooser interface to pick proper answers.
v3: use st_choose_format directly, it was okay, fix anding of masks.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-11 21:09:29 +01:00
Dave Airlie
1d90ee5ef5 st/mesa: add support for get sample position
This just calls into the gallium interface.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-11 21:09:28 +01:00
Dave Airlie
cc906396c7 gallium: add get_sample_position interface
This is to be used to implement glGet GL_SAMPLE_POSITION.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-11 21:09:28 +01:00
Dave Airlie
184278a804 r600g: fix two issues in compressed msaa reading code
I've no idea when sample_chan would ever be 4 here, but 4 is most
definitely wrong, array textures have it as 3 as well.

Also the cayman code though unused is obviously wrong.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-11 21:09:27 +01:00
Paul Berry
e9fa3a9448 i965/vs: Don't hardcode DEBUG_VS in generic vec4 code.
Since the vec4_visitor and vec4_generator classes are going to be
re-used for geometry shaders, we can't enable their debug
functionality based on (INTEL_DEBUG & DEBUG_VS) anymore.  Instead, add
a debug_flag boolean to these two classes, so that when they're
instantiated the caller can specify whether debug dumps are needed.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:26 -07:00
Paul Berry
defdb310b7 i965/vs: Generalize computation of array strides in preparation for GS.
Geometry shader inputs are arrays, but they use an unusual array
layout: instead of all array elements for a given geometry shader
input being stored consecutively, all geometry shader inputs are
interleaved into one giant array.  As a result, the array stride we
use to access geometry shader inputs must be equal to the size of the
input VUE, rather than the size of the array element.

This patch introduces a new virtual function,
vec4_visitor::compute_array_stride(), which will allow geometry shader
compilation to specialize the computation of array stride to account
for the unusual layout of geometry shader input arrays.  It also
renames the local variable that the ir_dereference_array visitor uses
to store the stride, to avoid confusion.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:26 -07:00
Paul Berry
444fce6398 i965/vs: Generalize attribute setup code in preparation for GS.
This patch introduces a new function,
vec4_visitor::lower_attributes_to_hw_regs(), which replaces registers
of type ATTR in the instruction stream with the hardware registers
that store those attributes.  This logic will need to be common
between the vertex and geometry shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:26 -07:00
Paul Berry
28fe02ce6e i965/vs: Generalize vertex emission code in preparation for GS.
This patch introduces a new function, vec4_visitor::emit_vertex(),
which contains the code for emitting vertices that will need to be
common between the vertex and geometry shaders.

Geometry shaders will need to use a different message header, and a
different opcode, for their URB writes, so we introduce virtual
functions emit_urb_write_header() and emit_urb_write_opcode() to take
care of the GS-specific behaviours.

Also, since vertex emission happens at the end of the VS, but in the
middle of the GS, we need to be sure to only call
emit_shader_time_end() during VS vertex emission.  We accomplish this
by moving the call to emit_shader_time_end() into the VS
implementation of emit_urb_write_opcode().

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
7214451bdc i965/vs: rename vec4_generator::generate_vs_instruction.
Since this function is going to get used for geometry shaders too, it
deserves a more generic name: generate_vec4_instruction.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
9bb6840b28 i965/vs: Generalize data structures pointed to by vec4_generator.
This patch removes the following field from vec4_generator, since it
is not used:

- struct brw_vs_compile *c

And changes the following field:

- struct gl_vertex_program *vp => struct gl_program *prog

With these changes, vec4_generator no longer refers to any VS-specific
data structures.  This will pave the way for re-using it for geometry
shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

v2: Use the name "prog" rather than "p".

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
4d773603d3 i965/vs: Rename vec4_generator::prog to shader_prog.
The next patch is going to change the type of vec4_generator::vp from
struct gl_vertex_program * to struct gl_program *, and rename it.  The
sensible name to change it to is vec4_generator::prog.  However, prog
is already used.  Since the existing vec4_generator::prog is of type
struct gl_shader_program, it makes sense to rename it to shader_prog.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
5743bea0ba i965/vs: move VS-specific data members to vs_vec4_visitor.
This patch moves the following data structures from vec4_visitor to
vec4_vs_visitor, since they contain VS-specific data:

- struct brw_vs_compile *c (renamed to vs_compile)
- struct brw_vs_prog_data *prog_data (renamed to vs_prog_data)
- src_reg *vp_temp_regs
- src_reg vp_addr_reg

Since brw_vs_compile and brw_vs_prog_data also contain vec4-generic
data, the following pointers are added to the base class, to allow it
to access the vec4-generic portions of these data structures:

- struct brw_vec4_compile *c
- struct brw_vec4_prog_key *key
- struct brw_vec4_prog_data *prog_data

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

v2: Use shorter names in the base class and longer names in the
derived class.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
0ce95222af i965/vs: move ARB_vertex_program functions to vec4_vs_visitor.
This patch moves functions from vec4_visitor to vec4_vs_visitor that
deal with ARB (assembly) vertex programs.  There's no point in having
these functions in the base class since we don't intend to support
assembly programs for the GS stage.  The following functions are
moved:

- setup_vp_regs
- get_vp_dst_reg
- get_vp_src_reg

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
42a3d63dd4 i965/vs: Add virtual function make_reg_for_system_value().
The system values handled by vec4_visitor::visit(ir_variable *) are
VS-specific (vertex ID and instance ID).  This patch moves the
handling of those values into a new virtual function,
make_reg_for_system_value(), so that this VS-specific code won't be
inherited by geomtry shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
8941f73c7c i965/vs: Make some vec4_visitor functions virtual.
This patch makes the following vec4_visitor functions virtual, since
they will need to be implemented differently for vertex and geometry
shaders.  Some of the functions are renamed to reflect their generic
purpose, rather than their VS-specific behaviour:

- setup_attributes
- emit_attribute_fixups (renamed to emit_prolog)
- emit_vertex_program_code (renamed to emit_program_code)
- emit_urb_writes (renamed to emit_thread_end)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
e9be5a05f7 i965/vs: Make vec4_vs_visitor class derived from vec4_visitor.
This patch just creates the derived class; later patches will migrate
VS-specific functions and data structures from the base class into the
derived class.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:25 -07:00
Paul Berry
5fff3752c8 i965/vs: split brw_vs_prog_data into generic and VS-specific parts.
This will allow the generic parts to be re-used for geometry shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

v2: Put urb_read_length and urb_entry_size in the generic struct.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
0c994f181c i965/vs: split brw_vs_prog_key into generic and VS-specific parts.
This will allow the generic parts to be re-used for geometry shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
d7af636473 i965/vs: split brw_vs_compile into generic and VS-specific parts.
This will allow the generic parts to be re-used for geometry shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
09cd6e06d2 i965/vs: Remove brw_vs_prog_data pointer from brw_vs_compile.
In patches that follow, we'll be splitting structs brw_vs_prog_data
and brw_vs_compile into a vec4-generic base struct and a VS-specific
derived struct (this will allow the vec4-generic code to be re-used
for geometry shaders).  Having brw_vs_compile point to
brw_vs_prog_data makes it difficult to do this cleanly.

Fortunately most of the functions that use brw_vs_compile (those in
the vec4_visitor class) already have access to brw_vs_prog_data
through a separate pointer (vec4_visitor::prog_data).  So all we have
to do is use that pointer consistently, and plumb prog_data through
the few remaining functions that need access to it.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
deffbbed4e i965: Generalize computation of VUE map in preparation for GS.
This patch modifies the arguments to brw_compute_vue_map() so that
they no longer bake in the assumption that we are generating a VUE map
for vertex shader outputs.  It also makes the function non-static so
that we can re-use it for geometry shader outputs.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
b29613371c i965/vs: Make type of vec4_visitor::vp more generic.
The vec4_visitor functions don't use any VS specific data from
vec4_visitor::vp.  So rename it to "prog" and change its type from
struct gl_vertex_program * to struct gl_program *.  This will allow
the code to be re-used for geometry shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

v2: Use the name "prog" rather than "p".

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
fe97f26c86 i965: Rename backend_visitor::prog to shader_prog.
The next patch is going to change the type of vec4_visitor::vp from
struct gl_vertex_program * to struct gl_program *, and rename it.  The
sensible name to change it to is vec4_visitor::prog.  However, prog is
already used in backend_visitor (which vec4_visitor derives from).
Since backend_visitor::prog is of type struct gl_shader_program *, it
makes sense to rename it to shader_prog.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-11 09:25:24 -07:00
Paul Berry
5b0bd8ece8 glsl: Fix (and validate) comment above glsl_type::name.
The comment above glsl_type::name claimed that it could sometimes be
NULL.  This was wrong--it is never NULL.  Many error handling paths
would segfault if it were.  (Anonymous structs are assigned names like
"#anon_struct_0001"--see the ast_struct_specifier constructor in
glsl_parser_extras.cpp.)

Fix the comment and add assertions to validate that it really is never
NULL.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-11 09:25:24 -07:00
Christian König
5b2855bfe7 radeon/uvd: add UVD implementation v5
Just everything you need for UVD with r600g and radeonsi.

v2: move UVD code to radeon subdir, clean up build system additions,
    remove an unused SI function, disable tiling on SI for now.
v3: some minor indentation fix and rebased
v4: dpb size calculation fixed
v5: implement proper fall-back in case the kernel doesn't support UVD,
    based on patches from Andreas Boll but cleaned up a bit more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-11 17:10:28 +02:00
Christian König
f91e4d2c9d radeon/winsys: add uvd ring support to winsys v3
Separated from UVD patch for clarity.

v2: sync with next tree for 3.10
v3: as pointed out by Andreas Bool check for drm minor >= 32

http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.10-wip

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-04-11 17:10:01 +02:00
Dave Airlie
cb12bf7606 st/mesa: fix UBO offsets.
Reported and tested by degasus on #radeon.

Note: This is a candidate for the 9.1 branch

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-11 15:20:19 +10:00
Ralf Jung
3998f8c6b5 egl/x11: Fix initialisation of swap_interval
The EGLConfig attributes EGL_MIN/MAX_SWAP_INTERVAL were incorrectly set to
0 and 0. This prevented clients from setting the swap interval to a
reasonable value, like 1 or 2.

Swap interval worked correctly in Mesa 9.0. The commit below introduced
the bug.

    commit 7e9bd2b2ed
    Author: Eric Anholt <eric@anholt.net>
    Date:   Tue Sep 25 14:05:30 2012 -0700
	egl: Add support for driconf control of swapinterval.

Note: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63078
[chadv: Wrote commit message]
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-10 19:16:45 -07:00
Kenneth Graunke
cbe24ff7c8 intel: Fall back to X-tiling when larger than estimated aperture size.
If a region is larger than the estimated aperture size, we map/unmap it
by copying with the BLT engine.  Which means we can't use Y-tiling.

Fixes Piglit max-texture-size and tex3d-maxsize, which regressed in my
recent change to use Y-tiling by default on Gen6+.  This was due to a
botched merge conflict resolution.

v2: Return a mask of valid tilings from intel_miptree_select_tiling.
    This allows us to avoid the X-tiling fallback if Y-tiling is actually
    mandatory.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-10 16:54:31 -07:00
Kenneth Graunke
eef3dff3fd intel: Refactor code in intel_miptree_choose_tiling().
This reduces the nesting level slightly, and in my opinion, makes it a
bit easier to follow.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-10 16:54:31 -07:00
Kenneth Graunke
ba38ac062c intel: Move the max_gtt_map_object_size estimation to intel_context.
We need know this in order to decide what tiling mode to use.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-10 16:54:31 -07:00
Fredrik Höglund
fb69dbb0d1 r600g: Add support for GL_ARB_texture_buffer_range
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-11 00:10:45 +02:00
Paul Berry
42767dc22f i965/blorp: Remove unnecessary test in gen7_blorp_emit_depth_stencil_config.
gen7_blorp_emit_depth_stencil_config() is only called when
params->depth.mt is non-null.  Therefore, it's not necessary to do an
"if (params->depth.mt)" test inside it.  The presence of this if test
was misleading static analysis tools (and briefly, me) into thinking
that gen7_blorp_emit_depth_stencil_config() might sometimes access
uninitialized data and dereference a null pointer.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-10 13:17:53 -07:00
Marek Olšák
34c3f98641 r600g: fix valgrind warning on Cayman
Warning: "Conditional jump or move depends on uninitialised value(s)".
2013-04-10 21:56:51 +02:00
Zack Rusin
fe29f99293 gallivm/tgsi: handle untyped moves
both mov and ucmp can be used to move variables of any type.
correctly note that about ucmp in the tgsi_info and make
sure gallivm can handle that by correctly casting the untyped
moves.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-10 12:37:17 -07:00
Zack Rusin
d56f2d5267 gallivm: fix loops and conditionals within GS
We were using simple temporaries, without using alloca or phi
nodes which meant that on every iteration of the loop our
temporaries, which were holding the number of vertices and
primitives which were emitted, were being reset to zero. Now
we're using alloca to allocate those variables to preserve
them across conditionals.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-10 12:33:59 -07:00
Zack Rusin
c1cd19c3b8 llvmpipe: implement PIPE_QUERY_SO_STATISTICS
We were missing the implementation of PIPE_QUERY_SO_STATISTICS
query, this change implements it on top of the existing
facilities.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-10 12:32:56 -07:00
Zack Rusin
7466e0b6c8 gallivm: fix unsigned divide and remainder opcodes
We want to both make sure we never divide by zero to not generate
sigfpe and that divide by zero is guaranteed to return 0xffffffff.
Based on José idea.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-10 12:31:22 -07:00
Zack Rusin
1ad4a4eeb3 gallivm: fix breakc
we break when the mask values are 0 not, 1, plus it's bit comparison
not a floating point comparison. This fixes both.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-10 12:25:34 -07:00
Chad Versace
e4484a0309 intel/hsw: Enable hiz (v2)
Enable hiz by setting intel_context::has_hiz.  However, to work around
a hardware bug, we selectively enable hiz for only nicely aligned miptree
slices.

No Piglit regressions on Haswell 0x0d26 rev07 when based atop
mesa-master-4ad3601.

Improves the performance of GLB27_TRex_C24Z16_FixedTimeStep by 18.52%
(hsw-0x0d26-rev07; kernel-3.9.0-rc1; GLBenchmark 2.7.0 Release a68901;
samples=3).

v2: Replace the check for IS_HASWELL(devid) in intel_miptree_slice_has_hiz()
    with a conditional set of has_hiz. [for anholt]

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:26 -07:00
Chad Versace
916d1ea7dc i965: Remove brw_context::depthstencil::hiz_mt
After recent refactorings, the field is written but no longer read.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:10 -07:00
Chad Versace
2d3bbc576c intel: Replace checks for hiz_mt with intel_has*hiz()
When appropriate, replace each check `hiz_mt != NULL` with either a call
to intel_miptree_slice_has_hiz() or intel_renderbuffer_has_hiz().  No
behavioral change.

This prepares for selectively enabling hiz on individual miptree slices
for Haswell.

This refactoring had several side effects.

  1. To prevent new warnings about discarding the const qualifier,
     I removed 'const' from some variable declarations in
     intel_validate_framebuffer().  The alternative was to add const
     qualifiers to multiple function signatures in the
     intel_renderbuffer_has_hiz call graph. Since the dominant convention
     in the Intel code is to not qualify function parameters as const,
     I chose to remove rather than add const qualifiers.

  2. I changed the signature of brw_emit_depth_stencil_hiz() by replacing
     `struct intel_mipmap_tree *hiz_mt` with `bool hiz`. The function used
     hiz_mt mostly as a boolean indicator of the presence of hiz, so the
     signature change is consistent with the patch's goal.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:10 -07:00
Chad Versace
5b79705526 i965: Change signature of brw_get_depthstencil_tile_masks()
Add new parameters `depth_level` and `depth_layer`, which specify depth
miptree's slice of interest.  A following patch will pass the new
parameters through to intel_miptree_slice_has_hiz().

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:10 -07:00
Chad Versace
87f4541bc1 i965/blorp: Add fields brw_blorp_mip_info::level,layer
The new fields define the 2D miptree slice to be used. A following patch
will pass the new fields through to intel_miptree_slice_has_hiz().

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:10 -07:00
Chad Versace
2a416a9b1b intel: Add field intel_mipmap_slice::has_hiz
On Haswell, HiZ will selectively be enabled on individual miptree slices
to workaround a hardware bug. The new field 'has_hiz' indicates if HiZ is
enabled for a given slice.

Also add two new accessor functions for this field.
  intel_miptree_slice_has_hiz
  intel_renderbuffer_has_hiz

The new field and accessor functions are not yet used. Also, this patch
introduces no behavioral change because, in this patch,
intel_miptree_alloc_hiz() sets has_hiz for all slices.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:10 -07:00
Chad Versace
a14dc4f92c i965/blorp: Align rectangle primitive for hiz ops
The hardware docs and the simulator require that the rectangle primitive
emitted during fast depth clears and hiz resolves must be aligned to 8x4
pixels.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-10 10:55:10 -07:00
Eric Anholt
d5f7aebac2 i965/vs: Use GRFs for pull constant offsets on gen7.
This allows the computation of the offset to get written directly into the
message source.

shader-db results:
total instructions in shared programs: 3308390 -> 3283025 (-0.77%)
instructions in affected programs:     442998 -> 417633 (-5.73%)

No difference in GLB2.7 low res (n=9).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-10 09:45:21 -07:00
Eric Anholt
3badbf7f7f i965/vs: When asked to make a dst_reg for a src.xxxx, just write to src.x.
We have several places in our pull constant handling where we make a
temporary src_reg for an int, and then turn it into a dst.  In doing so,
we were writing to the dst.xyzw, so we never register coalesced it with a
later mov from dst.x to real_dst.x.

These extra channels written would be removed if we had channel-wise DCE
in the backend, but we don't.  Fix it for now by just not writing these
extra channels that won't get used.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-04-10 09:45:21 -07:00
Eric Anholt
007a88ed24 i965/gen6: Reduce updates of transform feedback offsets with HW contexts.
The software-tracked transform feedback offsets (svbi_0_starting_index)
are incorrect in the presence of primitive restart, so we were actually
updating it with a bogus value if the batch wrapped and we emitted the
packet again during a single transform feedback.  By reducing state
emission, we avoid the bug.

Fixes piglit OpenGL 3.1/primitive-restart-xfb flush
Reviewed-by: Paul Berry <stereotype441@gmail.com>
NOTE: This is a candidate for the 9.1 branch.
2013-04-10 09:45:21 -07:00
Eric Anholt
62a18da341 i965/gen7: Skip resetting SOL offsets at batch start with HW contexts.
The software-tracked transform feedback offsets (svbi_0_starting_index)
are incorrect in the presence of primitive restart, so we can't reliably
compute offsets for our buffer pointers after a batch flush.  Thanks to HW
contexts, our transform feedback offsets are now saved, so we can just
keep using the ones from before the batch wrap.

Fixes piglit OpenGL 3.1/primitive-restart-xfb flush
Reviewed-by: Paul Berry <stereotype441@gmail.com>
NOTE: This is a candidate for the 9.1 branch.
2013-04-10 09:45:21 -07:00
Christian König
ccf3e8fc9b radeonsi: remove sampler writemask v3
v2: fix instrinsic name as well
v3: LLVM revision incremented as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-10 10:41:29 +02:00
Niels Ole Salscheider
31f14f3def pipe-loader: Fix out of source build
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
2013-04-10 09:45:04 +02:00
Brian Paul
b74b510d64 st/mesa: remove #if FEATURE_GL/ES tests
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 18:43:40 -06:00
Brian Paul
c04e0b9f4b mesa: remove old comment about FEATURE_GL
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 18:43:40 -06:00
Brian Paul
f490c6839b mesa: remove #ifdef FEATURE_ES2, add some comments instead
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 18:43:40 -06:00
Brian Paul
9dc6f76e44 st/mesa: remove #include mfeatures.h
None of these were needed.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 18:43:40 -06:00
Brian Paul
04bd972fc3 docs: initial 9.2 release notes file 2013-04-09 18:30:23 -06:00
Brian Paul
acd4fb8b5a st/osmesa: re-use buffers in OSMesaMakeCurrent()
Rather than creating a new buffer each time.  Fixes problems found
with vtk.

Tested-by: Kevin H. Hobbs <hobbsk@ohio.edu>
2013-04-09 18:30:23 -06:00
Marek Olšák
4f1fd920c9 mesa: update derived framebuffer state in GetMultisamplefv
This makes sure that ctx->DrawBuffer->Visual.samples is up-to-date.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-10 02:01:16 +02:00
Marek Olšák
b6475f9437 mesa: fix glGet queries depending on derived framebuffer state (v2)
"ctx->DrawBuffer->Visual" might be invalid if (NewState &_NEW_BUFFERS) != 0.

v2: also fix:
    - RGBA_INTEGER_MODE_EXT
    - RGBA_FLOAT_MODE_ARB (also check API support)
    - FRAMEBUFFER_SRGB_CAPABLE_EXT

NOTE: This is a candidate for stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-10 02:01:16 +02:00
Paul Berry
34efd9214d i965/gen7.5: Allow HW primitive restart for all primitive types.
Gen7.5 (Haswell) hardware supports primitive restart for all primitive
types.  It also handles all possible primitive restart indices.
Rather than specialize both can_cut_index_handle_restart_index() and
the switch statement in can_cut_index_handle_prims() for Haswell, just
return early if the hardware is Haswell because we know it can handle
everything.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-09 15:37:36 -07:00
Paul Berry
a7388f8e6f i965: Only use brw_draw.c's trim() function when necessary.
brw_draw.c contains a trim() function which modifies the vertex count
for quads and quad strips in order to discard dangling vertices.  In
principle this shouldn't be necessary, since hardware since Gen4 is
capable of discarding dangling vertices by itself.  However, it's
necessary because as a hack to speed up rendering on Gen 4-5, we
sometimes convert quads to trifans and quad strips to tristrips.  The
trim() function isn't necessary on Gen6 and up.

This patch documents why and when the trim() function is necessary,
and avoids calling it when it's not needed.

This will avoid creating problems when we enable hardware support for
primitive restart of quads and quad strips on Haswell.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-09 15:37:35 -07:00
Paul Berry
56ce7fa4b8 i965/vs: Fix DEBUG_SHADER_TIME when VS terminates with 2 URB writes.
The call to emit_shader_time_end() before the second URB write was
conditioned with "if (eot)", but eot is always false in this code
path, so emit_shader_time_end() was never being called for vertex
shaders that performed 2 URB writes.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-09 12:15:08 -07:00
Christian König
462647453c st/vdpau: fix subtitle related bug v2
Drawing subtitles didn't increased the dirty area of the surface.

Reported and tested by freeedrich on irc.

v2: don't clear the surface

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-04-09 21:11:32 +02:00
Paul Berry
5306af2113 glsl/linker: Reduce scope of non-flat integer varying fix.
In the mailing list discussion of "glsl/linker: fix varying packing
for non-flat integer varyings." (commit 7862bde), we concluded that
since the bug only applies to integral variables, it is safer to just
apply the bug fix to integer varyings.  I forgot to make the change
before pushing the patch upstream.  (Note: we aren't aware of any bugs
in commit 7862bde; it just seems wise to be on the safe side).

This patch makes the change.  Assuming commit 7862bde gets
cherry-picked back to 9.1, this commit should be cherry-picked too.

NOTE: This is a candidate for the 9.1 release branch.
2013-04-09 10:37:16 -07:00
Paul Berry
32d2b2aa2c glsl/linker: Adapt flat varying handling in preparation for geometry shaders.
When a varying is consumed by transform feedback, but is not used by
the fragment shader, assign_varying_locations() sets its interpolation
type to "flat" in order to ensure that lower_packed_varyings never has
to deal with non-flat integral varyings (the GLSL spec doesn't require
integral vertex outputs to be flat if they aren't consumed by the
fragment shader).

A similar situation will arise when geometry shader support is added,
since the GLSL spec only requires integral vertex shader outputs to be
flat when they are consumed by the fragment shader.  This patch
modifies the linker to handle this situation too.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 10:25:57 -07:00
Paul Berry
8687c40c2d glsl: Document lower_packed_varyings' "flat" requirement with an assert.
To minimize the variety of type conversions that lower_packed_varyings
needs to perform, it assumes that integral varyings are always
qualified as "flat".  link_varyings.cpp takes care of ensuring that
this is the case (even in the circumstances where GLSL doesn't require
it).

This patch documents the assumption with an assertion, for ease in
future debugging.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 10:25:19 -07:00
Paul Berry
7862bde8af glsl/linker: fix varying packing for non-flat integer varyings.
Commit dfb57e7 (glsl: Fix error checking on "flat" keyword to match
GLSL ES 3.00, GLSL 1.50) relaxed the rules for integral varyings: they
only need to be declared as "flat" if they are a fragment shader
inputs.  This allowed for the possibility of a vertex shader output
being a non-flat integer, provided that it was not matched to a
fragment shader input.  A non-contrived situation where this might
arise is if a vertex shader generates some integral outputs which are
consumed by tranform feedback, but not by the fragment shader.

Unfortunately, lower_packed_varyings assumes that *all* integral
varyings are flat, regardless of whether they are consumed by the
fragment shader.  As a result, attempting to create a non-flat
integral vertex output of a size that required packing (i.e. a size
other than ivec4 or uvec4) would cause an assertion failure in
lower_packed_varyings.

This patch prevents the assertion failure by forcing vertex shader
outputs to be "flat" whenever they are not consumed by the fragment
shader.  This should have no effect on rendering since the "flat"
keyword only affects the behaviour of fragment shader inputs.

Fixes piglit test "spec/EXT_transform_feedback/nonflat-integral".

NOTE: This is a candidate for the 9.1 release branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-09 10:25:15 -07:00
Paul Berry
778ce82b71 glsl: Check the size of ir_print_visitor's mode[] array with STATIC_ASSERT.
ir_print_visitor::visit(ir_variable *)'s mode[] array needs to match
the declaration of the enum ir_variable_mode.  It's hard to verify
that at compile time, but at least we can use a STATIC_ASSERT to make
sure it's the right size.

This required adding ir_var_mode_count to the enum.
2013-04-09 10:19:22 -07:00
Paul Berry
67f226e179 glsl: Fix ir_print_visitor's handling of interpolation qualifiers.
This patch updates the interp[] array to match the enum
glsl_interp_qualifier.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Add a STATIC_ASSERT to make sure the array is the correct size.
This required adding INTERP_QUALIFIER_COUNT to the enum.
2013-04-09 10:19:11 -07:00
Johannes Obermayr
c295874129 autotools: Better describe which cases OProfileJIT is required.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-04-09 17:38:42 +01:00
Brian Paul
4ad360133c softpipe: misc updates to image dumping in softpipe_flush() 2013-04-09 08:27:53 -06:00
Vinson Lee
04ffce3004 tgsi: Ensure struct tgsi_ind_register field Index is initialized.
Fixes uninitialized scalar variable defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-08 18:59:34 -07:00
Martin Andersson
a8246927e3 r600g: Fix UMAD on Cayman
The multiplication part of tgsi_umad did not work on Cayman, because it did
not populate the correct vector slots.

This fixed hardlocks in the EXT_transform_feedback/order tests.

NOTE: This is a candidate for the stable branches.
(might not be easy to cherry-pick though)

Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-04-09 03:09:37 +02:00
Kenneth Graunke
b76539aabe intel: Remove the texture_tiling driconf option.
This option can force textures to be untiled.  However, on Gen6+, depth
buffers must be Y-tiled.  MSAA buffers also must be Y-tiled.  So setting
this option on even a trivial application like glxgears causes assertion
failures in a debug build, and likely GPU hangs in a release build.

It's just giving users a license to shoot themselves in the foot.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 16:15:07 -07:00
Kenneth Graunke
55ecc448b9 i965: Prefer Y-tiling on Gen6+.
In the past, we preferred X-tiling for color buffers because our BLT
code couldn't handle Y-tiling.  However, the BLT paths have been largely
replaced by BLORP on Gen6+, which can handle any kind of tiling.

We hadn't measured any performance improvement in the past, but that's
probably because compressed textures were all untiled anyway.

Improves performance in GLB27_TRex_C24Z16_FixedTime by 7.69231%.

v2: Rebase on top of Eric's untiled-for-larger-than-aperture changes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 16:15:07 -07:00
Kenneth Graunke
40e30c1ca1 i965: Use tiling even for compressed textures.
The code has no rationale for why we would force compressed textures to
be untiled, and it appears to work fine.  Git archeology indicates that
it's been that way dating back to when we first started tiling.

Improves performance in GLB27_TRex_C24Z16_FixedTimeStep at 1280x720 by
10.0529% +/- 0.573075% (n=12).  Improves performance in Xonotic by
4.56409% +/- 0.27965% (n=3).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 16:15:07 -07:00
Chad Versace
f709198b10 intel: Refactor selection of miptree tiling
This patch (1) extracts from intel_miptree_create() the spaghetti logic
that selects the tiling format, (2) rewrites that spaghetti into a lucid
form, and (3) moves it to a new function, intel_miptree_choose_tiling().
No behavioral change.

As a bonus, it is now evident that the force_y_tiling parameter to
intel_miptree_create() does not really force Y tiling.

v2 (Ken): Rebase on top of Eric's untiled-for-larger-than-aperture
changes.  This required passing in the miptree.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 16:15:06 -07:00
Chad Versace
aa391976df intel: Allocate hiz in intel_renderbuffer_move_to_temp()
When moving the renderbuffer to a new miptree, we neglected to allocate
the hiz buffer for the new miptree. Oops.

Fixes all Piglit depthstencil-render-miplevels tests from crash to pass on
Sandybridge.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-04-08 16:09:26 -07:00
Dave Airlie
d0bf48f8e9 st/mesa: fix levels in initial texture creation
calim pointed out we were getting mipmap levels for array multisamples,
this didn't make sense. So then I noticed this function takes last_level
so we are passing in a too high value here.

I think this should fix the case he was seeing.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-08 23:56:06 +01:00
Ian Romanick
58d93e3247 glsl: Don't early-out for error-type inputs
Check the type of the array operand and the index operand before doing
other checks.  This simplifies the code a bit now (eliminating the
error_emitted parameter), and enables some later functional changes.

The shader

uniform float x[6];
uniform sampler2D s;
void main() { gl_Position.x = xx[s + 1]; }

still generates (only) the two expected errors:

0:3(33): error: `xx' undeclared
0:3(39): error: Operands to arithmetic operators must be numeric

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
a131b87706 glsl: Don't emit spurious errors for constant indexes of the wrong type
Previously the shader

uniform float x[6];
void main() { gl_Position.x = x[1.0]; }

would have generated the errors

0:2(33): error: array index must be integer type
0:2(36): error: array index must be < 6

Now only

0:2(33): error: array index must be integer type

will be generated.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
a70d2f05dc glsl: Collect all of the non-constant index error checks together
This puts all of the checks togeher for easier reading.  It also means
that all the checks are blocked on array->type->is_array.  Shortly this
will allow elimination of some is_error check work-arounds in this
function.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
f9d8ca2817 glsl: Minor code compaction in _mesa_ast_array_index_to_hir
Also, document the reason for not checking for type->is_array in some of
the bound-checking cases.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
2c333a878c glsl: Don't return a value from check_builtin_array_max_size
That last consumer of the return value was changed to not use it by the
previous commit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
666fafc144 glsl: Remove some unnecessary uses of error_emitted
The error_emitted flag is used in semantic checking to prevent spurious
cascading errors.  For example,

void foo(sampler2D s, float a)
{
    float x = a + (1.2 + s);

    ...
}

should only generate a single error.  Without the error_emitted flag for
the first error, "a + ..." would also generate an error.

However, a bunch of cases in _mesa_ast_array_index_to_hir that were
setting error_emitted would mask legitimate errors.  For example,

    vec4 a[7];
    float b = a[3.14];

should generate two error (float index and type mismatch in assignment).
The uses of error_emitted would cause only the first to be emitted.

This patch removes most of the places in _mesa_ast_array_index_to_hir
that would set the error_emitted flag.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
46934adb8d glsl: Refactor handling of ast_array_index to a separate function
I love 800+ line switch-statements as much as the next guy... Future
commits will make changes to this part of the AST-to-HIR conversion, and
extracting this code will make that a bit easier.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Ian Romanick
cd39ae7394 glsl: Make check_build_array_max_size externally visible
A future commit will try to use this function in a different file.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 15:17:05 -07:00
Eric Anholt
ca9a7d975a intel: Avoid making tiled miptrees we won't be able to blit.
Doing so was breaking miptree mapping, which we really need to be able to
handle.  With this change, intel_miptree_map_direct() falls through to
doing a CPU mapping on the buffer like we need.

With the previous 2 patches, all of these should be fixed:
piglit max-texture-size (all 3 patches required!)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37871
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44958
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53494

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 11:49:33 -07:00
Eric Anholt
dfed115090 intel: Do temporary CPU maps of textures that are too big to GTT map.
This still fails, since 8192*4bpp == 32768, which is too big to use the
blitter on.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-08 11:49:25 -07:00
Eric Anholt
b3a3cb9611 intel: Add support for writing to our linear-temporary-CPU-map case.
This will be used for handling updates of large textures.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>.
2013-04-08 11:49:20 -07:00
Kenneth Graunke
97e40a524e intel: Remove check for kernel 2.6.29.
Now that we require 2.6.39, there's no need to also check for 2.6.29.
Calling drm_intel_bufmgr_gem_enable_fenced_relocs() without checking
should be safe, as it simply sets a flag.

This does remove the check for zero fences available, but that doesn't
seem worth checking.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 11:03:08 -07:00
Kenneth Graunke
394edb5af5 intel: Require kernel 2.6.39 for relaxed relocation support.
Chris Wilson's relaxed relocation patch landed in March 2011.  Anyone
running pre-3.0 kernels probably isn't going to get the latest Mesa
anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 11:03:08 -07:00
Kenneth Graunke
d7fd5696e6 i965: Remove a few BRW_STATE_... enum values.
These were likely used for BRW_NEW_... dirty bit flags at one point, but
they're unused now.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 11:03:08 -07:00
Kenneth Graunke
79c27e7528 i965: Remove brw->vb.info and struct brw_vertex_info.
Nobody uses this value, so there's no need to set it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 11:03:08 -07:00
Kenneth Graunke
b29dc25572 i965: Remove the BRW_NEW_INPUT_DIMENSIONS flag.
When I removed the proj_attrib_mask optimization, I also removed the
last consumer of this bit without realizing it.

Since nobody uses it, there's no point in flagging it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-08 11:03:08 -07:00
Matt Turner
2e177bc8a5 register_allocate: Fix the type of best_benefit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-08 10:30:40 -07:00
Tom Stellard
a5a76782d5 radeon/llvm: Bump minimum LLVM version to 3.3 2013-04-08 07:43:34 -07:00
Niels Ole Salscheider
b336f51cc7 clover: Fix linkage of libOpenCL
Clover needs the irreader component of llvm

v2: Check for irreader component
irreader is only available with LLVM 3.3 >= 177971

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
2013-04-08 07:08:10 -07:00
Vincent Lejeune
5019af2145 r600g/llvm: Add support for native isa for pre EG
This fixes bug 62756 :
https://bugs.freedesktop.org/show_bug.cgi?id=62756#c12
2013-04-08 15:11:59 +02:00
Marek Olšák
eff66bc9f8 gallium/util: add const to a parameter of util_max_layer 2013-04-06 23:57:15 +02:00
Marek Olšák
08275b25cc st/mesa: don't expose ARB_color_buffer_float without driver support in GL core
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-06 23:57:12 +02:00
Marek Olšák
3264c3e997 mesa: allow drivers not to expose ARB_color_buffer_float in GL core profile
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-06 23:57:10 +02:00
Marek Olšák
9d4f67600b mesa: move updating clamp control derived state out of mesa_update_state_locked
It has 2 dependencies: glClampColor and the framebuffer, we might just as well
do the update where those two are changed.

v2: cosmetic changes from Brian's email

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-06 23:57:09 +02:00
Marek Olšák
755648c37f mesa: don't set _ClampFragmentColor to TRUE if it has no effect
This should reduce shader recompilations with drivers that emulate fragment
color clamping, because we want the clamping to be enabled only if there is
a signed normalized or floating-point colorbuffer.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-06 23:57:06 +02:00
Marek Olšák
21d407c1b8 mesa: refactor clamping controls, get rid of _ClampReadColor
v2: cosmetic changes from Brian's email

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-06 23:57:04 +02:00
Chris Forbes
c4629ad3f9 mesa: don't memcmp() off the end of a cache key.
Reported-by: `per` in #intel-gfx

The size of the cache key varies, so store the actual size as well as
the key blob itself, rather than just assuming it's the same as the size
passed in.

NOTE: This is a candidate for stable branches.

V2: Don't leave silly holes in structure; use unsigned instead of GLuint.
V3: Fix missing case for `last` match.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-06 18:30:08 +13:00
Tom Stellard
302f53dc20 radeonsi: Add compute support v3
v2:
  - Only dump shaders when env variable is set.

v3:
  - Don't emit VGT registers

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com
2013-04-05 18:43:34 -04:00
Tom Stellard
4f7fe2cf2c radeonsi: Set TCL1_ACTION_ENA when invalidating the texture cache
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com
2013-04-05 18:43:34 -04:00
Tom Stellard
0ccf82c557 radeonsi: Remove si_pm4_inval_vertex_cache()
This function is a holdover from r600g and is identical to
si_pm4_inval_texture_cache(), so it is not needed.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com
2013-04-05 18:43:34 -04:00
Tom Stellard
c5e5b3401c gallium: PIPE_COMPUTE_CAP_IR_TARGET - allow drivers to specify a processor v2
This target string now contains four values instead of three.  The old
processor field (which was really being interpreted as arch) has been split
into two fields: processor and arch.  This allows drivers to pass a
more a more detailed description of the hardware to compiler frontends.

v2:
  - Adapt to libclc changes

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-04-05 18:43:34 -04:00
Wladimir
1a868acbec util: add ETC as compressed format
Add UTIL_FORMAT_LAYOUT_ETC to util_format_is_compressed. It was missing.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-05 16:14:51 -06:00
Brian Paul
de99b6d117 gallium/u_blitter: fix is_blit_generic_supported() stencil checking
Don't check if there's sampler support for stencil if we're not
going to actually blit/copy stencil values.  Fixes the case where
we mistakenly said we can't support a blit of depth values from
S8Z24 to X8Z24.

Also, rename the is_stencil variable to dst_has_stencil to improve
readability.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-05 16:14:51 -06:00
Alexander Monakov
9cda356004 Honor GLX_DONT_CARE in MATCH_MASK
NOTE: This is a candidate for stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47478
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62999
Bugzilla: http://bugs.winehq.org/show_bug.cgi?id=26763
2013-04-05 14:32:45 -07:00
Rob Clark
aac7f06ad8 freedreno: use autogenerated register defs
Switch to use the envytools generated headers for register/bitfield
definitions.  This is the first step in preparing to add a3xx support,
since it avoids having conflicting names for a3xx and a2xx registers.
And since I'm using envytools for a3xx it is simpler to just use it for
everything.

This shouldn't cause any functional change, it is really just a lot of
renaming.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-04-05 14:33:16 -04:00
José Fonseca
1fefc65d20 st/wgl: Install our windows message hook to threads created before the ICD is loaded.
Otherwise we will not receive destroy windows events, causing framebuffers
to leak.

This happens particularly with java and jogl.

Tested with java + jogl, MATLAB.

VMware Internal Bug Number: 1013086.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-05 18:27:54 +01:00
Adam Jackson
ca70de9bd2 llvmpipe: Work without sse2 if llvm is new enough
At least on llvm 3.2 this appears to work fine.  Tested on an Athlon XP
2600+, which has sse and 3dnow but not sse2.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-04-05 11:32:53 -04:00
Jerome Glisse
b8998f976e winsys/radeon: add command stream replay dump for faulty lockup v3
Build time option, set RADEON_CS_DUMP_ON_LOCKUP to 1 in radeon_drm_cs.h to
enable it.

When enabled after each cs submission the code will try to detect lockup by
waiting on one of the buffer of the cs to become idle, after a timeout it
will consider that the cs triggered a lockup and will write a radeon_lockup.c
file in current directory that have all information for replaying the cs.

To build this file :
gcc -O0 -g radeon_lockup.c -ldrm -o radeon_lockup -I/usr/include/libdrm

v2: Add radeon_ctx.h file to mesa git tree
v3: Slightly improve dumped file for easier editing, only dump first faulty cs

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-04-05 10:22:05 -04:00
Brian Paul
5192262833 st/xlib: add HUD support for xlib/GLX
For the softpipe and llvmpipe drivers.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-04 17:00:42 -06:00
Brian Paul
f5071783c1 gallium/hud: add GALLIUM_HUD_PERIOD env var
To set the graph update rate, in seconds.  The default update rate
has also been changed to 1/2 second.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-04 17:00:42 -06:00
Brian Paul
6211c45186 gallium/hud: initialize sampler state
The default wrap mode (PIPE_TEX_WRAP_REPEAT) is incompatible with
unnormalized texcoords (at least for softpipe).

v2: use PIPE_TEX_WRAP_CLAMP_TO_EDGE

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-04 17:00:42 -06:00
Kenneth Graunke
edc52a8f28 glsl: Add an optimization pass to flatten simple nested if blocks.
GLBenchmark 2.7's shaders contain conditional blocks like:

if (x) {
    if (y) {
        ...
    }
}

where the outer conditional's then clause contains exactly one statement
(the nested if) and there are no else clauses.  This can easily be
optimized into:

if (x && y) {
    ...
}

This saves a few instructions in GLBenchmark 2.7:

    total instructions in shared programs: 11833 -> 11649 (-1.55%)
    instructions in affected programs:     8234 -> 8050 (-2.23%)

It also helps CS:GO slightly (-0.05%/-0.22%).  More importantly,
however, it simplifies the control flow graph, which could enable other
optimizations.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
967514ce68 i965: Use a variable for the push constant size in kB.
This clarifies that the offset of 2 is actually 16 kB / 8kB units.
It also keys both computations off of a single variable, which should
make it easier to change in the future.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
8cdb2d32ec i965: Turn brw->urb.vs_size and gs_size into local variables.
These variables are only used within a single function, so we may as
well make them local variables.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
b99ad7f02c i965: Remove BRW_NEW_WM_INPUT_DIMENSIONS dirty bit.
This was only produced by the brw_wm_input_dimensions atom, which was
removed in the previous commit.  So there's no need for the dirty bit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
d198546bac i965: Delete brw_vs_constval.c and the brw_wm_input_sizes atom.
This was only used to compute proj_attrib_mask, which was removed by the
previous commit.  That makes this dead code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
705c8247fa i965: Remove now dead brw_wm_prog_key::proj_attrib_mask field.
The previous commit removed the last user of this field, so there's no
longer any point in setting it.  Removing this should eliminate
state-dependent recompiles, and make the precompile more reliable.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
7183568869 i965: Remove fixed-function texture projection avoidance optimization.
This optimization attempts to avoid extra attribute interpolation
instructions for texture coordinates where the W-component is 1.0.

Unfortunately, it requires a lot of complexity: the brw_wm_input_sizes
state atom (all the brw_vs_constval.c code) needs to run on each draw.
It computes the input_size_masks array, then uses that to compute
proj_attrib_mask.  Differences in proj_attrib_mask can cause
state-dependent fragment shader recompiles.  We also often fail to guess
proj_attrib_mask for the fragment shader precompile, causing us to
needlessly compile it twice.

Furthermore, this optimization only applies to fixed-function programs;
it does not help modern GLSL-based programs at all.  Generally, older
fixed-function programs run fine on modern hardware anyway.

The optimization has existed in some form since the initial commit.  When
we rewrote the fragment shader backend, we dropped it for a while.  Eric
readded it in commit eb30820f26 as part of
an attempt to cure a ~1% performance regression caused by converting the
fixed-function fragment shader generation code from Mesa IR to GLSL IR.
However, no performance data was included in the commit message, so it's
unclear whether or not it was successful.

Time has passed, so I decided to re-measure this.  Surprisingly,
Eric's OpenArena timedemo actually runs /faster/ after removing this and
the brw_wm_input_sizes atom.  On Ivybridge at 1024x768, I measured a
1.39532% +/- 0.91833% increase in FPS (n = 55).  On Ironlake, there was
no statistically significant difference (n = 37).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
32726b1af6 i965: Use ctx->Stencil._WriteEnabled in DEPTH_STENCIL_STATE.
This is the same computation as the _WriteEnabled flag, so we may as
well use it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-04 15:38:19 -07:00
Kenneth Graunke
01bd29d681 i965: Fix stencil write enable flag in 3DSTATE_DEPTH_BUFFER on Gen7+.
ctx->Stencil.WriteMask is a statically sized array of 3 elements.
Checking it against 0 actually is a NULL check, and can never fail,
which meant that we always said stencil writes were enabled.

Use the new core Mesa derived state flag to fix this.

NOTE: This is a candidate for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-04 15:38:18 -07:00
Kenneth Graunke
1e3235d36e mesa: Add new ctx->Stencil._WriteEnabled derived state flag.
i965 needs to know whether stencil writes are enabled in several places,
and gets the test wrong sometimes.  While we could create a function to
compute this, it seems generally useful enough to warrant a new piece of
derived state.  Also, all the plumbing is already in place.

NOTE: This is a candidate for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-04 15:38:18 -07:00
Roland Scheidegger
9eef86bb55 gallivm: some minor cube map cleanup
The ar_ge_as_at variable was just very very confusing since the condition
was actually the other way around (as_at_ge_ar). So change the condition
(and the selects depending on it) to match the variable name.
And also change the chosen major axis in case the coord values are the
same. OpenGL doesn't care one bit which one is chosen in this case but
it looks like dx10 would require z chosen over y, and y chosen over x
(previously did x chosen over y, y chosen over z). Since it's all the
same effort just honor dx10's wishes. (Though actually, for some prefered
orderings, we could save one (or two with derivatives) selects since the
tnewx and tnewz (and the corresponding dmax values) are the same.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-04 23:22:10 +02:00
Eric Anholt
b6e9b54d06 i965: Ask the register allocator to round-robin through registers.
The way we were allocating registers before, packing into low register
numbers for Ironlake, resulted in an overly-constrained dependency graph
for instruction scheduling.  Improves GLBenchmark 2.1 performance by
4.5% +/- 0.7% (n=26).  No difference on my old GLSL demo (n=20).  No
difference on nexuiz (n=15).

v2: Fix off-by-one bug that made the change only work for 16-wide on i965.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-04 12:51:06 -07:00
Zack Rusin
be9a42e980 llvmpipe: implement ucmp
and add a test for it

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-04 12:09:55 -07:00
Paul Berry
5db2249493 Avoid spurious GCC warnings in STATIC_ASSERT() macro.
GCC 4.8 now warns about typedefs that are local to a scope and not
used anywhere within that scope.  This produced spurious warnings with
the STATIC_ASSERT() macro (which used a typedef to provoke a compile
error in the event of an assertion failure).

This patch switches to a simpler technique that avoids the warning.

v2: Avoid GCC-specific syntax.  Also update p_compiler.h.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-04 09:52:18 -07:00
Erik Faye-Lund
456f40e18d freedreno: document debug flag
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-04-04 10:41:50 -06:00
Brian Paul
e95514c0ea st/wgl: add HUD support
v2: fix a few minor issues spotted by Jose.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-04 10:41:35 -06:00
Brian Paul
0c1dcf906d st/wgl: make stw_current_context() non-static
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-04 08:50:16 -06:00
Brian Paul
92e5e45ff1 util: add debug_memory_check_block(), debug_memory_tag()
The former just checks that the given block is valid by checking
the header and footer.

The later sets the memory block's tag.  With extra debug code, we
can use that for monitoring/checking particular allocations.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-04 08:50:15 -06:00
Brian Paul
a408ea9692 gallium/hud: replace malloc w/ MALLOC
To match the FREE() called used later.  Fixes things on Windows.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-04 08:50:15 -06:00
Vincent Lejeune
9276961223 r600g/llvm: Workaround for wrong tex.offset_* 2013-04-04 16:03:04 +02:00
Roland Scheidegger
ce5096a0a9 gallivm: honor explicit derivatives values for cube maps.
This is trivial now, though need to make sure we pass all the necessary
derivative values (which is 3 each for ddx/ddy not 2).
Passes piglit arb_shader_texture_lod-texgradcube test.

v2: add the forgotten abs() for all incoming derivatives (discovered
by new piglit arb_shader_texture_lod-texgradcube test, though more by
luck as it was failing only for exactly one pixel...).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-04 01:03:42 +02:00
Roland Scheidegger
f621015cb5 gallivm: do per-pixel cube face selection (finally!!!)
This proved to be tricky, the problem is that after selection/mirroring
we cannot calculate reasonable derivatives (if not all pixels in a quad
end up on the same face the derivatives could get "randomly" exceedingly
large).
However, it is actually quite easy to simply calculate the derivatives
before selection/mirroring and then transform them similar to
the cube coordinates (they only need selection/projection, but not
mirroring as we're not interested in the sign bit, of course). While
there is a tiny bit more work to do (need to calculate derivs for 3
coords instead of 2, and additional selects) it also simplifies things
somewhat for the coord selection itself (as we save some broadcast aos
shuffles, and we don't need to calculate the average vector) - hence if
derivatives aren't needed this should actually be faster.
Also, this has the benefit that this will (trivially) work for explicit
derivatives too, which we completely ignored before that (will be in a
separate commit for better trackability).
Note that while the way for getting rho looks very different, it should
result in "nearly" the same values as before (the "nearly" is only because
before the code would choose the face based on an "average" vector and hence
the derivatives calculated according to this face, where now (for implicit
derivatives) the derivatives are projected on the face selected for the
first (top-left) pixel in a quad, so not necessarly the same face).
The transformation done might not quite be state-of-the-art, calculating
length(dx,dy) as max(dx,dy) certainly isn't neither but this stays the
same as before (that is I think a better transform would _somehow_ take
the "derivative major axis" into account so that derivative changes in
the major axis wouldn't get ignored).
Should solve some accuracy problems with cubemaps (can easily be seen with
the cubemap demo when switching wrapping/filtering), though we still don't
do seamless filtering to fix it completely (so not per-sample but per-pixel
is certainly better than per-quad and already sufficient for accurate
results with nearest tex filter).

As for performance, it seems to be a tiny bit faster too (maybe 3% or so
with cubemap demo). Which I'd have expected with nearest/nearest filtering
where this will be less instructions, but the difference seems to actually
be larger with linear/linear_mipmap_linear where it is slightly more
instructions, probably the code appears less serialized allowing better
scheduling (on a sandy bridge cpu). It actually seems to be now at least
as fast as the old path using a conditional when using 128bit vectors too
(that is probably more a result of testing with a newer cpu though), for now
that old path is still there but unused.
No piglit regressions.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-04 01:03:42 +02:00
Roland Scheidegger
bdfbeb9633 gallivm: minor rho calculation optimization for 1 or 3 coords
Using a different packing for the single coord case should save a shuffle.
Plus some minor style fixes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-04 01:03:42 +02:00
Roland Scheidegger
067a0ae420 gallivm: use f16c hw support for float->half and half->float conversion
Should be way faster of course on cpus supporting this (includes AMD
Bulldozer and Jaguar cores, Intel Ivy Bridge and up (except budget models)).
Passes piglit fbo-blending-formats GL_ARB_texture_float -auto on Ivy Bridge.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-04 01:03:42 +02:00
Zack Rusin
302df7cc85 draw/llvmpipe: allow independent so attachments to the vs
When geometry shaders are present, one needs to be able to create
an empty geometry shader with stream output that needs to be
resolved later and attached to the currently bound vertex shader.
Lets add support for it to llvmpipe and draw. draw allows attaching
independent stream output info to any vertex shader and llvmpipe
resolves at draw time which vertex shader the given empty geometry
shader should be linked to.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
246e68735f llvmpipe: reset so buffers when not appending
We need to reset the internal state of the so buffers or we'll
keep appending even though we're not supposed to.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
7ca65a68e1 draw: remove unused function
we use draw_set_mapped_so_targets nowadays

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
b16ae0f792 draw/llvm: use an enum instead of magic numbers
I think this was there before and got accidently
removed during a merge. Same code as for the GS
context, which is also using an enum instead of
hardcoded numbers.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
49b7d933f8 draw/gs: cleanup some debugging code
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
822c21c776 draw/so: maintain an exact number of written vertices
It's quite helpful during the rendering when we know
exactly the count of the vertices available in the
buffer.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
d8543bd752 draw: Implement support for primitive id
We were largely ignoring primitive id.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
f6bfb62c50 draw/so: Fix bogus assert
We do support so with multiple primitives.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
e6fc635351 draw/gs: Fix memory corruption with multiple primitives
We were flushing with incorrect number of primitives. TGSI exec
can only work with a single primitive at a time. Plus the fetching
with multiple primitives on llvm paths wasn't copying the last
element.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
f313b0c850 gallivm: cleanup the gs interface
Instead of void pointers use a base interface.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Brian Paul
ac114c6824 svga: add new memory-used HUD query
To track the amount of memory used by all pipe_resources (textures
and buffers).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 11:02:47 -06:00
Brian Paul
a69efa9482 util: add new util_resource_size() function in u_resource.[ch]
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 11:02:47 -06:00
Brian Paul
a3cccdec90 util: move functions from u_resource.c to u_transfer.c
The functions are prototyped in u_transfer.h and are related to the
other functions in u_transfer.c.

The next patch will re-use the u_resource.c file for new code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 11:02:47 -06:00
Vincent Lejeune
159d934066 r600g/llvm: Do not override llvm provided stack_size 2013-04-03 18:39:49 +02:00
Vincent Lejeune
097a6ecdfe r600g/llvm: Do not change cf_alu inst when adding alus 2013-04-03 18:22:40 +02:00
Marek Olšák
ff01e0db0e radeonsi: add more cases for copying unsupported formats to resource_copy_region
Ported from r600g commit:

8891b2f9c9

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

NOTE: This is a candidate for the 9.1 branch.
2013-04-03 10:58:33 -04:00
Brian Paul
3838edaf5d svga: add HUD queries for number of draw calls, number of fallbacks
The fallbacks count is the number of drawing calls that use a "draw"
module fallback, such as polygon stipple.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 09:56:08 -06:00
Brian Paul
49ed1f3cb3 svga: refactor occlusion query code
This is in preparation for adding new query types for the HUD.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 09:56:07 -06:00
Brian Paul
a9ae7e9c28 gallium/hud: try L8 texture for font if I8 format isn't supported 2013-04-03 09:44:57 -06:00
Brian Paul
0289ebaa0f svga: add case for PIPE_CAP_QUERY_PIPELINE_STATISTICS 2013-04-03 08:19:44 -06:00
Brian Paul
7e28debb6f st/mesa: rewrite comment in st_manager.c 2013-04-03 08:16:36 -06:00
Christoph Bumiller
80eef069f0 nv50,nvc0: remove MS resolve formats hack
Mesa now allows BlitFramebuffer resolve between RGBA and BGRA.
2013-04-03 13:19:15 +02:00
Christoph Bumiller
4de70bf43c nvc0: fix 128 bit compressed storage type selection 2013-04-03 12:54:44 +02:00
Christoph Bumiller
8e1dd58a7e nvc0: place staging textures in GART and map them directly 2013-04-03 12:54:44 +02:00
Christoph Bumiller
ba9b0b682f nv50: account for pesky prefetch in size calculation of linear textures 2013-04-03 12:54:44 +02:00
Christoph Bumiller
f0a0d59f0f nvc0: honour scaled coordiantes setting for linear textures 2013-04-03 12:54:44 +02:00
Christoph Bumiller
d801545964 nvc0: fix for 2d engine R source formats writing RRR1 and not R001 2013-04-03 12:54:43 +02:00
Christoph Bumiller
6417d56c19 nv50,nvc0: disable DEPTH_RANGE_NEAR/FAR clipping during blit
We send position.z == 0, DEPTH_RANGE may be some arbitrary range
not including 0 (for exmaple in piglit's hiz tests).
2013-04-03 12:54:43 +02:00
Christoph Bumiller
e45c969fe5 st/mesa: fix bitmap,drawpix,drawtex for PIPE_CAP_TGSI_TEXCOORD
NOTE: Changed the semantic index for the drawtex coordinate to
be the texture unit index instead of always 0.
Not sure if this is correct but since the value seems to depend
on the unit it would make sense to use different varying slots.
2013-04-03 12:54:43 +02:00
Christoph Bumiller
2a8145d36b nouveau: accelerate buffer copies in resource_copy_region 2013-04-03 12:54:43 +02:00
Christoph Bumiller
3ed4bbd769 nvc0: demagic some of the NVE4_COMPUTE_UPLOAD methods
It's actually the same as P2MF.
2013-04-03 12:54:43 +02:00
Christoph Bumiller
fb0334adb3 nvc0: read PM counters for each warp scheduler separately 2013-04-03 12:54:43 +02:00
Christoph Bumiller
7bac075f25 nvc0: add some metrics to driver specific queries 2013-04-03 12:54:43 +02:00
Christoph Bumiller
198f514aa6 nvc0: add some driver statistics queries 2013-04-03 12:54:43 +02:00
Christoph Bumiller
7628cc247f nvc0: disable compressed storage type 0xdb for now
Single-sample color compression doesn't seem that useful anyway.
2013-04-03 12:54:43 +02:00
Christoph Bumiller
ea12fc3f6c nvc0: use correct hw query for PRIMITIVES_GENERATED
It was the same as SO_STATISTICS[1] before.
2013-04-03 12:54:43 +02:00
Christoph Bumiller
6bca4e7085 nvc0: use fence to check state of queries that don't write sequence
This still isn't optimal, since the fence will signal a bit late,
but better than checking on the bo, which may never be ready if it
is shared (which is likely).
2013-04-03 12:54:43 +02:00
Christoph Bumiller
3d2790cead gallium/hud: add support for PIPE_QUERY_PIPELINE_STATISTICS
Also, renamed "pixels-rendered" to "samples-passed" because the
occlusion counter increments even if colour and depth writes are
disabled, or (on some implementations) for killed fragments that
passed the depth test when PS early_fragment_tests is set.
2013-04-03 12:54:43 +02:00
Christoph Bumiller
c620aad71c gallium/docs: fix definition of PIPE_QUERY_SO_STATISTICS
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-03 12:54:43 +02:00
Christoph Bumiller
f35e96d973 gallium: add PIPE_CAP_QUERY_PIPELINE_STATISTICS
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-03 12:54:43 +02:00
Paul Berry
41e4bccc75 i965: Reduce code duplication in handling of depth, stencil, and HiZ.
This patch consolidates duplicate code in the brw_depthbuffer and
gen7_depthbuffer state atoms.  Previously, these state atoms contained
5 chunks of code for emitting the _3DSTATE_DEPTH_BUFFER packet (3 for
Gen4-6 and 2 for Gen7).  Also a lot of logic for determining the
appropriate buffer setup was duplicated between the Gen4-6 and Gen7
functions.

This refactor splits the code into three separate functions:
brw_emit_depthbuffer(), which determines the appropriate buffer setup
in a mostly generation-independent way, brw_emit_depth_stencil_hiz(),
which emits the appropriate state packets for Gen4-6, and
gen7_emit_depth_stencil_hiz(), which emits the appropriate state
packets for Gen7.

Tested using Piglit on Gen5-7 (no regressions).

v2: Re-word some comments.  Fix an assertion that incorrectly
prohibited packed depth/stencil formats on Gen6 (these are allowed
provided that HiZ is disabled).

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-02 15:19:13 -07:00
Paul Berry
2ad0ed6349 Revert "glsl: Replace constant-index vector array accesses with swizzles"
This reverts commit dbf94d105a, which
was working around a bug in the handling of array indexing when
constant folding built-in functions.  Now that the constant folding
bug has been fixed, the workaround is no longer needed.
2013-04-02 12:24:16 -07:00
Paul Berry
7d4f1e6467 glsl: Fix array indexing when constant folding built-in functions.
Mesa constant-folds built-in functions by using a miniature GLSL
interpreter (see
ir_function_signature::constant_expression_evaluate_expression_list()).
This interpreter had a bug in its handling of array indexing, which
caused expressions like "m[i][j]" (where m is a matrix) to be handled
incorrectly.  Specifically, it incorrectly treated j as indexing into
the whole matrix (rather than indexing just into the vector m[i]); as
a result the offset computed for m[i] was lost and m[i][j] was treated
as m[j][0].

Fixes piglit tests inverse-mat[234].{vert,frag}.

NOTE: This is a candidate for the 9.1 and 9.0 branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57436
2013-04-02 12:24:08 -07:00
Roland Scheidegger
450950c57a gallivm: bring back optimized but incorrect float to smallfloat optimizations
Conceptually the same as previously done in float_to_half.
Should cut down number of instructions from 14 to 10 or so, but
will promote some NaNs to Infs, so it's disabled.
It gets a bit tricky though handling all the cases correctly...
Passes basic tests either way (though there are no tests testing special
cases, but some manual tests injecting them seemed promising).

v2: style and comment fixes suggested by Jose

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-02 18:24:31 +02:00
Roland Scheidegger
3febc4a1cd gallivm: consolidate code for float-to-half and float-to-packed conversion.
This replaces the existing float-to-half implementation.
There are definitely a couple of differences - the old implementation
had unspecified(?) rounding behavior, and could at least in theory
construct Inf values out of NaNs. NaNs and Infs should now always be
properly propagated, and rounding behavior is now towards zero
(note this means too large but non-Infinity values get propagated to max
representable value, not Infinity).
The implementation will definitely not match util code, however (which
does nearest rounding, which also means too large values will get
propagated to Infinity).

Also fix a bogus round mask probably leading to rounding bugs...
v2: fix a logic bug in handling infs/nans.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-02 18:24:31 +02:00
Vadim Girlin
9be624b3ef r600g: don't reserve more stack space than required v5
Reduced stack size allows to run more threads in some cases,
improving performance for the shaders that use stack (that is, for the
shaders with control flow instructions). E.g. with unigine-based apps.

v4: implement exact computation taking into account wavefront size
v5: add cases for RV620, RS880

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-02 19:34:14 +04:00
Vadim Girlin
7e04227f39 r600g: fix range handling for tgsi input declarations v2
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-02 19:34:14 +04:00
Marek Olšák
f8502b7e71 gallium/hud: do .xxxx swizzling for the font texture in the fragment shader
This allows using L8 and R8 for the font if I8 isn't supported.

Tested-by: Brian Paul <brianp@vmware.com>
2013-04-02 16:57:57 +02:00
Brian Paul
98b64cc20f hud: flush/unmap the vertex buffer before drawing
The VMware svga driver is picky about making sure the VBO is unmapped
before drawing.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-02 08:17:28 -06:00
Brian Paul
bdd3770b78 draw: use pipe_transfer_unmap() to match pipe_transfer_map() 2013-04-02 08:17:28 -06:00
Roland Scheidegger
9b329f4c09 gallivm: fix signed small float to float conversion
Introduced by 5f41e08cf3,
just a silly typo.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=62921.
2013-04-02 13:21:07 +02:00
Christian König
a0dca4409a radeonsi: add instance divisor support v3
v2: reduce key size, don't copy key around to much.
v3: remove key size reduction

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:43 +02:00
Christian König
cf9b31f78a radeonsi: add start instance support
This works different than on R600, we need to add the start instance manually.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:43 +02:00
Christian König
e4ed58763a radeonsi: add instanceid support
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:43 +02:00
Christian König
83df955ca9 radeon/llvm: move system value fetching to common code
This should be used by both SI and R600.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:42 +02:00
Michel Dänzer
c6efb4870b radeonsi: Handle arbitrary 2-byte formats in resource_copy_region
Fixes mplayer -vo vdpau OSD.

NOTE: This is a candidate for the 9.1 branch.

Reported-by: Igor Vagulin <igor.vagulin@gmail.com>

Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Christian König <christian.koenig@amd.com>
2013-04-02 11:42:35 +02:00
Maarten Lankhorst
6d20c646d6 nvc0: Fix fd leak in nvc0_create_decoder
NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-04-02 10:25:26 +02:00
Aras Pranckevicius
b2eee0869f GLSL: fix lower_jumps to report progress properly
A fix for lower_jumps progress reporting, very much like similar in
c1e591eed.

NOTE: This is a candidate for stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-01 16:57:17 -07:00
Eric Anholt
62501c3af8 i965/fs: Allow CSE on pre-gen7 varying-index uniform loads
All the other expression types allowed here have inst->mlen == 0, and this
one has implied MRF writes for all of its payload, so nothing else in the
implementation should need to change.

Reduces SEND messages for loading from pull constants in kwin's Lanczos
shader from 16 to 6.  (Due to a deficiency in constant propagation, I
can't use the hack I did in the previous commit to test the performance
change)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61554
NOTE: This is a candidate for the 9.1 branch.
2013-04-01 16:17:26 -07:00
Eric Anholt
70b27e0e4b i965/fs: Use LD messages for pre-gen7 varying-index uniform loads
This comes at a minor performance cost at the moment (-3.2% +/- 0.2%, n=14 on
my GM45 forced to load all uniforms through the varying-index path), but we
get a whole vec4 at a time to reuse in the next commit.

v2: Fix comment about channels in the other message.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.1 branch.
2013-04-01 16:17:26 -07:00
Eric Anholt
ce316f62ef i965/fs: Don't double-emit SEND dependency workarounds at control flow.
We weren't setting needs_dep[i] in the loops, so we'd continue on to
potentially add the same workaround MOVs to the later basic block
boundaries, too.  We can either set needs_dep[i] to exit through the
normal path, or we can just return since we know we're done.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:26 -07:00
Eric Anholt
3cf69b2284 i965/fs: Bake regs_written into the IR instead of recomputing it later.
For sampler messages, it depends on the target gen, and on gen4
SIMD16-sampler-on-SIMD8-execution we were returning 4 instead of 8 like we
should.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.1 branch.
2013-04-01 16:17:26 -07:00
Eric Anholt
8edc7cbe64 i965/fs: Clean up the setup of gen4 simd16 message destinations.
I think this makes it much more obvious what's going on here.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:26 -07:00
Eric Anholt
9f43b84928 i965/fs: Do CSE on gen7's varying-index pull constant loads.
This is our first CSE on a regs_written() > 1 instruction, so it takes a
bit of extra fixup.  Reduces the number of loads on kwin's Lanczos shader
from 12 to 2.

v2: Fix compiler warning (false positive on possibly-uninitialized variable)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61554
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
NOTE: This is a candidate for the 9.1 branch.
2013-04-01 16:17:25 -07:00
Eric Anholt
dca5fc1435 i965/fs: Improve performance of varying-index uniform loads on IVB.
Like we have done for the VS and for constant-index uniform loads, we use
the sampler engine to get caching in front of the L3 to avoid tickling the
IVB L3 bug.  This is also a bit of a functional change, as we're now
loading a vec4 instead of a single dword, though we're not taking
advantage of the other 3 components of the vec4 (yet).

With the driver hacked to always take the varying-index path for all
uniforms, improves performance of my old GLSL demo by 315% +/- 2% (n=4).
This a major fix for some blur shaders in compositors from the
varying-index uniforms support I introduced in 9.1.

v2: Move old offset computation into the pre-gen7 path.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61554
NOTE: This is a candidate for the 9.1 branch.
2013-04-01 16:17:25 -07:00
Eric Anholt
bc0e1591f6 i965/fs: Avoid inappropriate optimization with regs_written > 1.
Right now we don't have anything with regs_written() > 1 and !inst->mlen,
but that's about to change.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:25 -07:00
Eric Anholt
740350c982 i965: Make the fragment shader pull constants index by dwords, not vec4s.
We want to load vec4s, since loading a vec4 instead of a dword is
basically no increased latency.  But for variable indexed access, the
previous requirement of aligned vec4s for a sampler LD was hard to
implement.

Note that this change only affects those messages that use the surface
format, like sampler LDs, but not to the untyped data cache loads we've
used in other cases.

No significant performance difference on my GLSL demo with uniforms forced
to take the varying pull constants path (n=4).

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:25 -07:00
Eric Anholt
2f41a60145 i965: Make the constant surface interface take a normal byte size.
This puts the rounding-up logic into the function itself instead of all
the callers having to manage it.  Also drop an "unused" comment in gen4,
as the stride *is* used for texbos (and will be for uniforms soon).

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:25 -07:00
Eric Anholt
8c694dfe64 i965/fs: Move varying uniform offset compuation into the helper func.
I'm going to want to change the math for gen7 using sampler LD
instructions in a way that gets CSE to occur like we'd hope.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:25 -07:00
Eric Anholt
59e858861c i965/fs: Remove creation of a MOV instruction that's never used.
We weren't inserting it into the list, so it did nothing.  This line was
replaced by the MOV/MUL block above.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:24 -07:00
Eric Anholt
1d6ead3804 i965/fs: Allow constant propagation into MACH.
This happens quite a bit with varying-index uniform loads.  We could also
do better by avoiding the MACH entirely, but there's no reason not to at
least take this step.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 16:17:24 -07:00
Vincent Lejeune
50fd9c4544 r600g/llvm: Update LLVM_REVISION.txt 2013-04-01 23:50:20 +02:00
Vincent Lejeune
8c8c4e3977 r600g/llvm: Use stack_size provided from llvm. 2013-04-01 23:43:57 +02:00
Vincent Lejeune
4ac0d85ca6 r600g/llvm: uses function attribute to pass shader type 2013-04-01 23:43:42 +02:00
Vincent Lejeune
af38695f51 r600g/llvm: Add support for cf_alu native encode 2013-04-01 23:43:27 +02:00
Haixia Shi
bc0cc2944f ACTIVE_UNIFORM_MAX_LENGTH should include 3 extra characters for arrays.
If the active uniform is an array, then the length of the uniform name should
include the three extra characters for the "[0]" suffix, which is required by
the GL 4.2 spec to be appended to the uniform name in glGetActiveUniform().

This avoids the situation where the output buffer does not have enough space
to hold the "[0]" suffix, resulting in an incomplete array specification like
"foobar[0".

NOTE: This is a candidate for the 9.1 branch.

Change-Id: I41e87ba347a7169eec8c575596cc3416adbe0728
Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-01 13:39:13 -07:00
Matt Turner
e2b40e253b i965/fs: Fix bad interaction between tex swizzles and textureQueryLOD.
Reported-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 13:11:43 -07:00
Eric Anholt
4ee892ee8a i965: Remove the old brw_optimize() code.
This is now done in the VS backend before instruction emit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 11:36:06 -07:00
Eric Anholt
4fee05b020 i965/vs: Add a pass to set dependency control fields on instructions.
This is a more aggressive version of the old brw_optimize() path.  Reduces
cycles spent in the vertex shader on minecraft by 18.6% +/- 10.0% (n=15).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 11:36:05 -07:00
Eric Anholt
229a51cdbe i965: Dump shader source for linked shader programs.
We dump shader source in ir_to_mesa.cpp, and we dump linked programs here,
but we had no reference from the linked programs to their source.  This
was preventing improvement of shader-db to use linked shader programs
instead of individual shader files (which is bogus, because it means we
optimize out VS outputs, and don't interpolate FS inputs!)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-01 11:30:36 -07:00
Mike Lothian
777a7f2003 clover: Fix build with LLVM 3.3 2013-04-01 10:50:23 -07:00
Brian Paul
1165ff1af1 llvmpipe: use triangle subdivision to avoid fixed-point overflow issues
If we're drawing to a surface that's 2048 x 2048 pixels or larger there's
danger of fixed-point overflow in the triangle rasterization code.  That
leads to various rendering glitches.

Rather than implement some intricate changes to the rasterization code,
simply subdivide triangles into smaller subtriangles to avoid the issue.
Only do this when the drawing surface is larger than 2048 by 2048.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-01 08:40:35 -06:00
Brian Paul
95df2b2883 mesa: remove platform checks around __builtin_ffs, __builtin_ffsll
Use the __builtin_ffs, __builtin_ffsll functions whenever we have GCC,
not just for specific platforms.  Fixes Solaris build.

Note: This is a candidate for the stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62868
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-01 08:40:35 -06:00
Brian Paul
99811c344b docs: add a new page documenting known application issues
Let's try to update this when we find other broken applications...

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-01 08:40:35 -06:00
Brian Paul
fe30fa9ad6 drirc: set always_have_depth_buffer for Topogon
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-01 08:18:09 -06:00
Adam Jackson
e26d5940ff gallivm: Minor comment cleanup
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-04-01 09:45:38 -04:00
Dave Airlie
135bb3c1a9 mesa: fix texture storage multisample prototypes harder.
I just noticed the warnings since I fixed the other bit.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-01 19:54:56 +10:00
Vincent Lejeune
c3fb34ee8d r600g/llvm: Update LLVM_REVISION 2013-03-31 21:37:20 +02:00
Vincent Lejeune
67a8ee7aaa r600g/llvm: use native encode for tex 2013-03-31 21:35:47 +02:00
Dave Airlie
5b36bc05be glapi: fix storage multisample build errors
Reported on #radeon by udovdh

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-03-31 20:41:28 +10:00
Chris Forbes
2a528889a3 docs: mark ARB_texture_storage_multisample done
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:42 +13:00
Chris Forbes
d25b4d5e90 i965: enable ARB_texture_storage_multisample on Gen6+
This can be enabled everywhere that ARB_texture_multisample is
supported -- ARB_texture_storage is supported on everything.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:40 +13:00
Chris Forbes
e0015c819c mesa: allow multisample texture targets in [Get]TexParameter*
ARB_texture_storage_multisample allows texture parameters to be
queried for TEXTURE_2D_MULTISAMPLE and TEXTURE_2D_MULTISAMPLE_ARRAY
targets.

Some parameters may also be set, with the following exceptions:

- TEXTURE_BASE_LEVEL may not be set to a nonzero value; generates
   INVALID_OPERATION

- any state which appears in the `per-sampler` state table may not
  be set; generates INVALID_OPERATION

V2: Don't introduce bogus handling of TEXTURE_MAX_LEVEL

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:36 +13:00
Chris Forbes
b15c558c85 mesa: improve reported function name in Tex*Multisample
Now that there are 4 variants, just pass the function name into
teximagemultisample rather than reconstructing it.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:34 +13:00
Chris Forbes
9cbfe98bfc mesa: add enable bit for ARB_texture_storage_multisample
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:32 +13:00
Chris Forbes
719974b54c glapi: add definition of ARB_texture_storage_multisample
Adds XML for the extension, dispatch_sanity enabling, and the two new
entrypoints. These are both implemented by calling the shared
teximagemultisample() with immutable=GL_TRUE.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:28 +13:00
Chris Forbes
788b0f8535 mesa: add support for immutable textures to teximagemultisample()
The new entrypoints will come later, but this adds the actual logic for
supporting immutable multisample textures:

- The immutability flag is set as desired.
- Attempting to modify an immutable multisample texture produces
  INVALID_OPERATION.

Note: The extension spec does not mention adding this behavior to
TexImage*Multisample, but it seems like the reasonable thing to do.

V2: - Cover missing error cases (unsized formats; texture object zero)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V1] Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:22 +13:00
Chris Forbes
7f32b9560b mesa: extract _mesa_is_legal_tex_storage_format helper
This is about to be used in teximagemultisample() when immutable=true.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-31 22:19:13 +13:00
Kenneth Graunke
fdc5941972 mesa: Delete VERT_ATTRIB_GENERIC_NV and VERT_BIT_GENERIC_NV macros.
These haven't been used since we deleted NV_vertex_program support.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-30 19:19:45 -07:00
Eric Anholt
0967c362bf i965: Fix an inconsistency inb the VUE map with gl_ClipVertex on gen4/5.
We are intentionally not allocating a slot for gl_ClipVertex.  But by
leaving the bit set in the slots_valid, the fragment shader's computation
of where varyings are in urb entry coming out of the SF would be off by
one.  Fixes rendering in Freespace 2 SCP, and improves rendering in TF2.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62830
Tested-by: Joaquín Ignacio Aramendía <samsagax@gmail.com>
NOTE: This is a candidate for the 9.1 branch.
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-03-30 17:24:18 -07:00
Eric Anholt
9dd19575d3 intel: Remove a never-taken debug print path.
Alessandro Pignotti noted when I added this code in commit
0e723b135b that it's in the else block for
"if (busy)", so this debug print couldn't happen.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-30 17:23:50 -07:00
Brian Paul
c34bbe110d st/mesa: add ir_lod case in GLSL->TGSI code to silence warning 2013-03-29 17:21:33 -06:00
Ian Romanick
e0131196ca glsl: Generated masked write instead of vector array index for UBO lowering
When reading a column from a row-major matrix, we would slot the single
value read into the vector using an ir_dereference_array of the vector
with a constant index.  This will (eventually) get optimized to a
masked-write, so just generate the masked write in the first place.

v2: Remove unused variable 'chan'.  Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Eric Anholt <eric@anholt.net>
2013-03-29 12:01:14 -07:00
Ian Romanick
65cc68f430 glsl: Replace open-coded dot-product with dot
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Paul Berry <stereotype441@gmail.com>
2013-03-29 12:01:11 -07:00
Ian Romanick
dbf94d105a glsl: Replace constant-index vector array accesses with swizzles
Search and replace:

    ][0] -> ].x
    ][1] -> ].y
    ][2] -> ].z
    ][3] -> ].w

Fixes piglit tests inverse-mat[234].{vert,frag}.  These tests call the
inverse function with constant parameters and expect proper constant
folding to happen.  My suspicion is that this patch papers over some bug
in constant propagation involving array accesses.

Either way, all of these accesses eventually get lowered to swizzles.
This cuts out the middle man (saving a trivial amount of CPU).

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Paul Berry <stereotype441@gmail.com>
2013-03-29 12:01:07 -07:00
Ian Romanick
c770faea0a glsl: Add missing bool case in glsl_type::get_scalar_type
Since the case was missing bec4->get_scalar_type() would return bvec4,
but vec4->get_scalar_type() would return float.

NOTE: This is a candidate for stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-29 12:01:01 -07:00
Kenneth Graunke
57a502518e i965: Fix INTEL_DEBUG=shader_time for fragment shaders with discards.
"discard" instructions generate HALT instructions which jump to a final
HALT near the end of the shader.  Previously, fs_generator created this
final jump target when it saw the first FS_OPCODE_FB_WRITE, causing it
to jump right before the FB write epilogue.  This is normally good.

However, INTEL_DEBUG=shader_time also has an epilogue section which
records the final timestamp.  The frontend emits IR for this just before
FS_OPCODE_FB_WRITE.  Unfortunately, this led to the following ordering:

1. Shader Time Epilogue
2. Final HALT (where discards jump)
3. Framebuffer Write Epilogue

This meant that discarded pixels completely skipped the shader time
epilogue, causing no ending timestamp to be written.  This obviously
led to inaccurate results.

This patch adds a new FS_OPCODE_PLACEHOLDER_HALT in the IR stream just
before any epilogue sections.  This is where the final HALT should be
generated, and makes it easy to ensure the correct ordering:

1. Final HALT
2. Shader Time Epilogue
3. Framebuffer Write Epilogue

For shaders that don't discard, this opcode compiles away to nothing.
The scheduler adds barrier dependencies to make sure that it doesn't
get moved above any FS_OPCODE_DISCARD_JUMP instructions.

One 8-wide shader in GLBenchmark 2.7 dropped from 2291.67 Gcycles to
a mere 5.13 Gcycles.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 11:39:32 -07:00
Eric Anholt
20d846ce8b i965: Add names for all instructions to dump_instruction() in FS and VS.
I'd previously added the minimum names to understand my dumps, but this
makes dumps in general much easier to read.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-29 11:39:21 -07:00
Matt Turner
ed6186f0e8 i965: Enable ARB_texture_query_lod.
v2: Support Ironlake as well.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-29 10:21:14 -07:00
Matt Turner
b8aa9f7d3a i965/fs: Generate LOD sampler message from ir_lod.
v2: Support Ironlake as well.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-29 10:21:14 -07:00
Dave Airlie
110ca8b1f3 glsl: Implement ARB_texture_query_lod
v2 [mattst88]:
   - Rebase.
   - #define GL_ARB_texture_query_lod to 1.
   - Remove comma after ir_lod in ir.h for MSVC.
   - Handled ir_lod in ir_hv_accept.cpp, ir_rvalue_visitor.cpp,
     opt_tree_grafting.cpp.
   - Rename textureQueryLOD to textureQueryLod, see
     https://www.khronos.org/bugzilla/show_bug.cgi?id=821
   - Fix ir_reader of (lod ...).
v3 [mattst88]:
   - Rename textureQueryLod to textureQueryLOD, pending resolution of
     Khronos 821.
   - Add ir_lod case to ir_to_mesa.cpp.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-29 10:20:26 -07:00
Matt Turner
0e0ab8a071 i965/fs: Use measured Gen7 instruction timings on Gen6.
x before
+ after
+------------------------------------------------------------------------------+
|   x                                   x   +                                  |
|   xx  ++                              x   +                                  |
|   xx  ++ +                           xx   ++                                 |
|x xxx x+++++          +           xxx x*x+*+++ +         x                   +|
|   |_____|____________A______A____M____M_|_______|                            |
+------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
    x  23       8083.78       8287.83       8205.55     8162.7461     68.307951
    +  23       8107.56       8358.74       8224.33     8186.1765     71.506301
    No difference proven at 95.0% confidence

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 10:13:27 -07:00
Matt Turner
f085b21b25 i965/fs: Increase and document MAD latency on Gen7.
58% of mad(8) generated in shader-db are reading registers from the same
bank.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 10:13:27 -07:00
Matt Turner
414ea2f560 i965/fs: Add LRP instruction latency.
Set its latency to what happens to be the default floating-point
instruction latency. One day we may want to handle latency based on
register bank information.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 10:13:27 -07:00
Matt Turner
ad4507b355 i965/fs: Add Haswell cycle timings
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 10:13:27 -07:00
Matt Turner
7997e59b65 i965: Note that write-after-write dependencies are blocking.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 10:13:26 -07:00
Matt Turner
f91e371fee i965: Reword comment about the shared mathbox.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-29 10:13:26 -07:00
Roland Scheidegger
5f41e08cf3 gallivm: consolidate some half-to-float and r11g11b10-to-float code
Similar enough that we can try to use shared code.
v2: fix a stupid bug using wrong variable causing mayhem with Inf and NaNs.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com
2013-03-29 16:39:40 +01:00
Chris Forbes
4412f3bc13 mesa: provide default implementation of QuerySamplesForFormat
Previously at least i915 failed to provide an implementation, but
exposed ARB_internalformat_query anyway, leading to crashes when
QueryInternalformativ was called.

Default implementation just returns 1 for everything, so is suitable for
any driver which does not support multisampling.

V2: - Move from intel to core mesa.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-29 20:54:36 +13:00
Christoph Bumiller
ee624ced36 nvc0: implement MP performance counters
There's more, but this only adds (most) of the counters that are
handled directly by the shader processors.
The other counter domains are not handled on the multiprocessor and
there are no FIFO object methods for configuring them.
Instead, they have to be programmed by the kernel via PCOUNTER, and
the interface for this isn't in place yet.
2013-03-29 00:33:01 +01:00
Christoph Bumiller
480359bcf6 nvc0: enable compression when supported 2013-03-29 00:33:01 +01:00
Christoph Bumiller
25722e3454 nvc0: use NOUVEAU_GETPARAM_GRAPH_UNITS to get MP count 2013-03-29 00:33:00 +01:00
Christoph Bumiller
443b247878 nv50,nvc0: fix 3d blits, restore viewport after blit 2013-03-29 00:33:00 +01:00
Christoph Bumiller
090e73fc46 nv50: fix 3D render target setup 2013-03-29 00:33:00 +01:00
Brian Paul
b54ce3738a llvmpipe: put .bmp extension on dumped image files 2013-03-28 17:17:26 -06:00
Brian Paul
e90c56bc4e llvmpipe: add 'f' suffix to 1.0 in fixed_to_float() 2013-03-28 17:17:26 -06:00
Brian Paul
499aa3ddb4 draw: fix some build breakage when LLVM is not used
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62883
Tested-by: Vinson Lee <vlee@freedesktop.org>
2013-03-28 17:15:58 -06:00
Marek Olšák
9ad9141917 mesa: handle STATE_CURRENT_ATTRIB_MAYBE_VP_CLAMPED for parameter printing
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-28 20:02:50 +01:00
Kenneth Graunke
9fe47756b3 i965: Tidy shader time printing code by using printf's field widths.
We can use %-6s%-6s rather than manually counting characters, resulting
in much more readable code.

This necessitates a small secondary change: using "total fs16" and ""
now causes the "" string to be padded out to 6 characters, resulting in
too much whitespace.  Splitting it into "total" and "fs16" produces the
same output as before.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:44 -07:00
Eric Anholt
6192e9b377 i965/vs: Include URB payload setup in shader_time.
This much more accurately reflects the cost of the vertex shader, since
the payload setup is often a significant fraction of the instructions in
the VS.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:41 -07:00
Eric Anholt
55feb19704 i965/vs: Use a send from a 2-register VGRF for shader time writes.
This will let us emit it later, after we're setting up MRFs for the
URB write.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:37 -07:00
Eric Anholt
130138030a i965/vs: Teach copy propagation about sends from GRFs.
This incidentally also teaches it a bit about gen6 math -- we now allow
unswizzled, unmodified GRF temps as the sources for math.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:34 -07:00
Eric Anholt
c3a22d42a8 i965/vs: Prepare split_virtual_grfs() for the presence of SENDs from GRFs.
v2: Fix silly bool handling, and don't add new tabs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:29 -07:00
Eric Anholt
47e795d861 i965/fs: Include everything but the final FB write in shader_time.
Previously, if you just wrote a constant color to the render target, no
time got noted at all.  This is convenient for doing single-instruction
timings, but not so much for actual program analysis.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:23 -07:00
Eric Anholt
5c5218ea61 i965/fs: Switch shader_time writes to using GRFs.
This avoids conflicts between shader_time and FB writes, so we can include
more of the program under our profiling.  This does mean hiding more of
the message setup from the optimizer, which doesn't have a way to handle
multi-reg sends from GRFs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:15 -07:00
Eric Anholt
5c039543db i965: Provide more detailed information to match shader_time to programs.
Ken asked me the other day what -1 vs 0 vs 3 vs other meant in our shader
names, and I realized that it was really unclear.  I'd like to do even
better, like noting which one is the clear shader, but that would require
exposing the metaops struct to the driver.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:11 -07:00
Eric Anholt
d2ba1c24b4 i965: Track ARB program state along with GLSL state for shader_time.
This will let us do much better printouts for non-GLSL programs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 11:46:01 -07:00
Marek Olšák
a19f6e880a st/dri: fix crash with HUD and single buffering 2013-03-28 18:17:21 +01:00
Marek Olšák
6b5dfa42c9 st/mesa: remove leftover printfs from ReadPixels
Oops, I thought I had removed all debugging code.
2013-03-28 18:17:21 +01:00
Eric Anholt
eda434921d i965/fs: Improve performance of copy propagation dataflow using bitsets.
Reduces compile time of l4d2's slowest shader by 17.8% +/- 1.3% (n=10).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-28 09:48:50 -07:00
Zack Rusin
d066133a76 llvmpipe/draw: Fix texture sampling in geometry shaders
We weren't correctly propagating the samplers and sampler views
when they were related to geometry shaders.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:02 -07:00
Zack Rusin
186a6bffdd draw/llvm: Cleanup the store debugging code
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:02 -07:00
Zack Rusin
10964fc73d draw: Allocate the output buffer for output primitives
We were allocating the output buffer but using the input
primitives. We need to allocate that buffer using the
maximum number of output, not input, primitives.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:02 -07:00
Zack Rusin
f20f981553 gallivm: Implement the breakc instruction
Required by more modern examples. Like BRK but with a condition.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:02 -07:00
Zack Rusin
b66ffcf2f8 gallivm: implement implicit primitive flushing
TGSI semantics currently require an implicit endprim at the end
of GS if an ending primitive hasn't been emitted.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:02 -07:00
Zack Rusin
e96f4e3b85 gallium/llvm: implement geometry shaders in the llvm paths
This commits implements code generation of the geometry shaders in
the SOA paths. All the code is there but bugs are likely present.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:02 -07:00
Zack Rusin
edcebe665d draw/gs: Fetch more than one primitive per invocation
Allows executing gs on up to 4 primitives at a time. Will also be
required by the llvm code because there we definitely don't want
to flush with just a single primitive.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:01 -07:00
Zack Rusin
014c4d1cd7 draw/gs: Abstract the portions of GS that are tgsi specific
To be able to add llvm paths later on we need to have some common
interface for them.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:01 -07:00
Zack Rusin
a85c83e427 draw/llvm: Remove unused gs_constants from jit_context
The member was never used and we'll need to handle it differently
because gs will also need samplers/textures setup.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:01 -07:00
Zack Rusin
90ee8de700 graw/gs: add missing max output vertices to all tests
A few tests were missing this crucial property.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-27 03:53:01 -07:00
Jerome Glisse
3f7d9710e8 radeonsi: add cs tracing v3
Same as on r600, trace cs execution by writting cs offset after each
states, this allow to pin point lockup inside command stream and
narrow down the scope of lockup investigation.

v2: Use WRITE_DATA packet instead of WRITE_MEM
v3: Remove useless nop packet

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-03-27 11:38:02 -04:00
Chris Forbes
21a2dfa55d mesa: only check sample count if we actually wanted multisampling
Fixes various test fallout from 90b5a2425a on Pineview, which claims to
support ARB_internalformat_query but doesn't actually provide the
driverfunc.

That driver is still broken [GetInternalformativ will still segfault!]
but it was silly to be going through the sample count logic in the
nonmultisampling case at all.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-27 07:49:12 +13:00
Christian König
c77159cc11 radeon/llvm: document LLVM commit
We need at least that revision to work correctly now.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-26 15:08:00 +01:00
Christian König
1c10018925 radeonsi: add preloading for all samplers
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 12:57:43 +01:00
Christian König
0f6cf2bc79 radeonsi: add preloading of all constants
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 12:57:40 +01:00
Christian König
44e3224554 radeonsi: mark most intrinsics as readnone/nounwind
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 12:57:36 +01:00
Christian König
206f059e1f radeonsi: mark all loads as constant
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 12:57:33 +01:00
Christian König
86f6fc2f1d radeonsi: remove wqm intrinsic
Now the backend handles that itself.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 12:57:30 +01:00
Christian König
6249db73ea radeon/llvm: remove uneeded inclusion
The include isn't needed and the file has moved with LLVM master.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 12:57:23 +01:00
Christian König
0f001fbff1 glsl_to_tgsi: avoid creating arrays if driver doesn't support them
Avoid creating arrays if we replace indirect addressing anyway.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-26 10:22:27 +01:00
Christian König
462de2e65f glsl_to_tgsi: make simplify_cmp work with arrays
Even when we have arrays it is possible for simplify_cmp
to work on temps, just not on arrays.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=62696

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-26 10:22:27 +01:00
Marek Olšák
98a8e5b87e gallium/docs: document get_driver_query_info 2013-03-26 01:37:40 +01:00
Marek Olšák
8ddae684af r600g: add a driver query returning the amount of requested VRAM and GTT memory 2013-03-26 01:28:19 +01:00
Marek Olšák
2504380aaf r600g: add a driver query returning the number of draw_vbo calls
between begin_query and end_query
2013-03-26 01:28:19 +01:00
Marek Olšák
e40c634bd2 st/dri: integrate the HUD
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-26 01:28:19 +01:00
Marek Olšák
c91cf7d7d2 gallium: implement a heads-up display module
Reviewed-by: Brian Paul <brianp@vmware.com>

v2: lots of cosmetic changes
2013-03-26 01:28:19 +01:00
Marek Olšák
8ddcd715b7 gallium: add interface for driver queries like performance counters, etc.
The pipe query interface is reused. The list of available queries can be
obtained using pipe_screen::get_driver_query_info.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-26 01:28:19 +01:00
Marek Olšák
9cec5edea7 gallium/tgsi: fix valgrind warning
"Conditional jump or move depends on uninitialised value(s)"

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-26 01:28:19 +01:00
Marek Olšák
17003b44b7 st/mesa: fix crash with blit-based GetTexImage
https://bugs.freedesktop.org/show_bug.cgi?id=62573

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-26 01:28:19 +01:00
Marek Olšák
d1b91e309b cso: add constant buffer save/restore feature for postprocessing
Postprocessing is an internal meta op and should restore the states
it changes.
2013-03-26 01:28:18 +01:00
Marek Olšák
35c522dce4 radeonsi: fix crash while binding a NULL constant buffer
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-03-26 01:28:18 +01:00
Marek Olšák
a2378daf83 r600g: fix crash while binding a NULL constant buffer 2013-03-26 01:28:18 +01:00
Marek Olšák
53228fe2a8 r300g: fix crash while binding a NULL constant buffer 2013-03-26 01:28:18 +01:00
Martin Andersson
92855bcc95 r600g: Use virtual address for PIPE_QUERY_SO* in r600_emit_query_end
Virtual address is used for PIPE_QUERY_SO* queries in
r600_emit_query_begin, but not in r600_emit_query_end.

This will trigger a GPU fault when one of those queries is
made and virtual address is enabled.

Note: this is a candidate for the 9.1 branch

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-25 18:18:23 -04:00
Rob Clark
634fb837ef freedreno: use u_debug for debug env vars
Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-25 15:05:44 -04:00
Jordan Justen
e207c33020 glsl ir: add as_dereference_record
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-25 11:35:56 -07:00
Brian Paul
eb92f89587 gallium: undef PACKAGE_* macros to silence warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-25 12:24:11 -06:00
Brian Paul
c0f16df938 gallivm: init vars to silence warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-25 12:24:11 -06:00
Brian Paul
35aefe9226 swrast: init vars to silence warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-25 12:24:11 -06:00
Rob Clark
980f1cf8a1 freedreno: prefer sw upload for textures
Since we are UMA, in most cases the GPU blit doesn't make much sense for
texture upload.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-25 13:05:44 -04:00
Rob Clark
732b0b5ebc freedreno: track maximal scissor bounds
Optimize out parts of the render target that are scissored out by taking
into account maximal scissor bounds in fd_gmem_render_tiles().

This is a big win on things like gnome-shell which frequently do partial
screen updates.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-25 13:05:44 -04:00
Adrian Marius Negreanu
8a4750fe5e android: fix Android.mk bug in mesa/drivers/dri/common
target-specific variables are undefined when used as pre-requisites.
instead, use secondary-expansion.

I noticed this when building the patch:
     i965: Add a driconf option to disable flush throttling

Signed-off-by: Adrian Marius Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-25 09:52:19 -07:00
Eric Anholt
712bac1f41 mesa: Disable validate_ir_tree() on release builds.
Since half of ir_validate uses asserts() (the other using printf() then
abort()), there's not much use to calling it in a release build.  Cuts
6.3% of the startup time of TF2.

NOTE: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-25 08:50:38 -07:00
Roland Scheidegger
92b8a37fdf gallivm: move code for dealing with rgb9e5 and r11g11b10 formats to own file
This is really not generic conversion stuff and the code very particular to
these formats.
2013-03-24 22:54:45 +01:00
Vinson Lee
7d0c1f2437 llvmpipe: Fix assertions with assignment instead of comparison.
Fixes assign instead of compare defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-03-24 14:49:22 -07:00
Paul Berry
a593a1b276 i965: Shrink brw_vue_map struct.
This patch changes the arrays in brw_vue_map (which only ever contain
values from -1 to 58) from ints to signed chars.  This reduces the
size of the struct from 488 bytes to 136 bytes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: fix STATIC_ASSERT to use 127 instead of 128.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-24 10:55:28 -07:00
Paul Berry
0a0deb92d9 i965/fs: Rename vp_outputs_written to input_slots_valid.
With the introduction of geometry shaders, fragment inputs will no
longer come exclusively from the vertex shader; sometimes they come
from the geometry shader.  So the name "vp_outputs_written" will
become a misnomer.  This patch renames vp_outputs_written to
input_slots_valid, to reflect the true meaning of the bitfield from
the fragment shader's point of view: it indicates which of the
possible input slots contain valid data that was written by the
previous shader stage.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 10:55:28 -07:00
Paul Berry
bf9bfe838e i965: Use brw.vue_map_geom_out instead of VS output VUE map where appropriate.
This patch modifies post-GS pipeline stages (transform feedback, clip,
sf, fs) to refer to the VUE map through brw->vue_map_geom_out rather
than brw->vs.prog_data->vue_map.  This ensures that when geometry
shader support is added, these pipeline stages will consult the
geometry shader output VUE map when appropriate, rather than the
vertex shader output VUE map.

v2: Fixed some stale "CACHE_NEW_VS_PROG" comments.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 10:55:27 -07:00
Paul Berry
463ef47b16 i965: Store the geometry output VUE map in brw_context.
Currently, the GPU pipeline has one active VUE map in effect at any
given time--the one representing the layout of vertex data coming from
the vertex shader.  However, when geometry shaders are added, they
will have their own independent VUE map.  Later pipeline stages (clip,
sf, fs) will need to consult the geometry shader VUE map if a geometry
shader is in use, and the vertex shader VUE map otherwise.

This patch adds a new field to brw_context, vue_map_geom_out, which
contains the VUE map that should be used by later pipeline stages.  It
also adds a new state flag, BRW_NEW_VUE_MAP_GEOM_OUT, which is
signalled whenever the contents of the VUE map changes.

Since we don't support geometry shaders yet, vue_map_geom_out is
currently set only by the brw_vs_prog state atom.

v2: Don't set vue_map_geom_out in do_vs_prog--that's redundant and
possibly problematic for precompiles.  Only set it in
brw_upload_vs_prog.  Also, make a copy instead of using a
pointer--this makes it possible to detect when the VUE map hasn't
changed, so we can avoid redundant state uploads.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 10:55:27 -07:00
Paul Berry
8fbc22e880 i965: Move brw_vs_prog_data::outputs_written into VUE map.
Future patches will allow for there to be separate VUE maps when both
a geometry shader and a vertex shader are in use.  When this happens,
we will want to have correspondingly separate outputs_written
bitfields.  Moving outputs_written into the VUE map will make this
easy.

For consistency with the terminology used in the VUE map, the bitfield
is renamed to "slots_valid" in the process.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 10:55:27 -07:00
Paul Berry
76ba30800d i965/gen7: Use WE_all mode when enabling channel masks for URB write.
Gen7 adds mask bits to the message header for a URB write which allow
the write to apply only to certain channels.  We don't use this
functionality, so to ensure that the entire write always occurs, we
emit an OR instruction to set the mask bits.

With the advent of geometry shaders, URB writes won't just happen at
the end of a thread; they will happen in mid-thread too.  Thus, we can
no longer rely on channel 0 being enabled, so we need to emit the OR
instruction in WE_all mode to ensure that it is executed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 10:55:27 -07:00
Paul Berry
8371c68a4b i965: Rename BRW_VARYING_SLOT_MAX -> BRW_VARYING_SLOT_COUNT.
The new name clarifies that it represents *one more* than the maximum
possible brw_varying_slot value.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 10:55:27 -07:00
Paul Berry
ec9c3882d9 i965: Clarify nomenclature: vert_result -> varying
This patch removes the terminology "vert_result" from the i965 driver,
replacing it with "varying".  The old terminology, "vert_result", was
confusing because (a) it referred to the enum gl_vert_result, which no
longer exists (it was replaced with gl_varying_slot), and (b) it
implied a vertex output, but with the advent of geometry shaders, it
could be either a vertex or a geometry output, depending what shaders
are in use.  The generic term "varying" is less confusing.

No functional change.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Whitespace fixes.
2013-03-23 22:47:54 -07:00
Chris Forbes
f56fb9d248 i965: bump MAX_DEPTH_TEXTURE_SAMPLES to 4/8
Bump MAX_DEPTH_TEXTURE_SAMPLES to match what GetInternalformativ is
claiming. Since that limit is what is actually enforced now, this
doesn't actually change anything except the queried value.

There's still no piglits verifying that multisample depth textures work,
but this works in the Unigine demos.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 16:38:18 +13:00
Chris Forbes
2405da174e mesa: use _mesa_check_sample_count() for multisample textures
Extends _mesa_check_sample_count() to properly support the
TEXTURE_2D_MULTISAMPLE and TEXTURE_2D_MULTISAMPLE_ARRAY targets, which
have subtly different limits than renderbuffers.

This resolves the remaining TODO in the implementation of
TexImage*DMultisample.

V2: - Don't introduce spurious block.
    - Do this in multisample.c instead.
    - Fix typo in error message.
    - Inline spec quotes

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 16:38:18 +13:00
Chris Forbes
90b5a2425a mesa: helper for checking renderbuffer sample count
Pulls the checking of the sample count into a helper function, and
extends the existing logic to include the interactions with both
ARB_texture_multisample and ARB_internalformat_query.

_mesa_check_sample_count() checks a desired sample count against a
a combination of target/internalformat, and returns the error enum
to be produced, if any. Unfortunately the conditions are messy and the
errors vary.

V2: - Tidy up spurious block.
    - Move _mesa_check_sample_count() to multisample.c instead; It
      doesn't really belong in fbobject.c or teximage.c.
    - Inlined spec quotes

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 16:38:18 +13:00
Chris Forbes
86b8380600 mesa: allow internalformat_query with multisample texture targets
Now that we support ARB_texture_multisample, there are multiple targets
accepted for this query, and they may have target-dependent limits, so
pass the target to the driverfunc.

For example, the sampling hardware may not be able to do general
texelFetch() for some format/sample count combination, but the driver
may still be able to implement a reasonable resolve operation, so it can
be supported for renderbuffers.

V2: - Don't break Gallium compile.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-24 16:38:18 +13:00
Dmitry Cherkassov
3cc2629b3b clover: add dynamic_cast results checking down in clSetKernelArgument() code path.
Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
2013-03-24 02:43:34 +01:00
Roland Scheidegger
b50e362dbb gallivm: Add code for rgb9e5 shared exponent format to float conversion
And use this (and the code for r11g11b10 packed float to float conversion)
in the soa texturing code (the generated code looks quite good).
Should be an order of magnitude faster probably than using the fallback
(not measured).
Tested with piglit texwrap GL_EXT_packed_float and
GL_EXT_texture_shared_exponent respectively (didn't find much else using
it).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-24 02:09:02 +01:00
Marek Olšák
3e10ab6b22 gallium,st/mesa: don't use blit-based transfers with software rasterizers
The blit-based paths for TexImage, GetTexImage, and ReadPixels aren't very
fast with software rasterizer. Now Gallium drivers have the ability to turn
them off.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-23 13:19:16 +01:00
Marek Olšák
25e3094058 st/mesa: implement blit-based ReadPixels
Initial version contributed by: Martin Andersson <g02maran@gmail.com>

This is only used if the memcpy path cannot be used and if no transfer ops
are needed. It's pretty similar to our TexImage and GetTexImage
implementations.

The motivation behind this is to be able to use ReadPixels every frame and
still have at least 20 fps (or 60 fps with a powerful GPU and CPU)
instead of 0.5 fps.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-23 13:17:05 +01:00
Marek Olšák
d702c67ba5 mesa: add common format-independent memcpy-based ReadPixels path
I'll need the _mesa_readpixels_needs_slow_path function for the blit-based
version, but it's also useful to have this memcpy-based path in one place
and not scattered across several functions.

v2: add "const" to function parameters

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-23 13:17:05 +01:00
Marek Olšák
f8855a4214 mesa: add helper func for checking combined depthstencil buffers from st/mesa
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-23 13:17:05 +01:00
Marek Olšák
2dc2066b90 mesa: add a common function returning transfer ops for ReadPixels
I'll need both new functions for later. For now, it consolidates the code
for determining what the transfer ops should be and makes it a little bit
smarter.

v2: added "const"

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-23 13:17:05 +01:00
Marek Olšák
b2a4573c14 mesa: handle HALF_FLOAT like FLOAT in get_tex_rgba
NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-23 13:17:05 +01:00
Roland Scheidegger
b101a094b5 llvmpipe: add EXT_packed_float render target format support
New conversion code to handle conversion from/to r11g11b10 AoS to/from
SoA floats, and also add code for conversion from rgb9e5 AoS to float SoA
(which works pretty much the same as r11g11b10 except for the packing).
(This code should also be used for texture sampling instead of
relying on u_format conversion but it's not yet, so rgb9e5 is unused.)
Unfortunately a crazy amount of hacks is necessary to get the conversion
code running in llvmpipe's generate_unswizzled_blend, which isn't well
suited for formats where the storage representation has nothing to do
with what's needed for blending (moreover, the conversion will convert
from packed AoS values, which is the storage format, to float SoA values,
because this is much more natural for the conversion, and likewise from
SoA values to packed AoS values - but the "blend" (which includes
trivial things like partial mask) works on AoS values, so incoming fs
values will go SoA->AoS, values from destination will go packed
AoS->SoA->AoS, then do blend, then AoS->SoA->packed AoS which probably
isn't the most efficient way though the shuffles are probably bearable).

Passes piglit fbo-blending-formats (with GL_EXT_packed_float parameter),
still need to verify Inf/NaNs (where most of the complexity in the
conversion comes from actually).

v2: drop the (very bogus) rgb9e5 part, and do component extraction
in the helper code for r11g11b10 to float conversion, making the code
slightly more compact (suggested by Jose), now that there are no other
callers left this works quite well. (Could do the same for the
opposite way but it's less than ideal there, final part of packing
needs to be done in caller anyway and there'd be another conditional.)

v3: minor style and comment fixes. Also fix a potential issue with
negative zero being potentially returned by max(src, zero) as we
don't have well-defined min/max behavior (fortunately no additonal cost).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-22 20:10:53 +01:00
Michel Dänzer
31009b4521 r600g: Honour legacy debugging environment variables
This helps minimize confusion / effort when moving between branches or
helping others.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-03-22 10:29:49 +01:00
Matt Turner
81e585fabe docs: Mark ARB_ES3_compatibility as done. 2013-03-21 15:59:21 -07:00
Rob Clark
eab8d6cbdb freedreno: add pipe->blit
Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-21 17:33:51 -04:00
Paul Berry
eea30dff43 i965: Add a driconf option to disable flush throttling.
Normally when submitting the first batch buffer after a flush, we
check whether the GPU has completed processing of the first batch
buffer of the previous frame.  If it hasn't, we wait for it to finish
before submitting any more batches.  This prevents GPU-heavy and
CPU-light applications from racing too far ahead of the current frame,
but at the expense of possibly lower frame rates.  Sometimes when
benchmarking we want to disable this mechanism.

This patch adds the driconf option "disable_throttling" to disable the
throttling mechanism.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-21 13:24:43 -07:00
Matt Turner
12dc4be8a6 mesa: Implement TEXTURE_IMMUTABLE_LEVELS for ES 3.0.
NOTE: This is a candidate for the 9.1 branch.
Fixes piglit's texture-immutable-levels test.
Reported-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-21 11:04:41 -07:00
Adam Jackson
38aa8ec937 glx: Build with VISIBILITY_CFLAGS in automake
Note: This is a candidate for the stable branches.

Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-03-21 13:21:18 -04:00
Brian Paul
3804d67723 scons: check for existance of 'MSVC_VERSION' in env
Evidently, MSVC_VERSION isn't always defined so check for it before
checking the MSVC version.

Suggested by Jose.
2013-03-21 09:24:40 -06:00
Brian Paul
10393038f8 softpipe: silence some asst. MSVC type warnings in sp_tex_sample.c 2013-03-21 09:24:35 -06:00
Brian Paul
b2d3f364db softpipe: silence some MSVC signed/unsigned warnings 2013-03-21 09:24:35 -06:00
Brian Paul
2e3200d463 softpipe: silence some MSVC float/double warnings 2013-03-21 09:24:35 -06:00
Brian Paul
f7b07fd25c rbug: silence some MSVC signed/unsigned warnings 2013-03-21 09:24:35 -06:00
Brian Paul
bfc8b8fac5 postprocess: silence some MSVC float/int warnings 2013-03-21 09:24:35 -06:00
Brian Paul
8bd5692a5d meta: fix incorrect slice, r coordinate computation
The arithmetic to convert a 3D texture slice to an R coordinate was
incorrect.  Found when MSVC warned of a divide by zero.

Note that we don't actually ever hit this path.  We don't decompress
slices of 3D textures and we don't support 3D mipmap generation yet.
2013-03-21 09:24:35 -06:00
Brian Paul
a940c93aac vega: fix MSVC warning about missing return statement 2013-03-21 09:24:35 -06:00
Brian Paul
52edca9df9 meta: minor indentation fix 2013-03-21 08:28:26 -06:00
Michel Dänzer
032e5548b3 radeonsi: Emit pixel shader state even when only the vertex shader changed
Fixes random failures with piglit glsl-max-varyings.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-03-21 15:12:31 +01:00
Chad Versace
e34fe8bd20 android: Define PACKAGE_VERSION/BUGREPORT in CFLAGS
This fixes the Android build. Commit 439c3d4 broke it.

CC: Adrian M Negreanu <adrian.m.negreanu@intel.com>
CC: Matt Turner <mattst@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-20 15:11:41 -07:00
Kenneth Graunke
d24819dce8 i965/vs: Add IR dumping for immediates.
This makes dump_instructions more useful.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-20 10:40:44 -07:00
Kenneth Graunke
095c3755ee glsl: Add built-in functions for GLSL 1.50.
This makes basic built-in functions work in GLSL 1.50.  It supports
everything except the new Geometry Shader functions.

The new 150.glsl file is 140.glsl plus ARB_texture_multisample.glsl;
150.frag is identical to 140.frag except for the #version bump.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-03-20 10:38:40 -07:00
Kenneth Graunke
bcdda04349 glsl: Add sampler2DMS/sampler2DMSArray types to GLSL 1.50.
GLSL 1.50 includes support for the new sampler types introduced by
the ARB_texture_multisample extension.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-03-20 10:38:38 -07:00
Kenneth Graunke
f1ca2ed538 glsl: Bump standalone compiler versions to 1.50.
The version bumps are necessary in order to compile built-ins for 1.50.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-03-20 10:38:20 -07:00
Kenneth Graunke
d86efc075e i965: Don't use texture swizzling to force alpha to 1.0 if unnecessary.
Commit 33599433c7 began setting the texture swizzle mode to XYZ1 for
RED, RG, and RGB textures in order to force alpha to 1.0 in case we
actually stored the texture as RGBA.

This had a unforseen performance implication: the shader precompile
assumes that the texture swizzle mode will be XYZW for non-shadow
sampler types.  By setting it to XYZ1, this means every shader used with
a RED, RG, or RGB texture has to be recompiled.  This is a very common
case.

Unfortunately, there's no way to improve the precompile, since RGBA
textures still need XYZW, and there's no way to know by looking at
the shader source what texture formats might be used.

However, we only need to smash alpha to 1.0 if the texture's memory
format actually has alpha bits.  If not, the sampler already returns 1.0
for us without any special swizzling.  XRGB8888, for example, is a very
common case where this occurs.

This partially fixes a performance regression since commit 33599433c7.
More work is required to fully fix it in all cases.  This at least helps
Warsow.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-20 10:37:34 -07:00
Kenneth Graunke
2dd22130cd i965: Don't print a fatal-looking message if intelCreateContext fails.
With the old context creation mechanism, an application asked the GL to
give it a context.  Failing to produce a context was a fatal error.

Now, with GLX_ARB_create_context, the application can request a specific
version.  If it's higher than the maximum version we support, context
creation will fail.  But this is a normal error that applications
recover from.

In particular, the new glxinfo tries to create OpenGL 4.3, 4.2, 4.1,
4.0, 3.3, and 3.2 contexts before finally succeeding at creating a 3.1
context.  This led to it printing the following message 6 times:
"brwCreateContext: failed to init intel context"

There's no need to alarm users (and developers) with such a message.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-20 10:37:34 -07:00
Eric Anholt
1f112ccf02 i965/gen7: Align all depth miplevels to 8 in the X direction.
On an INTEL_DEBUG=perf piglit run on IVB, reduces the instances of "HW
workaround: blit" (the printouts from the misaligned-depth workaround
blits) from 725 to 675.

It doesn't totally eliminate the workaround blit, because we still have
problems with Y offsets that we can't fix (since texturing can only align
miplevels up to 2 or 4, not 8).

No regressions on piglit/es3conform on IVB.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-20 10:18:44 -07:00
Christoph Bumiller
529dbbfcf7 nvc0: fix max varying count, move CLIPVERTEX,FOG out of the way
The card spews an error if I use all 128 generic slots.
Apparently the real limit isn't just dictated by the address space
layout.
2013-03-20 12:25:21 +01:00
Christoph Bumiller
8acaf862df gallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD v3
This makes it possible to identify gl_TexCoord and gl_PointCoord
for drivers where sprite coordinate replacement is restricted.

The new PIPE_CAP_TGSI_TEXCOORD decides whether these varyings
should be hidden behind the GENERIC semantic or not.

With this patch only nvc0 and nv30 will request that they be used.

v2: introduce a CAP so other drivers don't have to bother with
the new semantic

v3: adapt to introduction gl_varying_slot enum
2013-03-20 12:25:21 +01:00
Ian Romanick
3eaf823b90 docs: import release notes for 9.1.1, add news item
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-19 17:46:30 -07:00
Kristian Høgsberg
939789e48d gallium-egl: Fix compile errors introduced in de315f76a
The commit changed API in a helper library shared by both egl_dri2 and
the gallium egl state tracker, but only egl_dri2 was updated to use the
new interface.

Tested-by: Giulio Camuffo <giuliocamuffo@gmail.com>
2013-03-19 20:17:47 -04:00
Paul Berry
995bbc2256 i965/fs: Avoid unnecessary recompiles due to POS bit of proj_attrib_mask.
Previous to this patch, when using fixed function fragment shading,
bit VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask was being set
differently during precompiles and normal usage.  During precompiles
it was being set only if the fragment shader reads from window
position (which it never does), so it was always being set to 0.
During normal usage it was being set if the vertex shader writes to
all 4 components of gl_Position (which it usually does), so it was
usually being set to 1.  As a result, we were almost always doing an
extra recompile for the fixed function fragment shader.

The recompile was totally unnecessary, though, because
brw_wm_prog_key::proj_attrib_mask is only consulted for
fs_visitor::emit_general_interpolation(), which isn't used for
VARYING_SLOT_POS.

This patch avoids the unnecessary recompile by always setting bit
VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask to 1.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-19 16:56:58 -07:00
Paul Berry
db81d3b8f7 ff_fragment_shader: Don't do unnecessary (and dangerous) uniform setup.
Previously, right after calling _mesa_glsl_link_shader(), the fixed
function fragment shader code made several calls with the ostensible
purpose of setting up uniforms for the fragment shader it just
created.

These calls are unnecessary, since _mesa_glsl_link_shader() calls
driver->LinkShader(), which takes care of calling these functions (or
their equivalent).  Also, they are dangerous to call after
_mesa_glsl_link_shader() has returned, because on back-ends such as
i965 which do precompilation, _mesa_glsl_link_shader() may have
already cached pointers to the existing uniform structures; attempting
to set up the uniforms again invalidates those cached pointers.

It was only by sheer coincidence that this wasn't manifesting itself
as a bug.  It turns out that i965's precompile mechanism was always
setting bit 0 of brw_wm_prog_key::proj_attrib_mask to 0 for fixed
function fragment shaders, but during normal usage this bit usually
gets set to 1.  As a result, the precompiled shader (with its invalid
uniform pointers) was not being used.

I'm about to introduce some changes that cause bit 0 of
proj_attrib_mask to be set consistently between precompilation and
normal usage, so to avoid regressions I need to get rid of the
dangerous duplicate uniform setup code first.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-19 16:56:56 -07:00
Paul Berry
0af56c9d53 i965: Avoid unnecessary copy when depthstencil workaround invoked by clear.
Since apps typically begin rendering with a call to glClear(), it is
likely that when brw_workaround_depthstencil_alignment() moves a
miplevel to a temporary buffer, it can avoid doing a blit, since the
contents of the miplevel are about to be erased.

This patch adds the necessary plumbing to determine when
brw_workaround_depthstencil_alignment() is being called as a
consequence of glClear(), and avoids the unnecessary blit when it is
safe to do so.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Eliminate unnecessary call to _mesa_is_depthstencil_format().  Fix
handling of depth buffer in depth/stencil format.

v3: Use correct bitfields for clear_mask.  Fix handling of depth
buffer in depth/stencil format when hardware uses separate stencil.
When invalidating, make sure we still reassociate the image to the new
miptree.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-19 16:56:51 -07:00
Alex Deucher
49c1fc7044 r600g: don't emit SQ_DYN_GPR_RESOURCE_LIMIT_1 on cayman
Doesn't exist on the asic and will cause a CS rejection
if VM is disabled.

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-19 18:13:27 -04:00
Alex Deucher
a9914117ea r600g: emit DB_SRESULTS_COMPARE_STATE0 on r6xx/r7xx
Not using HiS yet, but matches what we do on evergreen+.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-19 18:13:26 -04:00
Brian Paul
c45d22e26a winsys/svga: improve error/debug message output
Use vmw_printf() just for extra debugging info (off by default).
Use vmw_error() for real errors/failures/etc that we definitely
want to report.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-19 15:18:38 -06:00
Brian Paul
460a4444e8 tgsi: fix uninitialized declaration array fields
Fixes a few regressions since the TGSI array changes.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-19 15:15:37 -06:00
Kristian Høgsberg
1670737436 egl_dri2: Lower __DRI_IMAGE version requirement back to 1
We check the extension version manually instead and verify that we have
the createImageFromFds function before enabling prime fd passing.
2013-03-19 16:13:38 -04:00
Maarten Lankhorst
7c3d8301af radeon/llvm: Do not link against libgallium when building statically.
NOTE: This is a candidate for the 9.1 branch.

Tested-by: Vincent Lejeune <vljn@ovi.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-03-19 20:20:33 +01:00
Matt Turner
322c840bea gles2: Add an ABI-check test
Checks that no functions are exported that are not part of the ABI.

Note that currently we are exporting functions that are aliased to
functions that are part of the ABI. They shouldn't be exported, but the
XML descriptions don't adequately describe this case.
2013-03-19 12:04:32 -07:00
Matt Turner
569bd281c1 gles1: Add an ABI-check test
Checks that no functions are exported that are not part of the ABI.

Note that currently we are exporting functions that are aliased to
functions that are part of the ABI. They shouldn't be exported, but the
XML descriptions don't adequately describe this case.
2013-03-19 12:04:31 -07:00
Andreas Boll
182895c4e6 gallium/egl: fix out-of-tree build
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/15-fix-oot-build.diff;h=7040999a22d3937d0578cfd85ee2c71d7dc614bb;hb=refs/heads/ubuntu%2B1

NOTE: This is a candidate for the 9.1 branch.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-19 18:12:38 +01:00
Andreas Boll
92e6260c19 osmesa: fix out-of-tree build
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/14-fix-osmesa-build.diff;h=00581d0e1833c5492d9050e1bf3d5e658cad782e;hb=refs/heads/ubuntu%2B1

v2: Move the added line immediately after -I$(top_srcdir)/src/mapi

NOTE: This is a candidate for the 9.1 and 9.0 branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-19 18:12:38 +01:00
Andreas Boll
06fff296e9 build: Enable x86 assembler on Hurd.
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/10-hurd-configure-tweaks.diff;h=984e17df1b8afdf8e4b36bee96aa5ab6a5691021;hb=refs/heads/ubuntu%2B1

Thanks to Pino Toscano.

v2: Don't bother with x86_64. AFAICT GNU/Hurd doesn't support it so far.

NOTE: This is a candidate for stable branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Acked-by: Matt Turner <mattst88@gmail.com>
2013-03-19 18:12:38 +01:00
Andreas Boll
7962f28c43 mesa: use ieee fp on s390 and m68k
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/02_use-ieee-fp-on-s390-and-m68k.patch;h=d3d6c1d7fec3c72ecf320706167deb61c52636c3;hb=refs/heads/ubuntu%2B1

Fixes Debian bug #349437.

Patch written by David Nusinow.

NOTE: This is a candidate for stable branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
2013-03-19 18:12:37 +01:00
Roland Scheidegger
5af7b45986 gallivm: fix return opcode handling in main function of a shader
If we're in some conditional or loop we must not return, or the code
after the condition is never executed.
(v2): And, we also can't just continue as nothing happened, since the
mask update code would later check if we actually have a mask, so we
need to remember that there was a return in main where we didn't exit
(to illustrate this, a ret in a if clause would cause a mask update
which is still ok as we're in a conditional, but after the endif the
mask update code would drop the mask hence bringing execution back to
pixels which should have their execution mask set to zero by the ret).
Thanks to Christoph Bumiller for figuring this out.

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=62357.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-19 18:04:05 +01:00
Rob Clark
afc1b7c21f freedreno: clear fixes
Some fixes for clearing only depth or only stencil.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-19 10:49:30 -04:00
Christian König
90862c8507 radeonsi: enable indirect adressing
Fixing 16 piglit tests.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-19 15:16:18 +01:00
Christian König
5e616cf2c5 radeonsi: implement indirect adressing of constants
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-19 15:16:18 +01:00
Christian König
f5298b0a65 radeonsi: switch to using resource destribtors for constants v2
v2: remove superfluous mask, use buffer_size instead of constant

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-19 15:16:18 +01:00
Christian König
c05483fc00 radeon/llvm: rework input fetch and output store
Cleanup the code and implement indirect addressing.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-19 15:16:18 +01:00
Brian Paul
b51f8593d8 tgsi: add initializer data to fix MSVC compile error 2013-03-19 07:55:48 -06:00
Christian König
897303f8ff tgsi: add ArrayID documentation v2
v2: further improve the text with comments from Christoph Bumiller.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
21190fbd56 tgsi: use separate structure for indirect address v2
To further improve the optimization of source and destination
indirect addressing we need the ability to store a reference
to the declaration of the addressed operands.

Since most of the fields in tgsi_src_register doesn't apply for
an indirect addressing operand replace it with a separate
tgsi_ind_register structure and so make room for extra information.

v2: rename Declaration to ArrayID, put the ArrayID into () instead of []

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
16caeff2a5 tgsi: add ArrayID to declarations
Remember which declarations are declared as "arrays" and so
can be indirectly addressed. ArrayIDs start at 1, cause for
compatibility reasons zero is treaded as no array present.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
d3e07bed90 tgsi: remove TGSI_FILE_(IMMEDIATE|TEMP)_ARRAY
Nobody seems to be using it, and only nv50 had a partial implementation.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
affdff230b glsl_to_tgsi: remove indirect addressing limitations
They shouldn't be necessary any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
3f67251e3d glsl_to_tgsi: allocate arrays separately v2
Instead of allocating everything as temporaries, use the
new array allocation functions.

v2: fix bug in simplify_cmp, declare arrays on demand

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
433b2ca46b glsl_to_tgsi: use get_temp for all allocations
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
506d400275 tgsi/ureg: implement support for array temporaries
Don't bother with free temporaries, just allocate them at
the end and also emit them in their own declaration.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:32 +01:00
Christian König
52947b93b2 tgsi/ureg: cleanup local temporary emission v2
Instead of emitting each temporary separately, emit them in a chunk.

v2: keep separate function for emitting temps

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-19 13:38:31 +01:00
Andreas Boll
36320bfa54 radeon/llvm: Link against libgallium.la to fix an undefined symbol
Ported from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/119-libllvmradeon-link.patch;h=ee47f8a07dbf33c32f8b57faed923680ed6648fb;hb=refs/heads/ubuntu%2B1

Fixes a regression introduced with
f70c385351

NOTE: This is a candidate for the 9.1 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62434
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-03-19 12:07:51 +01:00
Kristian Høgsberg
de315f76a2 wayland: Add prime fd passing as a buffer sharing mechanism
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
2013-03-18 21:15:41 -04:00
Kristian Høgsberg
2356e28452 Add dri image entry point for creating image from fd
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
2013-03-18 21:03:54 -04:00
Kristian Høgsberg
664fe6dc84 wayland: allocate a __DRIimage for the color buffer
No functional change here, but this will let us query the image
for an fd handle later.

Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
2013-03-18 21:03:46 -04:00
Rob Clark
4e8f5c52bb DRI2: HACK: no GLX_INTEL_swap_event if no ScheduleSwap
If ddx does not support swap, don't advertise it.  This is a hack to
work around current xservers which advertise this extension even when it
is clearly not supported.  When:

http://lists.x.org/archives/xorg-devel/2013-February/035449.html

is merged in upstream xserver and makes it's way into most distros then
this hack can be removed.  In the mean time, it is required to allow
gnome-shell/clutter/etc to work properly with a DDX driver which does
not support ScheduleSwap.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-18 14:16:43 -04:00
Paul Berry
5a13e051d9 i965/blorp: Add INTEL_DEBUG=blorp flag.
This debug flag prints out the native GEN assembly for a blitting
shader produced using BLORP.  Hopefully this should be useful in
developing additional BLORP features.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-18 09:27:25 -07:00
Alex Deucher
2da8ee16a8 r600g: properly set non_disp tiling mode for DMA (v2)
Needs to be set for depth, stencil, and fmask just
like other blocks.

v2: drop additional cayman bits for now

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-17 13:32:48 -04:00
Alex Deucher
4409758a04 r600g: Use blitter rather than DMA for 128bpp on cayman (v3)
On cayman, 128bpp surfaces require non_disp ordering for hw
access to both linear and tiled surfaces.  When we use the 3D
engine we can set the non_disp ordering on both the tiled and
linear sides (via CB or texture), but when we use the DMA
engine, we can only set the non_disp ordering on the tiled
side, so after a L2T operation with the DMA engine, the data
ends up in the wrong order on the tiled side.

v2: cayman/TN only

v3: fix comments

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60802

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-17 13:32:48 -04:00
Paul Berry
346a1b9bb9 i965: Simplify separate stencil check
The only format returned by _mesa_get_format_base_format() that
satisfies _mesa_is_depthstencil_format() is GL_DEPTH_STENCIL, so we
can simplify the check.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-16 10:15:51 -07:00
Maarten Lankhorst
f70c385351 gallium/build: Fix visibility CFLAGS in automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Fix formatting - use one CFLAG per line

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59238
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-16 12:45:22 +01:00
José Fonseca
49ae9b08d4 scons: Warn when using MSVS versions prior to 2012.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-15 19:55:54 +00:00
Paul Berry
c5d5827951 i965: Apply depthstencil alignment workaround when doing fast clears.
Fast depth clears have the same depth/stencil alignment requirements
as other drawing operations.  Therefore, we need to call
brw_workaround_depthstencil_alignment() from both the clear and
drawing paths.

Without this fix, we get image corruption if the following conditions
hold: (a) the first ever drawing operation to a depth miplevel (or the
first drawing operation after having used the texture for sampling) is
a clear, (b) the depth miplevel has a size that is eligible for fast
depth clears, and (c) the depth miplevel has an offset within the
miptree that isn't 8x8 aligned.

Fixes piglit "depthstencil-render-miplevels" tests with size 273.

NOTE: This is a candidate for stable branches

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-15 11:52:33 -07:00
Paul Berry
eed6baf762 Replace gl_frag_attrib enum with gl_varying_slot.
This patch makes the following search-and-replace changes:

gl_frag_attrib -> gl_varying_slot
FRAG_ATTRIB_* -> VARYING_SLOT_*
FRAG_BIT_* -> VARYING_BIT_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:26:17 -07:00
Paul Berry
f117abe664 Get rid of _mesa_frag_attrib_to_vert_result().
Now that there is no difference between the enums that represent
vertex outputs and fragment inputs, there's no need for a conversion
function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:26:07 -07:00
Paul Berry
10a131211e Get rid of _mesa_vert_result_to_frag_attrib().
Now that there is no difference between the enums that represent
vertex outputs and fragment inputs, there's no need for a conversion
function.  But we still need to be able to detect when a given vertex
output has no corresponding fragment input.  So it is replaced by a
new function, _mesa_varying_slot_in_fs(), which tells whether the
given varying slot exists as an FS input or not.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:25:57 -07:00
Paul Berry
827c074fb1 mtypes.h: Modify gl_frag_attrib to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_frag_attrib enum entirely.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:25:46 -07:00
Paul Berry
a6d807c86f Replace gl_geom_result enum with gl_varying_slot.
This patch makes the following search-and-replace changes:

gl_geom_result -> gl_varying_slot
GEOM_RESULT_* -> VARYING_SLOT_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:25:36 -07:00
Paul Berry
d453225efc mtypes.h: Modify gl_geom_result to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_geom_result enum entirely.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:25:26 -07:00
Paul Berry
d7c60a4a4f Replace gl_geom_attrib enum with gl_varying_slot.
This patch makes the following search-and-replace changes:

gl_geom_attrib -> gl_varying_slot
GEOM_ATTRIB_* -> VARYING_SLOT_*
GEOM_BIT_* -> VARYING_BIT_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:25:15 -07:00
Paul Berry
094bcf399c mtypes.h: Modify gl_geom_attrib to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_geom_attrib enum entirely.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:25:05 -07:00
Paul Berry
36b252e947 Replace gl_vert_result enum with gl_varying_slot.
This patch makes the following search-and-replace changes:

gl_vert_result -> gl_varying_slot
VERT_RESULT_* -> VARYING_SLOT_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:24:54 -07:00
Paul Berry
9e729a79b0 mtypes.h: Modify gl_vert_result to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_vert_result enum entirely.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:24:44 -07:00
Paul Berry
8a076c5f05 mtypes.h: Add new gl_varying_slot enum, and bitfield defines.
Future patches will make use of the enum.  It will eventually take the
place of the existing enums gl_vert_result, gl_geom_attrib,
gl_geom_result, and gl_frag_attrib, all of which represent essentially
the same information but using inconsistent values.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:24:34 -07:00
Paul Berry
6bec74bfd9 i965: Change fragment input related bitfields to 64-bit.
This patch updates the bitfields brw_context::wm.input_size_masks,
tracker::size_masks, and brw_wm_prog_key::proj_attrib_mask, all of
which are indexed by gl_frag_attrib, from 32-bit to 64-bit.

This paves the way for supporting geometry shaders, and for merging
the gl_frag_attrib and gl_vert_result enums.  The combination of these
two will require at least 55 bits in the bitfields.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
2013-03-15 09:24:30 -07:00
Alex Deucher
03eef7f8ef r600g: add Richland APU pci ids
Note: this is a candidate for the stable branches.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-15 09:24:14 -04:00
Brian Paul
fec8733d4e st/dri: add support for the always_have_depth_buffer option
This involved adding another driOptionCache to dri_screen.  The
existing one just held the default values.  But now we also need
to have the values from the DRI config file so that we can get at
the always_have_depth_buffer config option, which is per-screen.
2013-03-15 07:05:01 -06:00
Brian Paul
5d1b3097e2 driconf: add a miscellaneous section and always_have_depth_buffer option
This option is needed for some applications that neglect to request
a depth buffer when choosing a visual/fbconfig.

The Linux app Topogun is an example of this problem.
2013-03-15 07:04:13 -06:00
Brian Paul
b3d184bac6 driconf: reorder options, reformat comments, etc
Move the options into the proper section (Debug, Quality, Performance,
etc).

Update comments and add some whitespace to improve readability.
2013-03-15 07:04:08 -06:00
Philipp Brüschweiler
c07c18081e wayland: fix segfault when using software rendering
wayland_roundtrip() was given an incorrect parameter.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=62362

Note: This is a candidate for the stable branches.

Signed-off-by: Brian Paul <brianp@vmware.com>
2013-03-15 06:50:23 -06:00
Brian Paul
f4a2c29d93 softpipe: fix up NUM_ENTRIES confusion
There were two different NUM_ENTRIES #defines for the framebuffer
tile cache and the texture tile cache.  Rename the later to fix
the warnings:

In file included from sp_flush.c:40:0:
sp_tex_tile_cache.h:76:0: warning: "NUM_ENTRIES" redefined
sp_tile_cache.h:78:0: note: this is the location of the previous definition
In file included from sp_context.c:50:0:
sp_tex_tile_cache.h:76:0: warning: "NUM_ENTRIES" redefined
sp_tile_cache.h:78:0: note: this is the location of the previous definition

Also, replace occurances of NUM_ENTRIES with Element() macro to
be safer.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-14 18:17:18 -06:00
Brian Paul
2f6970ae97 st/osmesa: silence some optimized build warnings 2013-03-14 18:09:42 -06:00
Brian Paul
6a9d7659d6 draw: init pre_clip_pos = NULL to fix optimized build warning 2013-03-14 18:09:42 -06:00
Brian Paul
622b1fcc18 glx: init screen = 0 to fix optimized build warning 2013-03-14 18:09:42 -06:00
Kenneth Graunke
91df4d746b i965: Make INTEL_DEBUG=shader_time use the RAW surface format.
Untyped Atomic Operation messages are illegal for non-RAW formats.  The
IVB hardware proceeds happily (after all, who cares what the format of the
surface is if you're doing untyped ops on it?), but later hardware
apparently doesn't.  The simulator for gen7 does complain, though.

v2: Rebase against updates to previous patches. (by anholt)

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-14 12:30:40 -07:00
Kenneth Graunke
125b34cffb i965: Specialize SURFACE_STATE creation for shader time.
This is basically a copy and paste of gen7_create_constant_surface, but
with the parameters filled in to offer a simpler interface.

It will diverge shortly.

I didn't bother adding it to the vtable for now since shader time is only
exposed on Gen7+.

v2: Replace tabs in the new code (by anholt)
    Add back dropped memset() and add a comment about HSW channel selects.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-14 12:30:40 -07:00
Kenneth Graunke
f27a220cad i965: Fix INTEL_DEBUG=shader_time for Haswell.
Haswell's "Data Cache" data port is a single unit, but split into two
SFIDs to allow for more message types without adding more bits in the
message descriptor.

Untyped Atomic Operations are now message 0010 in the second data cache
data port, rather than 6 in the first.

v2: Use the #defines from the previous commit. (by anholt)

NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
2013-03-14 12:30:40 -07:00
Eric Anholt
a2d08f170a i965: Add definitions for gen7+ data cache messages.
We were sparsely using some of these message types, but I'll just fill
them all in now.  It will be used for fixing shader_time on HSW.

v2: Add missing MEDIA_BLOCK_READ.

NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-14 12:30:39 -07:00
Eric Anholt
db3a0f13ef i965: Split shader_time entries into separate cachelines.
This avoids some snooping overhead between EUs processing separate shaders
(so VS versus FS).

Improves performance of a minecraft trace with shader_time by 28.9% +/-
18.3% (n=7), and performance of my old GLSL demo by 93.7% +/- 0.8% (n=4).

v2: Add a define for the stride with a comment explaining its units and
    why.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-14 12:30:39 -07:00
José Fonseca
a35a19a6ea scons: Define _ALLOW_KEYWORD_MACROS on MSVC builds.
scons/llvm.py defines inline globally to workaround issues with LLVM C
binding headers, so the only way to is to avoid
aggravating xkeycheck.h errors is to set _ALLOW_KEYWORD_MACROS.

This fixes MSVC 2012 build with LLVM.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-14 19:01:10 +00:00
José Fonseca
6a3d77e13d softpipe: Shrink context size.
- each softpipe_tex_tile_cache 50*64*64*4*4 = 3,276,800 bytes
- each softpipe_context has 3*32 softpipe_tex_tile_cache, i.e, each softpipe
  context is 314,572,800 bytes, i.e, 300MB

That is, in a 32bits process (around 3GB virtual memory max), we can
only fit 10 contexts.

This change is a short-term hack to shrink the context size.  Longer
term we'll need to change how the texture cache works.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-14 11:59:53 +00:00
Christian König
ce3aa0e775 radeon/llvm: fix LLVM dependencies
Since commit 1c4f283151 we obvious depend on this.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-14 12:38:54 +01:00
Anuj Phogat
d78dcdf103 mesa: Fix FB blitting in case of zero size src or dst rect
Framebuffer blitting operation should be skipped if any of the
dimensions (width/height) of src/dst rect is zero.

V2: Move the dimension check after error checking in _mesa_BlitFramebuffer.

Fixes: fbblit(negative.nullblit.zeroSize) in Intel oglconform
https://bugs.freedesktop.org/show_bug.cgi?id=59495

Note: Candidate for all the stable branches.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-03-13 17:58:09 -07:00
Roland Scheidegger
1826659272 tgsi: fix sample_d emit for arrays
Those cases were apparently forgotten.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-14 00:22:55 +01:00
Roland Scheidegger
9e93d7c4fd llvmpipe: don't assert when trying to render to surfaces with multiple layers
instead just warn when creating the surface, rendering will simply happen
to first layer.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-14 00:22:30 +01:00
Roland Scheidegger
81e728982d softpipe: don't assert when creating surfaces with multiple layers
We can't handle them yet, however we can safely just warn (we will
just render to first layer, which is fine since we can't handle
rendertarget system value neither).
Also make behavior more predictable with buffer surfaces
(it would sometimes hit bogus asserts because of the union in the surface,
instead create the surface but assert when trying to set a buffer
in the framebuffer).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-14 00:21:56 +01:00
José Fonseca
4889315619 llvmpipe: Fix geometry shader token leak.
Trivial. Matches softpipe's code.
2013-03-13 21:46:50 +00:00
Tom Stellard
c95177ea88 radeon/llvm: Add missing license headers
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-13 16:01:31 +00:00
Tom Stellard
1c4f283151 radeon/llvm: Make radeon_llvm_util.cpp a C file
All the functions in this file are now implemented in C.
2013-03-13 16:01:31 +00:00
Tom Stellard
3958c104c6 radeon/llvm: Optimize radeon_llvm_strip_unused_kernels()
Just delete unused kernels rather than marking them as internal and
running the GlobalDCE pass.

Also implement this function in C and inline it into
radeon_llvm_get_kernel_module()
2013-03-13 16:01:31 +00:00
Tom Stellard
2ace79dce5 radeon/llvm: Implement radeon_llvm_get_kernel_module() using the C API 2013-03-13 16:01:31 +00:00
Tom Stellard
b34b8576ec radeon/llvm: Implement radeon_llvm_get_num_kernels() using the C API 2013-03-13 16:01:31 +00:00
Tom Stellard
7e9abbea15 radeon/llvm: Implement radeon_llvm_parse_bitcode() using C API
Also make the function static since it is not used anywhere else.
2013-03-13 16:01:30 +00:00
Tom Stellard
97bfcddde0 r600g/llvm: Move llvm wrapper functions into the radeon directory 2013-03-13 16:01:30 +00:00
Jon TURNEY
28e1693630 Properly check GLX_INDIRECT_RENDERING in glapi/tests/check_table
Actually use $DEFINES, so we can see if GLX_INDIRECT_RENDERING is defined

If GLX_INDIRECT_RENDERING is defined,  _GLAPI_SKIP_PROTO_ENTRY_POINTS will
be defined, and libglapi won't contain the 'protocol entry points', so we
should provide stubs in check_table.cpp
2013-03-13 14:55:52 +00:00
Jon TURNEY
ed8ddd57e9 Fix glapi/tests/check_table.cpp for standardized OpenGL function names
It looks like this has been broken since commit
1a1db1746d "Standardize names of OpenGL
functions."

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
2013-03-13 14:53:49 +00:00
Jon TURNEY
c7a319182f Fix out-of-tree build of 'make check' in src/mapi/glapi/tests/
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
2013-03-13 14:53:36 +00:00
José Fonseca
cff70dcfb2 scons: Define PACKAGE_VERSION/BUGREPORT globally.
Fixes the scons build.
2013-03-13 13:13:37 +00:00
Vinson Lee
a6bb7a9495 tests: Add $(top_srcdir)/include to AM_CPPFLAGS.
Fixes this build error with make check.

  CC     collision.o
In file included from ../../../../../src/mesa/main/hash_table.h:34:0,
                 from collision.c:31:
../../../../../src/mesa/main/compiler.h:51:53: fatal error: c99_compat.h: No such file or directory

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-03-12 23:14:39 -07:00
José Fonseca
f7ef83cdf4 scons: Define PACKAGE_xxx
Should get the builds going again.
2013-03-13 01:29:47 +00:00
Brian Paul
6f86b934e6 docs: rewrite the OSMesa info / instructions
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:43 -06:00
Brian Paul
79eac7da6b configure: wire-up new OSMesa gallium state tracker and target
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:43 -06:00
Brian Paul
be51f123c9 target/osmesa: add new Makefile.am
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:43 -06:00
Brian Paul
94263da46e targets/osmesa: new OSMesa gallium target
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:43 -06:00
Brian Paul
7114b6a92d st/osmesa: add new Makefile.am
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:43 -06:00
Brian Paul
73436a909e st/osmesa: new OSMesa gallium state tracker
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:43 -06:00
Brian Paul
3c3668c5a1 st/mesa: add PIPE_FORMAT_R16G16B16A16_UNORM renderbuffer support
To allow rendering in 16-bit/channel RGBA buffers.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 19:04:42 -06:00
José Fonseca
c526e1728f scons: Re-add ',' 2013-03-13 00:31:03 +00:00
José Fonseca
7bff1cc3f6 autotools: Add missing top-level include dir.
Fixes autotools build failure.  Not sure if there are more, as I have
difficulties in building the full tree.
2013-03-13 00:25:09 +00:00
Matt Turner
5c6e1e97b3 configure.ac: Alphabetize freedreno makefiles. 2013-03-12 17:09:55 -07:00
Matt Turner
d89ef39418 build: Get rid of dead MESA_ASM_FILES variable
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-12 17:02:54 -07:00
Matt Turner
bd0c9d07d0 mesa/build: Get rid of dead ALL_FILES variable
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-12 17:02:47 -07:00
Matt Turner
51e065a96c xmlpool/.gitignore: Remove 'Makefile'
Handled by top level .gitignore.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-12 17:02:40 -07:00
Matt Turner
e59fc3faa5 mesa: Use PACKAGE_BUGREPORT macro.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-12 17:02:33 -07:00
Matt Turner
9065bab37e mesa: Remove unused version #defines from version.h.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-12 17:02:28 -07:00
Matt Turner
439c3d4e31 mesa: Replace MESA_VERSION with PACKAGE_VERSION.
One fewer place to have to update.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-12 17:02:21 -07:00
Zack Rusin
42c1b33f6d draw/so: Fix stream output with geometry shaders
If geometry shader is present its stream output info should
be used instead of the vs and we shouldn't use the pre-clipped
corrdinates.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-12 16:22:26 -07:00
José Fonseca
57cd1d1454 include: Fix build with VS 11 (i.e, 2012).
NOTE: Candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-12 22:07:10 +00:00
José Fonseca
70fe7c6d3e mesa,gallium,egl,mapi: One definition of C99 inline/__func__ to rule them all.
We were in four already...

NOTE: Candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-12 22:06:27 +00:00
José Fonseca
96b3ca89b1 scons: Allows choosing VS 10 or 11.
NOTE: Candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-12 22:04:04 +00:00
Michel Dänzer
4dca602521 radeonsi: Fix off-by-one for maximum vertex element index in some cases
In cases where the vertex element size is smaller than the vertex buffer
stride, the previous calculation could end up 1 too low. This would result
in the GPU using index 0 instead of the maximum index for those elements,
which would be visible as intermittent distorted triangles.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-12 18:25:54 +01:00
Christoph Bumiller
8aa8b0539e nvc0: avoid crash on updating RASTERIZE_ENABLE state
When doing a blit with the 3D engine, the rasterizer or zsa cso may
be NULL.
2013-03-12 12:55:37 +01:00
Christoph Bumiller
4d28aff48f gallium/tests: check format in compute tests, make selectable 2013-03-12 12:55:37 +01:00
Christoph Bumiller
e2dded78ea nvc0: add MP trap handler for nve4 2013-03-12 12:55:37 +01:00
Christoph Bumiller
ae59a7d35d nvc0: they removed the NTID,NCTAID,GRIDID registers on nve4 2013-03-12 12:55:37 +01:00
Christoph Bumiller
e066f2f62f nvc0: implement compute support for nve4 2013-03-12 12:55:37 +01:00
Christoph Bumiller
75f1f852b0 nvc0/ir: try to fix CAS (CompareAndSwap) 2013-03-12 12:55:37 +01:00
Christoph Bumiller
18fdfbdc32 nv50/ir: add CCTL (cache control) op 2013-03-12 12:55:37 +01:00
Christoph Bumiller
9db7e09cb4 nvc0/ir/emit: fix emission of large address offsets 2013-03-12 12:55:36 +01:00
Christoph Bumiller
175c185941 nvc0: add SHADER/COMPUTE_RESOURCE bind flags to format table 2013-03-12 12:55:36 +01:00
Christoph Bumiller
19ea0bd521 nouveau: align PIPE_BIND_SHADER,COMPUTE_RESOURCEs to 256 bytes 2013-03-12 12:55:36 +01:00
Christoph Bumiller
47f2179844 nv50,nvc0: copy writable flag on surface creation 2013-03-12 12:55:36 +01:00
Christoph Bumiller
7a91d3a2a4 nv50/ir: add support for different sampler and resource index on nve4
And remove non-working code for indirect sampler/resource selection.
Will be added back later.

Includes code from "nv50/ir/tgsi: Resource indirect indexing" by
Francisco Jerez (when mixing the R and S handles we can only specify
them via a register, i.e. indirectly, unless we upload all the used
handle combinations to c[] space, which we don't for now).
2013-03-12 12:55:36 +01:00
Christoph Bumiller
99e4eba669 nv50/ir: implement splitting of 64 bit ops after RA 2013-03-12 12:55:36 +01:00
Christoph Bumiller
ac9f19e485 nvc0/ir: skip back edges when determining latest sched value 2013-03-12 12:55:36 +01:00
Christoph Bumiller
f07c46a4f4 nvc0/ir: use large issue delay after RET, too 2013-03-12 12:55:36 +01:00
Christoph Bumiller
b23ec3f8ba nv50/ir: fix size adjustment for sched info for multiple functions 2013-03-12 12:55:36 +01:00
Christoph Bumiller
d39169cb6d nv50/ir: print function inputs and outputs 2013-03-12 12:55:36 +01:00
Christoph Bumiller
1b4faa2b17 nv50/ir/ssa: add a few comments regarding RenamePass 2013-03-12 12:55:36 +01:00
Francisco Jerez
1535b754fb nv50/ir/tgsi: Exclude local declarations from function prototypes. 2013-03-12 12:55:36 +01:00
Christoph Bumiller
9b563ef3f7 nv50/ir/opt: try to make use of SUCLAMP addend 2013-03-12 12:55:36 +01:00
Christoph Bumiller
a788be19e5 nv50/ir: don't assert on type in Modifier.applyTo if it is 0 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c3a5bc0bdf nv50/ir: add support for barriers
nv50 part by Francisco Jerez.
2013-03-12 12:55:35 +01:00
Christoph Bumiller
a0a25191f2 nv50/ir/tgsi: add support for atomics 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c2dfcd7f0e nv50/ir/tgsi: handle TGSI_OPCODE_LOAD,STORE
Squashed and (heavily) modified original patches by Francisco Jerez:
nv50/ir/tgsi: Implement resource LOAD/STORE (wip).
nv50/ir/tgsi: Emit SUST/SULD for surface access, and add CB LOAD/STORE support
nv50/ir/tgsi: Fix/clean up the LOAD/STORE handling code.

Left out for now:
nv50/ir/tgsi: Resource indirect indexing

Treating raw, read-only surfaces as constant buffers (CBs) was removed
because CBs are limited to a size of 64 KiB which isn't desireable, and
because this decision should probably be made by the state tracker.
If we used a number of CB slots for surfaces, it might find that we
cannot accomodate the advertised limit.
2013-03-12 12:55:35 +01:00
Christoph Bumiller
d105b3df14 nvc0/ir: don't replace load from input in COMPUTE progs with VFETCH 2013-03-12 12:55:35 +01:00
Christoph Bumiller
4506ed28de nvc0/ir: implement lowering of surface ops for nve4 2013-03-12 12:55:35 +01:00
Christoph Bumiller
8ac68b071d nvc0/ir: add formatted surface load lib code, move to extra header
OpenGL is nice and makes the user specify a format with an image unit.
OpenCL is evil and doesn't, and what's better than adding a huge load
of functions that we call indirectly to handle the conversion ?
2013-03-12 12:55:35 +01:00
Christoph Bumiller
ce1951daed nv50/ir: extend moveSources for delta < 0 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c0fc3463e9 nvc0/ir: lower atomics in s[] 2013-03-12 12:55:35 +01:00
Christoph Bumiller
9c196779bc nvc0/ir/emit: implement INSBF, EXTBF, PERMT and ATOM 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c8f0c43f7a nv50/ir/emit: handle OP_ATOM 2013-03-12 12:55:35 +01:00
Christoph Bumiller
d6c95f6819 nvc0/ir/target: some ops can't be predicated, e.g. CALL 2013-03-12 12:55:35 +01:00
Christoph Bumiller
1ed507ca46 nv50/ir/opt: CALLs cannot load 2013-03-12 12:55:35 +01:00
Christoph Bumiller
c893b94060 nv50/ir: add support for indirect BRA,CALL 2013-03-12 12:55:34 +01:00
Christoph Bumiller
efe55075b5 nvc0/ir/emit: implement move to and logic ops on predicates 2013-03-12 12:55:34 +01:00
Christoph Bumiller
ce7610f7d5 nvc0/ir/emit: implement surface related ops 2013-03-12 12:55:34 +01:00
Christoph Bumiller
3741b7d844 nv50/ir: initialize CodeEmitters' specialized target fields 2013-03-12 12:55:34 +01:00
Christoph Bumiller
b0fc2f13ec nv50/ir/opt: make optimization aware of atomics, barriers, surface ops 2013-03-12 12:55:34 +01:00
Christoph Bumiller
22b762f9b4 nv50/ir: add various new OPs that will be needed for compute 2013-03-12 12:55:34 +01:00
Francisco Jerez
c82714c593 nv50/ir: Rename "mkLoad" to "mkLoadv" for consistency. 2013-03-12 12:55:34 +01:00
Christoph Bumiller
cc30ce8160 nv50/ir: fix comparison of system values 2013-03-12 12:55:34 +01:00
Francisco Jerez
4ddfdcea04 nv50/ir/tgsi: Translate grid-related system parameters. 2013-03-12 12:55:34 +01:00
Francisco Jerez
8446c31d0e nv50/ir/tgsi: Accept COMPUTE programs. 2013-03-12 12:55:34 +01:00
Christoph Bumiller
e9294e11b4 nv50/ir/ra: make sure all used function inputs get assigned a reg
A live range [0, 0) counts as empty. For function inputs this can
be a problem, so insert a nop at the beginning to make it [0, 1).
This is a bit of a hack but also the most simple solution.
2013-03-12 12:55:34 +01:00
Christoph Bumiller
ee431b12ec nv50/ir/ra: also add pre-existing MERGE,SPLIT to constraint list 2013-03-12 12:55:34 +01:00
Christoph Bumiller
f1dfa414f4 nv50/ir/ra: fix confusion with conditional RegisterSet::occupy 2013-03-12 12:55:34 +01:00
Christoph Bumiller
d995f44f0b nv50/ir/ra: swap copyCompound args if src is compound and dst isn't 2013-03-12 12:55:33 +01:00
Francisco Jerez
95ad9bca2f nv50/ir/ra: Fix maxGPR calculation for programs with multiple functions. 2013-03-12 12:55:33 +01:00
Francisco Jerez
ca04e71024 nv50/ir/ra: Fix traversal before the beginning of the active list in buildRIG. 2013-03-12 12:55:33 +01:00
Francisco Jerez
fe17d8a7c0 nv50/ir/ra: Fix RegisterSet::occupy(const Value *v). 2013-03-12 12:55:33 +01:00
Francisco Jerez
49ded0e132 nv50/ir/ra: Fix argument const-ness in RegisterSet::idToUnits and idToBytes 2013-03-12 12:55:33 +01:00
Francisco Jerez
5959d4247a nv50/ir/opt: Fix tryPropagateBranch for BBs with several exit branches.
Comments and "if (bf->cfg.incidentCount() == 1)" condition added
by Christoph Bumiller.
2013-03-12 12:55:33 +01:00
Francisco Jerez
572bf83ec0 nv50/ir: Clean up references to function values before destroying them. 2013-03-12 12:55:33 +01:00
Francisco Jerez
12f65e38c0 nouveau: Bail out from nouveau_fence_wait if flushing the pushbuf fails. 2013-03-12 12:55:33 +01:00
Vinson Lee
543d032885 mesa: Use correct functions for enum conversion.
Fixes mixing enum types defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-11 23:44:10 -07:00
Rob Clark
6173cc19c4 freedreno: gallium driver for adreno
Currently works on a220.  Others in the a2xx family look pretty similar
and should be pretty straightforward to support with the same driver.

The a3xx has a new shader ISA, and while many registers appear similar,
the register addresses have been completely shuffled around.  I am not
sure yet whether it is best to support with the same driver, but
different compiler, or whether it should be split into a different
driver.

v1: original
v2: build file updates from review comments, and remove GPL licensed
    header files from msm kernel
v3: smarter temp/pred register assignment, fix clear and depth/stencil
    format issues, resource_transfer fixes, scissor fixes

Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-03-11 21:53:24 -04:00
José Fonseca
44a8e51354 d3d1x: Remove.
Unused/unmaintained.

Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
2013-03-12 00:35:06 +00:00
José Fonseca
7db60f049f nv50: Remove nv0_ir_from_sm4.*
Unused, depends on d3d1x.

Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
2013-03-12 00:35:06 +00:00
Roland Scheidegger
5c41d1c222 gallivm: clean up passing derivatives around
Previously, the derivatives were calculated and passed in a packed form
to the sample code (for implicit derivatives, explicit derivatives were
packed to the same format).
There's several reasons why this wasn't such a good idea:
1) the derivatives may not even be needed (not as bad as it sounds since
llvm will just throw the calculations needed for them away but still)
2) the special packing format really shouldn't be part of the sampler
interface
3) depending what the sample code actually does the derivatives will
be processed differently, hence there is no "ideal" packing. For cube
maps with explicit derivatives (which we don't do yet) for instance the
packing looked downright useless, and for non-isotropic filtering we'd
need different calculations too.

So, instead just pass the derivatives as is (for explicit derivatives),
or let the rho calculating sample code calculate them itself. This still
does exactly the same packing stuff for implicit derivatives for now,
though explicit ones are handled in a more straightforward manner (quick
estimates show performance should be quite similar, though it is much
easier to follow and also does the rho calculation per-pixel until the
end, which we eventually need for spec compliance anyway).

No piglit changes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-12 00:24:22 +01:00
Chad Versace
b7262ac7ea i965: Fix typo in doxygen hyperlink
s/brw_state_upload/brw_upload_state/

Found because the link was broken.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-11 16:01:19 -07:00
Eric Anholt
11b8df0c01 mesa: Reduce memory usage for reg alloc with many graph nodes (part 2).
After the previous fix that almost removes an allocation of 4*n^2
bytes, we can use a bitset to reduce another allocation from n^2 bytes
to n^2/8 bytes.

Between the previous commit and this one, the peak heap size for an
oglconform ARB_fragment_program max instructions test on i965 goes from
4GB to 255MB.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55825
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:54 -07:00
Eric Anholt
6aa3afbfd6 mesa: Reduce the memory usage for reg alloc with many graph nodes (part 1)
We were allocating an adjacency_list entry for every possible
interference that could get created, but that usually doesn't happen.
We can save a lot of memory by resizing the array on demand.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:54 -07:00
Eric Anholt
5daf867f6c i965/fs: Improve CSE performance by expiring some available expressions.
We're already walking the list, and we can easily know when something
has no reason to be in the list any longer, so take a brief extra step
to reduce our worst-case runtime (an oglconform test that emits the
maximum instructions in a fragment program).  I don't actually know what
the worst-case runtime was, because it was too long and I got bored.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:54 -07:00
Eric Anholt
f179f419d1 i965/fs: Improve live variables calculation performance.
We can execute way fewer instructions by doing our boolean manipulation
on an "int" of bits at a time, while also reducing our working set size.

Reduces compile time of L4D2's slowest shader from 4s to 1.1s
(-72.4% +/- 0.2%, n=10)

v2: Remove redundant masking (noted by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:54 -07:00
Eric Anholt
4dc7e6dcbf i965/fs: Also do the gen4 SEND dependency workaround against other SENDs.
We were handling the the dependency workaround for the first written reg
of a send preceding the one we're fixing up, but didn't consider the other
regs.  Thus if you had two sampler calls that got allocated to the same
set of regs, one might, rarely, ovewrite the other.  This was occurring in
XBMC's GLSL shaders.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44567
NOTE: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:53 -07:00
Eric Anholt
4c1fdae0a0 i965/fs: Switch to using sampler LD messages for uniform pull constants.
When forcing the compiler to always generate pull constants instead of
push constants (in order to have an easy to use testcase), improves
performance of my old GLSL demo 23.3553% +/- 1.42968% (n=7).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60866
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:53 -07:00
Eric Anholt
1323772543 i965/fs: Fix broken rendering in large shaders with UBO loads.
The lowering process creates a new vgrf on gen7 that should be represented
in live interval analysis.  As-is, it was getting a conflicting allocation
with gl_FragDepth in the dolphin emulator, producing broken rendering.

NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:53 -07:00
Eric Anholt
c588cd2031 i965/fs: Add a comment about about an implementation detail.
I was going to fix the code above like the previous commit, but we already
had that covered (otherwise all our uniform access would have been broken,
unlike just pull constants).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:53 -07:00
Eric Anholt
f10f5e4980 i965/fs: Fix register allocation for uniform pull constants in 16-wide.
We were allowing a compressed instruction to write a register that
contained the last use of a uniform pull constant (either UBO load or push
constant spillover), so it would get half its values smashed.

Since we need to see the actual instruction to decide this, move the
pre-gen6 pixel_x/y logic here, which should improve the performance of
register allocation since virtual_grf_interferes() is called more than
once per instruction.

NOTE: This is a candidate for the stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:53 -07:00
Eric Anholt
f09a8e17e5 intel: Remove some unused debug flags.
I was looking at the list to see what might be interesting to document for
application developers, and it turns out some are completely dead.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-11 12:11:53 -07:00
Zack Rusin
7295fad204 draw/gs: Correctly iterate the emitted primitives
We were assuming that each emitted primitive had the same
number of vertices. That is incorrect. Emitted primitives
can have arbirtrary number of vertices. Simply increment
index on iteration to fix it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-07 20:16:07 -08:00
Zack Rusin
e5406f7058 tgsi/exec: Correctly reset NumOutputs before parsing the shader
Whenever we're binding the shaders we're incrementing NumOutputs,
assuming the parser spots an output decleration, but we were never
reseting the variable. That means that each subsequent bind of
a geometry shader would add its number of output to the number
of output bound by all previously ran shaders and our indexes
would get completely messed up.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-07 20:16:00 -08:00
Roland Scheidegger
9060c835fd draw/llvm: another quick hack for drawing with no position output
Also need to skip things if we have no cv value but pos value
(happens with geometry shaders enabled).
Needs a round of cleanup, though.
2013-03-11 17:07:51 +01:00
Roland Scheidegger
ef17cc9cb6 softpipe: don't use samplers with prebaked sampler and sampler_view state
This is needed for handling the dx10-style sample opcodes.
This also simplifies the logic by getting rid of sampler variants
completely (sampler_views though OTOH have sort of variants because
some of their state is different depending on the shader stage they
are bound to).
No significant performance difference (openarena run:
840 frames in 459.8 seconds vs. 840 frames in 460.5 seconds).

v2: fix reference counting bug spotted by Jose.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-11 17:07:51 +01:00
Roland Scheidegger
f33c744fb9 tgsi: emit code for SVIEWINFO and SAMPLE_I
Can handle them since the single sampler interface was introduced.

v2: simplify txf/sample_i handling a bit according to Brian's feedback.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-11 17:07:51 +01:00
Roland Scheidegger
7b3a0bb45d tgsi: fix wrong reg used for unit for TGSI_OPCODE_TXF
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-11 17:07:51 +01:00
Tom Stellard
a0676968b9 r600g/llvm: Fix build 2013-03-11 11:10:51 -04:00
Marek Olšák
e4e655fd11 r600g: add debug options disabling various copy-buffer-related features
This will be invaluable for debugging and bug reports.
2013-03-11 13:44:46 +01:00
Marek Olšák
4b69c1a92d mesa: don't allocate a texture if width or height is 0 in CopyTexImage
NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-11 13:44:14 +01:00
Marek Olšák
68ed4c9c89 gallium/util: attempt to fix blitting multisample texture arrays
We don't have a test for this yet, but obviously the swizzle was wrong.
2013-03-11 13:43:36 +01:00
Marek Olšák
52efa01de0 r600g: allocate FMASK right after the texture, so that it's aligned with it
This avoids the kernel CS checker errors with MSAA textures.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
2c339f8015 r600g: remove r600.h, move the stuff elsewhere (mostly to r600_pipe.h)
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
ec7d775790 r600g: remove r600_hw_context_priv.h, move the stuff to r600_pipe.h
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
1724ef8908 r600g: remove deprecated state management code
It's nice to see so much code that did pretty much nothing go away.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
65cbf89567 r600g: atomize pixel shader
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
63042af933 r600g: atomize vertex shader
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
167263ecb1 r600g: inline r600_pipe_shader function
also change names of other functions, so that they make sense

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
65b2a449bc r600g: dump vertex elements state along with the fetch shader 2013-03-11 13:43:36 +01:00
Marek Olšák
3f0a51d677 gallium/util: dump instance_divisor 2013-03-11 13:43:36 +01:00
Marek Olšák
3832059b10 r600g: remove bytecode dumping
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
4bf0ebdd4f r600g: use a single env var R600_DEBUG, disable bytecode dumping
Only the disassembler is used to dump shaders. Here's a few examples
how to use R600_DEBUG.

Log compute info:
  R600_DEBUG=compute

Dump all shaders:
  R600_DEBUG=fs,vs,gs,ps,cs

Dump pixel shaders only:
  R600_DEBUG=ps

Disable Hyper-Z:
  R600_DEBUG=nohyperz

Disable the LLVM backend:
  R600_DEBUG=nollvm

Or use any combination of the above, or print all options:
  R600_DEBUG=help

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
2ca73bc7f7 r600g: cleanup #include recursion between r600_pipe.h and evergreen_compute.h
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-11 13:43:36 +01:00
Marek Olšák
43d3e0cd3d r600g: don't check for R600_ENABLE_S3TC env var 2013-03-11 13:43:36 +01:00
Stefan Brüns
b21a9d46e4 glapi/gen: Remove duplicate PYTHON_FLAGS
PYTHON_GEN calls python with PYTHON_FLAGS

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
2013-03-09 16:24:51 -08:00
Frank Henigman
89559c50e7 i965: Link i965_dri.so with C++ linker.
Force C++ linking of i965_dri.so by adding a dummy C++ source file.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-08 21:21:53 -08:00
Maxence Le Doré
ba588dd45d gallium/util: Correct shift value for TSC feature detection.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-08 21:21:53 -08:00
Matt Turner
07f2dee731 configure.ac: Build dricommon for DRI gallium drivers
Commit 67ef7559 added an || test "x$enable_dri" check in an attempt to
get the DRI common bits built in some necessary cases. That change was
inappropriate as it made these common DRI pieces be built
unconditionally, so some builds were broken.

Subsequently, commit 998d975e3 change the "|| test" to a "-a"
conjunction within the existing test invocation. This made the '-a
"x$enable_dri" = xyes' clause have no effect, (as it was inside an
enclosing test for the same condition). So the new breakage from
commit 67ef7559 was addressed, but the original problems were
regressed.

The immediately preceding commit removed the redundant condition.

Now, finally this commit fixes the original problem as described in
the commit message of 67ef7559: this code should be compiled when
using the DRI state tracker. In order to do so, the HAVE_*_DRI
conditionals must be moved after the last assignment of HAVE_COMMON_DRI.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61821
Tested-by: Stéphane Marchesin <marcheu@chromium.org>
2013-03-08 21:21:46 -08:00
Matt Turner
7de78ce5e5 configure.ac: Remove redundant checks of enable_dri.
The whole block is enclosed inside if test "x$enable_dri" = xyes.
2013-03-08 21:20:43 -08:00
Matt Turner
79a0977241 mesa: Allow ETC2/EAC formats with ARB_ES3_compatibility.
Fixes piglit's oes_compressed_etc2_texture-miptree tests on Desktop GL.
Reported-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-03-08 21:20:39 -08:00
Stéphane Marchesin
1662178863 i915g: Use PIPE_FLUSH_END_OF_FRAME to trigger throttling
This helps with jittering, instead of throttling at every command
buffer we only throttle once a frame.
2013-03-08 19:34:50 -08:00
Stéphane Marchesin
d815e8af39 i915g: Update TODO 2013-03-08 19:34:43 -08:00
Brian Paul
728240b64d docs: document another Viewperf bug 2013-03-08 10:35:46 -07:00
Jan de Groot
17f1cb1d99 dri/nouveau: fix crash in nouveau_flush
https://bugs.freedesktop.org/show_bug.cgi?id=61947

Note: this is a candidate for the stable branches
2013-03-07 19:55:07 +01:00
Brian Paul
057c46d791 draw: add const qualifier to silence compiler warning 2013-03-07 08:11:12 -07:00
Brian Paul
9915636fb8 llvmpipe: remove the power of two sizeof(struct cmd_block) assertion
It fails on 32-bit systems (I only tested on 64-bit).  Power of two
size isn't required, so just remove the assertion.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-07 06:28:23 -07:00
Brian Paul
c2665aacdd vbo: fix crash found with shared display lists
This fixes a crash when a display list is created in one context
but executed from a second one.  The vbo_save_context::vertex_store
memeber will be NULL if we never created a display list with the
context.  Just check for that before dereferencing the pointer.

Fixes http://bugzilla.redhat.com/show_bug.cgi?id=918661

Note: This is a candidate for the stable branches.
2013-03-07 06:28:23 -07:00
Alan Hourihane
5984a911f9 mesa: fix glGetInteger*(GL_SAMPLER_BINDING).
If the sampler object has been deleted on another context, an
alternative context may reference the old sampler. So ensure the sampler
object still exists.

Note: this is a candidate for the stable branch.

Signed-off-by: Alan Hourihane <alanh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-07 10:13:40 +00:00
Christian König
eddf33f711 radeon/llvm: document LLVM commit
We need at least that revision to work correctly now.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-07 10:06:24 +01:00
Christian König
a7a899584c radeon/llvm: enable LICM and DCE pass v2
LICM stands for Loop Invariant Code Motion. Instructions that
does not depend of loop index are moved outside of loop body.

DCE is DeadCodeElimination.

v2: updated commit msg, thx to Vincent.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
e4188ee13d radeonsi: add LLVMNoUnwindAttribute to intrinsic
So LLVM can better eliminate dead code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
0666ffddd2 radeonsi: rework input interpolation
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
c497321d31 radeonsi: remove SI.vs.load.buffer.index
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
55fe5ccb39 radeon/llvm: make SGPRs proper function arguments v2
v2: remove unrelated changes

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
b8f4ca3d85 radeon/llvm: replace shader type intrinsic with function attribute
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
de80e560bc radeonsi: switch to v*i8 for resources and samplers v2
v2: remove unrelated changes

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-03-07 10:03:22 +01:00
Christian König
2cb54833d0 r600g/llvm: Update CONSTANT_BUFFER address space definition
To match recent LLVM changes.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-07 10:03:11 +01:00
Zack Rusin
2532147f8b draw/llvm: fix inputs to the geometry shader
We can't clip and viewport transform the vertices before we let
the geometry shader process them. Lets make sure the generated
vertex shader has both disabled if geometry shader is present.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-05 20:13:08 -08:00
Bryan Cain
8c74380b2d draw: use geometry shader info in clip_init_state if appropriate
Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-05 20:13:08 -08:00
Bryan Cain
30f246bf2c draw: account for separate shader objects in geometry shader code
The geometry shader code seems to have been originally written with the
assumptions that there are the same number of VS outputs as GS outputs and
that VS outputs are in the same order as their corresponding GS inputs. Since
TGSI uses separate shader objects, these are both wrong assumptions. This
was causing several valid vertex/geometry shader combinations to either render
incorrectly or trigger an assertion.

Conflicts:
	src/gallium/auxiliary/draw/draw_gs.c

Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-05 20:13:08 -08:00
Alan Hourihane
cf0b4a30fc Unreference sampler object when it's currently bound to texture unit.
This change specifically unbinds a sampler object from the texture unit
if it's bound to a unit. The spec calls for default object when deleting
sampler objects which are currently bound.

Note: this is a candidate for the stable branches

Signed-off-by: Alan Hourihane <alanh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-06 18:10:12 +00:00
Brian Paul
b21f8e364b llvmpipe: fix incorrect 'j' array index in dummy texture code
Use 0 instead.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-03-06 10:34:09 -07:00
Brian Paul
975d31f60d llvmpipe: remove unused cmd_block_list struct 2013-03-06 10:34:09 -07:00
Brian Paul
a51b81558f llvmpipe: add some scene limit sanity check assertions
Note: This is a candidate for the stable branches.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-06 10:34:09 -07:00
Brian Paul
a31ebdffa0 llvmpipe: tweak CMD_BLOCK_MAX and LP_SCENE_MAX_SIZE
We advertise a max texture/surfaces size of 8K x 8K but the old values
for these limits didn't actually allow us to handle that surface size.

For 8K x 8K we'll have 16384 bins.  Each bin needs at least one cmd_block
object which was 2192 bytes in size.  Since 16384 * 2192 exceeded
LP_SCENE_MAX_SIZE we'd silently fail in lp_scene_new_data_block() and not
draw the complete scene.

By reducing CMD_BLOCK_MAX to 29 we get nice 512-byte cmd_blocks.  And
by increasing LP_SCENE_MAX_SIZE to 9 MB we can allocate enough command
blocks for 8K x 8K, plus a few regular data blocks.

Fixes the (improved) piglit fbo-maxsize test.

Note: This is a candidate for the stable branches.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-03-06 10:34:09 -07:00
Kenneth Graunke
492693c0a5 i965: Don't fill buffer with zeroes.
This was only necessary because our bounds checking was off by one, and
thus we read an extra pair of values.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-06 08:27:54 -08:00
Kenneth Graunke
89e5c8e0fa i965: Fix off-by-one in query object result gathering.
If we've written N pairs of values to the buffer, then last_index = N,
but the values are 0 .. N-1.  Thus, we need to use <, not <=.

This worked anyway because we fill the buffer with zeroes, so we just
added an extra (0 - 0) to our results.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-06 08:27:47 -08:00
Christian König
886c5085e3 radeon/llvm: fix trivial warnings
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-06 12:08:54 +01:00
Christian König
a212483437 radeonsi: fix trivial warning
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-03-06 12:07:40 +01:00
Eric Anholt
88b20d5834 intel: Improve the matching (more formats!) for TexImage from PBOs.
Mesa core is the place for encoding what format/type matches a mesa
format, so rely on that.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-05 16:02:38 -08:00
Eric Anholt
731d474d98 intel: Improve the test for readpixels blit path format checking.
We were allowing things like copying RG1616 to a user's ARGB8888
format, while we were denying anything that wasn't ARGB8888 or
RGB565.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-05 16:02:38 -08:00
Eric Anholt
3c7e96ff01 intel: Fold intel_region_copy() into its one caller.
This is similar code to intel_miptree_copy_slice, but the knobs
are all set differently.

v2: fix whitespace

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-05 16:02:38 -08:00
Eric Anholt
7604debabb intel: Transition intel_region_map() to being a miptree operation.
I'm trying to move us away from the region structure, and all the
callers are currently dereferencing a miptree to get the region.

In this change, the map_refcount is dropped.  However, the bo->virtual is
itself map refcounted, so that's already dealt with.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-05 16:02:38 -08:00
Eric Anholt
f4f288f317 intel: Remove num_mapped_regions tracking.
The point of tracking the value was removed in February 2012
(65b096aedd), and this should have
been removed at the same time.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-05 16:02:38 -08:00
Eric Anholt
3c9532314c intel: Remove the struct intel_region reuse hash table.
I don't see any reason for it -- it was introduced with the DRI2
invalidate work by krh in 2010 with no explanation.  I suspect it was
something about wanting the same drm_intel_bo struct underneath multiple
openings of the BO within one process, but that's covered by libdrm at
this point.  As far as the struct region goes, it is not threadsafe, so
multiple contexts sharing a region could have mixed up the map_count and
assertion failed or worse.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-05 16:02:37 -08:00
José Fonseca
e77234be39 scons: Provide shorthand aliases for software winsyses. 2013-03-05 23:06:13 +00:00
José Fonseca
3950953f93 scons: Fix llvm-config not found error message.
"% llvm_version" is bogus copy'n'past cruft.
2013-03-05 23:06:13 +00:00
Ian Romanick
674f9239b9 mesa: Modify candidate search string
Several commits on master for the 9.1 branch had "NOTE" messages in a
slightly different format.

NOTE: This is a candidate for stable branches

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-05 14:54:11 -08:00
Eric Anholt
65afa11dc6 mesa: Remove the special enum for _mesa_error debug output.
Now all the per-message enums from mtypes are gone.  Now we can extend
unique message IDs into all generators of debug output without having to
update mtypes.h for each one.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:01 -08:00
Eric Anholt
d9249935db mesa: Remove the enum for the oom-within-debug-output case.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:01 -08:00
Eric Anholt
6816f67de6 mesa: Remove now-unused gl_winsys_error and gl_shader_error enums.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:00 -08:00
Eric Anholt
c72cf53817 mesa: Report ARB_debug_output for both shader errors and warnings.
This ends up reusing the dynamic ID support, so a silly enum gets to go
away.  We don't assign good IDs to different messages yet, but at least
that's tractable now.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:00 -08:00
Eric Anholt
f0a191ca0f intel: Add missing perf debug for a stall on mapping a BO.
I was testing the ARB_debug_output code and wrote an obvious sample that
should have hit this, and got confused that my ARB_debug_output was
broken.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:00 -08:00
Eric Anholt
14cec07177 i965: Make perf_debug() output to GL_ARB_debug_output in a debug context.
I tried to ensure that performance in the non-debug case doesn't change
(we still just check one condition up front), and I think the impact is
small enough in the debug context case to warrant including all of it.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:00 -08:00
Eric Anholt
0a1c6bcfb0 intel: Finish renaming fallback_debug() to perf_debug().
They're about to change to handle GL_ARB_debug_output, so just make one
function.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:25:00 -08:00
Eric Anholt
807eedf70f intel: Hook up the WARN_ONCE macro to GL_ARB_debug_output.
This doesn't provide detailed error type information, but it's important
to get these relatively severe but rare error messages out to the
developer through whatever mechanism they are using.

v2: Rebase on new WARN_ONCE additions.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
2013-03-05 14:25:00 -08:00
Eric Anholt
3025680578 mesa: Add support for GL_ARB_debug_output with dynamic ID allocation.
We can emit messages now without always having to use the same ID for
each, or having a giant table of all possible errors in mtypes.h.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:24:59 -08:00
Eric Anholt
7beb93456d mesa: Merge handling of application-provided and built-in error sources.
I want to have dynamic IDs so that we don't need to add to mtypes.h for
every error we might want to add.  To do so, I need to get rid of the
static arrays and actually support all the crazy filtering of dynamic IDs
that we already support for application-provided error sources.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:24:59 -08:00
Eric Anholt
88831a8d99 mesa: Fix _mesa_problem() on context destroy after application debug output
This was apparently not noticed because we don't have any testing of
application-generated debug output.  However, as I'm changing the
GL-generated debug output to use the same path as
application/middleware-generated debug output, this obviously became an
issue.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:24:59 -08:00
Eric Anholt
e0d1e3b785 mesa: Move debug type/severity enums to mesa core.
These will get reused by new ARB_debug_output messages in drivers/core,
instead of having the caller pass GL enums and have us immediately
switch-statement those into enums.

Add source enums will be handled in the next commit, because the way
different sources are handled at the moment is pretty strange.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:24:59 -08:00
Eric Anholt
c42148d16e mesa: Replace open-coded _mesa_lookup_enum_by_nr().
The new one doesn't have the same behavior for GL_NO_ERROR, but we don't
produce errors with GL_NO_ERROR as the error type.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:24:59 -08:00
Eric Anholt
e022461c64 mesa: Remove extra #define MAXSTRING duplicating MAX_DEBUG_MESSAGE_LENGTH.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-03-05 14:24:59 -08:00
Marcin Slusarz
f4ebcd133b dri/nouveau: NV17_3D class is not available for NV1a chipset
Should fix https://bugs.freedesktop.org/show_bug.cgi?id=60510

Note: this is a candidate for the stable branches

Acked-by: Francisco Jerez <currojerez@riseup.net>
2013-03-05 21:19:17 +01:00
Roland Scheidegger
b9eb573600 tgsi: handle projection modifier for array textures.
This partly reverts 6ace2e41da.
Apparently with GL_MESA_texture_array fixed-function texturing
with texture arrays is possible, and hence we have to handle TXP.
(Though noone seems to know the semantics, softpipe now does what
it did before, which is to NOT project the array coord, llvmpipe
for instance however indeed does project the array coord. Unlike
before it will project the comparison coord for shadow1d array, as
that clearly was an error.)
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=61828.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-05 20:10:37 +01:00
Roland Scheidegger
be6d18ba5e st/mesa: translate ir offset parameters for non-TXF opcodes.
Otherwise the state tracker will crash if the texture instructions
have offsets.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-05 20:10:37 +01:00
Matt Turner
523b07e320 configure.ac: Remove stale comment about --x-* arguments.
Should have been removed with e273ed37.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-05 11:02:36 -08:00
Matt Turner
35189d768b configure.ac: Don't check for X11 unconditionally.
X11 is already checked conditionally below.

Fixes OSMesa-only configurations to not require X11.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-05 11:02:22 -08:00
Alan Hourihane
196443f3f5 Add missing GL_TEXTURE_CUBE_MAP entry in _mesa_legal_texture_dimensions
This was hit on the glTexStorage2D() path.

Note: this is a candidate for the stable branches

Signed-off-by: Alan Hourihane <alanh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-05 17:22:44 +00:00
Jon TURNEY
87fdcd87b1 Fix out-of-tree build of 'make check' in src/mesa/main/tests
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-05 13:33:16 +00:00
Dave Airlie
e21460b4d5 u_blitter: don't create illegal shaders for 1D/3D/RECT/CUBE MSAA
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-03-04 22:23:08 +00:00
Daniel Martin
998d975e38 Fix build of swrast only without libdrm
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Daniel Martin <consume.noise@gmail.com>
2013-03-04 10:11:01 -08:00
Brian Paul
b1390c7992 mesa: flush current state when querying GL_EDGE_FLAG
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61395

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-04 08:41:45 -07:00
Jakub Bogusz
e29124717e vdpau-softpipe: Build correct source file - vl_winsys_xsp.c
Copy-and-paste problem introduced by commit 7f24483e.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-03 22:53:26 -08:00
Kenneth Graunke
b88f74d63d i965: Fix Crystal Well PCI IDs.
The second digit was off by one, which meant we accidentally treated
GTn as GT(n-1).  This also meant no support for GT1 at all.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-03 13:53:58 -08:00
Vincent Lejeune
83e7d111af r600g: Check comp_mask before merging export instructions
Fixes a llvm uncovered (rare) bug where consecutive exports were
merged even if they have incompatible mask.
2013-03-03 21:39:51 +01:00
Vadim Girlin
138b5b9a12 r600g: fix check_and_set_bank_swizzle for cayman
Tested-by: Vincent Lejeune <vljn at ovi.com>
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
2013-03-03 21:38:49 +01:00
Brian Paul
0b6e72f8d7 st/mesa: add switch case for ir_txf_ms to silence warning
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-03-02 05:52:40 -07:00
Brian Paul
2ea0e30bed mesa: add switch case for ir_txf_ms to silence warning
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 05:52:28 -07:00
Kenneth Graunke
cf0c0a7782 i965: Pull query BO reallocation out into a helper function.
We'll want to reuse this for non-occlusion queries in the future.

Plus, it's a single logical task, so having it as a helper function
clarifies the code somewhat.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
961c9b8cac i965: Replace the global brw->query.bo variable with query->bo.
Again, eliminating a global variable in favor of a per-query object
variable will help in a future where we have more queries in hardware.

Personally, I find this clearer: there's just the query object's BO,
rather than two variables that usually shadow each other.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
614944b897 i965: Turn if (query->bo) into an assertion.
The code a few lines above calls brw_emit_query_begin() if !query->bo,
and that creates query->bo.  So it should always be non-NULL.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
981a22b62b i965: Unify query object BO reallocation code.
If we haven't allocated a BO yet, we need to do that.  Or, if there
isn't enough room to write another pair of values, we need to gather up
the existing results and start a new one.  This is simple enough.

However, the old code was awkwardly split into two blocks, with a
write_depth_count() placed in the middle.  The new depth count isn't
relevant to gathering the old BO's data, so that can go after the
reallocation is done.  With the two blocks adjacent, we can merge them.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
90feda81de i965: Use query->last_index instead of the global brw->query.index.
Since we already have an index in the brw_query_object, there's no need
to also keep a global variable that shadows it.

Plus, if we ever add support for more types of queries that still need
the per-batch before/after treatment we do for occlusion queries, we
won't be able to use a single global variable.  In contrast, per-query
object variables will work fine.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
ec5d502ec3 i965: Remove brw_query_object::first_index field as it's always 0.
brw->query.index is initialized to 0 just a few lines before it's
copied to first_index.

Presumably the idea here was to reuse the query BO for subsequent
queries of the same type, but since that doesn't happen, there's no need
to have the extra code complexity.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
d92c7d8eed i965: Add a pile of comments to brw_queryobj.c.
This code was really difficult to follow, for a number of reasons:

- Queries were handled in four different ways (TIMESTAMP writes a single
  value, TIME_ELAPSED writes a single pair of values, occlusion queries
  write pairs of values for the start and end of each batch, and other
  queries are done entirely in software.  It turns out that there are
  very good reasons each query is handled the way it is, but
  insufficient comments explaining the rationale.

- It wasn't immediately obvious which functions were driver hooks
  and which were helper functions.  For example, brw_query_begin() is
  a driver hook that implements glBeginQuery() for all query types, but
  the similarly named brw_emit_query_begin() is a helper function that's
  only relevant for occlusion queries.

Extra explanatory comments should save me and others from constantly
having to ask how this code works and why various query types are
handled differently.

v2: Incorporate Eric's feedback: change "as soon as possible" to "the
    results will be present when mapped."

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:04 -08:00
Kenneth Graunke
d1b34baf9b i965: Write TIMESTAMP query values into the first buffer element.
For timestamp queries, we just write a single value to a BO.  The
natural place to write that is element 0, so we should do that.

Previously, we wrote it into element 1 (the second slot) leaving
element 0 filled with garbage.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:03 -08:00
Kenneth Graunke
3d71f4fbac i965: Implement the new QueryCounter() hook.
This moves the GL_TIMESTAMP handling out of EndQuery.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:03 -08:00
Kenneth Graunke
dfb056b892 mesa: Add a new QueryCounter() hook for TIMESTAMP queries.
In OpenGL, most queries record statistics about operations performed
between a defined beginning and ending point.  However, TIMESTAMP
queries are different: they immediately return a single value, and there
is no start/stop mechanism.

Previously, Mesa implemented TIMESTAMP queries by calling EndQuery
without first calling BeginQuery.  Apparently this is DirectX
convention, and Gallium followed suit.  I personally find the asymmetry
jarring, however---having BeginQuery and EndQuery handle a different set
of enum values looks like a bug.  It's also a bit confusing to mix the
one-shot query with the start/stop model.

So, add a new QueryCounter driver hook for implementing TIMESTAMP.  For
now, fall back to EndQuery to support drivers that don't do the new
mechanism.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-03-01 22:09:03 -08:00
Roland Scheidegger
6ace2e41da tgsi: add texel offsets and derivatives to sampler interface
Something I never got around to implement, but this is the tgsi execution
side for implementing texel offsets (for ordinary texturing) and explicit
derivatives for sampling (though I guess the ordering of the components
for the derivs parameters is debatable).
There is certainly a runtime cost associated with this.
Unless there are different interfaces used depending on the "complexity"
of the texture instructions, this is impossible to avoid.
Offsets are always active (I think checking if they are active or not is
probably not worth it since it should mostly be an add), whereas the
sampler_control is extended for explicit derivatives.
For now softpipe (the only user of this) just drops all those new values
on the floor (which is the part I never implemented...).

Additionally this also fixes (discovered by accident) inconsistent
projective divide for the comparison coord - the code did do the
projection for shadow2d targets, but not shadow1d ones. This also
drops checking for projection modifier on array targets, since they
aren't possible in any extension I know of (hence we don't actually
know if the array layer should also be divided or not).

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-02 02:54:31 +01:00
Roland Scheidegger
c7c7186045 draw: additional fix for the no-position case with llvm
Similar fix to what is done for the non-llvm case, we could otherwise still
hit the stages (near certainly with gs) which crash. It is probably a much
better idea to skip trying to draw at that point anyway.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-02 02:54:31 +01:00
Roland Scheidegger
ea8b2ae8a5 draw: fix no position output in non-llvm pipeline.
It seems easiest (and best) if we simply skip all the later stages
(after stream output).
(This is different to the llvm case at least for now where we will
simply try to render garbage, though both behaviors should be correct.)
Fixes piglit glsl-1.40-tf-no-position with softpipe.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-02 02:54:31 +01:00
Roland Scheidegger
de0593e333 draw/llvm: skip clipping and viewport transform if there's no position output
With glsl 1.40 writing position is not required (useful for transform
feedback, though in fact it's still possible to rasterize such geometry
even if the results aren't too well defined).
Prevents crashes in that case. Fixes piglit glsl-1.40-tf-no-position.
Not quite sure this is 100% correct as it also skips clipdistance
clipping which could still work (but not sure if the result would
really be needed?)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-03-02 02:54:31 +01:00
Roland Scheidegger
2ef13e7c55 llvmpipe: don't assert on illegal surface creation.
Since c8eb2d0e82 llvmpipe checks if it's
actually legal to create a surface. The opengl state tracker doesn't quite
obey this so for now just warn instead of assert.
Also warn instead of disabled assert when creating sampler views
(same reasoning).

Addresses https://bugs.freedesktop.org/show_bug.cgi?id=61647.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-02 02:54:31 +01:00
Roland Scheidegger
4c12276607 llvmpipe: bump glsl version to 140
texel offsets should have been the last missing feature for 130, and in
fact 140 as well (last there were texture buffers). In any case we still
don't do OpenGL 3.0 (missing MSAA which will be difficult,
plus EXT_packed_float, ARB_depth_buffer_float and EXT_framebuffer_sRGB).

v2: bump to 140 instead - we have everything except we crash when not writing
to gl_Position (but softpipe crashes as well) so let's just say this is a bug
instead. Also (by Dave Airlie's suggestion) update llvm-todo.txt.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-02 02:54:30 +01:00
Roland Scheidegger
b3b3b389fa gallivm: add support for texel offsets for ordinary texturing.
This was previously only handled for texelFetch (much easier).
Depending on the wrap mode this works slightly differently (for somewhat
efficient implementation), hence have to do that separately in all roughly
137 places - it is easy if we use fixed point coords for wrapping, however
some wrapping modes are near impossible with fixed point (the repeat stuff)
hence we have to normalize the offsets if we can't do the wrapping in
unnormalized space (which is a division which is slow but should still be
much better than the alternative, which would be integer modulo for wrapping
which is just unusable). This should still give accurate results in all
cases that really matter, though it might be not quite conformant behavior
for some apis (but we have much worse problems there anyway even without
using offsets).
(Untested, no piglit test.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-03-02 02:54:30 +01:00
Brian Paul
a99eb5c83f svga: always link with C++
Even when we don't have LLVM since there's other C++ code
in the resulting DRI driver object.

Note: This is a candidate for the stable branches.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-01 17:31:32 -07:00
Brian Paul
f6c0612618 st/mesa: convert ir_triop_lrp to TGSI_OPCODE_LRP
AFAICT, all gallium drivers implement TGSI_OPCODE_LRP.
Tested with softpipe, llvmpipe, svga drivers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-01 17:31:32 -07:00
Chris Forbes
7616586cff docs: Mark some things done in GL3.txt 2013-03-02 12:02:25 +13:00
Martin Andersson
d96d8ed910 winsys/radeon: Only add bo to hash table when creating flink
The problem is that we mix bo handles and flinked names in the hash
table. Because kms type handles are not flinked they should not be
added to the hash table. If we do that we will sooner or later
get a situation where we will overwrite a correct entry because
the bo handle was the same as a flinked name.

Note: this is a candidate for the stable branches.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-01 17:52:40 -05:00
Chris Forbes
1d4dbeeaec i965: enable ARB_texture_multisample on Gen6+
V2: Works on Ivy Bridge now too, so this can be 6+.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-02 11:40:50 +13:00
Chris Forbes
26c8479474 i965/fs: add support for ir_txf_ms on Gen6+
On Gen6, lower this to `ld` with lod=0 and an extra sample_index
parameter.

On Gen7, use `ld2dms`. We don't support CMS yet for multisample
textures, so we just hardcode MCS=0. This is ignored for IMS and UMS
surfaces.

Note: If we do end up emitting specialized shaders based on the MSAA
layout, we can emit a slightly shorter message here in the UMS case.

Note: According to the PRM, `ld2dms` takes one more parameter, lod.
However, it's always zero, and including it would make the message too
long for SIMD16, so we just omit it.

V2: Reworked completely, added support for Gen7.
V3: - Introduce sample_index parameter rather than reusing lod
    - Removed spurious whitespace change
    - Clarify commit message
V4: - Fix comment style
    - Emit SHADER_OPCODE_TXF_MS on Gen6. This was benignly wrong since
      it lowers to `ld` anyway on this gen, but still wrong.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-02 11:40:50 +13:00
Chris Forbes
6883c8845d i965/vs: add support for ir_txf_ms on Gen6+
On Gen6, lower this to `ld` with lod=0 and an extra sample_index
parameter.

On Gen7, use `ld2dms`. This takes an additional MCS parameter to support
compressed multisample surfaces, but we're not enabling them for
multisample textures for now, so it's always ignored and can be safely
omitted.

V2: Reworked completely, added support for Gen7.
V3: - Use new sample_index, sample_index_type rather than reusing lod
    - Clarify commit message.
V4: - Fix comment style

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-02 11:40:49 +13:00
Chris Forbes
f52ce6a0ca i965: add a new virtual opcode: SHADER_OPCODE_TXF_MS
This is very similar to the TXF opcode, but lowers to `ld2dms` rather
than `ld` on Gen7.

V4: - add SHADER_OPCODE_TXF_MS to is_tex() functions, so regalloc thinks
      it actually writes the correct number of registers. Otherwise in
      nontrivial shaders some of the registers tend to get clobbered,
      producing bad results.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-03-02 11:40:49 +13:00
Chris Forbes
555dc6d74d i965: take the target into account for Gen7 MSAA modes
Gen7 has an erratum affecting the ld_mcs message, making it unsafe to
use when the surface doesn't have an associated MCS.

From the Ivy Bridge PRM, Vol4 Part1 p77 ("MCS Enable"):

   "If this field is disabled and the sampling engine <ld_mcs>
   message is issued on this surface, the MCS surface may be
   accessed. Software must ensure that the surface is defined
   to avoid GTT errors."

To allow the shader to treat all surfaces uniformly, force UMS if the
surface is to be used as a multisample texture, even if CMS would have
been possible.

V3: - Quoted erratum text

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-03-02 11:39:42 +13:00
Chris Forbes
8cc26ae993 i965: Support multisampling in surface_state for textures
The surface_state setup for renderbuffers already worked; only the
texturing side needed work. BLORP does something similar, but does its
own surface_state setup.

On Gen6, we just need to set the correct sample count.

On Gen7: - set the correct sample count
         - set the correct layout mode
         - set GEN7_SURFACE_ARYSPC_LOD0 if it's set in the miptree.

V2: - Clarify commit message
    - Rebased onto Paul's physical/logical dims cleanup
    - Added Gen7 support

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-03-02 11:35:24 +13:00
Chris Forbes
e62b6a10bc i965: add support for multisample textures
V2: - Fix for state moving from texobj to image
    - Rebased onto Paul's logical/physical cleanup
    - Fixed missing quantization of sample count
    - Fold in IMS renderbuffer wrapper fixes from later in the series
    - Use correct physical slice offset for UMS/CMS surfaces on Gen7

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-03-02 11:35:24 +13:00
Chris Forbes
575d3870bb mesa: implement TexImage*Multisample
V2: - fix formatting issues
    - generate GL_OUT_OF_MEMORY if teximage cannot be allocated
    - fix for state moving from texobj to image

V3: - remove ridiculous stencil hack
    - alter format check to not allow a base format of STENCIL_INDEX
    - allow width/height/depth to be zero, to deallocate the texture
    - dont forget to call _mesa_update_fbo_texture

V4: - fix indentation
    - don't throw errors on proxy texture targets

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2013-03-02 11:35:24 +13:00
Chris Forbes
61d42ffef4 mesa: support multisample textures in framebuffer completeness check
- sample count must be the same on all attachments
- fixedsamplepositions must be the same on all attachments
(renderbuffers have fixedsamplepositions=true implicitly; only
multisample textures can choose to have it false)

V2: - fix wrapping to 80 columns, debug message, fix for state moving
      from texobj to image.
    - stencil texturing tweaks tidied up and folded in here.

V3: - Removed silly stencil hacks entirely; the extension doesn't
      actually make stencil-only textures legal at all.
    - Moved sample count / fixed sample locations checks into
      existing attachment-type-specific blocks, as suggested by Eric

V4: - Removed stencil hacks which were missed in V3 (thanks Eric)
    - Don't move the declaration of texImg; only required pre-V3.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V2] Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:35:22 +13:00
Chris Forbes
032896cbf9 i965: expose sample positions
Moves the definition of the sample positions out of
gen6_emit_3dstate_multisample, and unpacks them in
gen6_get_sample_position.

V2: Be consistent about `sample position` rather than `location`.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:35:20 +13:00
Chris Forbes
569c4a9f1c i965: add support for sample mask on Gen6+
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:35:17 +13:00
Chris Forbes
1822496f3a mesa: implement sample mask
V2: - fix multiline comment style
    - stop using ASSERT_OUTSIDE_BEGIN_END_AND_FLUSH since that
      doesn't exist anymore.

V3: - check for the extension being enabled
    - tidier flagging of _NEW_MULTISAMPLE
    - fix weird indentation in get.c

V4: - move flush later in SampleMaski()

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:35:16 +13:00
Chris Forbes
7c1017e292 mesa: implement GetMultisamplefv
Actual sample locations deferred to a driverfunc since only the driver
really knows where they will be.

V2: - pass the draw buffer to the driverfunc; don't fallback to pixel
      center if driverfunc is missing.
    - rename GetSampleLocation to GetSamplePosition
    - invert y sample position for winsys FBOs, at Paul's suggestion

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:35:13 +13:00
Chris Forbes
abb5429537 i965: expose new max sample counts
V2: For now, only expose a depth sample count of 1, since there are
possible unresolved interactions with HiZ.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:35:08 +13:00
Chris Forbes
db5d5c30a6 mesa: add new max sample count state
- GL_MAX_COLOR_TEXTURE_SAMPLES
- GL_MAX_DEPTH_TEXTURE_SAMPLES
- GL_MAX_INTEGER_SAMPLES

V2: initialize limits to 1 in _mesa_init_constants as suggested by Brian
and Paul

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:34:58 +13:00
Chris Forbes
ffb53b4f03 glsl: add support for ARB_texture_multisample
V2: - emit `sample` parameter properly for multisample texelFetch()
    - fix spurious whitespace change
    - introduce a new opcode ir_txf_ms rather than overloading the
      existing ir_txf further. This makes doing the right thing in
      the driver somewhat simpler.

V3: - fix weird whitespace

V4: - don't forget to include the new opcode in tex_opcode_strs[]
      (thanks Kenneth for spotting this)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V2] Reviewed-by: Eric Anholt <eric@anholt.net>
[V2] Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:33:54 +13:00
Chris Forbes
16af0aca09 tests: add ARB_texture_multisample enums to table
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:33:42 +13:00
Chris Forbes
d04a4dd003 mesa: add texobj support for ARB_texture_multisample
Adds the new texture targets, and per-image state for GL_TEXTURE_SAMPLES
and GL_TEXTURE_FIXED_SAMPLE_LOCATIONS.

V2: - Allow multisample texture targets in glInvalidateTexSubImage too.
      This was already partly there, but I missed it the first time around
      since the interaction is defined in a newer extension. Fixed weird
      indentation.
    - Allow multisample array textures in glFramebufferTextureLayer.
      This was overlooked as the tests originally only used 2d
      multisample textures.

V3: - Set min/mag filters sensibly for multisample textures. This
      can't actually be changed by the user, so it's more sensible to
      initialize it correctly than to hack around it being bogus later.

V4: - Tidy up initial min/mag filter setup. Setup in
      _mesa_initialize_texture_object was bogus, but benign since
      finish_texture_init() clobbered everything with correct values. For V4,
      just do the setup in finish_texture_init().

V5: - Don't break glPopAttrib(GL_TEXTURE_BIT)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V2] Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:33:27 +13:00
Chris Forbes
0f83e415e4 glapi: add ARB_texture_multisample
Adds new enums, dispatch machinery, and stubs for the 4 new entrypoints.

V2: - Drop placeholder
    - Align enum values
    - Remove explicit exec=mesa; it *is* the dispatch flavor we want,
      but it's also the default. I misunderstood how this worked before;
      after actually reading the generator it makes good sense.

V3: - Squash in stubs for new entrypoints, and dispatch_sanity tweaks,
      so we don't get build breakage between those patches.

V4: - Fix various remaining whitespace issues

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[1/3 V2] Reviewed-by: Matt Turner <mattst88@gmail.com>
[V3] Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-02 11:33:20 +13:00
Eric Anholt
c0674fa5cd intel: Use the new "ctx" local variable I just added some more.
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-01 12:10:22 -08:00
Eric Anholt
e15c21a957 i965: Make sRGB-capable framebuffers by default.
The GLX extension lets you expose visuals that explicitly guarantee you
that the GL_FRAMEBUFFER_SRGB_CAPABLE flag will be set, but we can set
the flag even while the visual doesn't provide the guarantee.  This
appears to be consistent with other implementations, as we've seen
several apps now that don't require an srgb visual and assume sRGB will
work without checking the GL_FRAMEBUFFER_SRGB_CAPABLE flag.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55783
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60633
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
2013-03-01 12:10:16 -08:00
Eric Anholt
973ddc897d intel: Fix software copying of miptree faces for weird formats.
Now that we have W-tiled S8, we can't just region_map and poke at bits --
there has to be some swizzling.  Rely on intel_miptree_map to get that job
done.  This should also get the highest performance path we know of for the
mapping (interesting if I get around to finishing movntdqa some day).

v2: Fix stale name of the bit in a comment.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-01 11:50:03 -08:00
Eric Anholt
6d6bd2ac7c intel: Add a flag for miptree mapping to disable transcoding.
I want to reuse intel_miptree_map() to replace some region mapping that's
broken for separate stencil, but doing so would result in new demands on
ETC transcode that we actually don't want to happen.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-01 11:50:03 -08:00
Eric Anholt
e63c959451 i965: Add WARN_ONCE for depthstencil workarounds we shouldn't be hitting.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-03-01 11:50:03 -08:00
Alex Deucher
a40ba43d78 r600g: enable CP DMA on 6xx
Tested across several 6xx parts, no piglit regressions.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-03-01 12:11:31 -05:00
Marek Olšák
58bd926d9e r600g: don't require dword alignment with CP DMA for buffer transfers
which is a leftover from the days when we used streamout to copy buffers

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-01 13:46:32 +01:00
Marek Olšák
89e2898e9e r600g: always map uninitialized buffer range as unsynchronized
Any driver can implement this simple and efficient optimization.
Team Fortress 2 hits it always. The DISCARD_RANGE codepath is not even used
with TF2 anymore, so we avoid a ton of useless buffer copies.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch.
2013-03-01 13:46:32 +01:00
Marek Olšák
44f37261fc gallium/util: add helper code for 1D integer range
Reviewed-by: Brian Paul <brianp@vmware.com>

v2: cosmetic changes based on Brian's review

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch. (the next patch depends on it)
2013-03-01 13:46:32 +01:00
Marek Olšák
8f192a3c9e r600g: cleanup deprecated register tables
These registers are either already emitted elsewhere or moved to start_cs.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-01 13:46:32 +01:00
Marek Olšák
f0636bc982 r600g: unify vgt states
The states were split because we thought it caused a hardlock. Now we know
the hardlock was caused by something else and has since been fixed.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-01 13:46:32 +01:00
Marek Olšák
e5a250fdf9 r600g: flush and invalidate htile cache when appropriate
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch.
2013-03-01 13:46:32 +01:00
Marek Olšák
6f25de6711 r600g: atomize streamout enabling
This doesn't fix any issue we know of, but there indeed is a week spot
in draw_vbo where streamout can fail. After streamout is enabled,
the need_cs_space call can flush the context, which causes the streamout
to be disabled right after it was enabled and bad things happen.

One way to fix it is to atomize the beginning part, so that no context flush
can happen between streamout enabling and the first drawing.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-01 13:46:32 +01:00
Marek Olšák
9dd18f43a4 r600g: use async DMA with a non-zero src offset
probably a typo

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch.
2013-03-01 13:46:32 +01:00
Marek Olšák
c77917d35f r600g: pad the DMA CS to a multiple of 8 dwords
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>

NOTE: This is a candidate for the 9.1 branch.
2013-03-01 13:46:32 +01:00
Jordan Justen
782d4f0f3c intel: Enable __DRI_API_OPENGL_CORE api with dri2 contexts
Without this set, dri_util.c:dri2CreateContextAttribs
will reject requests to create a context with
__DRI_API_OPENGL_CORE.

This prevents a 3.2 core profile context from being created
even when MESA_GL_OVERRIDE_VERSION=3.2 is used.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 21:51:00 -08:00
Jordan Justen
fde59a27fb intel: update max versions based on MESA_GL_VERSION_OVERRIDE
If the override is version is >= 3.1, then update the
max_gl_core_version. Otherwise, update max_gl_compat_version.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 21:50:56 -08:00
Jordan Justen
c4e059a359 mesa version: add _mesa_get_gl_version_override
This will allow other code to get access to the override
version before a context is available.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 21:50:50 -08:00
Jordan Justen
500b69e797 glsl: allow GLSL compiler version to be overridden to 1.50
Although GLSL 1.50 compiler support is not available,
this change will allow MESA_GLSL_VERSION_OVERRIDE=150 to be
used while 1.50 support is being developed.

Since no drivers claim 1.50 GLSL support, this change should
only impact Mesa when MESA_GLSL_VERSION_OVERRIDE=150 is set.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 21:49:59 -08:00
Matt Turner
4154ac066f i965/fs: Put immediate operand as src2
Immediate operands can only be src2 in 2-source instructions. Fixes
piglit failures since 0a1d145e (oops!).

Spotted-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-28 16:29:30 -08:00
Chad Versace
809fdc211f intel: Remove intel_mipmap_tree::wraps_etc
The field was equivalent to (etc_format != MESA_FORMAT_NONE), and
therefore duplicate information.

This patch removes field and replaces all references to it with
`etc_format != MESA_FORMAT_NONE`.

No Piglit ETC test regresses on Intel Sandybridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-28 15:22:41 -08:00
Matt Turner
c001985cbf ir_to_mesa: Translate ir_triop_lrp to OPCODE_LRP.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:19:00 -08:00
Matt Turner
428503fcdf i965/vs: Assert that ir_triop_lrp was lowered.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:19:00 -08:00
Matt Turner
f78a7ff6b2 i965/fp: Use the LRP instruction for OPCODE_LRP.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:19:00 -08:00
Kenneth Graunke
0a1d145e5f i965/fs: Use the LRP instruction for ir_triop_lrp when possible.
v2 [mattst88]:
   - Add BRW_OPCODE_LRP to list of CSE-able expressions.
   - Fix op_var[] array size.
   - Rename arguments to emit_lrp to (x, y, a) to clear confusion.
   - Add LRP function to brw_fs.cpp/.h.
   - Corrected comment about LRP instruction arguments in emit_lrp.
v3 [mattst88]:
   - Duplicate MAD code for LRP instead of using a function pointer.
   - Check for != GRF instead of == IMM in emit_lrp.
   - Lower LRP on gen < 6.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>

1
2013-02-28 13:19:00 -08:00
Kenneth Graunke
015a48743d i965: Add support for emitting the LRP instruction.
Like MAD, this is another three-source instruction.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:18:59 -08:00
Matt Turner
af2c64063e glsl: Optimize ir_triop_lrp(x, y, a) with a = 0.0f or 1.0f
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:18:59 -08:00
Kenneth Graunke
93066ce129 glsl: Convert mix() to use a new ir_triop_lrp opcode.
Many GPUs have an instruction to do linear interpolation which is more
efficient than simply performing the algebra necessary (two multiplies,
an add, and a subtract).

Pattern matching or peepholing this is more desirable, but can be
tricky.  By using an opcode, we can at least make shaders which use the
mix() built-in get the more efficient behavior.

Currently, all consumers lower ir_triop_lrp.  Subsequent patches will
actually generate different code.

v2 [mattst88]:
   - Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a
     subsequent patch and ir_triop_lrp translated directly.
v3 [mattst88]:
   - Move changes from the next patch to opt_algebraic.cpp to accept
     3-src operations.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:18:59 -08:00
Kenneth Graunke
18281d6088 glsl: Rework ir_reader to handle expressions with three operands.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:18:59 -08:00
Kenneth Graunke
1afd33ec05 glsl: Consolidate ir_expression constructors that use explicit types.
Previously, we had separate constructors for one, two, and four operand
expressions.  This patch consolidates them into a single constructor
which uses NULL default parameters.

The unary and binary operator constructors had assertions to verify that
the caller supplied the correct number of operands for the expression,
but the four-operand version did not.  Since get_num_operands for
ir_quadop_vector returns the number of vector_elements, we can safely
add that without breaking the semantics of ir_quadop_vector.

This also paves the way for expressions with three operands.  Currently,
none can be constructed since get_num_operands() never returns 3.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-28 13:18:59 -08:00
Matt Turner
f0213b1242 i965/vs/gen7: Allow MATH instructions to have MRF as a destination
total instructions in shared programs: 346873 -> 346847 (-0.01%)
instructions in affected programs:     364 -> 338 (-7.14%)

(All affected shaders are from Lightsmark)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-28 13:18:59 -08:00
Matt Turner
4eeb9ded9d i965/fs/gen7: Allow MATH instructions to have MRF as a destination
total instructions in shared programs: 1376297 -> 1375626 (-0.05%)
instructions in affected programs:     35977 -> 35306 (-1.87%)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-28 13:18:59 -08:00
Matt Turner
d5c3aa89dc i965/gen7: Relax restrictions on fake MRFs
Gen6 has write-only MRF registers, and for ease of implementation we
paritition off 16 general purposes registers to act as MRFs on Gen7.

Knowing that our Gen7 MRFs are actually GRFs, we can do things we can't
do with real MRFs:
   - read from them;
   - return values directly to them from a send instruction; and
   - compute directly to them with math instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-28 13:18:59 -08:00
Matt Turner
b9f6795e34 i965/fs: Remove duplicate scan_inst->mlen check
Is already checked 20 lines below.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-28 13:18:59 -08:00
Tom Stellard
aa1c734b3c clover: Fix build with LLVM 3.3 v2
v2:
  - Fix order that the clang libraries are passed to the linker to avoid
    missing symbol errors.

Acked-by: Francisco Jerez <currojerez@riseup.net>
2013-02-28 16:01:23 -05:00
Jordan Justen
6f1538f8b4 attrib: push/pop FRAGMENT_PROGRAM_ARB state
This requirement was added by ARB_fragment_program

When the Steam overlay is enabled, this fixes:
* Menu corruption with the Puddle game
* The screen going black on Rochard when
  the Steam overlay is accessed

NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-28 09:29:45 -08:00
Keith Kriewall
efd8311a54 scons: Fix Windows build with LLVM 3.2
Fixes fdo bug 61299

NOTE: This is a candidate for the stable branches.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-02-28 15:40:02 +00:00
Adam Sampson
2506b03503 autotools: oprofilejit should be included in the list of LLVM components required
NOTE: This is a candidate for the stable branch.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-02-28 15:37:09 +00:00
Jerome Glisse
6bc7605745 r600g: workaround hyperz lockup on evergreen
This work around disable hyperz if write to zbuffer is disabled. Somehow
using hyperz when not writting to the zbuffer trigger GPU lockup. See :

https://bugs.freedesktop.org/show_bug.cgi?id=60848

Candidate for 9.1

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-28 09:48:05 -05:00
Jordan Justen
c6ae10887e texobj: add verbose api trace messages to several routines
Motivated by wanting to see if GenTextures was called by an
application while debugging another Steam overlay issue.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-27 23:02:12 -08:00
Roland Scheidegger
c8eb2d0e82 llvmpipe: check buffers in llvmpipe_is_resource_referenced.
Now that buffers can be used as textures or render targets
make sure they aren't skipped.

Fix suggested by Jose Fonseca.

v2: added a couple of assertions so we can actually guarantee
we check the resources and don't skip them. Also added some comments
that this is actually a lie due to the way the opengl buffer api works.
2013-02-28 03:39:54 +01:00
Roland Scheidegger
686f6c69bd llvmpipe: support rendering to buffer render targets.
Unfortunately not usable from OpenGL, and no cap bit.
Pretty similar to a 1d texture, though allows specifying a start element.

v2: also fix up renderbuffer width (which will get promoted to fb width)
to be the number of elements

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-28 03:39:54 +01:00
Roland Scheidegger
2fcd3638be util: fix issues with util_clear_render_target.
For PIPE_BUFFER we need coord adjustments for the transfer.
And for pure integer formats util_pack_color just crashes,
need to handle that differently due to clear colors being ints/uints.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-28 03:39:53 +01:00
Roland Scheidegger
6b35c2b110 softpipe/draw/tgsi: simplify driver/tgsi sampler interface
Use a single sampler adapter instead of per-sampler-unit samplers,
and just pass along texture unit and sampler unit in the calls.
The reason is that for dx10-style sample opcodes pre-wired
samplers including all the texture state aren't really feasible (and for
sample_i/sviewinfo we don't even have samplers).
Of course right now softpipe doesn't actually do anything more than
just look up all its pre-wired per-texunit/per-samplerunit sampler as
it did before so this doesn't really achieve much except one more
function call, however this is now all softpipe's fault (fixing that in
a way which doesn't suck is still an unsolved problem).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-28 03:39:53 +01:00
Maxence Le Doré
0845d16976 gallivm: fix mis-matching AOS instruction emission
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-02-27 20:23:01 +00:00
Jon TURNEY
f816a9f522 glx: Fix glXCreateWindow() when GLX_DIRECT_RENDERING is undefined
glXCreateWindow() and glXCreatePbuffer() always fail when built without
GLX_DIRECT_RENDERING defined since commit 48331047.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
2013-02-27 13:36:19 -05:00
Francisco Jerez
4deefd9ba6 configure.ac: Clarify the description of the --with-opencl-libdir parameter a little.
https://bugs.freedesktop.org/show_bug.cgi?id=61415

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
2013-02-27 12:27:13 +01:00
Vinson Lee
f987d23b28 radeonsi: Fix memory leak in si_set_constant_buffer.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-26 20:03:11 -08:00
Vinson Lee
f88ed1658c st/vega: Fix memory leak in combine_shaders.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-26 20:01:58 -08:00
Kristian Høgsberg
112ccfab44 egl/wayland: Don't block on EGL_DEFAULT_DISPAY under wayland
Normally the application will own the main event queue and be responsible
for moving events.  In case of EGL_DEFAULT_DISPLAY, EGL opens the display
and has to own the main queue so it can move the events itself.
Call wl_display_dispatch_pending() to take ownership.
2013-02-26 12:49:49 -05:00
Ian Romanick
68a147e9a9 egl: Allow 24-bit visuals for 32-bit RGBA8888 configs
Previously only the 32-bit X visual would match the 32-bit RGBA8888
configs.  This resulted in every config with alpha getting the "magic"
visual whose alpha is used by the compositor.  This also resulted in no
multisample visuals being advertised.  How many ways could we lose?

This patch inverts the problem... now you can't get the visual with
alpha used by the compositor even if you want it.  I think we need to
invent a new value for EGL_TRANSPARENT_TYPE that apps can use to get
this.  I'm surprised that there isn't already a choice for
EGL_TRANSPARENT_ALPHA.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Tian Ye <yex.tian@intel.com>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59783
2013-02-26 09:42:31 -08:00
Brian Paul
e2148ab043 st/mesa: remove some conditionals in update_raster_state()
Just use simple assignments.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-26 09:16:52 -07:00
Alex Deucher
e5e4c07e79 r600g: add missing emit_flush for R600_CONTEXT_FLUSH_AND_INV case
We set the cp_coher_cntl bits but never emit them.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-26 10:30:26 -05:00
Alex Deucher
d54bc5d227 r600g: synchronize streamout buffers on r6xx too (v3)
Streamout buffers need to be synchronized on r6xx as
well.

v2: Add DEST flush as well.
v3: drop DEST flush

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-26 10:30:10 -05:00
Brian Paul
62329d77b8 winsys/null: fix var typo templet->templat 2013-02-26 08:20:16 -07:00
Brian Paul
02bf645111 svga: fix comment typos 2013-02-26 08:20:16 -07:00
Marek Olšák
d8d58bdcb9 r300g: implement 3D transfers
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=61351
2013-02-26 01:14:20 +01:00
Marek Olšák
3857f450a6 gallium/util: add helper util_max_layer from r600g 2013-02-26 01:14:05 +01:00
Roland Scheidegger
52c44cee1e llvmpipe: (trivial) get rid of old function prototypes.
llvmpipe_init_screen/context_texture_funcs have long been replaced
with the respective "resource" funcs.
2013-02-25 20:38:23 +01:00
Roland Scheidegger
c0ba1080df draw: make sure pipeline is revalidated when sampler views or samplers change.
Since with llvm execution parts of sampler view and sampler state is baked into
the shader, we need to revalidate otherwise the wrong shader might get used.
(Not completely sure but I think this would not be required for non-llvm case,
along with everything else in these functions.)
This caused bugs in piglit arb_texture_buffer_object-formats, because we never
noticed that the view format changed.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-25 20:38:23 +01:00
Roland Scheidegger
20183177a5 llvmpipe: support GL_ARB_texture_buffer_object/GL_ARB_texture_buffer_range
This also fixes not honoring first/last_layer view parameters for array
textures, plus not honoring last_level view parameter for all textures
(neither is really used by OpenGL).
This mostly passes piglit arb_texture_buffer_object tests (it needs, however,
glsl 140 version override, plus GL 3.1 override, the latter only because
mesa does not allow ARB_tbo in non-core contexts).
Most arb_texture_buffer_object tests pass, with the exception of
arb_texture_buffer_object-formats. With "arb" parameter it passes most weirdo
formats before it segfaults in the state tracker, this looks to be some issue
with using legacy formats in core context (fails the same in softpipe).
With "core" parameter it passes with "fs", however fails with "vs" (for most
formats). This will be fixed later (debugging shows we're completely missing
the shader recompile depending on format).

v2: based on Jose's feedback, fix comments, variable/function names.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-25 20:38:23 +01:00
Eric Anholt
50a5d5dea0 i965: Fix the W value of deprecated pointcoords on pre-gen6.
When you didn't have a texcoord array bound (or a non-1 current w
attrib), we were telling the fragment shader that it could just use "1"
instead of doing expensive pre-gen6 math to invert it.  If you drew the
point with a non-1 W value, then you'd get the right size (since all the
vertex computations worked), but we'd mis-interpolate the coordinate
across the face.

Fixes the mesa pointsprite demo on GM45.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30232
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Note: This is a candidate for the stable branches.
2013-02-25 11:21:44 -08:00
Tapani Pälli
3cdb548bfb mesa/es: NULL check in EGLImageTargetTexture2DOES
check that pointer passed is valid and return error if not.

Note: This is a candidate for the stable branches.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-25 09:17:31 -08:00
Tapani Pälli
331967c773 mesa: add missing case in _mesa_GetTexParameterfv()
missing case GL_REQUIRED_TEXTURE_IMAGE_UNITS_OES is required
by OES_EGL_image_external extension.

Note: This is a candidate for the stable branches.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-25 09:17:20 -08:00
Andreas Boll
533dc3b690 docs: add news item for mesa-demos 8.1.0 release 2013-02-25 11:31:08 +01:00
Andreas Boll
d209926666 docs: import release notes for 9.1, add news item 2013-02-25 10:47:02 +01:00
Jordan Justen
0486d50320 glsl: Remove VS output varyings which are optimized out of the FS
Previously when an input varying was optimized out of the
FS we would still retain it as an output of the VS.

We now build a hash of live FS input varyings rather
than looking in the FS symbol table. (The FS symbol table
will still contain the optimized out varyings.)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-23 16:20:28 -08:00
Vinson Lee
f6487e8911 vl: Fix off-by-one error in device_name_length allocation.
Fixes out-of-bounds write reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel@daenzer.net>
2013-02-23 14:57:05 -08:00
John Kåre Alsaker
65aa1a194d llvmpipe: Fix creation of shared and scanout textures.
NOTE: This is a candidate for the stable branches.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-02-23 18:36:58 +00:00
José Fonseca
fdb88967e3 util/u_blitter: Set pipe_sampler_state::normalized_coords correctly.
We might want to revisit the normalized_coords semantics, but this is
the current expected behavior.

Fixes fdo bug 61091.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-23 18:36:57 +00:00
Brian Paul
2557d3f9c3 svga: remove some extraneous whitespace 2013-02-23 08:20:36 -07:00
Brian Paul
840d6faf68 st/mesa: fix debug_printf() format string warning
Use %td for ptrdiff_t (aka GLsizeiptrARB).
2013-02-23 08:20:36 -07:00
José Fonseca
0d760a8160 util/dump: Use static assertion to detect string table size mismatches.
Suggested by Brian Paul.

Could probably be extended to other enums.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-23 13:32:34 +00:00
Vinson Lee
2fa9e4c97c st/xvmc/tests: Ensure colorkey is initialized.
Fixes uninitialized scalar variable defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-02-22 19:32:00 -08:00
Vinson Lee
54afbce934 st/vdpau: Fix memory leak in vlVdpBitmapSurfaceCreate.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-02-22 19:30:03 -08:00
Vinson Lee
1bac4a1e6f st/vdpau: Fix memory leak in vlVdpOutputSurfaceCreate.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-02-22 19:29:56 -08:00
Tapani Pälli
b4dba5bba2 glapi: mark static_dispatch false for DiscardFramebufferEXT
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61199
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Brad King <brad.king@kitware.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-02-22 17:18:08 -08:00
Brian Paul
b804fb8714 llvmpipe: rename polygon offset fields to something more specific
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 16:49:05 -07:00
Brian Paul
f93c580063 llvmpipe: add missing checks for polygon offset point/line modes
The llvm pipeline handles regular filled triangle offsets, but it
doesn't handle offsets for triangles drawn in point or line mode.

Fixes failures found with new piglit polygon-mode-offset test.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 16:49:05 -07:00
Brian Paul
d6b8b116ee draw: fix broken polygon offset stage
There were several issues.  We weren't handling different front/back
polygon fill modes.  We weren't checking whether the offset applied to
fill mode vs. line mode vs. point mode.

Fixes problems found with the Visualization Toolkit (VTK) test suite.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 16:49:05 -07:00
Brian Paul
a2c105e31e st/mesa: fix polygon offset state translation logic
The old logic was kind of twisted, but seemed to work in practice.

Note: This is a candidate for the stable branches.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-02-22 16:49:05 -07:00
Brian Paul
8bb291b0f5 st/mesa: check for dummy programs in destroy_program_variants()
When we destroy an ARB vp/fp whose ID was gen'd but not otherwise used we
get a pointer to the dummy/placeholder program.  We can't destroy that one
so just skip it.  This only failed during context tear-down because
glDeleteProgramsARB() was already aware of dummy programs.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=38086

Note: This is a candidate for the stable branches.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-02-22 16:49:05 -07:00
Brian Paul
8589cc41b3 st/mesa: fix trimming of GL_QUAD_STRIP
We sometimes convert GL_QUAD_STRIP prims into GL_TRIANGLE_STRIP, but
that changes the results of the u_trim_pipe_prim() call.  We need to
pass the original primitive type to the trim function.

Note that OpenGL's GL_x prim type values match Gallium's PIPE_PRIM_x values.

Fixes a failure in the new piglit degenerate-prims test.

Note: This is a candidate for the stable branches.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-02-22 16:49:05 -07:00
Alex Deucher
8b5acad0e9 r600g: fixup PS_PARTIAL_FLUSH flag handling for cayman
So we don't emit it twice if we ever use the flag on
cayman.

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-22 18:43:27 -05:00
Alex Deucher
8442b67f5f r600g: r6xx deadlock workaround (v6)
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=50655
https://bugs.freedesktop.org/show_bug.cgi?id=47116

v2: flush along with workaround.
v3: just need a flush
v4: try WAIT_UNTIL
v5: switch to PS partial flush
v6: rework patch

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-22 18:23:46 -05:00
Alex Deucher
7ebf83f109 r600g: add PS_PARTIAL_FLUSH flag
PS_PARTIAL flushes seems to be required in certain
cases to prevent hangs, especially on r6xx.

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-22 18:23:31 -05:00
Ian Romanick
7ae6864f0d i965: Enable OpenGL ES 3.0 on Sandy Bridge
Regardless of what we put in the screen structure, all of the extensions
that compute_version_es2 checks are present and 3.0 will be exposed
anyway.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-22 13:57:44 -08:00
Lauri Kasanen
0a82828ad5 configure: Fix build with automake < 1.11
Commit 86d30dea3c broke building with older
automake versions with this error:

Makefile:769: *** Recursive variable am__v_YACC_ references itself (eventually).  Stop.

This patch fixes it. Fix stolen from xorg-macros.

Signed-off-by: Lauri Kasanen <cand@gmx.com>
2013-02-22 13:15:14 -08:00
Anuj Phogat
cff862f90d meta: Allocate texture before initializing texture coordinates
tex->Sright and tex->Ttop are initialized during texture allocation.
This fixes depth buffer blitting failures in khronos conformance tests
when run on desktop GL 3.0.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=59495

Note: This is a candidate for stable branches.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-22 12:03:59 -08:00
Eric Anholt
92a204b493 mesa: Fix setup of ctx->Point.PointSprite for GLES2.
The recent change for GL core broke the older setup, which broke
gl_PointCoord on pre-gen6 (where gl_PointCoord is undefined if point
sprites are disabled).  Fixes the new piglit GLES-2.0/glsl-fs-pointcoord
test.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32429
Note: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-22 10:55:39 -08:00
Eric Anholt
7b0731d940 i965/fs: Fix broken math on values loaded from uniform buffers on gen6.
In a debug build this led to assertion failures, but on a non-debug
build the hardware would just reference the whole vec8 instead of the
same channel 8 times.

Fixes the new piglit glsl-1.40/uniform-buffer/fs-exp2.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57121
Note: This is a candidate for the stable branches
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-22 10:50:50 -08:00
José Fonseca
cd01cc3b48 tgsi: Improve execution debugging.
- zero temps/outputs instead of copying (otherwise we won't be able to see
  the temps/outputs assignments for small shaders where nothing changes
  across big areas

- also show the inputs (as it's often impossible to infer from the rest)

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-22 16:19:58 +00:00
José Fonseca
f8436c17e4 util/u_dump: Update texture target strings. 2013-02-22 16:19:58 +00:00
Sergey Matyukevich
21e8af0b09 util/debug: Always use __builtin_frame_address on gcc.
Should workaround fdo bug 57563.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-02-22 16:19:58 +00:00
Michel Dänzer
f6b40ddd2d radeon/llvm: Remove stale comment about radeon_llvm_emit_prepare_cube_coords 2013-02-22 13:06:07 +01:00
Marek Olšák
aac8138744 r600g: fix random corruption with CP DMA in TF2
NOTE: This is a candidate for the 9.1 branch.
2013-02-22 12:49:15 +01:00
Michel Dänzer
3447cc4856 radeonsi: Don't pretend there is any R8G8B8 support
The hardware can't do it.
2013-02-22 11:44:24 +01:00
Andreas Boll
c1f2c3a80f llvmpipe/build: add DLOPEN_LIBS and PTHREAD_LIBS to the lp_test_* targets
Fixes undefined symbols.

NOTE: This is a candidate for the 9.1 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61052
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 10:21:43 +01:00
Andreas Boll
c1eb585f3d targets/xa-vmwgfx: Force c++ linker to fix undefined symbols
NOTE: This is a candidate for the 9.1 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61200
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-22 10:21:43 +01:00
Roland Scheidegger
b6f15954b4 llvmpipe: Fix rendering into PIPE_FORMAT_X8*_UNORM.
Mesa state tracker recently started using PIPE_FORMAT_X8B8G8R8_UNORM,
causing segfaults in texture-packed-formats, because swizze[chan] was
0xff for padding channel (X).

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-02-22 09:00:45 +00:00
José Fonseca
8ed1279b10 trace: Never close stdout/stderr.
This could happen, when a trace screen was destroyed and then recreated.
2013-02-22 08:45:07 +00:00
José Fonseca
59025d6e95 trace: Fix set_constant_buffer dumping.
We were dumping the trace driver pointer, instead of the pointer from the
underlying pipe driver.
2013-02-22 08:40:47 +00:00
Vinson Lee
b92984b2fa r600g: Fix memory leak in r600_shader_select.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reported-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 21:49:24 -08:00
Roland Scheidegger
66c3cd0be3 llvmpipe: simplify buffer allocation logic.
Now with buffer formats clarification don't need all that logic any longer.
(Note that it never would have worked in any case, because blockwidth and
blockheight were swapped any allocation with multi-byte format would have
had zero size.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 04:34:07 +01:00
Roland Scheidegger
2cfee2295f gallium/docs: improve text about resources a bit.
This clarifies some things and gets rid of some old stuff.
The most significant one is probably that buffers cannot have formats
(nearly all drivers completely ignored format and used width0 as byte size
already in any case). There seems to be no use case for "structured" buffers.
(Note while d3d11 has new Structured Buffers, these still aren't associated
with a format, rather a byte stride, which we can't do yet either way.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 04:34:07 +01:00
Roland Scheidegger
f972567671 draw: make sure key size is calculated consistently.
Some parts calculated key size by using shader information, others by using
the pipe_vertex_element information. Since it is perfectly valid to have more
vertex_elements set than the vertex shader is using those may not be the same,
so we weren't copying over all vertex_element state - this caused the tgsi dump
to assert (iterates over all vertex elements). More importantly in this
situation it would also break vertex texturing completely (since the sampler
state derived from the key is at a different position than expected).
Fix thix by deriving key->nr_vertex_elements from the shader information
instead of the pipe_vertex_element state (unlike dx10, we can't have "holes"
in pipe_vertex_element state, so this should be safe).
(Note that actual llvm shader generation does not use the pipe_vertex_element
state from the key itself in any case (althogh I guess it could) but uses
the one from draw.pt (which should be the same though contains all elements)
instead.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-22 04:34:07 +01:00
Tom Stellard
10bcc843f8 r300g/compiler: Fix bug in OMOD folding
The OMOD value was only being folded to one instruction in cases where
the MUL instruction was reading a value written by more than one
instruction.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 22:07:28 -05:00
Tom Stellard
5e1321ddf4 r300g/tests: Add helper functions for creating a full program
Now you can convert assembly strings into a full struct radeon_compiler
object and use it to test individual compiler pases.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 22:07:27 -05:00
Tom Stellard
bcf2e157ca r300g/tests: Exit test runner with a valid status code
This way make check can report whether or not the tests pass.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 22:07:27 -05:00
Tom Stellard
5355fc1e87 r300g/complier: Make r300_vertprog_swizzle_caps visible in other files
This will be used by the test suite in later commits.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 22:07:27 -05:00
Tom Stellard
c3df498ff9 r300g/compiler: Fix typo in comment
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 22:07:27 -05:00
Tom Stellard
27d140b960 r300g/compiler: Add missing license headers
These are all files that I authored, but forgot to add the license
headers.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 22:07:27 -05:00
Carl Worth
f5a8084692 i965: Avoid segfault in gen6_upload_state
This fixes a bug introduced in commit 258453716f and
triggered whenever "rb" is NULL.

Fixes at least one cause bug #59445:

	[SNB/IVB/HSW Bisected]Oglc draw-buffers2(advanced.blending.none) segfault
	https://bugs.freedesktop.org/show_bug.cgi?id=59445

(Though segfaults are still possible in that test case, but they have been
present since before commit 258453716f which is what's being fixed here.)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-21 12:09:24 -08:00
Alex Deucher
2e4ef989a2 r600g: don't enable ReZ mode on evergreen
Can cause lockups in certain cases when
zfunc/zenable/zwrite change without a flush
in between.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60969
and lockups on Civ4 with wine.

This is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-21 11:59:07 -05:00
Andreas Boll
f7d87332b0 docs: import release notes for 9.0.3, add news item 2013-02-21 17:31:42 +01:00
Michel Dänzer
b63b3012c9 radeonsi: Don't match TGSI_SEMANTIC_POSITION fs inputs to vs outputs 2013-02-21 10:07:18 +01:00
Michel Dänzer
954bc4ac34 radeonsi: Fix w component of TGSI_SEMANTIC_POSITION fragment shader inputs.
It's the reciprocal of the register value.

Fixes piglit fragcoord_w and glsl-fs-fragcoord-zw-perspective.

NOTE: This is a candidate for the 9.1 branch.
2013-02-21 10:06:52 +01:00
Michel Dänzer
18272c9b1b radeonsi: Fix up and enable flat shading.
Requires corresponding LLVM R600 backend fix to work correctly, but even
without that it doesn't hang anymore.

13 more little piglits.

Depends on LLVM: r175193, r175733

NOTE: This is a candidate for the 9.1 branch.
2013-02-21 09:14:36 +01:00
Vinson Lee
0d51906c07 radeonsi: Fix memory leak in si_shader_select.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-20 23:29:12 -08:00
Paul Berry
54d9c8a04a i965: Consign COORD_REPLACE VS hacks to Pre-Gen6.
Pre-Gen6, the SF thread requires exact matching between VS output
slots (aka VUE slots) and FS input slots, even when the corresponding
VS output slot is unused due to being overwritten by point coordinate
replacement (glTexEnvi(GL_POINT_SPRITE, GL_COORD_REPLACE, GL_TRUE)).
As a result, we have a special hack in the VS to ensure when any
texture coordinate is subject to point coordinate replacement, it is
always allocated space in the VUE, even if it isn't written to by the
VS.

This hack isn't needed from Gen6 onwards, since SF (Gen7: SBE)
swizzling has the ability to insert the point coordinate into
gl_TexCoord[] without needing a corresponding unused VUE slot.

Note that no modification of SF setup code is required for this
patch--get_attr_override() already does the right thing.  However, we
make a slight comment change to clarify why this works.

In addition to eliminating unnecessary VS recompiles and saving
precious URB space on Gen6+, this will save us the trouble of having
to adjust this hack when we implement geometry shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-20 13:48:45 -08:00
Ian Romanick
8b586322e7 mesa: Don't install glEvalMesh in the beginend dispatch table
NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59740
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-20 12:46:58 -08:00
Roland Scheidegger
83f7cde182 gallivm: fix indirect src register fetches requiring bitcast
For constant and temporary register fetches, the bitcasts weren't done
correctly for the indirect case, leading to crashes due to type mismatches.
Simply do the bitcasts after fetching (much simpler than fixing up the load
pointer for the various cases).

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=61036

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-20 19:37:30 +01:00
Roland Scheidegger
fbbcc1fcc4 llvmpipe: lp_resource_copy cleanup
We don't need to flush resources for each layer, and since we don't actually
care about layer at all in the flush function just drop the parameter.
Also we can use util_copy_box instead of repeated util_copy_rect.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-20 19:37:30 +01:00
Roland Scheidegger
95181ed2fd llvmpipe: fix lp_resource_copy using more than one 3d slice
These used to be illegal a very long time ago, then for some more time
nothing really emitted these so this code path wasn't hit.
Just trivially iterate over box->depth.
(Might be worth refactoring at some point since nowadays all the code
doesn't really do much except for depth textures.)

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=61093

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-20 19:37:30 +01:00
Tapani Pälli
413941e1a3 gles2: a stub implementation for GL_EXT_discard_framebuffer
This patch implements a stub for GL_EXT_discard_framebuffer with
required checks listed by the extension specification. This extension
is required by GLBenchmark 2.5 when compiled with OpenGL ES 2.0
as the rendering backend.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-20 10:01:45 -08:00
Michel Dänzer
73bf626713 r600g/Cayman: Fix blending using destination alpha factor but non-alpha dest
Only compile tested, but should fix at least some piglit fbo-blending tests.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-20 14:43:17 +01:00
Michel Dänzer
95bced5929 radeonsi: Fix blending using destination alpha factor but non-alpha destination
11 more little piglits.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-20 12:58:52 +01:00
Marek Olšák
72f4490b55 radeonsi: implement 3D transfers
That means we can map and read multiple slices with one transfer_map call.

[ Cherry-picked from r600g commit 1aebb6911e ]

11 more little piglits on master, 1 more on the 9.1 branch (Marek's
glTex(Sub)Image improvements on master broke the other 10).

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-20 12:30:59 +01:00
Marek Olšák
a84c4edeed radeonsi: add assertions to prevent creation of invalid surfaces
[ Cherry-picked from r600g commit ef11ed61a0 ]

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-20 12:30:32 +01:00
Marek Olšák
c4faab63c4 radeonsi: use u_box_origin_2d helper function
[ Cherry-picked from r600g commit b278aba423 ]

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-20 12:15:22 +01:00
Vinson Lee
c403a52666 configure.ac: Do not check for clock_gettime on MinGW.
MinGW does not have clock_gettime.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-19 21:17:37 -08:00
Zack Rusin
076403c30d DRI2: Don't disable GLX_INTEL_swap_event unconditionally
GLX_INTEL_swap_event is broken on the server side, where it's
currently unconditionally enabled. This completely breaks
systems running on drivers which don't support that extension.
There's no way to test for its presence on this side, so instead
of disabling it uncondtionally, just disable it for drivers
which are known to not support it. It makes sense because
most drivers do support it right now.
We'll be able to remove this once Xserver properly advertises
GLX_INTEL_swap_event.

Note: This is a candidate for stable branch branches.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60052
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-19 12:50:16 -08:00
Eric Anholt
4c64f65f5d i965/fs: Enable CSE on uniform pull constant loads.
Improves on a major performance regression for the dolphin wii emulator
from its move to using UBOs.  Performance in the UBO codepath (as
replayed through apitrace) is up 21.1% +/- 2.3% (n=26/29).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-19 10:34:03 -08:00
Eric Anholt
c2a6e529c3 i965/fs: Only do CSE when the dst types match.
We could potentially do some CSE even when the dst types aren't the same
on gen6 where there is no implicit dst type conversion iirc, or in the
case of uniform pull constant loads where the dst type doesn't impact
what's stored.  But it's not worth worrying about.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.1 branch.
2013-02-19 10:33:41 -08:00
Eric Anholt
aebd3f46e3 i965/fs: Delay setup of uniform loads until after pre-regalloc scheduling.
This should fix the register allocation explosion on the GLES 3.0 test
on gen6.  It also gives us an instruction that will fit our CSE handling.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.1 branch.
2013-02-19 10:33:32 -08:00
Eric Anholt
49bdebad38 i965/fs: Fix copy propagation with smearing.
We were correctly relaying the smear from MOV's src, but if the MOV
didn't do a smear, we don't want to smash the smear value from the
instruction being propagated into.  Prevents a regression in the
upcoming UBO change.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.1 branch.
2013-02-19 10:33:15 -08:00
Eric Anholt
de7cb1cff3 i965/fs: Add a bit more instruction dumping useful for upcoming work.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-19 10:33:00 -08:00
Tom Stellard
7cd248aa79 radeon/llvm: Fix build with LLVM 3.3 2013-02-19 15:52:55 +00:00
Tom Stellard
1f006717db r600g: Add $(DEFINES) to AM_CXXFLAGS
This way llvm_wrapper.cpp is compiled with -DHAVE_LLVM=0x....
2013-02-19 15:52:55 +00:00
Paul Berry
444246c7e3 i965: Remove unused userclip flags.
brw_vs_prog_data::userclip hasn't been used since commit f0cecd4
(i965: Move VUE map computation to once at VS compile time).

brw_gs_prog_key::userclip_active hasn't been used since commit 9f3d321
(i965: Make the userclip flag for the VUE map come from VS prog data).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-19 07:35:52 -08:00
Brian Paul
dfbcb1849c llvmpipe: fix handling of 0 x 0 framebuffer size
Bump up the size to 1 x 1.  This fixes a number of potential failure
points in the code.

See also http://bugs.freedesktop.org/show_bug.cgi?id=61012
2013-02-19 07:19:19 -07:00
Brian Paul
e2091f64cb st/xlib: initialize the drawable size in create_xmesa_buffer()
Otherwise, the PBuffer's size was never set.  This also initializes
the buffer size for windows, pixmaps, etc.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61012

Note: This is a candidate for the stable branches.
2013-02-19 07:19:19 -07:00
Stefan Brüns
5876a5dbc0 glx: fix glGetTexLevelParameteriv for indirect rendering
A single element in a GLX reply is contained in the header itself.
The number of elements is denoted in the "n" field of the reply.
If "n" is 1, the length of additional data is 0.
The XXX_data_length() function of xcb does not return the length of
the (optional, n>1) data but the number of elements.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=59876

Note: This is a candidate for the stable branches.

Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-19 07:19:19 -07:00
Brian Paul
63c30d7e4f st/mesa: implement glBitmap unpacking from a PBO, for the cache path
We weren't mapping the PBO when using the bitmap cache (but we had
the PBO code for the non-cache path.)

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61026

Note: This is a candidate for the stable branches.
2013-02-19 07:19:19 -07:00
Brian Paul
5da967aff5 draw: fix non-perspective interpolation in interp()
This fixes a regression from ab74fee5e1.
When we use the clip coordinate to compute the screen-space interpolation
factor, we need to first apply the divide-by-W step to the clip
coordinate.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=60938

Note: This is a candidate for the 9.1 branch.
2013-02-19 07:19:18 -07:00
Marek Olšák
07cdfdb708 st/mesa: remove what is left from u_blit
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Marek Olšák
40ee93c4e8 st/mesa: simplify and improve CopyTexSubImage
It has become a bit messy.

Changes:

- finally correct checking for transfer ops depending on the base format

- making sure the base internal format and the texture format match
  (we were ignoring it, but it's important for correctness)

- the way-too-strict rule that both src and dst base formats must be the same
  was dropped; ensuring the simpler and more permissive rule mentioned above
  is enough

- stop using util_blit_pixels; pipe->blit is flexible enough, and now that we
  have RGBX and red-alpha formats, pipe->blit can be used for more cases

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Marek Olšák
6520a86c67 st/mesa: don't do sRGB conversion in CopyTexSubImage
Assuming I understand EXT_texture_sRGB correctly.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Marek Olšák
0a1479c829 st/mesa: implement blit-based TexImage and TexSubImage
A temporary texture is created such that it matches the format and type
combination and pixels are copied to it using memcpy. Then the blit is used to
copy the temporary texture to the texture image being modified by TexImage or
TexSubImage. The blit takes care of the format and type conversion and
swizzling. The result is a very fast texture upload involving as little CPU
as possible.

This improves performance in apps which upload textures during rendering.
An example is the Wine OpenGL backend for DirectDraw, which I used to test
the game StarCraft. Profiling had shown that TexSubImage was taking 50% of
CPU time without this patch, which was the main motivation for this work, and
now TexSubImage only takes 14% of CPU time. I had to underclock my CPU to see
any difference in the game and this patch does make the game a lot faster
if the CPU is slow (or using the powersave cpufreq profile).

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Marek Olšák
a6e0ac9571 st/mesa: fix blit-based GetTexImage for 1D array textures
This is not easy to hit, because we have 3 code paths now
(tried in this order):
- memcpy-based (skips the blit) -> _mesa_tex_getimage
- blit-based
- slow pixel packing -> _mesa_tex_getimage

The main difference later in the code is the parameters of
_mesa_image_address3d.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Marek Olšák
91acf6225a st/mesa: fix blit-based GetTexImage for depth/stencil formats
BTW, we have 0 tests for glGetTexImage(format=GL_DEPTH*).

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Marek Olšák
0181e18d0f st/mesa: factor out code for determining blit.mask from CopyTexSubImage
I'll need this later.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-18 17:57:41 +01:00
Michel Dänzer
9c1107b3e1 radeonsi: Fix PIPE_FORMAT_X32_S8X24_UINT sampler hardware format
4 more little piglits.

NOTE: This is a candidate for the 9.1 branch.
2013-02-18 15:59:02 +01:00
Michel Dänzer
8356962853 radeonsi: Use stencil surface level information for stencil texturing
7 more little dwarves^W piglits.

NOTE: This is a candidate for the 9.1 branch.
2013-02-18 15:58:37 +01:00
Michel Dänzer
f9adf79876 radeonsi: properly implement S8Z24 depth-stencil format
Based on r600g commit 2b9659c9e6 .

Fixes crashes with 4 piglit tests which are now hitting these formats.

NOTE: This is a candidate for the 9.1 branch.
2013-02-18 15:58:05 +01:00
Vincent Lejeune
0527317e1f r600g/llvm: Support for TBO
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2013-02-18 15:08:59 +01:00
Vincent Lejeune
c116598f86 r600g/llvm: Set Inputs/Outputs count to 32 (api reported value)
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2013-02-18 15:08:54 +01:00
Vincent Lejeune
90e6f47ac8 r600g/llvm: Fix alpha_to_one piglit tests
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2013-02-18 15:08:50 +01:00
Vincent Lejeune
ef8fde6acb r600g/llvm: Add support for UBO
NOTE: This is a candidate for the Mesa stable branch.

Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2013-02-18 15:08:45 +01:00
Christopher James Halse Rogers
dd599188d2 i965: Fix leak in blorp CopyTexSubImage2D
_mesa_delete_renderbuffer does not call the driver-specific
renderbuffer delete function, so the blorp code was leaking the
Intel-specific bits, including some GEM objects.

Call the renderbuffer's ->Delete() method instead, which does the
right thing.

Fixes Unity rapidly sending the machine into the arms of the OOM-killer

Note: This is a candidate for the 9.1 branch.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-16 08:11:14 -08:00
Roland Scheidegger
f1ab67c13a gallivm/tgsi: fix issues with sample opcodes
We need to encode them as Texture instructions since the NumOffsets field
is encoded there. However, we don't encode the actual target in there, this
is derived from the sampler view src later.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-16 02:40:59 +01:00
Roland Scheidegger
cb2e678294 gallivm/tgsi: fix src modifier fetching with non-float types.
Need to take the type into account. Also, if we want to allow
mov's with modifiers we need to pick a type (assume float).

v2: don't allow all modifiers on all type, in particular don't allow
absolute on non-float types and don't allow negate on unsigned.
Also treat UADD as signed (despite the name) since it is used
for handling both signed and unsigned integer arguments and otherwise
modifiers don't work.
Also add tgsi docs clarifying this.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-16 02:40:51 +01:00
Roland Scheidegger
c25ae5d27b gallivm: fix issues with trunc/round/floor/ceil with no arch rounding
The emulation of these if there's no rounding instruction available
is a bit more complicated than what the code did.
In particular, doing fp-to-int/int-to-fp will not work if the exponent
is large enough (and with NaNs, Infs). Hence such values need to be filtered
out and the original value returned in this case (which fortunately should
always be exact). This comes at the expense of performance (if your cpu
doesn't support rounding instructions).
Furthermore, floor/ifloor/ceil/iceil were affected by precision issues for
values near negative (for floor) or positive (for ceil) zero, fix that as well
(fixing this issue might not actually be slower except for ceil/iceil if the
type is not signed which is probably rare - note iceil has no callers left
in any case).

Also add some new rounding test values in lp_test_arit to actually test
for that stuff (which previously would have failed without sse41).

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=59701.
2013-02-16 02:40:44 +01:00
Roland Scheidegger
70daad6a99 gallivm: DIV shouldn't be deprecated.
(Though it looks glsl won't emit it.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-16 02:40:36 +01:00
Matt Turner
00f6fe6c66 mesa: Use PROGRAM_ERROR_STRING_ARB instead of the _NV name
Since NV_fragment_program is now gone. No functional change, since the
values are identical.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-15 10:28:12 -08:00
Brian Paul
2ef530cf68 trace: add context pointer sanity checking
To help catch mixed up context pointer bugs in the future, add a
trace_context_check() function and some new assertions.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-15 11:11:34 -07:00
Brian Paul
82d62cf04f trace: fix incorrect trace_surface::base.context pointer
When a trace_surface object is created in trace_surf_create() we
weren't correctly setting the surface's context pointer.  Instead of
it being the trace context, it was the wrapped driver's context.
This caused things to blow up sometimes during surface deallocation.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-15 11:11:34 -07:00
Brian Paul
3b0de75c4d mesa: remove old version comment from gl.h 2013-02-15 09:25:15 -07:00
Brian Paul
70135e915a trace: whitespace, comment clean-ups 2013-02-15 09:25:15 -07:00
Brian Paul
7b836a7d25 trace: move struct tr_list to tr_texture.h
That's the only place it's used.
2013-02-15 09:25:15 -07:00
Brian Paul
4be5a06752 st/mesa: fix format query for GL_ARB_texture_rg
The GL_ARB_texture_rg spec says that we need to support both texturing
and rendering for the GL_RED and GL_RG formats.  So move the format
check up into the rendertarget_mapping[] list.  Also, add
PIPE_FORMAT_R8_UNORM to the list of formats required.

Note: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-15 09:25:14 -07:00
Eric Anholt
c37992c54d i965/fs: Do a general SEND dependency workaround for the original 965.
We'd been ad-hoc inserting instructions in some SEND messages with no
knowledge of when it was required (so extra instructions), but not all SENDs
(so not often enough).  This should do much better than that, though it's
still flow-control-ignorant.

v2: Use BRW_MAX_MRF instead of magic numbers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58960
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: Candidate for the stable branches.
2013-02-15 06:17:46 -08:00
Kristian Høgsberg
6dbe94c12c egl-wayland: Fix left-over wl_display_roundtrip() usage
We have to use the EGL wayland event queue for roundtrip, so use the
wayland_roundtrip() helper, which does just that.
2013-02-14 20:48:05 -05:00
Eric Anholt
5bb05c6e6d i965/gen7: Set up all samplers even if samplers are sparsely used.
In GLSL, sampler indices are allocated contiguously from 0.  But in the
case of ARB_fragment_program (and possibly fixed function), an app that
uses texture 0 and 2 will use sampler indices 0 and 2, so we were only
allocating space for samplers 0 and 1 and setting up sampler 0.  We
would read garbage for sampler 2, resulting in flickering textures and
an angry simulator.

Fixes bad rendering in 0 A.D. and ETQW.  This was fixed for pre-gen7 by
28f4be9eb9

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=25201
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58680
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for stable branches.
2013-02-14 15:14:09 -08:00
Marek Olšák
34dc4d6b67 r600g: add support for red-alpha render targets 2013-02-14 14:59:36 +01:00
Marek Olšák
ec5376f5d8 r300g: add support for red-alpha render targets 2013-02-14 14:59:36 +01:00
Marek Olšák
5d3b8ad24b st/mesa: try to find exact format matching user format and type for DrawPixels
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-14 14:51:46 +01:00
Marek Olšák
2b9659c9e6 r600g: properly implement S8Z24 depth-stencil format for Evergreen
I should say "fix", but it has never been used until now.
S8Z24 is the format equivalent to the GL_UNSIGNED_INT_24_8 packing,
so we'll start to see it more often with st/mesa now making smart decisions
about formats.

The DB<->CB copy can change the channel ordering for transfers, other than
that, the internal DB format doesn't really matter.

R600-R700 support is possible except shadow mapping.
FMT_24_8 is broken if the SAMPLE_C instruction is used (no idea why).

Also the sampler swizzling was broken in theory and the fact it worked was
a lucky coincidence.

radeonsi might need to port this.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-02-14 14:51:46 +01:00
Michel Dänzer
c840270ebe radeonsi: Handle TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS
8 more little piglits.

NOTE: This is a candidate for the 9.1 branch.
2013-02-14 10:51:44 +01:00
Michel Dänzer
f34ad85765 radeonsi: Fix array indices for detecting integer vertex formats 2013-02-14 10:31:21 +01:00
Vinson Lee
0d5ce524ab glsl: Initialize ir_texture member variable.
Fixes uninitialized pointer field defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-13 23:10:48 -08:00
Eric Anholt
b8906adb66 intel: Allow blit readpixels even when the pack alignment is set.
The default alignment is 4, so this fast path was rarely hit.  Rather
than introduce logic to handle alignment, just use the Mesa core
function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46632
Cc: neil@linux.intel.com
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-13 18:10:20 -08:00
Eric Anholt
516d8be502 i965: Remove writemask support from brw_SAMPLE().
The code was rather broken for non-XYZW on 8-wide, but all of our
callers were using XYZW anyway.  For my experiments with using writemask
on texturing, I've been using manual header setup in the compiler
backends, since we want to actually know what registers are written for
optimization and register allocation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-13 18:10:20 -08:00
Eric Anholt
bf91f0b039 i965/fs: Use a helper function for checking for flow control instructions.
In 2 of our checks, we were missing BREAK and CONTINUE.

NOTE: Candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-13 17:47:06 -08:00
bma
ce3dfa19ab shaderapi: Fix AttachShader error
Detect a duplicate Shader type as and error instead of silently allowing
it, restrict to ES2 API.

v2: Tapani Pälli <tapani.palli@intel.com>
    - make the check run time instead of compile time

v3: chadv
    - Quote spec on which error to generate.

Signed-off-by: bma <Bo.Ma@windriver.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-13 14:09:47 -08:00
Tom Stellard
0898047e7b configure.ac: Add components to LLVM_COMPONENTS when using llvm shared libs
This is required when LLVM is built with CMake, which creates one
shared library for each component.
2013-02-13 17:01:08 -05:00
Eric Anholt
cb4616d32d i965: Re-enable the -RHW workaround for original gen4 chips.
Fixes broken clipping in supertuxkart and presumably many other applications.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51471
NOTE: Candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-13 10:19:21 -08:00
Eric Anholt
ddc2b453d0 i965/gen4: Work around missing sRGB RGB DXT1 support.
The hardware just doesn't support it.  I suspect this was a regression from
the move to fixed MESA_FORMATs for compressed textures and that previously we
were storing uncompressed for this or something.

Fixes GPU hangs in piglit "texwrap GL_EXT_texture_sRGB-s3tc bordercolor
swizzled" on my GM965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-13 10:19:21 -08:00
Paul Berry
dfb57e7d1b glsl: Fix error checking on "flat" keyword to match GLSL ES 3.00, GLSL 1.50.
All of the GLSL specs from GLSL 1.30 (and GLSL ES 3.00) onward contain
language requiring certain integer variables to be declared with the
"flat" keyword, but they differ in exactly *when* the rule is
enforced:

(a) GLSL 1.30 and 1.40 say that vertex shader outputs having integral
type must be declared as "flat".  There is no restriction on fragment
shader inputs.

(b) GLSL 1.50 through 4.30 say that fragment shader inputs having
integral type must be declared as "flat".  There is no restriction on
vertex shader outputs.

(c) GLSL ES 3.00 says that both vertex shader outputs and fragment
shader inputs having integral type must be declared as "flat".

Previously, Mesa's behaviour was consistent with (a).  This patch
makes it consistent with (b) when compiling desktop shaders, and (c)
when compiling ES shaders.

Rationale for desktop shaders: once we add geometry shaders, (b) really
seems like the right choice, because it requires "flat" in just the
situations where it matters.  Since we may want to extend geometry
shader support back before GLSL 1.50 (via ARB_geometry_shader4), it
seems sensible to apply this rule to all GLSL versions.  Also, this
matches the behaviour of the nVidia proprietary driver for Linux, and
the expectations of Intel's oglconform test suite.

Rationale for ES shaders: since the behaviour specified in GLSL ES
3.00 matches neither pre-GLSL-1.50 nor post-GLSL-1.50 behaviour, it
seems likely that this was a deliberate choice on the part of the GLES
folks to be more restrictive.  Also, the argument in favor of (b)
doesn't apply to GLES, since it doesn't support geometry shaders at
all.

Some discussion about this has already happened on the Mesa-dev list.
See:

http://lists.freedesktop.org/archives/mesa-dev/2013-February/034199.html

Fixes piglit tests:
- glsl-1.30/compiler/interpolation-qualifiers/nonflat-*.frag
- glsl-1.30/compiler/interpolation-qualifiers/vs-flat-int-0{2,3,4,5}.vert
- glsl-es-3.00/compiler/interpolation-qualifiers/varying-struct-nonflat-{int,uint}.frag

Fixes oglconform tests:
- glsl-q-inperpol negative.fragin.{int,uint,ivec,uvec}

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-02-13 07:58:08 -08:00
Paul Berry
93c913485e glsl: don't allow non-flat integral types in varying structs/arrays.
In the GLSL 1.30 spec, section 4.3.6 ("Outputs") says:

    "If a vertex output is a signed or unsigned integer or integer
    vector, then it must be qualified with the interpolation qualifier
    flat."

The GLSL ES 3.00 spec further clarifies, in section 4.3.6 ("Output
Variables"):

    "Vertex shader outputs that are, *or contain*, signed or unsigned
    integers or integer vectors must be qualified with the
    interpolation qualifier flat."

(Emphasis mine.)

The language in the GLSL ES 3.00 spec is clearly correct and should be
applied to all shading language versions, since varyings that contain
ints can't be interpolated, regardless of which shading language
version is in use.

(Note that in GLSL 1.50 the restriction is changed to apply to
fragment shader inputs rather than vertex shader outputs, to
accommodate the fact that in the presence of geometry shaders, vertex
shader outputs are not necessarily interpolated.  That will be
addressed by a future patch).

NOTE: This is a candidate for stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-02-13 07:58:01 -08:00
Paul Berry
d5948f2f5e glsl: Allow default precision qualifiers to be set for sampler types.
From GLSL ES 3.00 section 4.5.4 ("Default Precision Qualifiers"):

    "The precision statement

        precision precision-qualifier type;

    can be used to establish a default precision qualifier. The type
    field can be either int or float or any of the sampler types, and
    the precision-qualifier can be lowp, mediump, or highp."

GLSL ES 1.00 has similar language.  GLSL 1.30 doesn't allow precision
qualifiers on sampler types, but this seems like an oversight (since
the intention of including these in GLSL 1.30 is to allow
compatibility with ES shaders).

Previously, Mesa followed GLSL 1.30 and only allowed default precision
qualifiers to be set for float and int.  This patch makes it follow
GLSL ES rules in all cases.

Fixes Piglit tests default-precision-sampler.{vert,frag}.

Partially addresses https://bugs.freedesktop.org/show_bug.cgi?id=60737.

NOTE: This is a candidate for stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-13 07:57:58 -08:00
Marek Olšák
60aa5f360a st/mesa: fix texture buffer objects
Broken by 624528834f.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-13 16:38:19 +01:00
Kenneth Graunke
8cabe26f5d i965: Use derived state for Haswell's 3DSTATE_VF packet.
Otherwise, we fail to correctly handle GL_PRIMITIVE_RESTART_FIXED_INDEX.

Fixes gles3conform's primitive_restart_mode test.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-12 20:24:28 -08:00
Marek Olšák
ea63491629 st/mesa: accelerate glGetTexImage for all formats using a blit
This commit allows using glGetTexImage during rendering and still
maintain interactive framerates.

This improves performance of WarCraft 3 under Wine. The framerate is improved
from 25 fps to 39 fps in the main menu, and from 0.5 fps to 32 fps in the game.

v2: fix choosing the format for decompression
2013-02-13 02:13:10 +01:00
Marek Olšák
cd41833b44 gallium: add red-alpha texture formats and a couple of util functions
This is for glGetTexImage and it will be used for samplers only (which some
drivers already implement by reading util_format_description).

v2: incorporate Brian's suggestion

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-13 02:13:10 +01:00
Jerome Glisse
974b482aca r600g: fix lockup when hyperz & alpha test are enabled together. v3
Seems that alpha test being enabled confuse the GPU on the order in
which it should perform the Z testing. So force the order programmed
throught db shader control.

v2: Only force z order when alpha test is enabled
v3: Update db shader when binding new dsa + spelling fix

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-12 17:03:56 -05:00
Jordan Justen
496928a442 CopyTexImage: Don't check sRGB vs LINEAR for desktop GL
In OpenGL 4.3, new language was added that would require
this check. But, if this check results in broken applications
then perhaps it will be reversed.

For now, remove this check and re-evaluate when
desktop GL 4.3 is closer.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-02-12 11:22:49 -08:00
Christian König
8c80894fb3 radeonsi: remove constant index limitation v3
With the llvm patches, fixing 14 piglit tests in total.

v2: increase the const limit
v3: document the const limit

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-12 18:57:12 +01:00
Christian König
8514f5ac01 radeonsi: support constants as TEX coordinates
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-12 18:57:12 +01:00
Paul Berry
f8426eea35 glsl: Fix unsupported version error for GLSL ES 3.00, future proof for 3.30.
When the user specifies an unsupported GLSL version,
_mesa_glsl_parse_state::process_version_directive() nicely gives them
an error message telling them which GLSL versions are supported.
Previous to this patch, the logic for determining whether a given
language version was supported was independent from the logic to
generate this error message string; as a result, we had a bug where
GLSL 3.00 would never be listed in the error message as an available
language version, even if it was really available.

To make matters worse, the code for generating the error message
string assumed that desktop GL versions were always separated by 0.10,
an assumption that will be wrong as soon as we support GLSL 3.30.

This patch fixes both problems by adding a table of supported GLSL
versions to _mesa_glsl_parse_state; this table is used both to
generate the error message and to check whether a given version is
supported.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-12 08:06:35 -08:00
Roland Scheidegger
9870459522 gallium/docs: fix typos in sample opcode descriptions 2013-02-12 16:51:11 +01:00
Roland Scheidegger
2947f00bc4 nv50: fix bogus parameters when processing sample instructions
Discovered accidentally when changing SAMPLE_L definition.
Turns out the lod arguments were already correct for the new definition
but the compare and derivs were not.

Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
2013-02-12 16:51:11 +01:00
Roland Scheidegger
427d36a227 gallium: fix tgsi SAMPLE_L opcode to use separate source for explicit lod
It looks like using coord.w as explicit lod value is a mistake, most likely
because some dx10 docs had it specified that way. Seems this was changed though:
http://msdn.microsoft.com/en-us/library/windows/desktop/hh447229%28v=vs.85%29.aspx
- let's just hope it doesn't depend on runtime build version or something.
Not only would this need translation (so go against the stated goal these
opcodes should be close to dx10 semantics) but it would prevent usage of this
opcode with cube arrays, which is apparently possible:
http://msdn.microsoft.com/en-us/library/windows/desktop/bb509699%28v=vs.85%29.aspx
(Note not only does this show cube arrays using explicit lod, but also the
confusion with this opcode: it lists an explicit lod parameter value, but then
states last component of location is used as lod).
(For "true" hw drivers, only nv50 had code to handle it, and it appears the
code was already right for the new semantics, though fix up the seemingly
wrong c/d arguments while there.)

v2: fix comment, separate out other changes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-12 16:51:11 +01:00
Brian Paul
4bfdef87e6 util: fix incorrect Z bit masking in util_clear_depth_stencil()
For PIPE_FORMAT_Z24_UNORM_S8_UINT, the Z bits are in the 24
least significant bits.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=60527
and http://bugs.freedesktop.org/show_bug.cgi?id=60524
and http://bugs.freedesktop.org/show_bug.cgi?id=60047

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-12 08:11:05 -07:00
Matt Turner
a79ce0c925 radeon: Remove dead STANDALONE_MMIO defines
These were, at some point in the past, used to request that Xorg's
compiler.h export a static inline xf86ReadMmio32 instead of a function
pointer. compiler.h only has this option for DEC Alpha.

But Xorg's compiler.h isn't being included by either of these two files
and the radeon driver still works on Alpha, so the definitions are dead
and not needed.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-11 23:18:11 -08:00
Roland Scheidegger
8b8bca06df llvmpipe: implement dual source blending
link up the fs outputs and blend inputs, and make sure the second blend source
is correctly loaded and converted (which is quite complex).
There's a slight refactoring of the monster generate_unswizzled_blend()
function where it makes sense to factor out alpha conversion (which needs
to run twice for dual source blend).
This passes piglit arb_blend_func_extended tests.

v2: remove new but ultimately not used function...

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-12 03:41:48 +01:00
Kenneth Graunke
a73181be6d docs: Mark a few things done in GL3.txt. 2013-02-11 15:55:29 -08:00
Kenneth Graunke
3d7c09e8b0 i965: Add missing dirty bits to INTEL_DEBUG=state arrays.
These are more recent additions, and no one remembered to update the
INTEL_DEBUG=state code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-11 15:54:10 -08:00
Kenneth Graunke
b9c5997bb3 i965: Reorganize brw_bits to match the order in brw_context.h.
This reorders the "brw_bits" array in brw_state_upload.c to match the
order of the #defines in brw_context.h.

Otherwise, it's really hard to see if any are missing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-11 15:54:07 -08:00
Kenneth Graunke
0ac6d5a7fb i965: Use BRW_NEW_CONTEXT for gen7_disable rather than BRW_NEW_BATCH.
These don't need to be re-disabled on every batch if we're using
hardware contexts.  (If we're not, this is equivalent.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-11 15:54:01 -08:00
Jerome Glisse
323a448825 r600g: make sure async blit is done 8 * pitch at a time v2
The blit must be aligned on 8 horizontal block.

v2: no need to align the reminder

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-11 18:44:18 -05:00
Martin Andersson
a37835c8ed winsys/radeon: fix bo with virtual address referencing mismatch
If the same context try to flink and open the object, use the
same bo struct instead of opening a new gem handle for the object.
This way we avoid avoid having 2 different handle pointing to the
same kernel object which can latter lead to trouble with virtual
address.

Fix:
https://bugs.freedesktop.org/show_bug.cgi?id=60200

Signed-off-by: Martin Andersson <g02maran@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-02-11 18:38:00 -05:00
Eric Anholt
e776b632c0 vbo: Merge GL_QUADS drawing requests in display lists.
minecraft apparently has its piles of display lists each contain 6
instances of glBegin(GL_QUADS)/verts/glEnd(), which appear in the
compiled list as 6 prims of 4 verts each in one draw call.  We can
reduce driver overhead even more by making that one prim of 24 verts.

Improves minecraft performance by 1.6% +/- .25% (n=446)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-02-11 13:14:52 -08:00
Eric Anholt
50202f0961 vbo: Print display list debug using printf() like dlist.c does.
Otherwise, the stderr and stdout debug end up interleaved wrong
when I pipe them to a file.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-02-11 13:14:51 -08:00
Eric Anholt
b9a66da258 i965: Remove some stale comments about the brw_constant_buffer atom.
These have been wrong since f428255bde
back in 2009!

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-11 13:14:51 -08:00
Eric Anholt
e07457d0ae i965: Simplify VS push constant upload code since removal of old path.
We used to have clip planes optionally included in the push constants,
resulting in a variable amount of data uploaded, but no more.  This also
means less wasted space in the batch for our push constants.

v2: Update _NEW_TRANSFORM state bit information.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-02-11 13:14:51 -08:00
Eric Anholt
11766b1bbb i965: Add perf debug for a corner case.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-11 13:14:51 -08:00
Eric Anholt
936a3ca6fd i965: Fix access mode of index buffer rebase.
It doesn't matter with our current implementation of MapBufferRange,
but it was wrong -- the result pointer is read by intel_upload_data().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-11 13:14:51 -08:00
Eric Anholt
016928b163 i965: Fix indentation of index buffer rebase code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-11 13:14:51 -08:00
Marek Olšák
cb6470775c mesa: fix GetTexImage if mesa format and internal format don't match
Tested with softpipe only exposing RGBA formats.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
c8379204ab mesa: don't use memcpy fast path for GetTexImage if base format is different
The Mesa format can be RGBA8888_REV, the format/type can be
GL_RGBA/GL_UNSIGNED_BYTE, but the actual texture internal format can be
LUMINANCE_ALPHA, INTENSITY, etc. Therefore we should look at the base
internal format as well.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
09a99867ab mesa: don't use _mesa_base_tex_format for format parameter of GetTexImage
_mesa_base_tex_format doesn't accept GL_BGR and GL_ABGR_EXT, etc.

v2: add a (now hopefully complete) helper function to deal with this

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
5587c8619a mesa: adjust usage of swapBytes/littleEndian in format_matches_format_and_type
- swapBytes has no effect on 8-bit single-component formats
- GL_SHORT is in host byte order, so checking for littleEndian is unnecessary,
  I decided to make the change for single-component formats only

Based on suggestions from Michel Dänzer.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
dcdffaaf43 mesa: remove per-format memcpy codepaths from texstore functions
It's obsoleted by the common function _mesa_texstore_memcpy.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
4bf27ed7ed mesa: implement common texstore memcpy function for all formats
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
967b21df6a mesa: fill in Z32_FLOAT_X24S8 in _mesa_format_matches_format_and_type
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
a0510fa773 mesa: fill in signed cases and RGBA16 in _mesa_format_matches_format_and_type
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
a0fb71888f mesa: fill in INT/UINT format cases in _mesa_format_matches_format_and_type
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
43395da55a mesa: fill in YCBCR cases in _mesa_format_matches_format_and_type
based on the texstore code

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-11 19:43:01 +01:00
Marek Olšák
87f94e6f80 mesa: fill in SRGB cases in _mesa_format_matches_format_and_type
Texstore takes the same codepath as the corresponding linear formats.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 19:43:01 +01:00
Adhemerval Zanella
1ab2c55bf4 llvmpipe: fix vertex_header mask store in big-endian
This patch fixes the vertex_header mask bitfield store in big-endian
architectures by bit-swap the fields accordingly.

Reviewed-by: Adam Jackson <ajax@redhat.com>
2013-02-11 13:41:28 -05:00
Adhemerval Zanella
a8016b2f60 llvmpipe: remove lp_swizzled_cbuf
Ununsed since 75da95c5.

Reviewed-by: Adam Jackson <ajax@redhat.com>
2013-02-11 13:41:28 -05:00
Andreas Boll
44a5d7371c docs: document removal of makedepend build dependency
Build dependency removed with
424f200881

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-11 18:11:20 +01:00
Andreas Boll
d59bd61445 docs: update making a new mesa release info
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 10:58:33 +01:00
Andreas Boll
ab10d2d8a5 docs: use proper title for index.html
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 10:58:33 +01:00
Andreas Boll
bf9e19d308 docs: mention some other supported APIs
v2: add ES3

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2013-02-11 10:58:33 +01:00
Andreas Boll
babc638c72 docs: update sourcetree
glsl directory is located in src and not in src/egl

v2: remove ppc, move glapi from src/mesa to src/mapi

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 10:58:33 +01:00
Andreas Boll
dbbe108951 docs: replace CVS with git
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-11 10:58:33 +01:00
Vinson Lee
990bd49fba configure.ac: Do not check for rt on Mac OS X.
There is no rt library on Mac OS X.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58872
Acked-by: Matt Turner <mattst88@gmail.com>
2013-02-09 15:21:08 -08:00
Ian Romanick
0e2f26d5ea intel: Do not expose OES_compressed_ETC1_RGB8_texture or ARB_texture_rgb10_a2ui pre-GEN4
Older hardware cannot do ARB_texture_rgb10_a2ui, and the translation
code for OES_compressed_ETC1_RGB8_texture was never implemented in the
i915 driver.

NOTE: This is a candidate for all stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-08 19:28:53 -08:00
Roland Scheidegger
75d99673a8 softpipe: clean up lod computation
This should handle the new lod_zero modifier more correctly.
The runtime-conditional is a bit more complex however we now also do
scalar lod computation when appropriate which should more than make up for it.
The refactoring should also fix an issue with explicit lods
(lod clamp wasn't applied to them).
Also, always pass lod as the 5th element from tgsi executor, which simplifies
things (get rid of annoying conditionals later).

v2: based on Brian's feedback, use switch in a couple of places, fix up
some function parameter names, fix up comments.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-08 18:54:40 -08:00
Roland Scheidegger
4f1d757b86 softpipe: try to beat new dx10-style sample opcodes into shape
There were several bugs how this was handled, most opcodes wouldn't even
have fetched the right arguments.
Also, the tex "target" is coming from the sampler view, hence it cannot
have information about shadow comparisons - fortunately this is not only
sampler state but also needs to have matching instruction, so just use this
instead to identify shadow comparisons.
Still untested (compiles...).
Note that sample_i and sviewinfo are still busted (just assert).
(The problem is that the interface for doing the opengl-equivalent functions
txf and txq is tied to the specific the sampler itself but these opcodes
have no sampler associated with them. Oops...)
Also, even the other sample instructions will not work correctly since
they always operate on samplers which include the texture state. Fixing
this wouldn't be that difficult but most likely make softpipe quite a bit
slower when using the OpenGL tex opcodes (as the samplers have pre-baked
function calls in the sampler state depending on texture state and that stuff
would need to be evaluated at runtime), so leave it for now.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-08 18:54:40 -08:00
Roland Scheidegger
614982d320 gallivm: fix up size queries for dx10 sviewinfo opcode
Need to calculate the number of mip levels (if it would be worthwile could
store it in dynamic state).
While here, the query code also used chan 2 for the lod value.
This worked with mesa state tracker but it seems safer to use chan 0.
Still passes piglit textureSize (with some handwaving), though the non-GL
parts are (largely) untested.

v2: clarify and expect the sviewinfo opcode to return ints, not floats,
just like the OpenGL textureSize (dx10 supports dst modifiers with resinfo).
Also simplify some code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-08 18:54:40 -08:00
Roland Scheidegger
0a8043bb76 gallivm: hook up dx10 sampling opcodes
They are similar to old-style tex opcodes but with separate sampler and
texture units (and other arguments in different places).
Also adjust the debug tgsi dump code.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-08 18:54:40 -08:00
Vinson Lee
db7612d15d intel: Ensure variable intel is used in i915 builds.
Fixes unused pointer value defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-08 18:51:27 -08:00
Vinson Lee
85a9a7f09c glsl: Ensure glsl_type constructors initialize gl_type.
Fixes uninitialized scalar field defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-08 18:50:08 -08:00
Jerome Glisse
9a47684564 winsys/radeon: improve debuging printing
Make sure one can identify virtual address failure from allocation
failure.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-08 20:30:09 -05:00
Roland Scheidegger
1d71106f5c softpipe: get rid of tgsi_sampler_control param in img_filter
None of the filters used it (why would they). Maybe that param
was just there because some of the lines were considered to be
too short...

Reviewed-by: Dave Airlie <airlied@redhat.com>
2013-02-08 16:32:30 -08:00
Roland Scheidegger
66b6d51214 softpipe: fix using optimized filter function
This optimized filter (when using repeat wrap modes,
linear min/mag/mip filters, pot textures) only applies to 2d textures,
but nothing prevented it from being used for other textures (likely
leading to very bogus sample results).

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-08 16:32:30 -08:00
Roland Scheidegger
49f8825c49 gallivm: fix typo in lp_build_mul_norm
The signed case didn't do what the comment indicated. Should increase rounding
precision (at the expense of performance since the former code was effectively
a no-op).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-08 16:32:30 -08:00
Roland Scheidegger
67906f91c9 llvmpipe: first steps of adding dual source blend support
This adds support of the additional blending factors to the blend function
itself, and also enables testing of it in lp_test_blend (which passes).
Still need to add the glue code of linking fs shader outputs to blend inputs
in llvmpipe, and probably need to add special handling if destination doesn't
include alpha (which lp_test_blend doesn't test).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-08 16:32:30 -08:00
Roland Scheidegger
8e44f4117a llvmpipe: refactoring of visibility counter handling
There can be other per-thread data than just vis_counter, so pass a struct
around instead (some of our non-public code uses this already and this
difference is a major cause of merge pain).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-08 16:32:30 -08:00
Jerome Glisse
3310acdf47 xorg: fix exa finish access
The exa core will already set the pointer to NULL prior calling
the callback function. So don't bail out in the callback if it's
already NULL.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-08 19:01:19 -05:00
Kristian Høgsberg
1fe007399c egl-wayland: Make sure we allocate a back buffer even if nothing was rendered
At eglSwapBuffer time, we blindly assume we have a back buffer, but the
back buffer only gets allocated when somebody tries to render something.

NOTE: This is a candidate for the 9.0 and 9.1 branches.

https://bugs.freedesktop.org/show_bug.cgi?id=60086
2013-02-08 11:23:18 -05:00
Paul Berry
a4b9678a54 Consolidate some redundant definitions of ARRAY_SIZE() macro.
Previous to this patch, there were 13 identical definitions of this
macro in Mesa source.  That's ridiculous.  This patch consolidates 6
of them to a single definition in src/mesa/main/macros.h.

Unfortunately, I wasn't able to eliminate the remaining definitions,
since they occur in places that don't include src/mesa/main/macros.h:

- include/pci_ids/pci_id_driver_map.h
- src/egl/drivers/dri2/egl_dri2.h
- src/egl/main/egldefines.h
- src/gbm/main/backend.c
- src/gbm/main/gbm.c
- src/glx/glxclient.h
- src/mapi/mapi/stub.c

I'm open to suggestions as to how to deal with the remaining redundancy.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-08 06:51:22 -08:00
Paul Berry
dc92b2d11f intel/pre-gen6: Disable EXT_framebuffer_multisample.
Previously, the i965 driver enabled EXT_framebuffer_multisample even
on pre-gen6 chipsets.  However, since we don't support multisampling
on these chips, we set GL_MAX_SAMPLES=1 (the minimum allowed by
EXT_framebuffer_multisample), and if the client ever requested a
multisample buffer, we quietly supplied them with a single-sampled
buffer instead.

After some discussion on the mailing list (see thread
"ext_framebuffer_multisample: check for num_samples<=1"), it's clear
that this was the wrong approach.  The correct approach is to only
expose EXT_framebuffer_multisample when we truly support
multisampling; that frees us to set a sensible value of
GL_MAX_SAMPLES=0 on other chipsets, so that we never have to deal with
a client requesting a multisample buffer when multisampling isn't
supported.

This change causes the following piglit tests to be skipped on
chipsets prior to Gen6:

- "ARB_framebuffer_sRGB/blit {renderbuffer,texture}
  {linear,linear_to_srgb,srgb,srgb_to_linear}
  {downsample,msaa,upsample} {disabled,enabled}"
- EXT_framebuffer_multisample/blit-mismatched-formats
- EXT_framebuffer_multisample/blit-mismatched-sizes
- EXT_framebuffer_multisample/dlist
- EXT_framebuffer_multisample/interpolation 0 *
- EXT_framebuffer_multisample/minmax
- EXT_framebuffer_multisample/negative-copypixels
- EXT_framebuffer_multisample/negative-copyteximage
- EXT_framebuffer_multisample/negative-max-samples
- EXT_framebuffer_multisample/negative-mismatched-samples
- EXT_framebuffer_multisample/negative-readpixels
- EXT_framebuffer_multisample/renderbuffer-samples
- EXT_framebuffer_multisample/renderbufferstorage-samples
- EXT_framebuffer_multisample/samples

This is expected, since the above tests exercise MSAA functionality,
and shouldn't be run on systems prior to Gen6.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-08 06:51:22 -08:00
Vinson Lee
b681ed6ac9 glsl: Initialize all tfeedback_candidate_generator member variables.
Fixes uninitialized pointer field defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-02-07 21:51:20 -08:00
Vinson Lee
7c544e55da nv30: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-07 21:45:01 -08:00
Ian Romanick
82691f1293 glsl: Change loop_analysis to not look like a resource leak
Previously the loop_state was allocated in the loop_analysis
constructor, but not freed in the (nonexistent) destructor.  Moving
the allocation of the loop_state makes this code appear less sketchy.

Either way, there is no actual leak.  The loop_state is freed by the
single caller of analyze_loop_variables.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Dave Airlie <airlied@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57753
2013-02-07 21:18:42 -08:00
Paul Berry
04f0d6cc22 mesa: Don't check (offset + size <= bufObj->Size) in BindBufferRange.
In the documentation for BindBufferRange, OpenGL specs from 3.0
through 4.1 contain this language:

    "The error INVALID_VALUE is generated if size is less than or
    equal to zero or if offset + size is greater than the value of
    BUFFER_SIZE."

This text was dropped from OpenGL 4.2, and it does not appear in the
GLES 3.0 spec.

Presumably the reason for the change is because come clients change
the size of the buffer after calling BindBufferRange.  We don't want
to generate an error at the time of the BindBufferRange call just
because the old size of the buffer was too small, when the buffer is
about to be resized.

Since this is a deliberate relaxation of error conditions in order to
allow clients to work, it seems sensible to apply it to all versions
of GL, not just GL 4.2 and above.

(Note that there is no danger of this change allowing a client to
access data beyond the end of a buffer.  We already have code to
ensure that that doesn't happen in the case where the client shrinks
the buffer after calling BindBufferRange).

Eliminates a spurious error message in the gles3 conformance test
"transform_feedback_offset_size".

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-07 21:16:37 -08:00
Ian Romanick
f29ab4ece5 i965: Set UniformBufferOffsetAlignment to sizeof(vec4)
This matches the behavior of the Windows driver, but a bspec reference
should would be nice.

NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-07 21:16:08 -08:00
Matt Turner
3ee602314f mesa: Allow glGet* queries of MAX_VARYING_COMPONENTS in ES 3
Should have been done in d9948e49 but I missed it because
MAX_VARYING_FLOATS doesn't appear in the ES 3 spec, but is the same
value as MAX_VARYING_COMPONENTS.

NOTE: Candidate for the 9.1 branch
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-07 17:53:13 -08:00
Daniel van Vugt
6e226ab5ac gbm: Remember to init format on gbm_dri_bo_create.
https://bugs.freedesktop.org/show_bug.cgi?id=60143
2013-02-07 20:00:52 -05:00
Eric Anholt
7242b03622 glx: Centralize the code for context flushing.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-07 13:13:02 -08:00
Eric Anholt
95080ca8d4 glx: Add a little comment about what dri2FlushFrontBuffer() does.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-07 13:13:02 -08:00
Michel Dänzer
c093f12406 radeonsi: Handle scaled and integer formats for samplers and vertex elements.
Also, add assertions to stress that render targets don't support scaled
formats.

20 more little piglits.
2013-02-07 19:07:43 +01:00
Michel Dänzer
23405ef467 radeonsi: Don't advertise PIPE_FORMAT_L8A8_SRGB support.
The hardware can't do it.
2013-02-07 19:07:43 +01:00
Michel Dänzer
a9816cc784 radeonsi: Remove incorrect (and dead) assignment in tex_fetch_args().
The proper return type is assigned at the end of the function.
2013-02-07 19:07:43 +01:00
Michel Dänzer
07eddc444c radeonsi: Use unique names for referring to texture sampling intrinsics.
Append the overloaded vector type used for passing in the addressing
parameters.

Without this, LLVM uses the same function signature for all those types,
which cannot work.

Fixes problems e.g. with FlightGear and Red Eclipse.
2013-02-07 19:07:43 +01:00
Marek Olšák
74a17a764d r300g: put textures with usage=staging in GTT and make them linear 2013-02-07 17:43:19 +01:00
Jerome Glisse
681707abf2 r600g: fix slice tile max for compressed texture and async dma
Was using the pixel size instead of the number of block for the slice
tile max computation which resulted in dma writing at wrong address.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-07 10:42:22 -05:00
Marek Olšák
9ba1e23647 radeonsi: use new RGBX formats 2013-02-07 00:20:24 +01:00
Marek Olšák
4dc142d521 r300g: fix blending and alpha-test with RGBX16F and enable MSAA for it 2013-02-07 00:20:24 +01:00
Marek Olšák
27e216a075 r300g: use new RGBX formats 2013-02-07 00:20:24 +01:00
Marek Olšák
3c351b7c33 r600g: use new RGBX formats 2013-02-07 00:20:24 +01:00
Marek Olšák
dd21ecdc42 st/mesa: use new RGBX formats
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-07 00:20:24 +01:00
Marek Olšák
f9fa725690 mesa: add RGBX formats for existing GL RGB texture formats
v2: fix compilation of swrast
2013-02-07 00:20:24 +01:00
Marek Olšák
70bf7bae1d gallium: add RGBX formats for existing GL RGB texture formats
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-07 00:20:23 +01:00
Kenneth Graunke
7d467f3c15 i965/blorp: Support blits between ARGB and XRGB formats.
Now that we have support for overriding alpha to 1.0, we can handle
blitting between these formats in either direction.

For now, we only support two XRGB formats: MESA_FORMAT_XRGB8888 and
MESA_FORMAT_RGBX8888_REV.  Most places only appear to worry about the
former, so ignore the latter for now.  We can always add it later.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
2013-02-06 10:01:03 -08:00
Kenneth Graunke
c0554141a9 i965/blorp: Support overriding destination alpha to 1.0.
Currently, Blorp requires the source and destination formats to be
equal.  However, we'd really like to be able to blit between XRGB and
ARGB formats; our BLT engine paths have supported this for a long time.

For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
interpreted as 1.0.  For XRGB -> ARGB, we need to smash the alpha
channel to 1.0 when writing the destination colors.  This is fairly
straightforward with blending.

For now, this code is never used, as the source and destination formats
still must be equal.  The next patch will relax that restriction.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
2013-02-06 10:00:53 -08:00
Kenneth Graunke
0b3bebbaac i965: Implement CopyTexSubImage2D via BLORP (and use it by default).
The BLT engine has many limitations.  Currently, it can only blit
X-tiled buffers (since we don't have a kernel API to whack the BLT
tiling mode register), which means all depth/stencil operations get
punted to meta code, which can be very CPU-intensive.

Even if we used the BLT engine, it can't blit between buffers with
different tiling modes, such as an X-tiled non-MSAA ARGB8888 texture
and a Y-tiled CMS ARGB8888 renderbuffer.  This is a fundamental
limitation, and the only way around that is to use BLORP.

Previously, BLORP only handled BlitFramebuffer.  This patch adds an
additional frontend for doing CopyTexSubImage.  It also makes it the
default.  This is partly to increase testing and avoid hiding bugs,
and partly because the BLORP path can already handle more cases.  With
trivial extensions, it should be able to handle everything the BLT can.

This helps PlaneShift massively, which tries to CopyTexSubImage2D
between depth buffers whenever a player casts a spell.  Since these
are Y-tiled, we hit meta and software ReadPixels paths, eating 99% CPU
while delivering ~1 FPS.  This is particularly bad in an MMO setting
because people cast spells all the time.

It also helps Xonotic in 4X MSAA mode.  At default power management
settings, I measured a 6.35138% +/- 0.672548% performance boost (n=5).
(This data is from v1 of the patch.)

No Piglit regressions on Ivybridge (v3) or Sandybridge (v2).

v2: Create a fake intel_renderbuffer to wrap the destination texture
    image and then reuse do_blorp_blit rather than reimplementing most
    of it.  Remove unnecessary clipping code and conditional rendering
    check.

v3: Reuse formats_match() to centralize checks; delete temporary
    renderbuffers.  Reorganize the code.

v4: Actually copy stencil when dealing with separate stencil buffers but
    packed depth/stencil formats.  Tested by a new Piglit test.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com> [v4]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v3]
Reviewed-and-tested-by: Carl Worth <cworth@cworth.org> [v2]
Tested-by: Martin Steigerwald <martin@lichtvoll.de> [v3]
2013-02-06 10:00:22 -08:00
Kenneth Graunke
29aef6cce8 mesa: Put extern "C" guards in renderbuffer.h.
I need to use this from C++ code.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-06 09:59:53 -08:00
Brian Paul
48b01e6a10 llvmpipe: remove extraneous const qualifier 2013-02-06 09:16:58 -07:00
Marek Olšák
bc2ceb97f1 gallium/util: remove duplicated function util_format_is_rgb_no_alpha
It only checks if alpha is present, so it's the same as util_format_has_alpha.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:32 +01:00
Marek Olšák
b92057a983 st/mesa: get rid of GET_CURRENT_CONTEXT in st_choose_format
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:32 +01:00
Marek Olšák
2e6f10d0b7 st/mesa: adjust texture format selection to try the closest base format first
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:32 +01:00
Marek Olšák
b89b80a91d st/mesa: put RGBX8 and RGBA8 in the default format lists
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:32 +01:00
Marek Olšák
c1856da75d st/mesa: add the rest of RGB8 format/type combos to exact_format_mapping tables
These formats were added a few months after these tables were committed.
No idea why we have the table though. AFAIK, texstore always takes the slow path
for GL_RGBn.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:32 +01:00
Marek Olšák
ebe86b8082 mesa: fixup inconsistent naming of RG16 formats
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:31 +01:00
Marek Olšák
cf37aef414 r600g: report correct control flow depth 2013-02-06 14:51:31 +01:00
Marek Olšák
fc86394882 glsl: fix incorrect comment about do_common_optimization 2013-02-06 14:51:31 +01:00
Marek Olšák
4362bdadf3 st/mesa: emit saturates in the vertex shader if Shader Model 3.0 is supported
v2: change the requirement from GLSL 1.30 to SM 3.0 (R500 can do this)
2013-02-06 14:51:31 +01:00
Marek Olšák
48689ca14a st/mesa: advertise ARB_shading_language_packing for GLSL >= 1.30
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:31 +01:00
Marek Olšák
afd4178fec st/mesa: do most of GLSL lowering outside of the optimization do-while loop
based on the intel driver

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:31 +01:00
Marek Olšák
7325f1faaa st/mesa: remove dead code depending on EmitCondCodes
EmitCondCodes is always false.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 14:51:31 +01:00
Marek Olšák
85efb2fff0 r300g: try to use color varyings for texcoords if max texcoord limit is exceeded
+35 piglits
2013-02-06 14:45:22 +01:00
Marek Olšák
1d3561d877 r300/compiler: copy-propagate saturate mode when possible
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-02-06 14:45:20 +01:00
Marek Olšák
ae8696c7ee r300/compiler: add support for saturate output modifier in r500 vertex shaders
The GLSL compiler can simplify clamp(v,0,1) to saturate. The state tracker
doesn't use it yet, but it will.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-02-06 14:45:16 +01:00
Marek Olšák
499f7de12e r300g: fix blending with RGBX formats
Change DST_ALPHA to ONE.
2013-02-06 14:31:23 +01:00
Marek Olšák
f40a7fc34a r300g: fix blending with blend color and RGBA formats
NOTE: This is a candidate for the stable branches.
2013-02-06 14:31:23 +01:00
José Fonseca
5048e69392 egl/dri: Don't invoke dri2_dpy->flush if it's NULL.
I'd like to test Mesa OpenGL ES along side with NVIDIA libGL drivers. But
without this change, I get a NULL pointer dereference.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-06 09:22:26 +00:00
Vinson Lee
d08cee5d80 glsl: Initialize ast_parameter_declarator member variables.
Fixes uninitialized pointer field defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-05 22:11:32 -08:00
Brian Paul
ff60509157 svga: fix sRGB rendering
We weren't emitting the SVGA_RS_OUTPUTGAMMA state so sRGB rendering
didn't work properly.

Fixes piglit's framebuffer-srgb test.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-05 12:34:55 -07:00
Tom Stellard
8aaee4d64e r600g/compute: Fix segfault caused by new shader disassembler 2013-02-05 15:41:33 +00:00
Michel Dänzer
02a423b239 Require libdrm_radeon 2.4.42 for radeonsi.
It has new PCI IDs and an important tiled surface layout fix.
2013-02-05 15:12:14 +01:00
Eric Anholt
86536a321d i965: Disable write masking when setting up texturing m0.
v2/Kayden: Also disable write masking in the vec4 backend.

Fixes 78 oglconform glsl-bif-tex-* subcases.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com> [v1]
Reviewed-by: Eric Anholt <eric@anholt.net> [v2]
2013-02-04 17:29:41 -08:00
Tapani Pälli
e062a4187d intel: Fix regression in intel_create_image_from_name stride handling
Strangely, the DRIimage interface we have passes the pitch in pixels
instead of bytes, which anholt missed in the change to using bytes for
region pitch.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-02-04 13:59:02 -08:00
Eric Anholt
5751d0cb2d i965: Fix segfaults from 45a28a927a
If you look up a level that isn't in the miptree, you crash.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-02-04 13:58:55 -08:00
Alex Deucher
4161d70bba radeonsi: add Oland pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch.
2013-02-04 15:44:38 -05:00
Alex Deucher
af0af75881 radeonsi: default PA_SC_RASTER_CONFIG to 0
That should work in all cases.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch.
2013-02-04 15:44:07 -05:00
Alex Deucher
83e4407f44 radeonsi: add support for Oland chips
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch
2013-02-04 15:43:21 -05:00
Paul Berry
99b78337e3 glsl: Support transform feedback of varying structs.
Since transform feedback needs to be able to access individual fields
of varying structs, we can no longer match up the arguments to
glTransformFeedbackVaryings() with variables in the vertex shader.

Instead, we build up a hashtable which records information about each
possible name that is a candidate for transform feedback, and then
match up the arguments to glTransformFeedbackVaryings() with the
contents of that hashtable.

Populating the hashtable uses the program_resource_visitor
infrastructure, so the logic is shared with how we handle uniforms.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-04 10:36:47 -08:00
Paul Berry
53febac02c glsl: Use parse_program_resource_name to parse transform feedback varyings.
Previously, transform feedback varyings were parsed in an ad-hoc
fashion that wasn't compatible with structs (or array of structs).
This patch makes it use parse_program_resource_name(), which correctly
handles both.

Note that parse_program_resource_name()'s technique for handling
mal-formed input strings is to simply let them through and rely on the
fact that a future name lookup will fail.  Because of this,
tfeedback_decl::init() no longer needs to return a boolean error
code--it always succeeds, and if the input was mal-formed the error
will be detected later.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-04 10:36:44 -08:00
Paul Berry
b4db34cc4c glsl: Rename uniform_field_visitor to program_resource_visitor.
There's actually nothing uniform-specific in uniform_field_visitor.
It is potentially useful for all kinds of program resources (in
particular, future patches will use it for transform feedback
varyings).

This patch renames it to program_resource_visitor, and clarifies
several comments, to reflect the fact that it is useful for more than
just uniforms.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-04 10:36:40 -08:00
Paul Berry
b92900d26a mesa/glsl: Separate parsing logic from _mesa_get_uniform_location.
The parsing logic is moved to a new function in the GLSL module,
parse_program_resource_name().  This name was chosen because it should
eventually be useful for handling everything that OpenGL 4.3 calls
"program resources" (e.g. uniforms, vertex inputs, fragment outputs,
and transform feedback varyings).

Future patches will make use of this function for linking transform
feedback varyings.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-02-04 10:36:35 -08:00
Quentin Glidic
11bd1b0f58 gallium/egl: Fix include dirs for VPATH build
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>
2013-02-04 10:36:50 -08:00
Abdiel Janulgue
eaeb314372 intel: make sure to setup image dimension in image_from_planar setup
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=60212
Tested-by: Scott Moreau <oreaus@gmail.com>
Tested-by:  Tiago Vignatti <tiago.vignatti@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-04 10:18:22 -08:00
Matt Turner
2db1f73849 builtin_compiler/build: Don't use *_FOR_BUILD when not cross compiling
Previously we were relying on CFLAGS_FOR_BUILD to be the same as CFLAGS
when not cross compiling, but this assumption didn't take into
consideration 32-bit builds on 64-bit systems. More generally, not
honoring CFLAGS is bad.

Automake is evidently too stupid to accept

if CROSS_COMPILING
CC = @CC_FOR_BUILD@
...
else
CC = @CC@
endif

without warning that CC has been already defined. The warnings are
harmless, but I'd prefer to avoid future reports about them, so define
proxy variables, which are assigned inside the conditional and then
unconditionally assigned to CC et al.

NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59737
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60038
2013-02-04 09:35:45 -08:00
Brian Paul
805cf07dc3 st/mesa: emit SQRT opcode when driver supports it 2013-02-04 09:33:44 -07:00
Brian Paul
13f3ae5b83 gallium/drivers: handle PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED query
Initially, only softpipe/llvmpipe support SQRT.
2013-02-04 09:33:44 -07:00
Brian Paul
2d367e40d9 gallivm: implement support for SQRT opcode 2013-02-04 09:33:44 -07:00
Brian Paul
ad30e4545b tgsi: add support for new SQRT opcode 2013-02-04 09:33:44 -07:00
Brian Paul
d276a40e15 gallium: add SQRT shader opcode
The glsl-to-tgsi translater will emit SQRT to implement GLSL's sqrt()
and distance() functions if the PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED
query says it's supported by the driver.

Otherwise, sqrt(x) is implemented with x*rsq(x).  The problem with
this is sqrt(0) must be handled specially because rsq(0) might be
Inf/NaN/undefined (and then 0*rsq(0) is Inf/Nan/undefined).  In the
glsl-to-tgsi code we use an extra CMP to check if x is zero and then
replace the result of x*rsq(x) with zero.

In the end, this makes sqrt() generate much more reasonable code for
drivers that can do square roots.

Note that many of piglit's generated shader tests use the GLSL
distance() function.
2013-02-04 09:33:44 -07:00
Michel Dänzer
6455d40b7e radeonsi: Remove spurious traces of R16G16B16 support.
The hardware can't do it, and these were causing warnings in some piglit tests.

NOTE: This is a candidate for the 9.1 branch.
2013-02-04 17:03:26 +01:00
Michel Dänzer
6bcb823844 radeonsi: Enable texture arrays.
28/30 piglit tests pass.

NOTE: This is a candidate for the 9.1 branch.
2013-02-04 17:03:25 +01:00
Michel Dänzer
120efeef8b radeonsi: Improve packing of texture address parameters.
In particular, the LOD bias and depth comparison values are packed before the
'normal' texture coordinates, and the array slice and LOD values are appended.

NOTE: This is a candidate for the 9.1 branch.
2013-02-04 17:03:25 +01:00
Michel Dänzer
e5fb7347a7 radeonsi: Adapt to sample intrinsics changes.
Fix up intrinsic names, and bitcast texture address parameters to integers.

NOTE: This is a candidate for the 9.1 branch.
2013-02-04 17:03:25 +01:00
Brian Paul
624528834f st/mesa: simplify the update_single_texture() function
In particular, rework the sRGB/linear format selection code.
There's no reason to mess with the Mesa format.
Just do everything in terms of the gallium pipe_format.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-04 08:28:17 -07:00
Brian Paul
5f81549f6c st/mesa: merge st_ChooseTextureFormat_renderable() into st_ChooseTextureFormat()
That was the only place it was being called from.
2013-02-04 08:28:17 -07:00
Brian Paul
f54a9f4ff2 st/mesa: improve the format choosing code for DrawPixels
The code before was getting a pipe format, then calling
st_pipe_format_to_mesa_format() and then converting back again with
st_mesa_format_to_pipe_format().  This removes one conversion step.
2013-02-04 08:28:17 -07:00
Andreas Boll
38d65a9769 gallium: handle unhandled PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60098

Signed-off-by: Brian Paul <brianp@vmware.com>
2013-02-04 08:28:17 -07:00
Brian Paul
4df42890c5 st/mesa: don't choose DXT formats if we can't do DXT compression
If we call gl[Copy]TexImage2D() with a generic compression format
(e.g. intFormat=GL_COMPRESSED_RGBA) we can't choose a DXT format if
we don't have the external DXT compression library.

We weren't actually enforcing this before since the
pipe_screen::is_format_supported(DXT) query has no dependency on
the DXT compression library.

Now if we're given a generic compressed format and we can't do DXT
compression we'll fall back to a non-compressed format.

v2: use util_format_is_s3tc() function and add more comments about
the allow_dxt parameter.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-02-04 07:58:21 -07:00
Brian Paul
478056b81a mesa: don't use format chooser code for glCompressedTexImage
When glCompressedTexImage is called the internalFormat is a specific
format for the incoming image and the the hardware format should be
the same (since we never do format transcoding).  So use the simpler
_mesa_glenum_to_compressed_format() function.  This change is also
needed for the next patch.

Note: This is a candidate for the stable branches.
2013-02-04 07:58:21 -07:00
Kenneth Graunke
44aa2e15f6 i965: Fix the SF Vertex URB Read Length calculation for Gen7 platforms.
Ivybridge doesn't appear to have the same errata as Sandybridge; no
corruption was observed by setting it to more than the minimal correct
value.  It's possible that we were simply lucky, since the URB entries
are 1024-bit on Ivybridge vs. 512-bit Sandybridge.  Or perhaps the
underlying hardware issue is fixed.

Either way, we may as well program the minimum value since it's now
readily available, likely to be more efficient, and possibly more
correct.

v2: Use GEN7_SBE_* defines rather than GEN6_SF_*.  (A copy and paste
    mistake.)  They're the same, but using the right names is better.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-03 13:41:09 -08:00
Kenneth Graunke
09fbc29828 i965: Fix the SF Vertex URB Read Length calculation for Sandybridge.
(This commit message was primarily written by Paul Berry, who explained
 what's going on far better than I would have.)

Previous to this patch, we thought that the only restrictions on
3DSTATE_SF's URB read length were (a) it needs to be large enough to
read all the VUE data that the SF needs, and (b) it can't be so large
that it tries to read VUE data that doesn't exist.  Since the VUE map
already tells us how much VUE data exists, we didn't bother worrying
about restriction (a); we just did the easy thing and programmed the
read length to satisfy restriction (b).

However, we didn't notice this erratum in the hardware docs: "[errata]
Corruption/Hang possible if length programmed larger than recommended".
Judging by the context surrounding this erratum, it's pretty clear that
it means "URB read length must be exactly the size necessary to read all
the VUE data that the SF needs, and no larger".  Which means that we
can't program the read length based on restriction (b)--we have to
program it based on restriction (a).

The URB read size needs to precisely match the amount of data that the
SF consumes; it doesn't work to simply base it on the size of the VUE.

Thankfully, the PRM contains the precise formula the hardware expects.

Fixes random UI corruption in Steam's "Big Picture Mode", random terrain
corruption in PlaneShift, and Piglit's fbo-5-varyings test.

NOTE: This is a candidate for all stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56920
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60172
Tested-by: Jordan Justen <jordan.l.justen@intel.com> (v1/Piglit)
Tested-by: Martin Steigerwald <martin@lichtvoll.de> (PlaneShift)
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-03 13:40:45 -08:00
Kenneth Graunke
5e9bc7bd12 i965: Compute the maximum SF source attribute.
The maximum SF source attribute is necessary to compute the Vertex URB
read length properly, which will be done in the next commit.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-03 13:40:43 -08:00
Kenneth Graunke
b3efc5bea8 i965: Refactor Gen6+ SF attribute override code.
The next patch will benefit from easy access to the source attribute
number and whether or not we're swizzling.  It doesn't want the final
attr_override DWord form, however.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-03 13:40:31 -08:00
Kenneth Graunke
488ddb247c glsl: Remove hash table from ir_set_program_inouts pass.
Back when ir_var_in and ir_var_out signified both function parameters
and shader input/outputs, we had trouble distinguishing the two when
looking at a dereference.  Now that we have separate ir_var_shader_in
and ir_var_shader_out modes, we can determine this easily.

Removing the hash table saves memory and CPU overhead.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-02-03 13:38:16 -08:00
Kenneth Graunke
b56d6badad i965: Remove dead field brw_wm_prog_data::error. 2013-02-03 13:38:16 -08:00
Kenneth Graunke
7eda7a455b i965: Remove dead field brw_context::constant_map.
This was used by the old VS backend, but that's long gone.
2013-02-03 13:38:16 -08:00
Vinson Lee
8a4d952d10 r600g: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-01 22:52:22 -08:00
Vinson Lee
080e91aa07 egl/dri2: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-01 22:50:34 -08:00
Vinson Lee
cea341fce8 nv30: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-01 22:50:26 -08:00
Vinson Lee
4cd4deab48 nv50: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-01 22:50:16 -08:00
Vinson Lee
0580f165ed nvc0: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-01 22:50:01 -08:00
Vinson Lee
985e710c0d swrast: Fix memory leak.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-02-01 22:49:45 -08:00
Quentin Glidic
1e857130f0 configure.ac: Fix --with-llvm-shared-libs
The third argument of AC_ARG_WITH is evaluated for any provided value,
not only on --with-, so it must not force-enable the feature
Also, setting $with_llvm_shared_libs in the opencl check was overriding
the user switch

https://bugs.freedesktop.org/show_bug.cgi?id=59851

Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>
2013-02-01 22:53:46 +00:00
Tom Stellard
257006e2a4 r600g/llvm: Select the correct GPU type for RV670
RV670 belongs in the R600 chip class

https://bugs.freedesktop.org/show_bug.cgi?id=58666

NOTE: This is a candidate for the 9.1 branch
2013-02-01 22:53:30 +00:00
Abdiel Janulgue
6c7e95cb89 intel: implement create image from texture
Save miptree level info to DRIImage:
- Appropriately-aligned base offset pointing to the image
- Additional x/y adjustment offsets from above.

v8:  -Bump intelImageExtension version
v9:  -Don't use internal _eglError but implement error reporting in new DRI inteface
      instead. This fixes Android build problems based on feedback from
      Adrian M Negreanu and Chad Versace.
     -Move the non-tile-aligned check and error-reporting to intel_set_texture_image_region
v10: -Don't #include "egl/main/eglcurrent.h". [chadv]

Reviewed-by: Eric Anholt <eric@anholt.net> (v6)
Acked-by: Chad Versace <chad.versace@linux.intel.com> (v10)
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:13 -08:00
Abdiel Janulgue
8e2454c562 intel: Account for mt->offset in intel_miptree_map
We need to take account the offset from original bo when using glTexSubImage()
and other functions that manipulate the subregion of an exported texture.
Offsets are appended to mapped region address and when blitting from a source
region.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Abdiel Janulgue
11f5c82e83 intel: Create a miptree using offsets in intel_set_texture_image_region
When binding a region to a texture image, re-create the miptree base-level
considering the offset and dimension information exported by DRIImage.

v8: - Move the alignment surface address checks from the image-from-texture
      code to the texture-from-image side. This allows the error reporting to conform to
      OES_EGL_Image and to prevent mixing up EGL and GL errors. Reported by Chad Versace.
    - Addressed an existing issue in renderbuffer case where there is a
      a possibility of creating EGL images out of depthstencil textures which isn't
      really possible. This was spotted by Eric earlier.

Reviewed-by: Eric Anholt <eric@anholt.net> (v6)
Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v8)
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Abdiel Janulgue
45a28a927a i965: Account for offsets when updating SURFACE_STATE.
If the offsets are present, this lets us specify a particular level and slice
in a shared region using the base level of an exported mip-map tree.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Abdiel Janulgue
163b35e416 intel: add pixel offset calculator for miptree levels
Add helper to calculate fine-grained x and y adjustment pixels
to an image within a miptree level for tiled regions.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Abdiel Janulgue
7014df0d1d intel: Expose intel_miptree_create_internal as intel_miptree_create_layout.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Abdiel Janulgue
f9e4e5f9f9 intel: expose dimensions and offsets of a miptree level in DRIImage
v8: - Append has_depthstencil field in DRIImage structure.

Reviewed-by: Eric Anholt <eric@anholt.net> (v6)
Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v8)
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Abdiel Janulgue
7b7af48e01 dri2: Create image from texture
Add create image from texture extension and bump version.

v8: - Add appropriate image errors codes in DRI interface so we don't
      have to use internal EGL functions in driver. Suggested by Chad Versace.

Reviewed-by: Eric Anholt <eric@anholt.net> (v6)
Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v8)
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2013-02-01 11:58:12 -08:00
Michel Dänzer
a8a5055f2d radeonsi: Fix draws using user index buffer.
Was broken since commit bf469f4edc
('gallium: add void *user_buffer in pipe_index_buffer').

Fixes 11 piglit tests and lots of missing geometry e.g. in TORCS.

NOTE: This is a candidate for the 9.1 branch.
2013-02-01 18:53:03 +01:00
Brian Paul
1bb52bab9e st/mesa: whitespace/indentation fix 2013-02-01 08:00:28 -07:00
Brian Paul
3cb4915344 svga: check for NaN shader immediates
The svga device doesn't handle them.  Replace with zeros.
Fixes several piglit tests, such as "glsl-const-builtin-inversesqrt".

Reviewed-by: Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-02-01 08:00:28 -07:00
Brian Paul
9eff5e905f svga: add, use SVGA3D_SURFACE_HINT_VOLUME flag
Reviewed-by: Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-02-01 08:00:28 -07:00
Brian Paul
9a91ce9448 trace: measure time for each gallium call
To get a rough idea of how much time is spent in each gallium driver
function.  The time is measured in microseconds.
2013-02-01 08:00:28 -07:00
Brian Paul
b516bf46ef trace: add void to function definition 2013-02-01 08:00:28 -07:00
Brian Paul
fe20e3ebb5 trace: allow GALLIUM_TRACE=stdout/stderr 2013-02-01 08:00:28 -07:00
Marek Olšák
225228a7f5 radeonsi: port some of get_shader_param changes from r600g
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-02-01 15:16:35 +01:00
Marek Olšák
cc5fdaf2dc mesa: don't expose IBM_rasterpos_clip in a core context
glRasterPos doesn't exist in the core profile.

NOTE: This is a candidate for the stable branches (9.0 and 9.1).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-02-01 15:16:35 +01:00
Marek Olšák
a06f03d795 r300g: always put MSAA resources in VRAM
This along with the latest drm-fixes branch should help with bad performance
of MSAA. Remember: Nx MSAA can't be more than N times slower (where N=2,4,6).

Anyway, I recommend at least 512 MB of VRAM for Full HD 6x MSAA.

NOTE: This is a candidate for the 9.1 branch.
2013-02-01 15:16:35 +01:00
Michel Dänzer
3b888f534c configure.ac: GLX cannot work without OpenGL
GLX uses mapi/glapi/libglapi.la, which is only built for OpenGL.

If the user specified --enable-xlib-glx --disable-opengl, error out, as these
cannot be both observed at the same time. If the user just specified
--disable-opengl but not --disable-glx, print a warning and disable GLX as
well.

NOTE: This is a candidate for the stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59364

Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-02-01 11:42:09 +01:00
Vadim Girlin
9824755dae r600g: remove broken assert from r600_isa.c
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-02-01 13:19:35 +04:00
Vadim Girlin
e42111ecba r600g: implement shader disassembler v3
R600_DUMP_SHADERS environment var now allows to choose dump method:
 0 (default) - no dump
 1 - full dump (old dump)
 2 - disassemble
 3 - both

v2: fix output for burst_count > 1
v3: use more human-readable output for kcache data in CF_ALU_xxx clauses,
    improve output for ALU_EXTENDED, other minor fixes

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-02-01 12:08:42 +04:00
Vadim Girlin
022122ee63 r600g: use tables with ISA info v3
v3: added some flags including condition codes for ALU,
    fixed issue with CF reverse lookup (overlapping ranges of CF_ALU_xxx
    and other CF instructions)
    rebased on current master

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-02-01 12:08:42 +04:00
Vinson Lee
b68a3b865b glapi: Do not use backtrace on MinGW.
execinfo.h is not available on MinGW.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-01-31 23:23:12 -08:00
Jerome Glisse
5e0c956cb2 r600g: add cs memory usage accounting and limit it v3
We are now seing cs that can go over the vram+gtt size to avoid
failing flush early cs that goes over 70% (gtt+vram) usage. 70%
is use to allow some fragmentation.

The idea is to compute a gross estimate of memory requirement of
each draw call. After each draw call, memory will be precisely
accounted. So the uncertainty is only on the current draw call.
In practice this gave very good estimate (+/- 10% of the target
memory limit).

v2: Remove left over from testing version, remove useless NULL
    checking. Improve commit message.
v3: Add comment to code on memory accounting precision

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-31 14:23:52 -05:00
Marek Olšák
5c86a728d4 r600g: fix htile buffer leak
NOTE: This is a candidate for the 9.1 branch.
2013-01-31 15:35:18 +01:00
Andreas Boll
6ea753b056 mesa: bump version to 9.2 (devel)
Now that branch 9.1 is created, bump the minor version in
master.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-31 09:01:15 +01:00
Matt Turner
a527b2192e Revert "mesa: Return INVALID_OPERATION when type is known but not allowed"
This reverts commit 2906e2034c.

Fixes a regression in the glean depthStencil test.

Reverting this does not affect any tests in es3conform, so a more recent
patch must have also fixed the failure this one was intended to fix.

Reported-by: lu hua <huax.lu@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59494
2013-01-30 10:56:01 -08:00
Kenneth Graunke
7cccf46ec4 mesa: Add TexBufferRange to dispatch_sanity.
Christoph implemented this, so we should expect it to be present now.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60082
2013-01-30 10:48:05 -08:00
Christoph Bumiller
4bdf5454a5 nv50,nvc0: fix/enable texture buffer objects 2013-01-30 13:10:11 +01:00
Christoph Bumiller
a901d54f67 st/mesa: add support for GL_ARB_texture_buffer_range
v2: Update to handle BufferSize being -1 and return a NULL sampler
view if the specified range would cause out of bounds access.

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-30 13:10:11 +01:00
Christoph Bumiller
0fcd2c5e2f gallium: add PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-30 13:10:11 +01:00
Christoph Bumiller
785a8c3beb mesa: implement GL_ARB_texture_buffer_range
v2: Record texObj.BufferSize as -1 in TexBuffer(non-Range) instead
of the buffer's current size so we know we always have to use the
full size of the buffer object (i.e. even if it changes without the
user calling TexBuffer again) for the texture.

Clarify invalid offset alignment error message.

v3: Use extra GL_CORE-only section in get_hash_params.py for
TEXTURE_BUFFER_OFFSET_ALIGNMENT.

v4: Remove unnecessary check for profile in _mesa_TexBufferRange.
Add check for extension enable in get_tex_level_parameter_buffer.

v5: Fix position in gl_API.xml.
Add comment about meaning of BufferSize == -1.

v6: Add back checks for core profile and add a note about it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-30 13:10:10 +01:00
Matt Turner
02b6da1e87 build: Add missing comma in AS_IF
Reported-by: Lauri Kasanen<curaga@operamail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47248#c15
2013-01-29 13:19:18 -08:00
Brian Paul
ce6bf2d4c5 mesa: remove ctx->Driver.Error() hook
Not used by any driver anymore.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-29 12:32:13 -07:00
Stéphane Marchesin
67e7263e45 glx: Check that swap_buffers_reply is non-NULL before using it
Check that the return value from xcb_dri2_swap_buffers_reply is
non-NULL before accessing the struct members.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-29 11:15:22 -08:00
Brian Paul
70c5297439 mesa: fix comment typo: s/formaat/format/ 2013-01-29 11:53:24 -07:00
José Fonseca
42f762dcf6 llvmpipe: Don't advertise S8_UNORM (with feeble attempt at supporting it).
S8_UNORM was inadvertedly supported together with Z16_UNORM.

I tried to update the code to accomodate stencil-only -- it seemed a simple
thing to do -- but "fbo-stencil clear GL_STENCIL_INDEX8" still fails,
and it's not worth debugging.

Therefore although this change tries to update for S8_UNORM, it also
disables it completely.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-01-29 16:41:56 +00:00
José Fonseca
3b683700ef llvmpipe: Fix deferred depth writes for Z16_UNORM.
This special path hadn't been exercised by my earlier testing, and mask
values weren't being properly truncated to match the values.

This change fixes that.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-01-29 16:41:56 +00:00
Roland Scheidegger
0eb588a37c draw: fix draw_llvm_variant_key struct padding to avoid recompiles
The struct padding got broken by c789b981b2.
This caused serious performance regression because part of the key was
uninitialized and hence the shader always recompiled (at least on release
builds...).
While here also fix key size calculation when the number of samplers
and the number of sampler views are different.

v2: add comment

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-29 08:40:52 -08:00
Marek Olšák
845130951f docs/relnotes-9.1: document new features in radeon drivers 2013-01-29 17:35:17 +01:00
Brian Paul
d83336ce3e docs: more VMware guest driver info, tips 2013-01-29 08:59:53 -07:00
Brian Paul
c80bacba2e st/mesa: only enable GL_EXT_framebuffer_multisample if GL_MAX_SAMPLES >= 2
We never really have multisampling with one sample per pixel.
See also http://bugs.freedesktop.org/show_bug.cgi?id=59873

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-01-29 08:59:53 -07:00
Brian Paul
8f3c81d018 mesa: don't enable GL_EXT_framebuffer_multisample for software drivers
Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-01-29 08:59:53 -07:00
Brian Paul
2180f32972 osmesa: use _mesa_generate_mipmap() for mipmap generation, not meta
See previous commit for more info.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-29 08:59:53 -07:00
Brian Paul
89551ae04f xlib: use _mesa_generate_mipmap() for mipmap generation, not meta
The swrast fragment program interpreter has trouble computing the
right texture LOD because it doesn't have easy access to input
derivatives.  This causes the GLSL-based meta generate mipmap code
to fetch texels from the wrong mipmap level.

One possible fix would be to set the GL_TEXTURE_MIN/MAX_LOD parameters
to limit sampling from the right level.  But let's just use the
_mesa_generate_mipmap() fallback since it's a lot faster than using
the fragment shader interpreter.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=54240

Note: This is a candidate for the 9.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-29 08:59:53 -07:00
Brian Paul
d60da27273 st/mesa: set ctx->Const.MaxSamples = 0, not 1
The gallium docs for pipe_screen::is_format_supported() says that
samples==0 or samples==1 both mean that multisampling is not supported.
Return GL_MAX_SAMPLES==0 instead of 1 for consistency with other drivers.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-29 08:59:53 -07:00
Brian Paul
4e41ae5fc1 xlib: stop use _mesa_enable_extension(), just set the boolean flags
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-29 08:59:53 -07:00
Brian Paul
becec657d6 xlib: fix incorrect GL_ANGLE_texture_compression_dxt enable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-29 08:59:53 -07:00
José Fonseca
0ca384fb39 llvmpipe: Support Z16_UNORM as depth-stencil format.
Simply by adjusting the vector element width after/before
reading/writing the depth-stencil values.

Ran several GL_DEPTH_COMPONENT16 piglit tests without regressions.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-01-29 07:06:36 +00:00
Kenneth Graunke
9add4e8038 i965: Add chipset limits for Haswell GT1/GT2.
The maximum number of URB entries come from the 3DSTATE_URB_VS and
3DSTATE_URB_GS state packet documentation; the thread count information
comes from the 3DSTATE_VS and 3DSTATE_PS state packet documentation.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
2013-01-28 17:08:28 -08:00
Kenneth Graunke
7b07808f74 intel: Un-hardcode lengths from blitter commands.
The packet length may change at some point in the future.  Specifying it
explicitly (rather than hardcoding it in the command #define) allows us
to change it much more easily in the future.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-28 16:47:52 -08:00
Matt Turner
1b3ec16cc2 Remove APIspec.dtd
Left behind by a8ab7e33.
2013-01-28 16:48:38 -08:00
Matt Turner
6324521789 docs: List new extensions added in Mesa 9.1
I did not list the *_get_program_binary extensions since they're not
useful to anyone with their current implementation (that supports 0
binary formats).
2013-01-28 16:48:38 -08:00
Eric Anholt
99fe2b36cf intel: Use a CPU map of the batch on LLC-sharing architectures.
Before, we were keeping a CPU-only buffer to accumulate the batchbuffer in,
which was an improvement over mapping the batch through the GTT directly
(since any readback or other failure to stream through write combining
correctly would hurt).  However, on LLC-sharing architectures we can do better
by mapping the batch directly, which reduces the cache footprint of the
application since we no longer have this extra copy of a batchbuffer around.

Improves performance of GLBenchmark 2.1 offscreen on IVB by 3.5% +/- 0.4%
(n=21).  Improves Lightsmark performance by 1.1 +/- 0.1% (n=76).  Improves
cairo-gl performance by 1.9% +/- 1.4% (n=57).

No statistically significant difference in GLB2.1 on SNB (n=37).  Improves
cairo-gl performance by 2.1% +/- 0.1% (n=278).
2013-01-29 11:25:14 +11:00
Jerome Glisse
e1598cb642 r600g: use uint64_t instead of unsigned long for proper 32bits cpu support
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 19:09:52 -05:00
Jerome Glisse
da638781f6 r600g: real fix for non 3.8 kernel
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 17:17:00 -05:00
Vinson Lee
1559994cba i965: Fix assignment instead of comparison in asserts.
Fixes side effect in assertion defects reported by Coverity.

Note: This is a candidate for the 9.1 branch.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-28 13:51:10 -08:00
Tapani Pälli
407029591c android: use gralloc_drm_get_gem_handle api
Currently a gralloc internal structure is exposed to Mesa,
Use a query function instead to maintain ABI compatibility.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-28 12:49:41 -08:00
Paul Berry
8e4bb4bc09 intel: Typo fix: "pitsh" -> "pitch"
Comment change only.
2013-01-28 12:31:25 -08:00
Jerome Glisse
72916698b0 r600g: fix segfault with old kernel
Old kernel do not have dma support, patch pushed were missing some
of the check needed to not use dma.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 14:51:40 -05:00
Zack Rusin
dbb2d192de glx: only advertise GLX_INTEL_swap_event if it's supported
Only drivers supporting DRI2 version >=4 support GLX_INTEL_swap_event.
So lets mark it as such otherwise applications which use this extension
(i.e. everything based on Clutter, e.g. gnome-shell) break horribly on
drivers supporting DRI2 versions only up to 3.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-24 19:13:05 -08:00
Vadim Girlin
c9343047cf r600g: improve inputs/interpolation handling with llvm backend
Get rid of special handling for reserved regs.
Use one intrinsic for all kinds of interpolation.

v2[Vincent Lejeune]: Rebased against current master

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-01-28 18:30:38 +00:00
Tom Stellard
33dc412b89 r600g: Add ar_chan member to struct r600_bytecode
r600_bytecode::ar_chan stores the register channel for the value that
will be loaded into the AR register.

At the moment, this field is only used by the LLVM backend.  The default
backend always sets ar_chan = 0.
2013-01-28 18:30:38 +00:00
Tom Stellard
0ba0926861 r600g: More robust checks for MOVA_INT instructions 2013-01-28 18:30:37 +00:00
Vincent Lejeune
a871e01174 r600g/llvm: Add dummy export for vs output
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=59588

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-01-28 18:30:37 +00:00
Tom Stellard
91a160b19f r600g: Fix building with --enable-r600-llvm-compiler
https://bugs.freedesktop.org/show_bug.cgi?id=59877
2013-01-28 18:30:37 +00:00
Alex Deucher
e110c98cae r600g: don't emit WAIT_UNTIL on cayman/TN (v2)
It shouldn't be needed and older kernels don't support
it.

v2: Replace with PS partial flush as before.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=59945

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-28 12:11:27 -05:00
Jerome Glisse
325422c494 r600g: add async for staging buffer upload v2
v2: Add virtual address to dma src/dst offset for cayman

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 11:30:35 -05:00
Jerome Glisse
bff07638a8 r600g: add multi ring support with dma as first second ring v4
We keep track of ring emission order in a stack, whenever we need to
flush we empty the stack in a fifo order. There is few helpers function
for bo mapping and other ring activities that will make sure that
the ring stack is properly flush and submitted.

v2: fix st flush path, and other flush path to properly flush all
    rings if necessary
v3: - improve name of ring helpers
    - make sure that each time a cs is gona be written it endup at
      top of the stack to avoid any issue such as :
      STACK[0] = dma (withbo A,B)
      STACK[1] = gfx (withbo C,D)
      Now if code try to emit a dma command relative to bo C or D
      it will start writting cmd stream into the cs and once it
      reach the point where it adds relocation it will flush.
      At that point the cs will have cmd that don't have proper
      relocation into the relocation buffer and kernel will just
      refuse to run.
v4: - Drop the stack idea as it turn out there is no way to use it
      or benefit from it. Any time the driver start command on other
      ring, it always need to flush the previous ring. So make code
      simpler by not using a stack.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 11:30:35 -05:00
Jerome Glisse
6c064fd749 radeon/winsys: add dma ring support to winsys v3
Add ring support, you can create a cs for each ring. DMA ring is
bit special regarding relocation as you must emit as much relocation
as there is use of the buffer.

v2: - Improved comment on relocation changes
    - Use a single thread to queue cs submittion this simplify driver
      code while not impacting performances. Rational for this is that
      you have to wait for all previous submission to have completed
      so there was never a case while we could have 2 different thread
      submitting a command stream at the same time. This code just
      consolidate submission into one single thread per winsys.
v3: - Do not use semaphore for empty queue signaling, instead use
      cond var. This is because it's tricky to maintain an even number
      of call to semaphore wait and semaphore signal (the number of
      cs in the stack would for instance make that number vary).

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28 11:30:35 -05:00
Roland Scheidegger
cbf0f66631 gallivm,draw,llvmpipe: mass rename of unit->texture_unit/sampler_unit
Make it obvious what "unit" this is (no change in functionality).
draw still uses "unit" in places where it changes the shader by adding
texture sampling itself - it seems like this can't work with shaders
using dx10-style sample opcodes (can't mix gl-style and dx10-style
sample instructions in a shader).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-01-28 06:58:06 -08:00
Roland Scheidegger
c789b981b2 gallivm: split sampler and texture state
Split the sampler interface to use separate sampler and texture (sampler_view)
state. This is needed to support dx10-style sampling instructions.
This is not quite complete since both draw/llvmpipe don't really track
textures/samplers independently yet, as well as the gallivm code not quite
using the right sampler or texture index respectively (but it should work
for the sampling codes used by opengl).
We are however losing some optimizations in the process, apply_max_lod will
no longer work, and we potentially could end up with more (unnecessary)
recompiles (if switching textures with/without mipmaps only so it shouldn't
be too bad).

v2: don't use different callback structs for sampler/sampler view functions
(which just complicates things), fix up sampling code to actually use the
right texture or sampler index, and similar for llvmpipe/draw actually
distinguish between samplers and sampler views.

v3: fix more of PIPE_MAX_SAMPLER / PIPE_MAX_SHADER_SAMPLER_VIEWS mismatches
(both in draw and llvmpipe), based on feedback from José get rid of unneeded
static sampler derived state.(which also fixes the only 2 piglit regressions
due to a forgotten assignment), fix comments based on Brian's feedback.

v4: remove some accidental unrelated whitespace changes

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-28 06:50:36 -08:00
Marek Olšák
87592cff57 gallium/u_upload_mgr: fix a serious memory leak
It can eat all memory and crash in a matter of minutes with r600g.
2013-01-28 02:51:52 +01:00
Christoph Bumiller
e058f2ac97 nouveau: don't try to use push_data if it's not implemented 2013-01-27 13:45:06 +01:00
Matt Turner
51b64ce47b gles3: Update gl3.h
Contains a fix for Khronos bug 9557.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-26 20:42:19 -08:00
Marek Olšák
8891b2f9c9 r600g: add more cases for copying unsupported formats to resource_copy_region
just in case a new format is added to gallium
2013-01-26 14:59:04 +01:00
Marek Olšák
26c872c2a2 r600g: don't use radeon_surface_level::npix_x/y/z
npix_x/y/z is wrong with NPOT textures, since it's always aligned to POT
if the level is non-zero, so we can't use that.

This fixes piglit/spec/EXT_texture_shared_exponent/fbo-generatemipmap-formats.
2013-01-26 14:58:52 +01:00
Marek Olšák
edc38330da r600g: fix compile warnings in r600_cp_dma_copy_buffer on 32-bit gcc 2013-01-26 14:50:36 +01:00
Alex Deucher
f951f2f52c r600g: fix up CP DMA for VM on cayman and TN
Need to add the virtual address.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-01-25 21:57:42 -05:00
Brian Paul
c1d35aece0 svga: use pipe_sampler_view_release() in svga_cleanup_tss_binding()
Fixes a crash when the Redway3D Turbine demo exits.  We've made this
change in other places in the past.  The root issue is texture objects
are being shared by multiple contexts and sampler views get shared too.
Sampler views have a context pointer and if that context gets deleted
we may try to reference that context when finally deleting the sampler
view.

pipe_sampler_view_release() avoids this problem because it takes
an explicit context.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-01-25 15:57:35 -07:00
Brian Paul
50c4c818aa st/mesa: handle new GLSL IR enumerants in switch statements
To silence warnings about unhandled cases.
2013-01-25 15:46:14 -07:00
Brian Paul
9227c53741 svga: add NULL pointer check in svga_create_sampler_state()
Note: This is a candidate for the 9.0 branch.
2013-01-25 15:41:41 -07:00
Brian Paul
7a89f08a22 vbo: add a null pointer check to handle OOM instead of crashing
Note: This is a candidate for the 9.0 branch.
2013-01-25 15:41:41 -07:00
Brian Paul
b13c534f14 util: add new error checking code in vbuf helper
Check the return value of calls to u_upload_alloc() and
u_upload_data() and return early if needed.

Since we don't have a way to propagate errors all the way up to
Mesa through pipe_context::draw_vbo(), call debug_warn_once() so
the user might have some clue about OOM errors.

Note: This is a candidate for the 9.0 branch.
2013-01-25 15:41:40 -07:00
Brian Paul
8c3f9ea073 st/mesa: do proper error checking for u_upload_alloc() calls
We weren't properly checking the return value of these calls (and
calls to u_upload_data()) to detect OOM errors.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-01-25 15:41:40 -07:00
Brian Paul
68a097596e util: add some defensive coding in u_upload_alloc()
Some callers of this function were checking the 'ptr' result to see if
the function failed.  But the correct way is to check the regular
return value for PIPE_ERROR_x.  Now we initialize all the returned
values at the top of the function in case we do hit an error (like OOM).

Callers are more likely to detect OOM conditions now.  But there
are some callers which don't do any error checking...

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-01-25 15:41:40 -07:00
Brian Paul
d6f8b7ef38 glsl: use glsl_strtof() instead of glsl_strtod()
Since the result of those calls is always assigned to a float.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-25 15:41:40 -07:00
Brian Paul
811b5b4b39 glsl: add new glsl_strtof() function
Note, we could alternately implement this in terms of glsl_strtod()
with a (float) cast.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-25 15:41:39 -07:00
Brian Paul
6102b9d441 softpipe: add casts to silence MSVC warnings
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-25 15:41:39 -07:00
Brian Paul
257783b939 util: silence MSVC signed/unsigned comparison warnings
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-25 15:41:39 -07:00
Brian Paul
539541f2e2 util: silence MSVC double->float conversion warnings
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-25 15:41:39 -07:00
Brian Paul
869071dfb7 util: silence MSVC signed/unsigned warnings in debug_get_flags_option()
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-25 15:41:39 -07:00
Brian Paul
1a15772b7c st/mesa: silence assorted MSVC warnings in DrawPixels code
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-25 15:41:38 -07:00
Brian Paul
eee762258e swrast: silence a bunch of MSVC warnings
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-25 15:41:38 -07:00
Brian Paul
ccbb479f40 mesa: use GLbitfield64 when copying program inputs
Note: This is a candidate for the 9.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-25 15:41:38 -07:00
Brian Paul
701a0f6a76 mesa: add some casts to silence MSVC warnings
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-25 15:41:38 -07:00
Brian Paul
ddb774ddf1 mesa: add casts in _mesa_GetTexParameterfv() to silence warnings
There are other similar int->float casts elsewhere in the function.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-25 15:41:38 -07:00
Matt Turner
9aadc3a6cc i965: Enable ARB_shading_language_packing
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-01-25 14:10:24 -08:00
Matt Turner
64dbc51b49 i965: Assert that the 4x8 pack/unpack operations have been lowered
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-01-25 14:10:23 -08:00
Matt Turner
96220111dd i965: Lower the 4x8 pack/unpack operations
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-01-25 14:10:23 -08:00
Matt Turner
321555fb41 glsl: Add support for lowering 4x8 pack/unpack operations
Lower them to arithmetic and bit manipulation expressions.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-01-25 14:10:23 -08:00
Matt Turner
1ef674f215 glsl: Evaluate constant pack/unpack 4x8 expressions
That is, evaluate constant expressions for the following functions:
  packSnorm4x8, unpackSnorm4x8
  packUnorm4x8, unpackUnorm4x8

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-01-25 14:10:23 -08:00
Matt Turner
b64b174b0a glsl: Extend ir_expression_operation for ARB_shading_language_packing
For each function {pack,unpack}{Snorm,Unorm}4x8, add a corresponding
opcode to enum ir_expression_operation. Validate the new opcodes in
ir_validate.cpp.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-01-25 14:10:23 -08:00
Matt Turner
b0239ce960 glsl: Add IR lisp for ARB_shading_language_packing
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-01-25 14:10:23 -08:00
Matt Turner
12aa2fec5b glsl: Add infrastructure for ARB_shading_language_packing
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-01-25 14:10:23 -08:00
Tom Stellard
7a850c5851 configure.ac: Don't set LLVM_LIBS when llvm is disabled 2013-01-25 22:05:00 +00:00
Tom Stellard
264e6dad28 r600g: Don't build llvm_wrapper.cpp when we aren't using LLVM
We were using the NEED_RADEON_GALLIUM conditional to decide whether or not
to build llvm_wrapper.cpp, which is required for using the LLVM backend.
llvm_wrapper.cpp needs to be linked against the LLVM IPO libary
and this library is only added to LLVM_LIBS if either opencl or the
r600-llvm-compiler is enabled.

The NEED_RADEON_GALLIUM conditional is set to true when enabling the
radeonsi driver, so if the radeonsi and r600 drivers are enabled without
also enabling opencl or r600-llvm-compiler, llvm_wrapper.cpp will be
built, but the IPO library won't be added to LLVM_LIBS.  This was
causing unresolved symbol errors when buiding with this configuration.

https://bugs.freedesktop.org/show_bug.cgi?id=59831

Tested-by: Alex Deucher <alexander.deucher@amd.com>
2013-01-25 22:05:00 +00:00
Eric Anholt
1a316af034 i965: Pass in the glarray to get_surface_type.
Dereffing all the values in the two callers was just pointless, and
the function isn't inlined so there was actual code impact.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:41:04 -08:00
Eric Anholt
80aeda2784 i965: Remove nonsense comment.
vb.inputs_read has never been a thing, even in the initial import.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:40:59 -08:00
Eric Anholt
23e5503348 i965: Remove NDEBUG undef that was snuck in.
If you want debug, set --enable-debug in your config flags.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:40:54 -08:00
Eric Anholt
8fe43b6dc9 i965: reuse _mesa_sizeof_type for index buffer types.
The core Mesa code has just one more case than this (GL_BITMAP), so I
don't see any cause to special-case it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:40:49 -08:00
Eric Anholt
b859a12f21 i965: Reuse precalculated ib_type_size value.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:40:44 -08:00
Eric Anholt
9aa02a205d i965: Drop debug check for knowing the size of a type.
This was added in b93684f5f3, but there's
no need for it -- get_size has to succeed, and it has an assert for us
in debug builds.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:40:39 -08:00
Eric Anholt
5ae3c20791 i965: Stop worrying about alignment of vertex data.
For our current types, the required alignment is actually just 1 byte.
When we get doubles, we have to worry (those have to be aligned to the
natural size), but we don't have doubles yet and they'll just be a
special case.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:40:33 -08:00
Eric Anholt
2a7a5062c9 i965: Use the glarray _ElementSize that Mesa tracks for us.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:40:22 -08:00
Eric Anholt
f6191e09aa mesa: Print more informative debug for _mesa_do_init_remap_table().
This is the same logic from _mesa_map_function_array().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:36:43 -08:00
Ian Romanick
22233da1ee glsl: Remove ir_variable::uniform_block
v2: A previous patch contained a spurious hunk that removed an
assignment to ir_variable::uniform_block.  That hunk was moved to this
patch.  Suggested by Carl Worth.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:36 -05:00
Ian Romanick
f09d77b2af glsl: Allow dereferencing fields of an interface instance
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-25 09:07:36 -05:00
Ian Romanick
32f3229255 glsl: Allow elimination of uniform block members
glGetActiveUniform is not supposed to report block members that are not
active even if they are included in the layout of the block.  The block
layout is determined from the GLSL_TYPE_INTERFACE that defines the
block, so eliminating the ir_variables that correspond to the individual
fields is safe.

Fixes gles3conform test
uniform_buffer_object_getuniformindices_for_for_nonexistent_or_not_active_uniform_names.

This also fixes the assertion failures (added in the previous commit) in
gles3conform uniform_buffer_object_index_of_not_active_block,
uniform_buffer_object_inherit_and_override_layouts, and
uniform_buffer_object_repeat_global_scope_layouts.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-25 09:07:36 -05:00
Ian Romanick
514f8c7ec7 glsl: Calculate UBO data at link-time
Use the function added in the previous commit.

This temporarily causes gles3conform
uniform_buffer_object_index_of_not_active_block,
uniform_buffer_object_inherit_and_override_layouts, and
uniform_buffer_object_repeat_global_scope_layouts to assertion fail.
This is fixed in the next commit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-25 09:07:36 -05:00
Ian Romanick
0ab7399822 glsl: Add link_uniform_blocks to calculate all UBO data at link-time
Calculate all of the block member offsets, the IndexNames, and
everything else to do with every UBO.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-25 09:07:35 -05:00
Ian Romanick
681df909e3 glsl: Add a visitor to determine whether a uniform block is ever used
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-25 09:07:35 -05:00
Ian Romanick
d1b4960f9b glsl: Lower UBO references using link-time data instead of compile-time data
Pretty much all of the compile-time, per-compilation unit block data is
about to get the axe.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-25 09:07:35 -05:00
Ian Romanick
90b1dd03e5 glsl: Add gl_uniform_buffer_variable::IndexName field
glGetUniformIndices requires that the block instance index not be
present in the name of queried uniforms.  However,
gl_uniform_buffer_variable::Name will include the instance index.  The
IndexName field is added to handle this difference.

Note that currently IndexName will always point to the same string as
Name.  This will change soon.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
2013-01-25 09:07:35 -05:00
Ian Romanick
11d42de681 glsl: Make the align function available elsewhere in the linker
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
2013-01-25 09:07:35 -05:00
Ian Romanick
e2c95cd674 glsl: Calculate link-time uniform block data without using compile-time block data
Pretty much all of the compile-time, per-compilation unit block data is
about to get the axe.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-25 09:07:35 -05:00
Ian Romanick
bd963e12ef glsl: Assert that interfaces, like structures, are not seen as leaf types
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:35 -05:00
Ian Romanick
99b8935ce2 glsl: Add new uniform_field_visitor::process variant
This flavor takes a type and a base name.  It will be used to handle
cases where the block name (instead of the instance name) is used for an
interface block.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:35 -05:00
Ian Romanick
007de494d2 glsl: Recurse into uniform blocks just like uniform structures
v2: Inspite of the spell checker, spell recurse correctly.  Suggested by
Carl Worth.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:35 -05:00
Ian Romanick
25e75b0a13 glsl: Handle instance array declarations
v2: Add a comment and an assertion about the array size in the
non-instance name case.  Suggested by Paul Berry.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:34 -05:00
Ian Romanick
5383661092 glsl: Track blocks in the symbol table using the glsl_type instead of the gl_uniform_block
Eventually the gl_uniform_block information won't be calculated until
linking.  Block names need to be checked for name clashes during
compiling, so we have to track it differently.

v2: Update the commit message.  Suggested by Carl Worth.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:34 -05:00
Ian Romanick
9a204bb9f6 glsl: Add new uniform_field_visitor::visit_field variant
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:34 -05:00
Ian Romanick
141e9d42f5 glsl: Modify uniform_field_visitor::visit_field to take a row_major parameter
Not used yet, but the UBO layout visitor will use this.

v2: Remove a spruious hunk.  This is moved to the patch "glsl: Remove
ir_variable::uniform_block".  Suggested by Carl Worth.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:34 -05:00
Ian Romanick
6a0c1bc163 glsl: Modify uniform_field_visitor::recursion to take a row_major parameter
Not used yet, but the UBO layout visitor will use this.

v2: Add some commentary as to why row_major is always set to false in
process.  Suggesed by Paul Berry.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:34 -05:00
Ian Romanick
23b7ce3a82 glsl: Add a predicate to determine whether a variable is an interface block
For the first declaration below, there will be an ir_variable named
"instance" whose type and whose instance_type will be the same
glsl_type.  For the second declaration, there will be an ir_variable
named "f" whose type is float and whose instance_type is B2.

"instance" is an interface instance variable, but "f" is not.

uniform B1 {
    float f;
} instance;

uniform B2 {
    float f;
};

v2: Copy the comment message documentation into the code.  Suggested by
Paul Berry.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:34 -05:00
Ian Romanick
3b09603dda glsl: Require that indices into uniform block arrays be constants
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:34 -05:00
Ian Romanick
7a7b44b329 glsl: Add ir_variable::interface_type field
For variables that are in an interface block or are an instance of an
interface block, this is the GLSL_TYPE_INTERFACE type for that block.

Convert the ir_variable::is_in_uniform_block method added in the
previous commit to use this field instead of ir_variable::uniform_block.

v2: Fix the place-holder comment on ir_variable::interface_type.
Suggested by Paul Berry.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:34 -05:00
Ian Romanick
13be1f4a10 glsl: Add ir_variable::is_in_uniform_block predicate
The way a variable is tested for this property is about to change, and
this makes the code easier to modify.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:34 -05:00
Ian Romanick
17e6f19044 glsl: Generate an interface type for uniform blocks
If the block has an instance name, add the instance name to the symbol
table instead of the individual fields.

Fixes the piglit test interface-name-access-without-interface-name.vert
for real.

v2: Update the comment before the assertion that interface block
definitions won't generate instructions.  Suggested by Paul Berry.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-01-25 09:07:33 -05:00
Ian Romanick
491364e1f3 glsl: Add GLSL_TYPE_INTERFACE
Interfaces are structurally identical to structures from the compiler's
point of view.  They have some additional restrictions, and generally
GPUs use different instructions to access them.  Using a different base
type should make this a bit easier.

This commit also adds the glsl_type::interface_packing fields.  For
GLSL_TYPE_INTERFACE types, this will track the specified packing mode.
It is analogous to gl_uniform_buffer::_Packing.

v2: Add serveral missing GLSL_TYPE_INTERFACE cases in switch-statements.

v3: Add information about glsl_type::interface_packing.  Move row_major
checking in glsl_type::record_key_compare from this patch to the
previous patch.  Both suggested by Paul Berry.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:33 -05:00
Ian Romanick
7f96a8471e glsl: Add row_major field to glsl_struct_field
For now, this will always be false.  In the near future, an "interface"
type will be added that shares a lot of infrastructure with structures.

v2: Move row_major checking in glsl_type::record_key_compare from the
next patch to this patch.  Suggested by Paul Berry.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:33 -05:00
Ian Romanick
51f740cd5a glsl: Refactor out processing of structure fields
This will soon also be used for processing interface block fields.

v2: Add a comment explaining the interface of
ast_process_structure_or_interface_block.  Suggested by Paul Berry.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:33 -05:00
Ian Romanick
a39a70c8d4 glsl: Parse interface array size
The size is parsed and stored in the AST, but it is not used yet.
Processing of the array size is added in the patch "glsl: Handle
instance array declarations"

v2: Update the commit message (suggested by Carl Worth).  Add a comment
to ast_uniform_block::array_size (suggested by Paul Berry).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:33 -05:00
Kenneth Graunke
34f966bdcb glsl: Parse non-array uniform block instance names in GLSL ES 3.00.
In GLSL ES 3.00 (and GLSL 1.50), uniform blocks can have an associated
"instance name", which essentially namespaces the variables inside.

This patch adds basic parsing for this new feature, but doesn't yet hook
it up to actually do anything yet.

It does not support for arrays of interface blocks; a later commit will
take care of that.

This change temporarily regresses the piglit test
interface-name-access-without-interface-name.vert.  This shader failed
to compile before (the expected result), but it failed to compile for
the wrong reason.  This is not a real regression.

v2: Add some comments to ast_uniform_block::instance_name.  Suggested by
Paul Berry.

Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:33 -05:00
Kenneth Graunke
0d2e6336a2 glsl: Refactor uniform block parser rules.
The existing code has a lot of duplication; the only difference between
the two cases is whether we merge in an additional layout qualifier.

Apparently creating a layout_qualifieropt rule that can be empty causes
a lot of conflicts and confusion.  However, refactoring out the guts of
the ast_uniform_block creation works fine.

Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:33 -05:00
Ian Romanick
b226a058db linker: Refactor intra-stage block compatabililty testing
Also slightly change the compatibility test.  Instead of comparing the
offsets of the block variables, compare the packing mode of the blocks.
Ideally we don't want to assign the offsets until a later stage of
linking.

This is put in a new file called link_uniform_blocks.cpp.  Some new
functions related to uniform blocks are going to live in that file as
well.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:33 -05:00
Ian Romanick
9a971ab695 mesa: Track the packing mode of a UBO in gl_uniform_buffer
This allows the next patch to verify that two uniform blocks match
without first calculating the locations of the fields.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:33 -05:00
Ian Romanick
ecfb404e8d glsl: Replace most default cases in switches on GLSL type
This makes it easier to find switch-statements that need to be updated
after a new GLSL_TYPE_* is added because the compiler will generate a
warning.

Switch-statements that only had a small number of cases (e.g.,
everything in ir_constant_expression.cpp) were not modified.  I may
regret that decision when we eventually add support for doubles.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-25 09:07:32 -05:00
Eric Anholt
416326e337 i965: Correct gen6+ guardband calculation.
Too much attention was paid to the first paragraphs, and not enough to
the last little note that "oh, by the way, the rendered things
themselves still have to be clipped to just 8192 wide/high".

Fixes GTF's clip.c test with 4096 or higher width on ivb, where one of
the triangles got the upper half of its pixels dropped.

Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-25 09:04:52 -05:00
Kenneth Graunke
9db2098d18 i965: Use GL_RED for DEPTH_TEXTURE_MODE in ES 3.0 for unsized formats.
Khronos has apparently decided that depth textures with sized formats
(allowed with ARB_internalformat_query or ES 3.0) should be treated as
GL_RED, while unsized formats (an existing feature) should be treated
as GL_INTENSITY for compatibility with ES 2.0.

Ian is proposing changes to ARB_internalformat_query which will make
this actually legal and consistent.

A similar problem exists with GL 4.2, but we're going to ignore that
for the time being.

Tested on Ivybridge: no Piglit regressions; fixes 4 es3conform tests:
- depth_texture_fbo
- depth_texture_fbo_clear
- depth_texture_teximage
- depth_texture_texsubimage

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-25 09:04:25 -05:00
Chad Versace
7638ede4ce i965: Bump maximum supported ES2 context version to 3.0
Since patch "i965: Validate requested GLES context version in
brwCreateContext", we have been able to create ES 3.0 contexts due to the
max version check.  So...bump the max version.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-25 08:30:40 -05:00
Paul Berry
e4f661afc8 i965/Gen6+: Enable ARB_ES3_compatibility extension
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-25 08:30:40 -05:00
Ian Romanick
1d0e8c109c mesa/es3: Enable ES 3.0 API and shading language version
v2: Add ARB_internalformat_query to the list of required extensions.

v3: Add OES_depth_texture_cube_map to the list of required extensions.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-25 08:30:40 -05:00
Vinson Lee
07e215f4ec scons: Add imports.c to builtin_compiler build.
Fixes build regression introduced by commit
eac030e38e.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59835
2013-01-24 22:36:27 -08:00
Chad Versace
0974031f88 i965/fs/gen7: Fix fatal typo in unpackHalf2x16
s/src/src_w/

That little typo, which sneaked into v4 of the previous patch, generates
incorrect fs code.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:51:06 -08:00
Chad Versace
20dfa501b3 i965/fs/gen7: Emit code for GLSL 3.00 pack/unpack operations (v4)
v2: Remove lewd comment. [for idr]
v3: - Optimize away tmp register for packHalf2x16. [for anholt, paul]
    - Improve comments. [for anholt, paul]
    - Reduce near-duplicate code by removing vec4_visitor emit_pack/unpack
      methods. [for chadv]
v4: Factor our UD/W register conversion into helper function. [for anholt]

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v2)
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:31:06 -08:00
Chad Versace
203c12b18f i965/vs/gen7: Emit code for GLSL ES 3.00 pack/unpack operations (v3)
FIXME: This patch emits VS code that violates documented hardware
restrictions and then relies on undocumented behavior that results from
that violation.  This patch passes all tests, but should be fixed ASAP to
conform to the hardware documentation.

v2: Explain undocumented hardware behavior. Improve comments.
v3: Use ALU1 helper methods F32TO16() and F16TO32(). [for anholt]

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:11 -08:00
Chad Versace
7093558b31 i965: Quote the PRM on a HorzStride subtlety
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:11 -08:00
Chad Versace
7e21910f23 i965: Add opcodes for F32TO16 and F16TO32
The GLSL ES 3.00 operations packHalf2x16 and unpackHalf2x16 will emit
these opcodes.

- Define the opcodes BRW_OPCODE_{F32TO16,F16TO32}.
- Add the opcodes to the brw_disasm table.
- Define convenience functions brw_{F32TO16,F16TO32}.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:10 -08:00
Chad Versace
ee0ed52d69 i965: Lower the GLSL ES 3.00 pack/unpack operations (v2)
On gen < 7, we fully lower all operations to arithmetic and bitwise
operations.

On gen >= 7, we fully lower the Snorm2x16 and Unorm2x16 operations, and
partially lower the Half2x16 operations.

v2:
  - Comment that scalarization is needed only for SOA code [for idr].
  - Replace switch-statement with if-statement [for idr].
  - Remove misplaced hunk from previous patch [found by idr].

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Tuner <mattst88@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:10 -08:00
Chad Versace
b9f56ea923 glsl: Add lowering pass for GLSL ES 3.00 pack/unpack operations (v4)
Lower them to arithmetic and bit manipulation expressions.

v2: Rewrite using ir_builder [for idr].
v3: Comment typos. [for mattst88]
v4: Fix arithmetic error in comments.
    Factor out a shift instruction.
    Don't heap allocate factory.instructions.
    [for paul]

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v2)
Reviewed-by: Matt Tuner <mattst88@gmail.com> (v3)
Reviewed-by: Paul Berry <stereotype441@gmail.com> (v4)
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:10 -08:00
Chad Versace
9d7931ddf0 glsl: Fix type-deduction for and/or/xor expressions
In ir_expression's constructor, the cases for {bit,logic}_{and,or,xor}
failed to handle the case when both operands were vectors.

Note: This is a candidate for the stable branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:10 -08:00
Chad Versace
ccf87f2199 glsl: Reformat and/or/xor cases in ir_expression ctor
Replace tabs with spaces. According to docs/devinfo.html, Mesa's
indetation style is:
  indent -br -i3 -npcs --no-tabs infile.c -o outfile.c

This patch prevents whitespace weirdness in the next patch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:10 -08:00
Chad Versace
f859e4fbd1 glsl/ir_builder: Add helpers for making if-statements
Add two overloaded variants of
    ir_if *if_tree()

The new functions allow one to chain together if-trees within a single C++
expression that resembles a real if-statement.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:10 -08:00
Chad Versace
a32bc53029 glsl/ir_builder: Add enum writemask
Using this enum improves the readibility of calls to assign(), whose third
argument is a writemask.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:10 -08:00
Chad Versace
a6479ef968 glsl/ir_factory: Add helper method for making an ir_constant
Add method ir_factory::constant.  This little method constructs an
ir_constant using the factory's mem_ctx.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:10 -08:00
Chad Versace
5790174e37 glsl/ir_builder: Add more helpers for constructing expressions
Add the following functions, each of which construct the similarly named
ir expression:
    div, round_even, clamp

    equal, less, greater, lequal, gequal

    logic_not, logic_and, logic_or

    bit_not, bit_or, bit_and, lshift, rshift

    f2i, i2f, f2u, u2f, i2u, u2i

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:10 -08:00
Chad Versace
fafcbf52b7 glsl/ir_factory: Initialize members to NULL in constructor
This eliminates unexpected behavior due to unitialized values.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:10 -08:00
Chad Versace
542c7a3022 glsl: Evaluate constant GLSL ES 3.00 pack/unpack expressions (v3)
That is, evaluate constant expressions of the following functions:
  packSnorm2x16  unpackSnorm2x16
  packUnorm2x16  unpackUnorm2x16
  packHalf2x16   unpackHalf2x16

v2: Reuse _mesa_pack_float_to_half and its inverse to evaluate
    pack/unpackHalf2x16. [for idr]
v3: Whitespace fixes. [for mattst88]
    Don't cast neg floats directly to uint16; use an intermediate cast to
    int16. [for paul]

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v2)
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Matt Tuner <mattst88@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:10 -08:00
Chad Versace
529b6d1f3d mesa: Remove rounding bias in _mesa_float_to_half()
Not all float32 values can be exactly represented as a float16.
_mesa_float_to_half() rounded such intermediate float32 values to zero by
truncating unrepresentable bits in the mantissa.

This patch improves _mesa_float_to_half() by rounding intermediate float32
values to the nearest float16; when the float32 is exactly between two
float16 values we round to the one with an even mantissa. This behavior is
preferred over the old behavior because:
  - It has reduced bias relative to the old behavior.

  - It reproduces the behavior of real hardware: opcode F32TO16 in
    Intel's GPU ISA.

  - By reproducing the behavior of the GPU (at least on Intel hardware),
    compile-time evaluation of constant packHalf2x16 GLSL expressions will
    result in the same value as if the expression were executed on the GPU.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:10 -08:00
Chad Versace
eac030e38e mesa,glsl: Move round_to_even() from glsl to mesa/main (v2)
Move round_to_even's definition to mesa/main so that _mesa_float_to_half()
can use it in order to eliminate rounding bias.

In additon to moving the fuction definition, prefix its name with "_mesa",
just as all other functions in mesa/main are prefixed.

v2: Fix Android build.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:24:07 -08:00
Chad Versace
1fafd00839 glsl/standalone_scaffolding: Add stub for _mesa_warning()
A subsequent patch will add mesa/main/imports.c as a dependency to the
compiler, which in turn requires that _mesa_warning() be defined.

The real definition of _mesa_warning() is in mesa/main/errors.c, but to
pull that file into the standalone scaffolding would require transitively
pulling in the dispatch tables.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:11:41 -08:00
Chad Versace
ee5921ad0d glsl: Extend ir_expression_operation for GLSL 3.00 pack/unpack functions (v2)
For each function {pack,unpack}{Snorm,Unorm,Half}2x16, add a corresponding
opcode to enum ir_expression_operation.  Validate the new opcodes in
ir_validate.cpp.

Also, add opcodes for scalarized variants of the Half2x16 functions.  (The
code generator for the i965 fragment shader requires that all vector
operations be scalarized.  A lowering pass, to be added later, will
scalarize the Half2x16 functions).

v2: Fix assertion message in ir_to_mesa [for idr].

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Tuner <mattst88@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:11:41 -08:00
Chad Versace
3a88d71d35 glsl: Add IR lisp for GLSL ES 3.00 pack/unpack functions
For each of the following functions, add a declaration to
builtins/profiles/300es.glsl and create new file
builtins/ir/${funcname}.ir:

  packSnorm2x16  unpackSnorm2x16
  packUnorm2x16  unpackUnorm2x16
  packHalf2x16   unpackHalf2x16

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Tuner <mattst88@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:11:41 -08:00
Chad Versace
6f8f919a53 glsl: Fix typo in comment
s/num_operands()/get_num_operands()/

Discovered because Eclipse failed to resolve the false reference.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:11:41 -08:00
Chad Versace
ca7d332253 i965/disasm: Fix horizontal stride of dest registers
The bug: The printed horizontal stride was the numerical value of the
  BRW_HORIZONTAL_$N enum.
The fix: Translate the enum before printing.

Note: This is a candidate for the stable releases.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-24 21:10:46 -08:00
Paul Berry
d1f2e9699f intel: Fix glCopyTexSubImage on buffers whose width >= 32kbytes
When possible, glCopyTexSubImage calls are performed using the
hardware blitter.  However, according to the Ivy Bridge PRM, Vol1
Part4, section 1.2.1.2 (Graphics Data Size Limitations):

    The BLT engine is capable of transferring very large quantities of
    graphics data. Any graphics data read from and written to the
    destination is permitted to represent a number of pixels that
    occupies up to 65,536 scan lines and up to 32,768 bytes per scan
    line at the destination. The maximum number of pixels that may be
    represented per scan line’s worth of graphics data depends on the
    color depth.

With an RGBA32F color buffer (which has 16 bytes per pixel) this
imposes a maximum width of 2048 pixels.  Other pixel formats have
accordingly larger limits.

To make matters worse, if the pitch of the buffer is 32k or greater,
intel_copy_texsubimage's call to intelEmitCopyBlit will overflow
intelEmitCopyBlit's src_pitch and dst_pitch parameters (which are
16-bit signed integers).

We can conveniently avoid both problems by avoiding use of the blitter
when the miptree's pitch is >= 32k.

Fixes gles3conform "framebuffer_blit_functionality_magnifying_blit"
tests when the buffer width is equal to 8192.

Note: this is very similar to the recent patch "intel: Fix ReadPixels
on buffers whose width >= 32kbytes" except that it applies to
glCopyTexSubImage instead of glReadPixels.  In a future patch it would
be nice to refactor the code so that (a) overflow is avoided, and (b)
intelEmitCopyBlit is responsible for checking whether the blitter can
handle the width, so that all callers of intelEmitCopyBlit work
properly, rather than just these two.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-24 18:35:08 -08:00
Paul Berry
c6a50ddfcb glsl: Allow varying structs in GLSL ES 3.00 and GLSL 1.50.
Previously I thought that varying structs had been added to GLSL ES
3.00 by mistake, because chapter 11 of the GLSL ES 3.00 spec
("Counting of Inputs and Outputs") failed to mention how structs
should be handled.  Khronos has clarified
(https://cvs.khronos.org/bugzilla/show_bug.cgi?id=9828) that varying
structs are indeed required, and that chapter 11 will be modified to
indicate that the minimal reference packing algorithm flattens varying
structs to their individual components.

Mesa doesn't flatten varying structs to their individual components,
but this is ok, since it packs varyings of all kinds with no wasted
space at all (except where this is impossible due to differing
interpolation modes), so it will outperform the minimal reference
packing algorithm in all but the most pathological cases.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-24 16:30:49 -08:00
Paul Berry
cd53457ffa glsl: Disable transform feedback of varying structs.
It is not clear from the GLSL ES 3.00 spec how transform feedback is
supposed to apply to varying structs:

- There is no specification for how the structure is to be packed when
  it is recorded into the transform feedback buffer.

- There is no reasonable value for GetTransformFeedbackVarying to
  return as the "type" of the variable.

We currently have a Khronos bug requesting clarification on how this
feature is supposed to work
(https://cvs.khronos.org/bugzilla/show_bug.cgi?id=9856).

This patch just disables transform feedback of varying structs for
now; we can implement the proper behaviour once we find out from
Khronos what it is.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-24 16:30:46 -08:00
Paul Berry
1ecd23dea9 glsl: Update lower_packed_varyings to handle varying structs.
This patch adds code to lower_packed_varyings to handle varyings of
type struct.  Varying structs are currently packed in the most naive
possible way (in declaration order, with no gaps), so there is a
potential loss of runtime efficiency.  In a later patch it would be
nice to replace this with a "flattening" approach (wherein a varying
struct is flattened to individual varyings corresponding to each of
its structure elements), so that the linker can align each structure
element independently.  However, that would require a significantly
more complex implementation.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-24 16:30:43 -08:00
Paul Berry
88e4bfde26 glsl: Generalize compute_packing_order for varying structs.
This patch paves the way for allowing varying structs by generalizing
varying_matches::compute_packing_order to handle any type of varying.
Previously, we packed in the order (vec4, vec2, float, vec3), with
matrices being packed according to the size of their columns.  Now, we
pack everything according to its number of components mod 4, in the
order (0, 2, 1, 3).

There is no behavioural change for vectors.  Matrices are now packed
slightly differently:

- mat2x2 gets assigned PACKING_ORDER_VEC4 instead of
  PACKING_ORDER_VEC2.  This is slightly better, because it guarantees
  that the matrix occupies a single varying slot.

- mat2x3 gets assigned PACKING_ORDER_VEC2 instead of
  PACKING_ORDER_VEC3.  This is kind of a wash.  Previously, mat2x3 had
  a 25% chance of having neither of its columns double parked, a 50%
  chance of having exactly one of its columns double parked, and a 25%
  chance of having both of its columns double parked.  Now it always
  has exactly one of its columns double parked.

- mat3x3 gets assigned PACKING_ORDER_SCALAR instead of
  PACKING_ORDER_VEC3.  This doesn't affect much, since in both cases
  there is no guarantee of how the matrix will be aligned.

- mat4x2 gets assigned PACKING_ORDER_VEC4 instead of
  PACKING_ORDER_VEC2.  This is slightly better for the same reason as
  in mat2x2.

- mat4x3 gets assigned PACKING_ORDER_VEC4 instead of
  PACKING_ORDER_VEC3.  This is slightly better for the same reason as
  in mat2x2.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-24 16:30:40 -08:00
Paul Berry
3680864c0b glsl: Disable structure splitting for shader ins/outs.
Previously, it didn't matter whether structure splitting tried to
split shader ins/outs, because structs were prohibited from being used
for shader ins/outs.  However, GLSL 3.00 ES supports varying structs.
In order for varying structs to work, we need to make sure that
structure splitting doesn't get applied to them, because if it does,
then the linker won't be able to match up varyings properly.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-24 16:30:37 -08:00
Paul Berry
42a29d89fd glsl: Eliminate ambiguity between function ins/outs and shader ins/outs
This patch replaces the three ir_variable_mode enums:

- ir_var_in
- ir_var_out
- ir_var_inout

with the following five:

- ir_var_shader_in
- ir_var_shader_out
- ir_var_function_in
- ir_var_function_out
- ir_var_function_inout

This eliminates a frustrating ambiguity: it used to be impossible to
tell whether an ir_var_{in,out} variable was a shader in/out or a
function in/out without seeing where the variable was declared in the
IR.  This complicated some optimization and lowering passes, and would
have become a problem for implementing varying structs.

In the lisp-style serialization of GLSL IR to strings performed by
ir_print_visitor.cpp and ir_reader.cpp, I've retained the names "in",
"out", and "inout" for function parameters, to avoid introducing code
churn to the src/glsl/builtins/ir/ directory.

Note: a couple of comments in the code seemed to indicate that we were
planning for a possible future in which geometry shaders could have
shader-scope inout variables.  Our GLSL grammar rejects shader-scope
inout variables, and I've been unable to find any evidence in the GLSL
standards documents (or extensions) that this will ever be allowed, so
I've eliminated these comments.

Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-24 16:30:30 -08:00
Paul Berry
7d51ead56e glsl: Clean up case statement in builtin_variables.cpp's add_variable.
The case statement purported to handle the addition of ir_var_const_in
and ir_var_inout builtin variables.  But no such variables exist.
This patch removes the unnecessary cases, and adds a comment
explaining why they're not needed.

Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-24 16:30:27 -08:00
Kenneth Graunke
fce9e5d41b i965/vs: Do headerless texturing for texelFetchOffset().
For texelFetchOffset(), we just add the texel offsets to the coordinate
rather than using the message header's offset fields.  So we don't
actually need a header on Gen5+.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-24 15:19:08 -08:00
Matt Turner
0412864ae8 libgl-xlib/build: Link with C++ when LLVM is used
Also link-in libX11 and libXext.

Tested-by: Brian Paul <brianp@vmware.com>
2013-01-24 14:00:27 -08:00
Paul Berry
b50c0feb2c intel: Fix ReadPixels on buffers whose width >= 32kbytes
When possible, glReadPixels calls are performed using the hardware
blitter.  However, according to the Ivy Bridge PRM, Vol1 Part4,
section 1.2.1.2 (Graphics Data Size Limitations):

    The BLT engine is capable of transferring very large quantities of
    graphics data. Any graphics data read from and written to the
    destination is permitted to represent a number of pixels that
    occupies up to 65,536 scan lines and up to 32,768 bytes per scan
    line at the destination. The maximum number of pixels that may be
    represented per scan line’s worth of graphics data depends on the
    color depth.

With an RGBA32F color buffer (which has 16 bytes per pixel) this
imposes a maximum width of 2048 pixels.

To make matters worse, if the pitch of the buffer is 32k or greater,
intel_miptree_map_blit's call to intelEmitCopyBlit will overflow
intelEmitCopyBlit's src_pitch and dst_pitch parameters (which are
16-bit signed integers).

We can conveniently avoid both problems by avoiding the readpixels
blit path when the miptree's pitch is >= 32k.

Fixes gles3conform "half_float" tests when the buffer width is greater
than 2048.

Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-24 13:17:07 -08:00
Ian Romanick
ac158f8ee7 intel: callocing a 32 byte temp is silly, so don't
I believe that the size used to vary, so the dynamic allocation is
necessary.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-01-24 13:57:46 -05:00
Marek Olšák
7a23029b2f st/mesa: implement ARB_internalformat_query v2
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-24 18:39:28 +01:00
Marek Olšák
041234ee1e st/mesa: advertise OES_depth_texture_cube_map if GLSL 1.30 is supported
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-24 18:38:49 +01:00
Marek Olšák
4f0563a658 st/dri: disallow recursion in dri_flush
ST_FLUSH_FRONT may call driThrottle, which is implemented with dri_flush.
This prevents double flush as well as fence leaks caused by a recursion
in the middle of throttling.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58839

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-24 18:22:14 +01:00
Marek Olšák
fffe3e0908 st/dri: add null-pointer check, remove duplicated local variable
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-24 18:22:14 +01:00
Tom Stellard
0261b4ecdb Revert "Revert "targets/opencl: Link against libgallium.la instead of libgallium.a""
This reverts commit 7824ab8070.

Now that we force linking with LLVM shared libs when building clover,
we can link against libgallium.la with no problems.
2013-01-24 15:45:32 +00:00
Tom Stellard
cf69a591e1 configure.ac: Force use of LLVM shared libs with --enable-opencl v2
If we build clover with LLVM static libraries, then clover and also each
pipe_*.so driver that is built will contain their own static copy of
LLVM.  The recent automake changes have uncovered a problem where
the pipe_*.so drivers try to use clover's LLVM symbols.  This causes
LLVM's static registry objects to be initialized each time
a pipe_*.so driver is loaded by clover.  Initializing these objects
multiple times is not allowed and leads to assertion failures in the
LLVM code.

We can avoid all these problems by having clover and all the pipe_*.so
drivers link against the same LLVM shared library.

https://bugs.freedesktop.org/show_bug.cgi?id=59334
https://bugs.freedesktop.org/show_bug.cgi?id=59534

v2:
  - Fix shared library detection when LLVM is built with CMake
2013-01-24 15:45:18 +00:00
Tom Stellard
69d639ba8b configure.ac: Compute the required llvm static libraries only once
In order to determine which static LLVM libraries are needed we pass
a list of components to llvm-config and it generates the list of
library dependencies for us.  The advantage of only calling llvm-config
one time is that it can determine if two components depend on the same
library and then add it to the output list only once.  The old practice
of having each driver call llvm-config to add its own dependencies to
$(LLVM_LIBS) caused many libraries to be added to this variable multiple
times.
2013-01-24 15:44:53 +00:00
Michel Dänzer
35f0dc2cc7 radeonsi: Fall back to dummy pixel shader instead of trying indirect addressing.
Indirect addressing isn't fully handled yet.

Fixes crashes with piglit tests using indirect addressing.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-24 08:46:48 +01:00
Marek Olšák
68cebb9a8f radeonsi: make sure copying of all texture formats is accelerated
[ Cherry-picked from r600g commit 7c371f4695 ]

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-24 08:46:31 +01:00
Michel Dänzer
de4e448095 radeonsi: Handle PIPE_FORMAT_L32A32_S/UINT for rendering.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-24 08:46:31 +01:00
Michel Dänzer
d0096dfa85 radeonsi: Make sure to use float number format for packed float colour formats.
These aren't covered by UTIL_FORMAT_TYPE_FLOAT.

Fixes 15 piglit (sub)tests.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-24 08:46:31 +01:00
Ian Romanick
5bd86b26df intel: Enable S3TC extensions always
Always enable the use of pre-compressed texture data.  The ability to
perform on-line compression still requires the presence of libtxc_dxtn
or an explicit driconf over-ride.  Applications that just want to submit
precompessed data when an on-line compressor is not available can look
for the GL_EXT_texture_compression_dxt1 and
GL_ANGLE_texture_compression_dxt[35] extensions.

v2: Only enable the extensions that do not require on-line compression
by default.  The previous statement "This should not impact many (if
any) real applications." proved to be false for at least Sauerbraten.
This application mostly submits pre-compressed data, but it also can
submit uncompressed data that it asks the driver to compress.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Acked-by: Eric Anholt <eric@anholt.net> [v1]
Acked-by: Lee Salzman <lsalzman@gmail.com>
2013-01-23 23:38:04 -05:00
Ian Romanick
53f8251107 mesa: Like EXT_texture_compression_dxt1, advertise ANGLE_texture_compression_dxt in all APIs
This is technically outside the ANGLE spec, but it seems unlikely to
cause any harm.

v2: Simplify the extension checks by assuming the ANGLE extension will
always be enabled by any driver that enables the EXT.  Suggested by
Eric Anholt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Lee Salzman <lsalzman@gmail.com>
2013-01-23 23:38:04 -05:00
Ian Romanick
d45c6c817d mesa: Simplify _mesa_choose_tex_format handling of compressed formats
For non-generic compressed format we assert two things:

1. The format has already been validated against the set of available
   extensions.

2. The driver only enables the extension if it supports all of the
   formats that are part of that extension.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-23 23:38:04 -05:00
Ian Romanick
a021881ccd mesa: Use a single flag for the S3TC extensions that don't require on-line compression
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Lee Salzman <lsalzman@gmail.com>
2013-01-23 23:38:04 -05:00
Carl Worth
8059c2ea90 i965: Use swizzles to force R, G, and B to 0.0 for ALPHA textures.
Similar to the previous commit, we may be using a texture with actual RGBA
storage for the GL_ALPHA format, so force the color values to 0.0.

This commit fixes the following piglit (sub) tests:

	EXT_texture_snorm/fbo-blending-formats
		GL_ALPHA16_SNORM
	        GL_ALPHA8_SNORM
		GL_ALPHA_SNORM

Note: Haswell bypasses this swizzle code, so may require an independent fix
for this bug.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-23 17:41:09 -08:00
Carl Worth
33599433c7 i965: Use swizzles to force alpha to 1.0 for RED, RG, or RGB textures.
We may be using a texture with actual RGBA storage for these formats, so force
the alpha value read to 1.0.

This commit fixes the following piglit (sub) tests:

	ARB_texture_float/fb-blending-formats
		GL_RGB16F_ARB
	EXT_framebuffer_object/fbo-blending-formats
                GL_RGB10
		GL_RGB12
	        GL_RGB16
	EXT_texture_snorm/fbo-blending-formats
		GL_RGB16_SNORM
		GL_RGB8_SNORM
		GL_RGB_SNORM

These test improvements depend on the previous commit as well. That commit
smashes alpha to 1.0 for the case of ReadPixels (so fixes "FBO testing" as
reported by this test), while this commit smashes alpha to 1.0 for the case of
texturing (fixed the "window testing" as reported by this test).

Note: Haswell bypasses this swizzle code, so may require an independent fix
for this bug.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-23 17:40:52 -08:00
Carl Worth
570ed2be7d ReadPixels: Force ALPHA to 1 while rebasing RGBA values for GL_RGB format
When performing a ReadPixels operation, we may be reading from a buffer that
stores alpha values, but that is actually representing a buffer with no alpha
channel. In this case, while rebasing the values, touch up all alpha values
read to 1.0.

This commit fixes the following piglit (sub) tests:

	ARB_texture_float/fbo-colormask-formats
		GL_RBG16F_ARB
	EXT_texture_snorm/fbo-colormask-formats
		GL_RGB16_SNORM
		GL_RGB8_SNORM
		GL_RGB_SNORM

It likely improves the results of other tests as well, but a PASS remains
elusive due to additional bugs.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-01-23 17:40:52 -08:00
Carl Worth
b961ba44ed i965: Examine _BaseFormat when deciding to perform xRGB_alpha fixups
The renderbuffer's Format field may have an alpha channel even when the
underlying _BaseFormat does not. This can happen when mesa chooses to use
RGBA16 for an RGB16 format, for example.

So look at _BaseFormat when deciding whether to fixup the blend factors.

This test improves the results of at least the following piglit tests:

	EXT_frambebuffer_object/fbo-blending-formats
        	{GL_RGB10, GL_RGB12, GL_RGB16}
	EXT_texture_snorm/fbo-blending-formats
		{GL_RGB16_SNORM, GLRGB8_SNORM, GL_RGB_SNORM}

But none of these actually change from FAIL to PASS yet. The R, G, and B probe
values are fixed with this commit, but the tests still fail because the alpha
values are still wrong.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-23 17:40:24 -08:00
José Fonseca
0642437606 scons: Fix source lists parsing on Windows.
/ vs \ mismatch was causing .objs to be put in the source tree, causing
breakeage when doing different build types in the same tree (eg., debug
vs release).

Fix this by normalizing everything to / slashes.

It's probably a good idea to purge all .objs from source tree to prevent
issues completely.
2013-01-23 12:11:53 +00:00
Matt Turner
60315e3eaf GL3.txt: i965 supports ARB_base_instance
Added in commit cdd3f549.
2013-01-22 21:34:25 -08:00
Brian Paul
bd8045d4c5 wmesa: include api_exec.h to fix compilation 2013-01-22 16:44:11 -07:00
Brian Paul
26a05b5005 draw: fix MSVC divide-by-zero compilation error
Kind of lame, but it works.
2013-01-22 16:44:11 -07:00
Kenneth Graunke
cdd3f5496a i965: Implement the GL_ARB_base_instance extension.
Thanks to Fredrik Höglund, all the hard work was already done.

Tested using a modified oglconform (that actually runs these tests on
our driver); it looks like there may be some bugs when using client
arrays.  All applicable non-compatibility tests passed.

For now, only enable it in core profiles.

Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Ian Romanick <idr@freedesktop.org>
2013-01-22 15:41:30 -08:00
Matt Turner
0d108116bd glsl/build: Build libglcpp and libglslcore in builtin_compiler
And reuse them if not cross compiling.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-22 14:33:45 -08:00
Matt Turner
952e6e9f3b glsl/Makefile.sources: Correct BUILTIN_COMPILER_CXX_FILES
Squashed with two reverts:

Revert "android: Update for builtin_stubs.cpp move"

This reverts commit c0def90ede.

Revert "scons: Update for builtin_stubs.cpp"

This reverts commit 8ac4b82699.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Tested-on-Android-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-22 14:33:41 -08:00
Matt Turner
2a71054396 build: Use AX_PROG_FLEX
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47248
2013-01-22 14:33:38 -08:00
Matt Turner
b68b85224d build: Use AX_PROG_BISON
No one tests yacc/byacc. Let's just request bison specifically.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46815
2013-01-22 14:33:31 -08:00
Matt Turner
3791ce05eb builtin_compiler/build: Use generated parser files
... instead of generating them again.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-22 14:33:28 -08:00
Matt Turner
efd201caa5 glsl/build: Build tests via the glsl Makefile
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-22 14:33:24 -08:00
Matt Turner
86d30dea3c glsl/build: Build glcpp via the glsl Makefile
Removing the subdirectory recursion provides a small speed up.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-22 14:33:20 -08:00
Matt Turner
cc9f609cb9 glsl/build: Don't build builtin_compiler separately if not cross compiling
Reduces the number of times that src/glsl/ is compiled when not cross
compiling.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-22 14:33:16 -08:00
Matt Turner
569f0e400a glsl/build: Don't build glsl_compiler
Use glslparsertest from piglit instead.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-22 14:33:07 -08:00
Brian Paul
ab74fee5e1 draw: fix problem in screen-space interpolation clip code
I don't see how this could have ever worked right.

The screen-space interpolation code uses the vertex->data[pos_attr]
position which contain window coords.  But window coords are only
computed for the unclipped vertices; the clipped vertices have
undefined window coords (see draw_cliptest_tmp.h).

Use the vertex clip coords instead which are always defined.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=55476
(piglit fbo-blit-stretch failure on softpipe)

Note: This is a candidate for the 9.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-22 14:53:58 -07:00
Brian Paul
ed643d6b2f draw: improve the clipper debug/printf code
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-22 14:53:58 -07:00
Brian Paul
4a938ef713 draw: add new debug code and comments in clip code template
In debug builds, set clipped vertex window coordinates to NaN values
to help debugging.  Otherwise, we're just leaving the coordinate in clip
space and it's invalid to use it later expecting it to be a window coord.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-22 14:53:58 -07:00
Brian Paul
547a418888 swrast: fix blit code's nearest/linear coordinate arithmetic
Fixes piglit's fbo-blit-stretch test.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-22 14:53:58 -07:00
Brian Paul
b70b486249 swrast: fix incorrect width for direct/nearest blit
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-22 14:53:58 -07:00
Brian Paul
728bf86a23 swrast: move resampleRow setup code in blit_nearest()
The resampleRow setup depends on pixelSize.  For color buffers,
we don't know the pixelSize until we're in the buffer loop.  Move
that code inside the loop.

Fixes: http://bugs.freedesktop.org/show_bug.cgi?id=59541

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-22 14:53:58 -07:00
Andreas Boll
0a60ea4ddc docs: import release notes for 9.0.2, add news item 2013-01-22 21:28:51 +01:00
José Fonseca
9a0973044e scons: Disable frame pointer omission for all build types except release.
In particular for checked builds, where debug_backtrace_capture relies
on it.
2013-01-22 20:19:28 +00:00
José Fonseca
de0057caa6 nouveau/build: Fix build failures when drm is not in /usr/include.
Fixes failures to include libdrm/nouveau.h when drm is not installed in
/usr/include.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-01-22 19:10:47 +00:00
Michel Dänzer
a56dfd99e2 radeon/llvm: Handle LP_CHAN_ALL in emit_fetch_immediate().
Fixes piglit spec/ARB_sampler_objects/sampler-incomplete and
spec/EXT_texture_swizzle/depth_texture_mode_and_swizzle.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-22 18:50:02 +01:00
Kenneth Graunke
121d19de92 build: Fix build on systems where /usr/bin/python isn't python 2.
configure.ac sets up a PYTHON2 variable, which is what we want
AX_PYTHON_MODULE to use (since we only use Python 2 for now).

NOTE: This is a candidate for the 9.0 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31598
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-22 09:05:32 -08:00
Ian Romanick
148fc6d537 mesa/es3: Apply stricter multisample blit rules for ES3.
Fixes gles3conform
framebuffer_blit_error_blitframebuffer_multisampled_read_buffer_different_origins.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-22 03:26:24 -05:00
Ian Romanick
d7475c7966 mesa/es3: Disallow FRAMEBUFFER_ATTACHMENT_COMPONENT_TYPE query of DEPTH_STENCIL_ATTACHMENT
This error was added in the 3.0.1 update to the OpenGL ES 3.0 spec.
Fixes the updated gles3conform packed_depth_stencil_parameters test.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-22 03:26:24 -05:00
Ian Romanick
9cb64a4cb6 mesa: Don't allow blits to / from the same buffer in OpenGL ES 3.0
Fixes gles3conform test CoverageES30.  It temporarily regresses some
framebuffer_blit tests, but the failing subcases have been determined to
be invalid for OpenGL ES 3.0.

v2: Fix typo in depth (and stencil) RB checking.  Noticed by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-22 03:26:24 -05:00
Eric Anholt
85c2e99039 mesa: Remove exec thunks from the dlist.c module.
These were introduced in 2000 during a rework of the TNL module (commit
cab974cf6c), though I'm having a hard time
finding an instance there of one of these Exec functions being changed
at runtime.

Regardless, as far as I can tell now, these functions don't get changed,
by grepping for calls to SET_* to change the dispatch table (we do change
functions in GLvertexformat at runtime, but those don't overlap with
this set of functions).  Remove them and just let them be initialized to
the same functions as are in the Exec table.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-21 21:26:48 -08:00
Eric Anholt
ab4c549378 mesa: Initially populate the display list with the exec list.
This cuts out a ton of code to make functions not set to a save_ variant
match.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-21 21:26:48 -08:00
Eric Anholt
7820e2dd8d mesa: Delay display list save dispatch setup until Exec is set up.
This will let us copy from the Exec dispatch to deal with our commands that
don't get compiled into display lists.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-21 21:26:48 -08:00
Eric Anholt
be4b1664fb mesa: Make the drivers call a non-code-generated dispatch table setup.
I want to drive the Save dispatch table setup from this same function.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-21 21:26:48 -08:00
Eric Anholt
ced98f17ef mesa: Remove the size argument from _mesa_alloc_dispatch_table().
All callers are in Mesa core and all use _gloffset_COUNT, so just rely on
the already baked-in use of _gloffset_COUNT in the function.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-21 21:26:47 -08:00
Eric Anholt
cb49016622 mesa: Remove two of the now unused ASSERT_OUTSIDE_BEGIN_END macros.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-21 21:26:47 -08:00
Eric Anholt
a9754793da mesa: Drop manual checks for outside begin/end.
We now have a separate dispatch table for begin/end that prevent these
functions from being entered during that time.  The
ASSERT_OUTSIDE_BEGIN_END_WITH_RETVALs are left because I don't want to
change any return values or introduce new error-only stubs at this
point.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-21 21:26:47 -08:00
Eric Anholt
c572251417 mesa: Install a minimal dispatch table during glBegin()/glEnd().
This is a step toward getting rid of ASSERT_OUTSIDE_BEGIN_END() in Mesa.

v2: Finish create_beginend_table() comment, move loopback API init into it,
    and add a const flag. (suggestions by Brian)

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
2013-01-21 21:26:47 -08:00
Eric Anholt
0aaf0445ba mesa: Remove the dead PrepareExecBegin() driver hook.
This was used in i965 for a while, but no more.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-21 21:26:47 -08:00
Eric Anholt
23916cae8e mesa: Use an early return to unindent most of vbo_exec_Begin/End().
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-21 21:26:46 -08:00
Eric Anholt
7b3c8b3747 mesa: Improve a glTexEnv error message by looking up the enum.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-21 21:26:46 -08:00
Eric Anholt
4d8f72f2bc mesa: Fix regression in dlist save primitive tracking.
My change 7ca4f07b5b caused errors to not
be thrown when they should, because the new if statement for ExecuteFlag
made the CurrentSavePrimitive not get set.  And on further review, we
shouldn't be validating our primitive in GL_COMPILE mode, since the
command shouldn't be executed yet.

Partially fixes piglit gl-1.0-beginend-coverage.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-21 21:26:46 -08:00
Maarten Lankhorst
3a91e7955a vl: round next_msc to integer frame, and kill skew_msc
This reduces jitter slightly in a cleaner way, without desynchronizing mplayer2 as badly
when falling behind.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-01-21 23:49:56 +01:00
José Fonseca
71c87e42e1 scons: Fix dependencies of generated headers.
It appears that scons implicit dependency scanners fail to chain
dependencies of generated headers when these are outside the build tree.

This patch ensures generated source files are _always_ put in the build
tree. I'm not 100% this will fix all depency issues, but from my
experiments it does seem to fix this.

NOTE: For this to be effective it is necessary to clean the source tree
from generated header/source files.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-21 19:10:54 +00:00
Ian Romanick
75b7e1df13 intel: Don't expose XRGB8888 visuals any more
There really isn't any point.  There is no resource savings, and we have
to do gymnastics in the driver to make it work.

There are also bad interactions with multisampling and OpenGL ES 3.0.
In ES3, a multisample-to-singlesample blit must have identical source
and destination format.  This means a multisample RGBA8 to singlesample
RGB8 (window) blit will generate an error.  Also in ES3, RGB8 is not a
renderable format.  This means that the application CANNOT make an RGB8
multisample renderbuffer.

As a result, if an application gets an RGB8 window and wants to do
multisample FBO rendering, it will probably break.

"Fixes" gles3conform
framebuffer_blit_functionality_multisampled_to_singlesampled_blit test
on RGB8 visuals.

v2: Fix 'formats' array size.  Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
2013-01-21 13:34:34 -05:00
Ian Romanick
9bdf5bef76 i965: Enable floating-point textures always
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Eric Anholt <eric@anholt.net>
2013-01-21 11:46:21 -05:00
Marek Olšák
4a1af434e6 r300g: add a workaround for the AA colorbuffer addressing bug on R500 2013-01-21 17:00:51 +01:00
Marek Olšák
7bfbf5b287 r300g: allow resolutions up to 1280x1024 with AA optimizations on 1-pipe cards
because single-pipe cards have bigger CMASK RAM
2013-01-21 17:00:51 +01:00
Marek Olšák
b7cb655298 r300g: enable AA optimizations for the RGBA16F format 2013-01-21 17:00:51 +01:00
Marek Olšák
6f6112a2b9 radeonsi: More assorted depth/stencil changes ported from r600g.
[ Squashed port of the following r600g commits: - Michel Dänzer ]

commit 428e37c2da
Author: Marek Olšák <maraeo@gmail.com>
Date:   Tue Oct 2 22:02:54 2012 +0200

    r600g: add in-place DB decompression and texturing with DB tiling

    The decompression is done in-place and only the compressed tiles are
    decompressed. Note: R6xx-R7xx can do that only with Z16 and Z32F.

    The texture unit is programmed to use non-displayable tiling and depth
    ordering of samples, so that it can fetch the texture in the native DB format.

    The latest version of the libdrm surface allocator is required for stencil
    texturing to work. The old one didn't create the mipmap tree correctly.
    We need a separate mipmap tree for stencil, because the stencil mipmap
    offsets are not really depth offsets/4.

    There are still some known bugs, but this should save some memory and it also
    improves performance a little bit in Lightsmark (especially with low
    resolutions; tested with Radeon HD 5000).

    The DB->CB copy is still used for transfers.

commit e2f623f1d6
Author: Marek Olšák <maraeo@gmail.com>
Date:   Sat Jul 28 13:55:59 2012 +0200

    r600g: don't decompress depth or stencil if there isn't any

commit 43e226b6ef
Author: Marek Olšák <maraeo@gmail.com>
Date:   Wed Jul 18 00:32:50 2012 +0200

    r600g: optimize uploading depth textures

    Make it only copy the portion of a depth texture being uploaded and
    not the whole 2D layer.

    There is also a little code cleanup.

commit b242adbe5c
Author: Marek Olšák <maraeo@gmail.com>
Date:   Wed Jul 18 00:17:46 2012 +0200

    r600g: remove needless wrapper r600_texture_depth_flush

commit 611dd52942
Author: Marek Olšák <maraeo@gmail.com>
Date:   Wed Jul 18 00:05:14 2012 +0200

    r600g: init_flushed_depth_texture should be able to report errors

commit 80755ff563
Author: Marek Olšák <maraeo@gmail.com>
Date:   Sat Jul 14 17:06:27 2012 +0200

    r600g: properly track which textures are depth

    This fixes the issue with have_depth_texture never being set to false.

commit fe1fd67556
Author: Marek Olšák <maraeo@gmail.com>
Date:   Sun Jul 8 03:10:37 2012 +0200

    r600g: don't flush depth textures set as colorbuffers

    The only case a depth buffer can be set as a color buffer is when flushing.

    That wasn't always the case, but now this code isn't required anymore.

commit 5a17d8318e
Author: Marek Olšák <maraeo@gmail.com>
Date:   Sun Jul 8 02:14:18 2012 +0200

    r600g: flush depth textures bound to vertex shaders

    This was missing/broken. There are also minor code cleanups.

commit dee58f94af
Author: Marek Olšák <maraeo@gmail.com>
Date:   Sun Jul 8 01:54:24 2012 +0200

    r600g: do fine-grained depth texture flushing

    - maintain a mask of which mipmap levels are dirty (instead of one big flag)
    - only flush what was requested at a given point and not the whole resource
      (most often only one level and one layer has to be flushed)

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-21 15:42:28 +01:00
Vadim Girlin
bc398f908f radeonsi: improve flushed depth texture handling
Use r600_resource_texture::flished_depth_texture for GPU access, and
allocate it in the VRAM. For transfers we'll allocate texture in the GTT
and store it in the r600_transfer::staging.

Improves performance when flushed depth texture is frequently used by the
GPU, e.g. in Lightsmark

[ Ported from r600g commit 3770847960 ]

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-21 15:42:28 +01:00
Marek Olšák
bfb405ceee radeonsi: Assorted depth/stencil changes ported from r600g.
[ Squashed port of the following r600g commits: - Michel Dänzer ]

commit c1e8c845ea
Author: Marek Olšák <maraeo@gmail.com>
Date:   Sat Jul 7 19:10:00 2012 +0200

    r600g: inline r600_hw_copy_region

commit 4891c5dc64
Author: Marek Olšák <maraeo@gmail.com>
Date:   Mon Jun 25 22:53:21 2012 +0200

    r600g: inline r600_blit_push_depth and use resource_copy_region

    We are going to have a separate resource for depth texturing and transfers
    and this is just a transfer thing.

commit da98bb6fc1
Author: Marek Olšák <maraeo@gmail.com>
Date:   Mon Jun 25 12:45:32 2012 +0200

    r600g: split flushed depth texture creation and flushing

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-21 15:42:28 +01:00
Michel Dänzer
f0ffbbc9ff radeonsi: Enable 1D tiling for non-depth resources as well.
No piglit regressions anymore thanks to fixes in libdrm_radeon and here.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-21 14:10:52 +01:00
Michel Dänzer
90d919fcd0 radeonsi: Fix 1D tiling mode index for non-scanout resources.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-21 14:10:52 +01:00
Matt Turner
a076c272e2 build: Remove dead SHARED_GLAPI variable
The static Makefiles used it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-20 20:06:46 -08:00
Matt Turner
3f276b37b1 glsl/build: Build glsl_test only on make check
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-20 20:06:44 -08:00
Matt Turner
ecbe3118c2 glsl/build: Remove dead LIBRARY_* variables
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-20 20:06:41 -08:00
Matt Turner
37f34e53e0 xmlpool/build: generate options.h via BUILT_SOURCES
Fixes missing options.h when doing 'make check' in dri/common before
'make' has been run.

Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-20 20:06:20 -08:00
Jordan Justen
6c7fa72229 fbobject: add additional fbo completeness checks for GLES
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-20 19:54:38 -08:00
Jordan Justen
f8e7aa2827 framebuffer: update allowed implementation format/type
Allow additional format/type combinations based on the
color render buffer to fix failures with gles3-gtf.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-20 19:54:38 -08:00
Jordan Justen
ffdffd834a readpix: allow implementation format/type
For GLES2/3 allow reading of pixels with format/type based on:
 * GL_IMPLEMENTATION_COLOR_READ_FORMAT
 * GL_IMPLEMENTATION_COLOR_READ_TYPE

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-20 19:54:38 -08:00
Jordan Justen
119002a648 extensions: enable EXT_color_buffer_float for ES3
[mattst88] v2: Enable only for ES3 per spec.
[mattst88] v3: Use _mesa_is_gles3 since EXT_color_buffer_float is
	       ES3-only.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-20 19:54:38 -08:00
Matt Turner
227f58695e extensions: Add ES3-only extension support
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-20 19:54:38 -08:00
Jordan Justen
ce9118c7f0 readpix: check FBO completeness before trying to access the read-buffer
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-20 19:54:38 -08:00
Jordan Justen
8b0bc9de36 readpix: add error checking for GLES3
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-20 19:54:38 -08:00
Jordan Justen
a793ffa0b8 copyteximage: update error checking for GLES3
Changes based on GTF/gles3 conformance test suite.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-20 19:54:38 -08:00
Jordan Justen
3b51d71c85 copyteximage: check that sRGB usage is valid for GLES3 / GL
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-20 19:54:38 -08:00
Ian Romanick
285fe32bd9 intel: Enable GL_OES_depth_texture_cube_map
For now I'm just enabling this on the same subset of hardware that has
OpenGL 3.0 enabled.  This same functionality is part of OpenGL 3.0, and
there is no matching desktop extension.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-20 20:56:23 -05:00
Ian Romanick
1c29d8f4ff mesa/es3: Allow unsized depth and depth-stencil formats in ES3
They're part of GL_OES_depth_texture_cube_map, and we'll always enable
that extension in ES3 contexts.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-20 20:56:20 -05:00
Ian Romanick
b3eed73c3b mesa/es2: Allow depth component cube maps in ES2 if the extension is enabled
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-20 20:56:18 -05:00
Ian Romanick
0f899c2da8 mesa: Add extension bit tracking for GL_OES_depth_texture_cube_map
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-20 20:56:01 -05:00
Adam Jackson
30530ee9ac gallium: Remove ppc asm backend
The vs part hasn't been wired up since tgsi_sse2 was disabled in:

    commit 4eb3225b38
    Author: José Fonseca <jose.r.fonseca@gmail.com>
    Date:   Tue Nov 8 00:10:47 2011 +0000

	Remove tgsi_sse2.

And it would certainly not work correctly in its current state:

draw/draw_vs_ppc.c: In function ‘draw_create_vs_ppc’:
draw/draw_vs_ppc.c:190:24: warning: assignment from incompatible pointer
type [enabled by default]

As with the sse2 backend, this should be done in llvm anyway.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-01-20 17:12:47 -05:00
Andreas Boll
410b58c7bf build: require python module libxml2
configure should warn if libxml2 is not found.
libxml2 is needed by glapi/gen.

Fixes error during build in src/mapi/glapi/gen:
ImportError: No module named libxml2

NOTE: This is a candidate for the 9.0 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31598
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-19 23:50:39 +01:00
Vincent Lejeune
f9f5c92f73 r600g/llvm: Fixes addressspace of basevectors for clipvertex 2013-01-19 22:28:13 +01:00
Christoph Bumiller
e264b8ef41 nv50/ir: add definitions of Target and CodeEmitter dtors
I really did build test, my compiler just doesn't seem to care.
2013-01-19 22:13:45 +01:00
Christoph Bumiller
7d2d450ea6 nouveau: fix undefined behaviour when testing sample_count
NOTE: This is a candidate for the 9.0 branch.
2013-01-19 20:54:39 +01:00
Christoph Bumiller
b0863c26d4 nv50/ir: fix a couple of warnings 2013-01-19 20:54:39 +01:00
Ian Romanick
f59a3a0fe2 mesa: Array uniform name length includes length of [0]
This is required by OpenGL ES 3.0 and desktop OpenGL 4.2.  Previous
version were ambiguous.  This also matches the behavior of NVIDIA's
closed-source driver (version 304.64).

Fixed gles3conformance test uniform_buffer_object_getactiveuniformsiv
and uniform_buffer_object_structure_and_array_element_names (on my
in-progress branch that fixes a bunch of other stuff...YMMV).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 17:35:33 -08:00
Ian Romanick
8ef3c83ffe mesa: Array uniform names are supposed to have [0] appended
This is required by OpenGL ES 3.0 and desktop OpenGL 4.2.  Previous
version were ambiguous.  This also matches the behavior of NVIDIA's
closed-source driver (version 304.64).

Fixed gles3conformance test uniform_buffer_object_getactiveuniform.

Several piglit tests expect glGetActiveUniform to *not* include the [0]
on the end.  These tests were already failing on NVIDIA, and this change
regresses them on Mesa.  Patches have been sent to the piglit mailing
list to fix the tests.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 17:35:33 -08:00
Ian Romanick
5938c7774f mesa: Refactor getting a uniform's name to a helper function
We currently have a bug in this code, and I don't want to fix it in two
places.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 17:35:33 -08:00
Ian Romanick
f26520146b glsl: Eliminate link_update_uniform_buffer_variables return value
It always returns true, so there's no point in having a return value.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 17:35:33 -08:00
Ian Romanick
bd85c75922 glsl: Remove unused loc parameter from generate_call
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 17:35:33 -08:00
Ian Romanick
56053b0a2d mesa: Remove unused field gl_uniform_buffer_variable::Buffer
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 17:35:33 -08:00
Ian Romanick
feea85da06 linker: Use helper variable sh
This looks like a copy-and-paste left over.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 17:35:32 -08:00
Ian Romanick
db718e2472 glsl: Remove stale comment
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 17:35:32 -08:00
Kenneth Graunke
4f29169913 glsl: Track UBO block names in the symbol table.
The GLSL 1.40 spec says:

    "Uniform block names and variable names declared within uniform
    blocks are scoped at the program level."

Track the block name in the symbol table and emit errors when conflicts
exist.

Fixes es3conform's uniform_buffer_object_block_name_conflict test, and
fixes the piglit block-name-clashes-with-{variable,function,struct}.vert
tests.

NOTE: This is a candidate for the 9.0 branch.

v2: Fix bad constructor initialization.  Noticed by Topi Pohjolainen.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-18 17:35:32 -08:00
Ian Romanick
bb47a4d081 glsl: Reject row_major and column_major on non-matrix types
About both row_major and column_major layout qualifiers, the GLSL spec
says:

    "It only affects the layout of matrices."

However, the OpenGL ES 3.0 conformance tests have taken this to mean it
is an error use it elsewhere.  This seems logical given that
'layout(row_major) vec4 foo' is probably not what the programmer meant.

The only catch is dealing with structures that contain matrices.  Layout
qualifiers cannot be applied directly to fields of structures, so the
only way to affect the layout of the fields is to apply a qualifier to
the structure declaration itself.  There is ongoing debate about this
within Khronos, and it seems to be settling in favor of allowing the
qualifiers on structures.  I light of this, I have chosen to allow the
qualifiers on structures but emit a warning since the usage may not be
portable.

Fixes gles3conform test
uniform_buffer_object_layouts_not_for_matrix_type and causes no
regressions.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 17:35:32 -08:00
Eric Anholt
1ec1b577f7 mesa: Skip updating texgen when not doing fixed function.
Between the previous commit and this one, improves GLBenchmark 2.1
offscreen performance by 0.48% +/- 0.24% (n=22, throttling outliers
removed).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-18 13:03:46 -08:00
Eric Anholt
078727d41c mesa: Don't bother updating ff texture state if we have a fragment shader.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-18 13:03:41 -08:00
Eric Anholt
b5788146ba mesa: Drop a comment about ff vertex shading and texturing.
It's never going to have texture fetches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-18 13:03:27 -08:00
Eric Anholt
4533a38fa8 mesa: Fix out of bounds writes when uncompressing non-block-aligned ETC1.
Fixes a crash in GLB2.1 offscreen on the glthread branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 12:48:27 -08:00
Eric Anholt
5e529d708a i965: Add support for GL_ARB_texture_buffer_object_rgb32.
Tested with piglit ARB_texture_buffer_object/formats.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 12:48:22 -08:00
Eric Anholt
582b06c2c6 i965: Add support for MESA_FORMAT_RGB_FLOAT32 surfaces.
This is for GL_ARB_texture_buffer_object_rgb32 support, but it also
causes the format to get used for float32 rgb textures as well on
Ironlake and later.  Since that came with some surprises, separate
the change from the enable commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 12:48:18 -08:00
Eric Anholt
60894edeef intel: Make intel_region's pitch be bytes instead of pixels.
We almost never want a stride in pixels -- if you're doing anything with
a stride, you're specifying an offset or incrementing a pointer, and in
both cases you had to multiply by cpp to get the bytes value you wanted.
But worse, on the way to creating a region from a new tiled BO, we
divided by cpp to get pitch in pixels, and for an RGB32 buffer (an
upcoming change) the pitch wouldn't divide exactly, and we'd end up with
a wrong stride in our region.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 12:48:13 -08:00
Eric Anholt
8fd62e80ae intel: Make intel_blit.c take pitches in bytes.
As we gain support for NPOT cpp, a pitch may not divide by cpp cleanly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-18 12:48:07 -08:00
Vincent Lejeune
3b14ce2caf r600g/llvm: tgsi to llvm emits store.swizzle intrinsic for vs/fs output
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-01-18 20:34:26 +00:00
Vincent Lejeune
7b20526466 r600g/llvm: tgsi to llvm emits stream output intrinsics.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-01-18 20:34:21 +00:00
Vincent Lejeune
ce34ff1ad7 r600g/llvm:translate ARL opcode to a simple cast
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-01-18 20:08:10 +00:00
Vadim Girlin
7d532800d8 r600g/llvm: rework handling of the constants
Vincent Lejeune:
  - tgsi to llvm now emits pointers for constants

Tom Stellard:
  - Only use texture cache for vtx fetch with compute shaders
  - Change address space used for constant loads to match LLVM
    backend.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-01-18 20:08:10 +00:00
Kenneth Graunke
1ee2880e86 mesa: Only mark textures as mipmap incomplete on MAX_LEVEL issues.
According to the OpenGL 3.2 Core Profile specification, section 3.8.12:

"For one-, two-, and three-dimensional and one-and two-dimensional array
 textures, a texture is mipmap complete if all of the following
 conditions hold true:
 - [...]
 - levelbase <= levelmax [...]

 Using the preceding definitions, a texture is complete unless any of
 the following conditions hold true:
 - [...]
 - The minification filter requires a mipmap (is neither NEAREST nor
   LINEAR), and the texture is not mipmap complete."

(This text also appears in all GL >= 3.2 specs and the ES 3.0 spec.)

From this, we see that levelbase <= levelmax should only affect mipmap
completeness, not base-level completeness.

Prior versions of GL did not have the notion of mipmap completeness,
simply calling the texture incomplete in this case.  But I don't think
we really care.

Fixes es3conform's sgis_texture_lod_basic_completeness test.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2013-01-18 11:31:27 -08:00
Kenneth Graunke
f0dbd9255b i965/vs: Store texturing results into a vec4 temporary.
The sampler appears to ignore writemasks (even when correcting the
WRITEMASK_XYZW in brw_vec4_emit.cpp to the proper writemask) and just
always writes all four values.

To cope with this, just texture into a temporary, then MOV out into a
register that has the proper number of components.

NOTE: This is a candidate for stable branches.

Fixes es3conform's shadow_execution_vert.test.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2013-01-18 11:31:27 -08:00
Kenneth Graunke
aeff9a0d98 i965/vs: Set LOD to 0 for ordinary texture() calls.
Previously it was left undefined, causing us to select a random LOD.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2013-01-18 11:31:26 -08:00
Kenneth Graunke
56ce55d198 i965/vs: Create a 'lod_type' temporary for ir->lod_info.lod->type.
This is purely a refactor.  However, in a moment, we'll want to set
lod_type to float for ir_tex, where ir->lod_info.lod is NULL.

NOTE: This is a candidate for stable branches (for the next patch).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2013-01-18 11:31:26 -08:00
Kenneth Graunke
613e64060c i965: Lower textureGrad() with samplerCubeShadow on pre-Haswell.
Fixes regressions since commit 899017fc54
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Fri Jan 4 07:53:09 2013 -0800

    i965: Use Haswell's sample_d_c for textureGrad with shadow samplers.

That patch assumed that all instances were lowered.  However, we weren't
lowering textureGrad() with samplerCubeShadow because I couldn't figure
out the LOD calculations.  It turns out they're easy: you just have to
use 1 for the depth.  This causes it to pass oglconform's four tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Ian Romanick <idr@freedesktop.org>
2013-01-18 10:30:54 -08:00
Roland Scheidegger
d03d9b657e llvmpipe: turn on integer texture support
Now that things mostly seem to work enable those formats.
Some formats cause crashes (notably RGB8 variants) so switch these off
(these crashes are not specific to INT/UINT variants but the state tracker
doesn't use them for UNORM etc. formats so it went unnoticed so far).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-18 09:14:52 -08:00
Roland Scheidegger
f2a87a1f5b llvmpipe: more fixes for integer color buffers
Cast back the fake floats to ints, and make sure we don't try to do scaling
in format conversion (which only makes sense with normalized values).
Also need to disable blending and alpha test (as per spec) for such buffers.
This makes fbo-blending from the piglit ext_texture_integer tests work for most
formats (some crash, and the luminance and intensity variants have the GB or
GBA channels respectively wrong).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-18 09:14:52 -08:00
Roland Scheidegger
dc6bc3b642 llvmpipe: trivial code and comment cleanup.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-18 09:14:52 -08:00
Roland Scheidegger
8c84a82383 llvmpipe: fix using wrong format with MRT in blend code
We were passing in the rt index however this was always 0 for non-independent
blend case. (The format was only actually used to decide if the color mask
covered all channels so this went unnoticed and was discovered by accident.)
Additionally, there was a second problem because we do fixups in the key based
on color buffer format we cannot use non-independent blend anyway as the fixed
up values would never get used.
So always turn non-independent blending into independent.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-18 09:14:52 -08:00
Ian Romanick
ca39c0f94a mesa/es3: Don't check dimensions in _mesa_es3_error_check_format_and_type
Filtering of DEPTH_COMPONENT and DEPTH_STENCIL for TEXTURE_3D is already
done in texture_error_check because these combinations aren't allowed on
desktop GL either.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-01-17 10:47:46 -08:00
Ian Romanick
311cc5d973 mesa: Don't allow DEPTH_STENCIL for 3D textures
Just like DEPTH_COMPONENT.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-01-17 10:47:42 -08:00
Brian Paul
57ddf1227f swrast: fix assorted bugs in software blit code
1. The loop over dest buffers in blit_linear() needed a null pointer
check.  Fixes https://bugs.freedesktop.org/show_bug.cgi?id=59499

2. The code to grab the drawRb's format needs to be inside the drawing loop.

3. An equality test was using = instead of == thus messing up a
renderbuffer attachment texture pointer.  This lead to memory
corruption and a crash at exit.

Finally, fix a capitalization error NumDrawBuffers -> numDrawBuffers
and change type to unsigned to fix signed/unsigned comparison warnings.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-01-17 09:38:54 -07:00
Michel Dänzer
51efb081f7 radeonsi: Actually keep track if we are using depth textures for samplers.
20-odd more piglits.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-17 16:57:21 +01:00
Michel Dänzer
3c92bfe2d2 radeonsi: Fix Z24 texture formats.
About half a dozen more piglits.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-17 16:57:21 +01:00
Michel Dänzer
1ace200b2b radeonsi: Set SPI_SHADER_COL_FORMAT to what the pixel shader actually exports.
Instead of deriving it from the colour buffer formats only.

Fixes a number of piglit tests which export depth from the pixel shader.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-17 16:57:21 +01:00
Michel Dänzer
bc5e65096d radeonsi: Use proper hardware format for stencil texturing.
Fixes piglit 'spec/ARB_depth_buffer_float/fbo-clear-formats stencil' crash.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-17 16:57:20 +01:00
Michel Dänzer
c486e3ef34 radeonsi: Enable tiling for depth/stencil resources.
Enabling it for all resources still seems to cause problems, but depth/stencil
buffers are always accessed with tiling by the DB block.

Also, stick to 1D tiling for now. Getting 2D tiling to work properly will
require substantial changes in libdrm_radeon and possibly the kernel as well.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-17 16:57:20 +01:00
Michel Dänzer
c408f0c5c4 radeonsi: Consolidate calculation of tile mode index.
Apart from the obvious cleanup, this makes sure all blocks use the same tiling
mode for accessing the resource.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-17 16:57:20 +01:00
Maarten Lankhorst
9ba7eac535 nvc0: add support for accelerated video decoding through the dedicated engines
Currently the use of external firmware is required, with kernel and
userspace firmware needed for all Fermi cards except nvd9. Kepler and nvd9
should only require kernel firmware.
2013-01-17 16:28:57 +01:00
Michel Dänzer
6eb0d3d863 radeonsi: Pass texture type to sampling intrinsics.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2013-01-17 15:47:38 +01:00
Maarten Lankhorst
edc8e8cbef nvc0: add space checks to clear functions
Thanks to calim for helping me find and fix the issue.
2013-01-17 12:37:25 +01:00
Maarten Lankhorst
5dc76c7670 nv50: add space checks to clear functions, and respect depth
Thanks to calim for helping me find and fix the issue.
2013-01-17 12:37:15 +01:00
Brian Paul
56c01d8109 st/mesa: a couple fixes for st_BlitFramebuffer()
1. Loop over multiple destination color buffers.  If we set
glDrawBuffers(GL_FRONT_AND_BACK) we need to loop over multiple color
buffers, blitting to each.

2. Add checks for null src/dst surface pointers.  This fixes a crash
in the piglit fbo-missing-attachment-blit test.
See bug http://bugs.freedesktop.org/show_bug.cgi?id=59450

Reviewed-by: Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-16 17:06:17 -07:00
Brian Paul
af7b4b01f1 st/mesa: simplify some src/dst surface setup in BlitFramebuffer
Use the renderbuffer attachment pointers that we grabbed earlier.

Reviewed-by: Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-16 17:06:17 -07:00
Brian Paul
09154c274c meta: add 'f' suffix to floats to silence some MSVC warnings 2013-01-16 17:06:17 -07:00
Brian Paul
6064810e53 mesa: add missing ASSERT_OUTSIDE_BEGIN_END() in _mesa_GetInternalformativ()
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-16 17:06:17 -07:00
Matt Turner
99629735e7 build: Make src/gtest before src/mesa
Fixes a make check problem where libgtest.la wasn't build before tests
that want to link with it.
2013-01-16 13:31:36 -08:00
Jon TURNEY
e6e73089e5 Fix mapi code generator for out-of-tree build
Use os.path.join() rather than hand-rolling it, so path is correct if
sys.argv[0] returns an absolute path.

(According to the python documentation, it's platform dependent whether
sys.argv[0] is a full pathname or not.  It probably also depends on how
the process was started...)

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-16 19:53:01 +00:00
Maarten Lankhorst
4fad211502 nvc0: Add support for video buffer 2013-01-16 17:44:09 +01:00
Maarten Lankhorst
4b8af72f96 vl/video_buffer: fix up surface ordering for the interlaced case
It seems the other code expects surface[0..1] to be the luma field in interlaced case.

See for example vdpau/surface.c vlVdpVideoSurfaceClear and vlVdpVideoSurfacePutBitsYCbCr.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-01-16 17:22:55 +01:00
Maarten Lankhorst
892c1fa8d8 vl/compositor: fix weave shader bugs
Writemask was XY instead of YZ (thanks to calim for spotting it).

The pixel calculation resulted in the pixel always being off by one.
If y was .5:

y' = round(y) + 0.5 = 1.5

Fixing this also means the LRP function has to swap the pixels it, since
it's now the other way around for top/bottom.

WIth these fixes only chroma for top and bottom pixel rows are wrongly interpolated
in my test program:

--- nvidia
+++ nouveau
@@ -1,4 +1,4 @@
-YCbCr[0] = 00c080
+YCbCr[0] = 00b070
 YCbCr[1] = 00b070
 YCbCr[2] = 029050
 YCbCr[3] = 207050
@@ -61,4 +61,4 @@
 YCbCr[60] = 0c5070
 YCbCr[61] = c05090
 YCbCr[62] = 0e70b0
-YCbCr[63] = e080c0
+YCbCr[63] = e070b0

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-01-16 17:22:45 +01:00
Brian Paul
dfcd7658c5 mesa: add new formatquery.c file to SConscript file to fix build 2013-01-16 08:18:33 -07:00
Christian König
f449948812 radeonsi/vdpau: remove nonsense state tracker dep
Added with automake conversion, but makes no sense at all.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-16 15:28:43 +01:00
Ian Romanick
1cedf7819b glapi: Remove duplicate ARB_base_instance from gl_API.xml
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 23:20:18 -08:00
Ian Romanick
3c00a52f7e intel: Enable GL_ARB_internalformat_query
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-15 21:34:45 -08:00
Ian Romanick
f5e7f12e4a mesa: Add driver method to determine the possible sample counts
Use this method in _mesa_GetInternalformativ for both GL_SAMPLES and
GL_NUM_SAMPLE_COUNTS.

v2: internalFormat may not be color renderable by the driver, so zero
can be returned as a sample count.  Require that drivers supporting the
extension provide a QuerySamplesForFormat function.  The later was
suggested by Eric Anholt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-15 21:34:45 -08:00
Ian Romanick
bda540d235 mesa: Add dispatch and extension XML for GL_ARB_internalformat_query
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-15 21:34:45 -08:00
Ian Romanick
5e4bb063f0 mesa: Add extension tracking bit for GL_ARB_internalformat_query
Though, I'm tempted to always expose this extension when
GL_ARB_framebuffer_object is exposed.  In that case, it would share the same
enable bit.

v2: Correctly sort extension names.  Suggested by Eric Anholt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-15 21:34:45 -08:00
Ian Romanick
1b468d043e mesa: Add skeleton implementation of glGetInternalformativ
This is for the GL_ARB_internalformat_query extension and GLES 3.0.

v2: Generate GL_INVALID_OPERATION if the extension is not supported.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-15 21:34:45 -08:00
Vinson Lee
780c2cb42b meta: Move loop variable declaration outside for loop.
Fixes build with MSVC.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-01-15 18:03:25 -08:00
Brian Paul
7ecbbc3386 mesa: move declarations before code to fix MSVC build 2013-01-15 17:02:30 -07:00
Anuj Phogat
d0ce8d6ceb mesa: Round float param in glTexparameterf() to nearest integer
OpenGL 4.2 specification suggests rounding the float data to nearest
integer when the type of internal state is integer. Out of range floats
should be clamped to {INT_MIN, INT_MAX}. This is not specified anywhere
in gl/gles spec but below test expects this behavior.  This patch makes
gles3 conformance sgis_texture_lod_basic_getter.test pass.

A GL spec bug will be raised to include clamping of out of range floats.

V2: Round float to nearest integer for all cases where
_mesa_Texparameterf() converts float param to int. Use the same block of
float to int conversion code for GL_TEXTURE_SWIZZLE_{R,G,B,A}_EXT cases
as well.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 15:09:37 -08:00
Anuj Phogat
bed997daba mesa: Add support to allow blitting to multiple color draw buffers
This patch fixes a blitting case when drawAttachment->Texture ==
readAttachment->Texture. It was causing an assertion failure in
intel_miptree_attach_map() with gles3 conformance test case:
framebuffer_blit_functionality_minifying_blit

Number of changes in this file look scary. But most of them are caused
by introducing a big for loop to support rendering to multiple color
draw buffers.

V2: Fixed a case when number of draw buffer attachments are zero.
V3: Put a for loop in blit_nearest() and blit_linear() functions in to
    support blitting to multiple color draw buffers.
V4: Remove variable declaration in for loop to avoid MSVC compilation
    issues.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 15:09:23 -08:00
Anuj Phogat
ab36ca0614 mesa: Add error checking in _mesa_BlitFramebuffer() for MRTs
This patch adds required error checking in _mesa_BlitFramebuffer() when
blitting to multiple color render targets. It also fixes a case when
blitting to a framebuffer with renderbuffer/texture attached to
GL_COLOR_ATTACHMENT{i} (where i!=0). Earlier it skips color blitting if
nothing is found attached to GL_COLOR_ATTACHMENT0.

V2: Fixed a case when number of draw buffer attachments are zero.
V3: Do compatible_color_datatypes() and compatible_resolve_formats()
    check for all the draw renderbuffers in fbobject.c. Fix debug code
    at bottom of _mesa_BlitFramebuffer() to handle MRTs. Combine error
    checking code for linear blits with other color blit error checking.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 15:09:12 -08:00
Anuj Phogat
2f2801f876 mesa: Fix GL error generation in _mesa_GetFramebufferAttachmentParameteriv()
This allows query on default framebuffer in
glGetFramebufferAttachmentParameteriv() for gles3. Fixes unexpected GL
errors in gles3 conformance test case:
framebuffer_blit_functionality_multisampled_to_singlesampled_blit

V2: Use _mesa_is_gles3() check to restrict allowed attachment types to
specific APIs.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 15:09:03 -08:00
Anuj Phogat
b77243b9c2 intel: Support blitting to multiple color draw buffers
This patch enables blitting to multiple color attachments of a
framebuffer.  It also fixes a case when blitting to a framebuffer with
renderbuffer/texture attached to non-zero attachment point
i.e. GL_COLOR_ATTACHMENT{1, 2, ...}.  Earlier we were incorrectly
blitting to GL_COLOR_ATTACHMENT0 by default.

V2: Use intel_copy_texsubimage() for blitting only if all the color
attachments can blit using it.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 15:08:55 -08:00
Anuj Phogat
0c535ae7fc meta: Add functionality to do _mesa_meta_BlitFrameBuffer() using glsl
This patch rewrites _mesa_meta_BlitFrameBuffer() function to add support
for blitting with GLSL/GLSL ES shaders. These changes were required to
support glBlitFrameBuffer() in gles3. This patch, along with other
patches in this series, make 16 failing framebuffer_blit test cases in
gles3 conformance pass.

V2: Properly handle flipped blits for source and destination
    renderbuffer / textures. Add support for GL_TEXTURE_RECTANGLE in
    _mesa_meta_BlitFrameBuffer. Create a temp depth texture to support
    depth buffer blitting.
V3: Remove unsupported / redundant shader code. Add an assertion to make
    sure that we don't use rectangle texture in ES. Put API guard on
    glTexEnvi().
V4: For gles3: Don't use ReadPixels or CopyTexImage2D to blit depth
    buffer.  gles3 spec says for CopyTexImage2D that "color buffer
    components can be dropped during the conversion to internalformat,
    but new components cannot be added." So, use the internal format of
    read renderbuffer to create texture for color buffer blitting.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2013-01-15 15:08:38 -08:00
Anuj Phogat
252573ae0f mesa: Fix GL error generation in glBlitFramebuffer()
V2:
If mask has GL_STENCIL_BUFFER_BIT set, the depth formats for
readRenderBuffer and drawRenderBuffer must match unless one of the two
buffers doesn't have depth, in which case it's not blitted, so the
format check should be ignored.  Same comment goes for stencil formats
in depth renderbuffers if mask has GL_DEPTH_BUFFER_BIT set.

v3 (Kayden): Refactor code to be a bit more readable.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 15:08:12 -08:00
Kenneth Graunke
f727fc6304 mesa: Make ES3 glDrawBuffers() only accept BACK/NONE for the winsys fbo.
Nothing was explicitly checking this.

v2: Update GL3 spec reference.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> [v2]
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
2013-01-15 15:04:50 -08:00
Kenneth Graunke
fd3891cbbe mesa: Handle GL_BACK correctly for ES 3.0 in glDrawBuffers().
In ES 3.0, when calling glDrawBuffers() on the window system
framebuffer, the only valid targets are GL_NONE or GL_BACK.  Since there
is no stereo rendering in ES 3.0, this is a single buffer, unlike
desktop where it may be two (and thus isn't allowed).

For single-buffered configs, GL_BACK ironically means the front (and
only) buffer.  I'm not sure that it matters, however, as ES shouldn't
have front buffer rendering in the first place.

Fixes es3conform framebuffer_blit_coverage_default_draw_buffer_binding.

v2: Update GLES3 spec reference.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> [v2]
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
2013-01-15 14:59:40 -08:00
Ian Romanick
d786bf2c2a egl/dri2: Fix typo in the previous commit
I didn't notice this due to a noobed piglit run.  It wasn't previously
noticed because the patch was only run on a driver that supported GLES3.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 14:19:04 -08:00
Rob Schneider
45575ff388 libgl-gdi: Avoid hangs on DLL_PROCESS_DETACH.
At process exit DLL_PROCESS_DETACH is signaled to DllMain(), where then
a final cleanup is triggered.  In stw_cleanup() code is triggered that
tries to communicate a shutdown to the spawned threads -- however at
that time those threads have already been terminated by the OS and so
the process hangs.

v2: skip stw_cleanup_thread() too

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2013-01-15 14:16:09 -08:00
Chad Versace
eb09940e55 egl/dri2: Add plumbing for EGL_OPENGL_ES3_BIT_KHR
Fixes error EGL_BAD_ATTRIBUTE in the tests below on Intel Sandybridge:
    * piglit egl-create-context-verify-gl-flavor, testcase OpenGL ES 3.0
    * gles3conform, revision 19700, when runnning GL3Tests with -fbo

This plumbing is added in order to comply with the EGL_KHR_create_context
spec. According to the EGL_KHR_create_context spec, it is illegal to call
eglCreateContext(EGL_CONTEXT_MAJOR_VERSION_KHR=3) with a config whose
EGL_RENDERABLE_TYPE does not contain the EGL_OPENGL_ES3_BIT_KHR. The
pertinent
portion of the spec is quoted below; the key word is "respectively".

  * If <config> is not a valid EGLConfig, or does not support the
    requested client API, then an EGL_BAD_CONFIG error is generated
    (this includes requesting creation of an OpenGL ES 1.x, 2.0, or
    3.0 context when the EGL_RENDERABLE_TYPE attribute of <config>
    does not contain EGL_OPENGL_ES_BIT, EGL_OPENGL_ES2_BIT, or
    EGL_OPENGL_ES3_BIT_KHR respectively).

To create this patch, I searched for all the ES2 bit plumbing by calling
`git grep "ES2_BIT\|DRI_API_GLES2" src/egl`, and then at each location
added a case for ES3.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:45:54 -08:00
Chad Versace
26f9faa04b intel: Expose support for DRI_API_GLES3
If the hardware/driver combo supports GLES3, then set the GLES3 bit in
intel_screen's bitmask of supported DRI API's.  Neither the EGL nor GLX
layer uses the bit yet.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:45:54 -08:00
Chad Versace
e90c08e667 dri: Define enum __DRI_API_GLES3
This enum corresponds to EGL_OPENGL_ES3_BIT_KHR.
Neither the GLX nor EGL layer use the enum yet.

I don't like the GLES bits. I'd prefer that all GLES APIs be exposed
through a single API bit, as is done in GLX_EXT_create_context_es_profile.
But, we need this GLES3 enum in order to do the plumbing necessary to
correctly support EGL_OPENGL_ES3_BIT_KHR as required by the
EGL_KHR_create_context spec.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:45:53 -08:00
Chad Versace
a11fe62058 intel: Move validation of context version into intelInitContext
Each driver (i830, i915, i965) used independent but similar code to
validate the requested context version. With the rececnt arrival of GLES3,
that logic has needed an update. Rather than apply identical updates to
each drivers validation code, let's just move the validation into the
shared routine intelInitContext.

This refactor required some incidental changes to functions
i830CreateContext and intelInitContext. For each function, this patch:
    - Adds context version parameters to the signature.
    - Adds a DRI_CTX_ERROR out param to the signature.
    - Sets the DRI_CTX_ERROR at each early return.

Tested against gen6 with piglit egl-create-context-verify-gl-flavor.
Verified that this patch does not change the set of exposed EGL context
flavors.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:45:51 -08:00
Chad Versace
4945086f36 intel: Set screen's api mask according to hw capabilities (v3)
Before this patch, intelInitScreen2 set DRIScreen::api_mask with the hacky
heuristic below:

    if (gen >= 3)
        api_mask = GL | GLES1 | GLES2;
    else
        api_mask = 0;

This hack was likely broken on gen2 (i830), but I don't care enough to
properly investigate. It appears that every EGLConfig on i830 has
EGL_RENDERABLE_TYPE=0, and thus eglCreateContext will never succeed.
Anyway, moving on to living drivers...

With the arrival of EGL_OPENGL_ES3_BIT_KHR, this heuristic is now
insufficient. We must enable the GLES3 bit if and only if the driver is
capable of creating a GLES3 context. This requires us to determine the
maximum supported context version supported by the hardware/driver for
each api *during initialization of intel_screen*.

Therefore, this patch adds four new fields to intel_screen which indicate
the maximum supported context version for each api:
  max_gl_core_version
  max_gl_compat_version
  max_gl_es1_version
  max_gl_es2_version

The api mask is now correctly set as:

    api_mask = GL;
    if (max_gl_es1_version > 0)
        api_mask |= GLES1;
    if (max_gl_es2_version > 0)
        api_mask |= GLES2;

Tested against gen6 with piglit egl-create-context-verify-gl-flavor.
Verified that this patch does not change the set of exposed EGL context
flavors.

v2:
  - Replace the if-tree on gen with a switch, for Ian.
  - Unconditionally enable the DRI_API_OPENGL bit, for Ian.

v3:
  - Drop max gl version to 1.4 on gen3 if !has_occlusion_query,
    because occlusion queries entered core in 1.5. For Ian.

v4:
  - Drop ES2 version back to 2.0 due to rebase (Ian).

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick.intel.com>
2013-01-15 13:44:29 -08:00
Matt Turner
112e302481 mesa: Return INVALID_ENUM for glReadPixels(..., GL_DEPTH_*, ...) on ES 3
I'm not sure if this is the correct fix. The
_mesa_es_error_check_format_and_type function (used above in the ES 1
and 2 cases) was originally added for glTexImage checking and allows
GL_DEPTH_STENCIL/GL_UNSIGNED_INT_24_8 combinations. Using it in ES 3
causes other tests to regress.

Fixes es3conform's packed_depth_stencil_error test.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:34:00 -08:00
Matt Turner
2906e2034c mesa: Return INVALID_OPERATION when type is known but not allowed
INVALID_ENUM is for when the type is simply not known.

Fixes part of es3conform's packed_depth_stencil_error test.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:34:00 -08:00
Matt Turner
c8901133a4 mesa: Allow HALF_FLOAT in glVertexAttribPointer for GLES3
Fixes es3conform's half_float_max_vertex_dimensions and
half_float_textures tests.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:34:00 -08:00
Matt Turner
cbef5371f6 mesa: Reject texture-only formats as renderbuffer formats in ES 3
ES 3 specifies some formats as texture-only (i.e., not available for
renderbuffers).

See the "Required Texture Formats" section (pg 126) of the ES 3 spec.

v2: Allow RED and RG float rendering in core profiles The check used to
be (version > 30) || (compat profile w/extensions).  Just deleting
<version > 30) broke 3.0+ core profiles.

Fixes es3conform's color_buffer_unsupported_format test.

Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:34:00 -08:00
Kenneth Graunke
8907b6a8e4 mesa: Fix default value of BUFFER_ACCESS_FLAGS.
According to both the GL 3.0 and ES 3.0 specifications (table 2.7 for GL
and table 2.8 for ES), the default value of BUFFER_ACCESS_FLAGS is
supposed to be zero.

Note that there are two related quantities: the obsolete BUFFER_ACCESS
enum and the new BUFFER_ACCESS_FLAGS bitfield.

BUFFER_ACCESS can only be GL_READ_ONLY, GL_WRITE_ONLY, or GL_READ_WRITE;
BUFFER_ACCESS_FLAGS can easily represent all three via GL_MAP_WRITE_BIT,
GL_MAP_READ_BIT, and their logical or.  It also supports more flags.

Thus, Mesa only stores the bitfield, and simply computes the old enum
when queried, via simplified_access_mode(bufObj->AccessFlags).

The tricky part is that, while BUFFER_ACCESS_FLAGS defaults to 0,
BUFFER_ACCESS defaults to GL_READ_WRITE for desktop [GL 3.0, table 2.8]
and GL_WRITE_ONLY_OES for ES [the GL_EXT_map_buffer_range extension].

Mesa tried to implement this by setting the default AccessFlags to
GL_MAP_READ_BIT | GL_MAP_WRITE_BIT on desktop, and GL_MAP_WRITE_BIT on
ES.  But in all specifications, it needs to be 0.

This patch moves that logic into simplified_access_mode(): when
AccessFlags == 0, it now returns GL_READ_WRITE for desktop and
GL_WRITE_ONLY for ES 1/2.  (BUFFER_ACCESS doesn't exist on ES 3.0,
so it's irrelevant there.)

With that in place, it changes the AccessFlags default to 0.

Fixes three es3conform tsets:
- copy_buffer_defaults
- map_buffer_range_modify_indices
- pixel_buffer_object_default_parameters

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:34:00 -08:00
Kenneth Graunke
f3db20da1a mesa: Rework crazy error code rules in glDrawBuffers().
Perhaps most importantly, this patch adds comments quoting the relevant
spec paragraphs above each error condition.

It also makes three changes:
- For FBOs, GL_COLOR_ATTACHMENTm where m >= MaxDrawBuffers is supposed
  to generate INVALID_OPERATION (not INVALID_ENUM).
- Constants that refer to multiple buffers (such as FRONT, BACK, LEFT,
  RIGHT, and FRONT_AND_BACK) are supposed to generate INVALID_OPERATION,
  not INVALID_ENUM.
- In ES 3.0, for FBOs, buffers[i] must be NONE or GL_COLOR_ATTACHMENTi
  or else INVALID_OPERATION occurs.  (This is a new restriction.)

Fixes es3conform's draw-buffers-api test.

v2: The error path was missing a "return" like all the other error
paths.  Also, we may as well call it glDrawBuffers in the error message
since the ARB suffix doesn't exist in ES 3.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:34:00 -08:00
Carl Worth
d9d857e24f i965: Force even an empty query to flush all previous queries.
The specification requires that query results are processed in order, (when
one query result is returned, all previous query of the same type must also be
available). The implementation was failing this requirement in the case of
BeginQuery and EndQuery with no intervening drawing, (the result would be made
available immediately without flushing previous queries).

This fixes the following es3conform test:

	occlusion_query_query_order

as well as the following piglit test:

	occlusion_query_order

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:34:18 -08:00
Carl Worth
c0b768ffee meta: Allow meta operations to pause/resume an active occlusion query
This allows for avoiding the occlusion query erroneously accumulating results
during the meta operation. This functionality is made conditional on a new
MESA_META_OCCLUSION_QUERY bit so that meta-operations which should generate
fragments can continue to get the current behavior.

The implementation of glClear is specifically augmented to request the flag
since glClear is specified to not generate fragments.

This fixes the following es3conform tests:

	occlusion_query_draw_occluded.test
 	occlusion_query_clear
	occlusion_query_custom_framebuffer
	occlusion_query_stencil_test
	occlusion_query_discarded_fragments

As well as the following piglit test:

	occlusion_query_meta_no_fragments

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:34:18 -08:00
Carl Worth
3dd76f7168 queryobj: Add EverBound flag, making ISQuery() return false before BeginQuery()
This flag allows for the specified behavior that GenQueries reserves a name,
but does not associate an object with it until BeginQuery. We allocate the
object immediately with the new EverBound flag set to false, and then set the
flag to true at the time of BeginQuery.

This allows us to implement a conformant IsQuery function by checking the
state of the new EverBound flag.

This fixes the following es3conform tests:

	occlusion_query_genqueries
	occlusion_query_is_query_nonzero

and the following piglit test:

	occlusion_query_lifetime

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:34:01 -08:00
Carl Worth
c7df9c0e12 Update comment to specify actual text being referenced from the specification.
The reference to "correct, see spec" was a bit too vague to be useful,
(particularly since the language being referenced here changes between OpenGL
3.1 and OpenGL 4.3).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-15 13:10:58 -08:00
Brian Paul
133383f77a docs: minor updates to VMware guest driver docs
The DRM's --enable-vmwgfx-experimental-api flag isn't needed anymore.
2013-01-15 13:55:24 -07:00
Marek Olšák
7660529c44 r300g: fix and cleanup flushing before clearing CMASK, ZMASK, and HIZ 2013-01-15 21:50:34 +01:00
Marek Olšák
ca2c28859e r300g: implement MSAA compression and fast MSAA color clear
These are optimizations which make MSAA a lot faster.

The MSAA work is complete with this commit.  (except for enablement of AA
optimizations for RGBA16F, for which a patch is ready and waiting until
the kernel CS checker fix lands)

MSAA can't be made any faster as far as hw programming is concerned.

The catch is only one process and one colorbuffer can use the optimizations
at a time.  There usually is only one MSAA colorbuffer, so it shouldn't be
an issue.

Also, there is a limit on the size of MSAA colorbuffer resolution in terms
of megapixels.  If the limit is surpassed, the AA optimizations are disabled.
The limit is:
- 1 Mpix on low-end and some mid-level chipsets (1024x768 and 1280x720)
- 2 Mpix on some mid-level chipsets (1600x1200 and 1920x1080)
- 3 or 4 Mpix on high-end chipsets (2048x1536 or 2560x1600, respectively)
It corresponds to the number of raster pipes (= GB pipes) available, each pipe
can hold 1 Mpix of AA compression data.

If it's enabled, the driver prints to stdout:
  radeon: Acquired access to AA optimizations.
2013-01-15 21:48:58 +01:00
Marek Olšák
1dfe8eead9 gallium/util: add a half float array to util_color
For convenient packing into half floats.
2013-01-15 21:48:49 +01:00
Tom Stellard
7824ab8070 Revert "targets/opencl: Link against libgallium.la instead of libgallium.a"
This reverts commit 4148a29ed8.

This is a work-around for bug:
https://bugs.freedesktop.org/show_bug.cgi?id=59334

We really should be linking against libgallium.la instead of
libgallium.a, but until we can figure why linking against libgallium.la
causes runtime failures in clover we will continue to link against
libgallium.a

Acked-by: Andreas Boll <andreas.boll.dev@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
2013-01-15 18:04:51 +00:00
Marek Olšák
f26eb36e8b st/mesa: use a generic varying to pass the clear color to the FS
The color varying may have reduced precision or be even clamped.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-15 16:47:18 +01:00
Marek Olšák
355d463f73 gallium/util: fix glClear with MRT by making the FS write to all cbufs
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-15 16:47:18 +01:00
Marek Olšák
2cd1407d2d st/mesa: fix InternalFormat for Z24X8 window-system buffers
This probably doesn't fix anything, but it's good to be consistent.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-15 16:47:18 +01:00
Marek Olšák
d489c90a68 st/mesa: remove dead conditional in Clear
I think the conditional always evaluates to false.

If I understand the code in core Mesa correctly, depthBits or stencilBits
is 0 if the depth or stencil renderbuffer is NULL, respectively.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-15 16:47:18 +01:00
Marek Olšák
f94ea25a4a st/mesa: simplify conditionals in Clear
just check depth and stencil separately, the outcome is the same

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-15 16:47:18 +01:00
Marek Olšák
01b7124788 st/mesa: fix glClear with different colormask for each colorbuffer
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-15 16:47:18 +01:00
Marek Olšák
f04dd3d003 gallium: remove PIPE_CAP_DEPTHSTENCIL_CLEAR_SEPARATE
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-15 16:47:18 +01:00
Marek Olšák
cabe4fbb85 st/mesa: always assume separate depth and stencil clear is supported
All drivers implement it now.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-15 16:47:18 +01:00
Marek Olšák
16a30e201e softpipe: implement separate depth-stencil clear
The CAP is going away.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-15 16:47:18 +01:00
Jon TURNEY
77dd50d020 libgl-xlib: softpipe and llvmpipe aren't mutually exclusive at link time
Since automake changes, softpipe and llvmpipe are mutually exclusive at link
time.  This doesn't make much sense to me as we can choose between them at
run-time using GALLIUM_DRIVER.

Creating library file: .libs/libGL.dll.a
.libs/xlib.o: In function `sw_screen_create_named':
/jhbuild/checkout/mesa/mesa/src/gallium/targets/libgl-xlib/../../../../src/gallium/auxiliary/target-helpers/inline_sw_helper.h:35:
undefined reference to `_softpipe_create_screen'

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-15 10:43:33 +00:00
Jordan Justen
8443b59a5b pack: handle GL_RGB+GL_UNSIGNED_INT_2_10_10_10_REV case
For floats, if GL_RGB is the source, then alpha should be set to
1.0F.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-14 20:52:19 -08:00
Jordan Justen
80784066cc glformats: allow GL_RGB+GL_UNSIGNED_INT_2_10_10_10_REV for GLES2/3
This format is allowed by the GL_EXT_texture_type_2_10_10_10_REV
extension.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-14 20:52:09 -08:00
Jordan Justen
ba34c1d570 copyteximage: Use Driver's AllocTextureImageBuffer instead of TexImage
Call Driver.AllocTextureImageBuffer rather than calling
Driver.TexImage with NULL data, format=GL_NONE and type=GL_NONE.

This avoids setting ctx->Unpack, which can lead to incorrectly
trying to upload data.

The GLES3 GTF program's packed_pixels_pbo test was triggering
an error for i965 with the previous code.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-14 20:50:31 -08:00
Jordan Justen
91ec623bd2 copyteximage: update signed vs. unsigned format matching
Fixes issues with gles3-gtf

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-14 20:48:14 -08:00
Jordan Justen
161a3cd9fc framebuffer: add _mesa_get_read_renderbuffer
This returns the current read renderbuffer for the specified
format type.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-14 20:48:14 -08:00
Matt Turner
f5a3d151b0 teximage: use _mesa_es3_error_check_format_and_type for GLES3
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-14 20:48:13 -08:00
Matt Turner
9cfcac4528 glformats: add _mesa_es3_error_check_format_and_type
This function checks for ES3 compatible
format/type/internalFormat/dimension combinations.

[jordan.l.justen@intel.com: additional tweaks for gles3-gtf]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-14 20:47:59 -08:00
Jordan Justen
cf300eaab6 fbobject: don't allow LUMINANCE/INTENSITY/ALPHA fbo on ES/Core
v2:
 * Only allow on GL Legacy contexts

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-14 20:47:02 -08:00
Jordan Justen
275620c4b2 fbobject: add VERBOSE=api message for renderbuffer storage
Add API debug trace message for:
 * glRenderbufferStorage
 * glRenderbufferStorageMultisample

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-14 20:47:02 -08:00
Jordan Justen
7f867851f5 fbobject: add VERBOSE=api message for check framebuffer status
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-14 20:47:02 -08:00
Brian Paul
1c9833ba70 util: add new primitive types to pipe_prim_names[] array
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-01-14 18:15:41 -07:00
Brian Paul
f5eb1b123f st/mesa: add some simple buffer/draw debug code
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-14 18:15:41 -07:00
Brian Paul
cb6ef3d112 libgl-xlib: link with -lrt
Fixes a runtime error:

glxgears: symbol lookup error: /home/brian/mesa/lib/gallium/libGL.so.1: undefined symbol: clock_gettime

v2: use $(CLOCK_LIB) and $(PTHREAD_LIBS) per Andreas Boll.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-14 18:15:41 -07:00
Carl Worth
258453716f i965: Avoid blending with destination alpha when RB format has no alpha bits
The hardware does not support a render target without an alpha channel.
So when the user creates a render buffer with no alpha channel, there actually
is storage available for alpha internally. It requires special care to
avoid these unwanted alpha bits from causing any problems.

Specifically, when blending, and when the blend factors would read the
destination alpha values, this commit coerces the blend factors to instead be
either 0 or 1 as appropriate.

A similar fix was made for pre-gen6 hardware in commit eadd9b8e and this
commit shares the fixup function written by Ian then.

This commit the following es3conform test:

	rgb8_rgba8_rgb

As well as the following piglit (sub) tests:

	EXT_framebuffer_object/fbo-blending-formats/3
	EXT_framebuffer_object/fbo-blending-formats/GL_RGB
	EXT_framebuffer_object/fbo-blending-formats/GL_RGB8

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-14 15:35:37 -08:00
Kristian Høgsberg
6d4d4b00dd egl/wayland: Implement EGL_EXT_buffer_age
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
2013-01-14 16:39:15 -05:00
Kristian Høgsberg
90804e886d egl/wayland: Pull color buffers from dri2_surf->color_buffers pool
We used to keep the color buffers in the dri_buffers array and
swap __DRI_BUFFER_BACK_LEFT and __DRI_BUFFER_FRONT_LEFT around there
and swap third_buffer in in case we needed to triple buffer.  That
gets a little fidgety with all the swaps, so lets use the
color_buffers pool like the gbm platform does.  We track the color buffers,
their corresponding wl_buffer and locked status here and just plug
a free one into dri2_surf->buffers when we need to.

This is a nice clean-up in itself, but it also sets us up to track
buffer age in the color_buffers structs.

Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
2013-01-14 16:39:15 -05:00
Johannes Obermayr
dc473c5f0a gallium/svga: Make sure -std=gnu99 is set.
This is a work-around until configure.ac stops touching CFLAGS.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-01-14 13:32:13 -08:00
Damien Lespiau
164a04ed1b build: Fix the documented default value of --with-gallium-drivers
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-14 09:11:44 -08:00
Marek Olšák
e3e1ffb252 r300g: set a dummy vertex buffer in context_create
so that the driver doesn't crash if an app doesn't set any vertex buffers.
2013-01-14 05:58:06 +01:00
Marek Olšák
5330c5a248 r300g: fix MSAA resolve to an untiled texture
RB3D_DEBUG_CTL doesn't help, so I resolve to a tiled temporary texture and
then blitting it to the destination one, which we also do in other situations.
2013-01-14 03:12:01 +01:00
Marek Olšák
e102b665e6 r300g: advertise MSAA support for the RGB10_A2 format on r500
It seems to be working just fine.
2013-01-14 03:12:01 +01:00
Marek Olšák
5fc83101fb r300g: allow separate depth and stencil clear
The handling of the CAP is broken in st/mesa anyway. Let's just kill it.

This commit pretty much enables fast Z clear for FBOs with Z24S8.
The driver falls back to clearing with a quad if the fast clear cannot be
used. It can still do fast color clear, for example.
2013-01-14 03:11:43 +01:00
Marek Olšák
e93a5c2b86 r300g: if both Z and stencil are present, they must be fast-cleared together 2013-01-14 03:11:42 +01:00
Marek Olšák
631c631cbf r300g: allow HiZ with a 16-bit zbuffer 2013-01-14 03:11:42 +01:00
Marek Olšák
3f584c211a r300g: random hyperz cleanups 2013-01-14 03:11:42 +01:00
Marek Olšák
4d6faf5175 r300g: kill the X.Org state tracker target
This won't ever be made default and we don't need it anyway.

We should also consider doing this for other drivers.
2013-01-14 03:11:41 +01:00
Johannes Obermayr
6acef6c5f7 xmlpool: Fix out-of-tree builds.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-13 12:38:50 +01:00
Johannes Obermayr
40a9b0f5d2 gtest: Build it only for 'make check'.
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-13 12:38:44 +01:00
Johannes Obermayr
ebcabb88cf tests: AM_CPPFLAGS must include $(top_srcdir) instead of $(top_builddir).
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-13 12:38:38 +01:00
Adam Jackson
06f3a1f792 r200: Fix probable thinko in r200EmitArrays
Effectively this path would always assert.  Move the break statement to
the (probable) intended place.

Note: This is a candidate for the stable branches.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-01-13 12:38:31 +01:00
Nathan Schulte
1b8adabe2e target/dri-swrast: fix for nonstandard LLVM prefix
Include LLVM_LDFLAGS when building with LLVM.  Fixes the following build
errors:
  CXXLD  swrast_dri.la
  /usr/bin/ld: cannot find -lLLVMR600CodeGen
  /usr/bin/ld: cannot find -lLLVMR600Desc
  /usr/bin/ld: cannot find -lLLVMR600Info
  /usr/bin/ld: cannot find -lLLVMR600AsmPrinter

Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-13 12:22:15 +01:00
Andreas Boll
9da454f295 targets/dri-r600: Force c++ linker in all cases
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59282
2013-01-13 12:19:29 +01:00
Andreas Boll
e09a5846cd glapi/gen: remove an obsolete comment from Makefile.am
Glapi gets generated at build time.

See commit:
0ce0f7c0c8
mesa: Remove the generated glapi from source control, and just build it.
2013-01-13 00:55:37 +01:00
Matt Turner
92ce9c38fd Remove hacks for static Makefiles
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - don't remove compatibility with scripts for the old build system

v3: Andreas Boll <andreas.boll.dev@gmail.com>
    - remove more obsolete hacks

v4: Andreas Boll <andreas.boll.dev@gmail.com>
    - add a previously removed TOP variable to fix vgapi build
2013-01-13 00:55:37 +01:00
Kenneth Graunke
8c80bdc4a8 i965: Move program_id to intel_screen instead of brw_context.
According to bug #54524, I regressed oglconform's multicontext test
when I reenabled the fragment shader precompile.

However, these test cases only passed by miraculous coincedence.  We
assign each fragment program a unique ID (brw_fragment_program::id which
becomes brw_wm_prog_key::program_string_id) which we obtain by storing a
per-context counter.

The test case uses GLX context sharing to access the same fragment
program from two different contexts.  This means that we share a program
cache.  Before the precompile, if both contexts happened to use the same
shaders in the same order, we'd obtain the same program_string_ids (by
virtue of doing the same computation twice).  However, the more likely
scenario is that they completely disagree on program_string_id.

This meant that we'd have two completely different fragment shaders in
the cache with the same ID, tricking us to think they were the same
(aside from NOS), so we'd render using the wrong program.

This patch implements a simple fix suggested by Eric: it moves the
global counter out of brw_context and into intel_screen, which is shared
across all contexts.  A mutex protects it from concurrent access.

This is also the first direct usage of pthreads in the i965 driver.

Fixes 10 subcases of oglconform's multicontext test.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54524
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-12 15:36:21 -08:00
Kenneth Graunke
2c4ad502ce i965: Fix build error with clang.
Technically, variable sized arrays are a required feature of C99,
redacted to be optional in C11, and not actually part of C++ whatsoever.

Gcc allows using them in C++ unless you specify -pedantic, and Clang
appears to allow them for simple/POD types.

exec_list is arguably POD, since it doesn't have virtual methods, but I
can see why Clang would be like "meh, it's a C++ struct, say no", seeing as
it's meant to support C99.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58970
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-01-12 15:35:40 -08:00
Kenneth Graunke
fea648db08 i965/fs: Don't mix integer/float immediates in i2b handling.
The simulator gets very angry about our i2b code:

cmp.ne(16)      g3<1>D          g2<0,1,0>D      0F

We can't mix integer DWord and float types.  The only reason to use 0F
here was to share code with f2b.  Split it and use 0D instead.

While we don't believe anything bad will actually happen because of
this, it's nice to fix the warnings and easy enough to do.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-12 15:35:38 -08:00
Kenneth Graunke
4a6753926f i965: Add an INTEL_DEBUG=no16 option.
Often when debugging, I don't want to see SIMD16 shaders.  It makes
INTEL_DEBUG=vs/fs output much easier to read, especially when a program
dumps many shaders.  Plus, I also want to verify that SIMD8 works before
even considering SIMD16.

v2: Fix the likeliness check (caught by Chris and Eric).

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-12 15:35:38 -08:00
Alexandre Demers
67ef755908 configure.ac: Fixing common dri dependency when using dri state tracker
Fixes a regression caused by b587a7595e

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59261
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-13 00:13:16 +01:00
Fredrik Höglund
ac1c2b8238 st/mesa: set ctx->Const.UniformBufferOffsetAlignment
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-12 22:08:09 +01:00
José Fonseca
a3dd1ff45f scons: Update for xmlpool/options.h generation. 2013-01-12 19:00:04 +00:00
Johannes Obermayr
6bca283ad5 nv50/nvc0: Build codegen in nv50.
This is required to make libnv50 independent of libnvc0.
2013-01-12 17:14:04 +01:00
Pekka Vuorela
09a00a141f winsys/sw/wayland: Fix build to properly use wayland cflags
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59281
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-12 16:02:30 +01:00
Jordan Justen
3c3a2b51b8 texformat: use MESA_FORMAT_ARGB2101010 with GL_UNSIGNED_INT_2_10_10_10_REV
Choose MESA_FORMAT_ARGB2101010 when storing
GL_RGBA + GL_UNSIGNED_INT_2_10_10_10_REV or
GL_RGB + GL_UNSIGNED_INT_2_10_10_10_REV.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-12 01:46:12 -08:00
Jordan Justen
53e0f32efe texstore argb2101010: merge GL_RGBA and GL_RGB cases
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-12 01:46:02 -08:00
Jordan Justen
f1c5b5d15e glformats: support _mesa_bytes_per_pixel for 2101010+GL_RGB
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-12 01:45:01 -08:00
Jordan Justen
89e07ccf61 glformats: add _mesa_base_format_component_count
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-12 01:44:25 -08:00
Jordan Justen
6d63b6e503 glformats: add functions to detect signed/unsigned integer types
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-12 01:40:47 -08:00
Jordan Justen
2ace406b1f unpack: support unpacking MESA_FORMAT_ARGB2101010
Note: This is a candidate for the stable branches.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-12 01:40:01 -08:00
Ian Romanick
8af7d3ce9f mesa: Add extension tracking for {ARB,OES}_get_program_binary
The ARB_get_program_binary spec says "OpenGL 3.0 is required."  The
nearly identical OES_get_program_binary extension is available for
OpenGL ES 2.0, so I don't see how / why OpenGL 3.0 is a requirement for
the ARB version.  Let's just enable whenever GL_ARB_shader_objects is
available.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 18:13:53 -08:00
Ian Romanick
31ca0c8be3 mesa: Add GetProgramiv support for GL_PROGRAM_BINARY_LENGTH
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 18:13:53 -08:00
Ian Romanick
50c5fac4e2 mesa: Add Get support for PROGRAM_BINARY_FORMATS and NUM_PROGRAM_BINARY_FORMATS
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 18:13:53 -08:00
Ian Romanick
fefd03e16c mesa: Add tracking for GL_PROGRAM_BINARY_RETRIEVABLE_HINT state
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 18:13:53 -08:00
Ian Romanick
8e2e670007 mesa: Emit errors for geometry shader enums when ARB_gs4 is not supported
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 18:13:53 -08:00
Ian Romanick
e3f273e2f4 glapi: Emit dispatch for {ARB,OES}_get_program_binary
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 18:13:53 -08:00
Ian Romanick
11b49dbd05 glapi: Remove spurious space from end of extension name
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 18:13:52 -08:00
Ian Romanick
3fe747a0fe mesa: Add stub implementations of glGetProgramBinary and glProgramBinary
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 18:13:52 -08:00
Ian Romanick
ec41349a78 mesa: Fix the naming of _mesa_ProgramParameteriARB
After recent changes in the XML, the dispatch generators will expect
this function to be named _mesa_ProgramParameteri.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 18:13:52 -08:00
Ian Romanick
bb7f1a9ae8 glapi: Reorder and clean up some of the includes and comments
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 18:13:52 -08:00
Ian Romanick
a002902168 mesa: Fix GL_SHADER_BINARY_FORMATS query
There were two bugs here.  First, this and several other queries were
not available in a desktop GL context with GL_ARB_ES2_compatibility.
Second, GL_NUM_SHADER_BINARY_FORMATS returns zero, but
GL_SHADER_BINARY_FORMATS writes one element of data to the buffer.  If
NUM is zero, no data should be written.

Fixes piglit test 'arb_get_program_binary-overrun shader'.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 18:13:52 -08:00
Dave Airlie
4f1e037acf docs/GL3.txt: update GL3 status for r600g.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-01-12 00:19:18 +00:00
Dave Airlie
5039ad6bc5 r600g: fix warnings for htile va
This fixes a warning about mismatched types.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-01-11 23:19:11 +00:00
Dave Airlie
d23aa65001 r600g: texture buffer object + glsl 1.40 enable support (v2)
This adds TBO support to r600g, and with GLSL 1.40 enabled,
we now get 3.1 core profiles advertised for r600g.

The r600/700 implementation is a bit different from the evergreen one,
as r6/7 hw lacks vertex fetch swizzles. So we implement it by passing 5
constants per sampler to the shader, the shader uses the first 4 as masks
for each component and the 5th as the alpha value to OR in.

Now TXQ is also broken so we have to pass a constant for the buffer size,
on evergreen we just pass this, on r6/7 we pass it as the 6th element
in the const info buffer.

v1.1: drop return as DDX doesn't use a texture type
v2: add r600/700 support.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-01-11 22:31:54 +00:00
Dave Airlie
77c10225ee r600g: uniform buffer object support
This adds 12 more constant buffers for use as UBOs,
along with adding relative constant fetching for 2D indices.

This with GLSL 1.40 enabled passes all the same tests as softpipe
on my evergreen system.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-01-11 22:31:54 +00:00
Dave Airlie
199eea4a4b r600: always export a position from vertex shader
This fixes piglit glsl-1.40-tf-no-position from gpu hanging on my rv635
at least.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-01-11 22:31:54 +00:00
Carl Worth
cc5fc8bf2f glcpp: Add tests for line continuation
First we test that line continuations are honored within a comment, (as
recently changed in glcpp), then we test that line continuations can be
disabled via an option within the context. This is tested via the new support
for a test-specific command-line option passed to glcpp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 13:55:41 -08:00
Carl Worth
2483039aca glcpp: Rewrite line-continuation support to act globally.
Previously, we were only supporting line-continuation backslash characters
within lines of pre-processor directives, (as per the specification). With
OpenGL 4.2 and GLES3, line continuations are now supported anywhere within a
shader.

While changing this, also fix a bug where the preprocessor was ignoring
line continuation characters when a line ended in multiple backslash
characters.

The new code is also more efficient than the old. Previously, we would
perform a ralloc copy at each newline. We now perform copies only at each
occurrence of a line-continuation.

This commit fixes the line-continuation.vert test in piglit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 13:55:41 -08:00
Carl Worth
63d156900f glcpp: Add --disable-line-continuations argument to standalone glcpp
This will allow testing of disabled line-continuation on a case-by-case basis,
(with the option communicated to the preprocessor via the GL context).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 13:55:41 -08:00
Carl Worth
409dee5eac glcpp: Allow test-specific arguments for standalone glcpp tests
This will allow the test exercising disabled line continuations to arrange
for the --disable-line-continuations argument to be passed to the standalone
glcpp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 13:55:41 -08:00
Carl Worth
0206ea3751 glcpp: Honor the GL context's DisableGLSLLineContinuations option
And simply don't call into the function that removes line continuations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 13:55:41 -08:00
Carl Worth
f8987f9972 glcpp: Accept pointer to GL context rather than just the API version
As the preprocessor becomes more sophisticated and gains more optional
behavior, it's easiest to just pass the GL context pointer to it so that
it can examine any fields there that it needs to (such as API version,
or the state of any driconf options, etc.).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 13:55:41 -08:00
Carl Worth
4b00ecebd0 drirc: Add quirk to disable GLSL line continuations for Savage2
This application is known to contain shaders that:

1. Have a stray backslash as the last line of comment lines
2. Have a declaration immediately following that line

Hence, interpreting that backslash as a line continuation causes the
declaration to be hidden and the shader fails to compile.  Fortunately, the
shaders also:

3. Do not have any other intentional line-continuation characters

So disabling line continuations entirely for the application fixes this
problem without causing any other breakage.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 13:55:41 -08:00
Carl Worth
c0c9c9966f driconf: Add a new option: disable_glsl_line_continuations
This is to enable a quirk for Savage2 which includes a shader with a stray '\'
at the end of a comment line. Interpreting that backslash as a line
continuation will break the compilation of the shader, so we need a way to
disable this.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 13:55:41 -08:00
Carl Worth
c6c575c69a driconf: Add proper dependency for compiling .mo files from .po files.
Previously this was happening unconditionally, leading to some excessive
rebuilding/relinking during builds.

Note that the .po files are not automatically updated due to changes to the
t_options.h file. Instead, translators should continue to use "make po"
manually. This is because after new strings are merged into the existing .po
file, manual work is still required by translators to ensure that the
translations are correct.
2013-01-11 13:54:54 -08:00
Carl Worth
b587a7595e driconf: Add translation-generation to build system, don't track generated files
Previously, the xmlpool directory had a lone Makefile to assist poeple in
manually invoking a deep make in order to update the translations in
options.h. We can observe that this wasn't happening in fact, (new
translations had been added to de.po without being generated into options.h,
and new options had been manually added directly to options.h rather than to
t_options.h).

Prevent both of these problems from occurring in the future by automatically
generating options.h as part of the standard build of mesa.

For this, the generated options.h is now removed from version control, (along
with Makefile in favor of Makefile.am).

[chadv: Port the Autotools changes to Android.]
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 13:54:54 -08:00
Carl Worth
8888c6f8e5 driconf: Fix German translations by removing a couple of bogus backslashes
As can be seen, many other translation strings already include a single
apostrophe just fine without any escaping. This strangely-escaped apostrophe
was causing a build failure ("invalid escape sequence") resulting in no "de"
translations in the final options.h file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 13:54:54 -08:00
Chad Versace
ec04617fb3 driconf: Fix gen_xmlpool.py script to allow running from any directory
The gen_xmlpool.py script would work correctly only when executed from the
directory that contained the script. This shortcoming was due to some
hard-coded paths in the script.

In order to easily invoke the script from the Android build system, we
must be able to execute the script from an arbitrary directory. To enable
that, this patch replaces the two hard-coded paths with new command line
arguments.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
2013-01-11 13:54:54 -08:00
Carl Worth
11c3013610 driconf: Add some translations which have been available, but were not compiled
These translations have existed in the de.po file, but were not in the
generated options.h file. This was fixed by simply running "make options.h".

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 13:54:54 -08:00
Carl Worth
bc50f02bc7 driconf: Add option definitions to source file, not generated target
For the last two most-recently-added driconf options, their definition was
manually added to options.h, a file which is intended to be automatically
generated, (as part of support for translated driconf option
descriptions). This means that these options would be eliminated if the
generation step were performed again.

Fix this by correctly adding the definitions of these options to t_options.h,
(the file used as input to the generator), and not the options.h file, which
is generated.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 13:54:54 -08:00
Tom Stellard
4148a29ed8 targets/opencl: Link against libgallium.la instead of libgallium.a 2013-01-11 21:40:42 +00:00
Tom Stellard
4fc11fa3c8 drivers/radeon: Don't link against libgallium.la
This fixes several duplicate symbol errors.

libllvmradeon is a simple helper library.  If it requires symbols in
other libraries, this should be taken care of by the gallium target that
uses it (e.g. libr600.la)
2013-01-11 21:40:42 +00:00
Matt Turner
93d5fe1478 mesa: Use _mesa_lookup_enum_by_nr in tex*_error_check
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-11 11:45:10 -08:00
Ian Romanick
42ed81a7c3 mesa/es3: Add support for GL_PRIMITIVE_RESTART_FIXED_INDEX
This requires some derived state.  The cut vertex used is either the
value specified by glPrimitiveRestartIndex or it's hard-coded to ~0.
The derived state gl_array_attrib::_RestartIndex captures this value.
In addition, the derived state gl_array_attrib::_PrimitiveRestart is set
whenever either gl_array_attrib::PrimitiveRestart or
gl_array_attrib::PrimitiveRestartFixedIndex is set.

v2: Use _mesa_is_gles3.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-11 10:57:25 -08:00
Ian Romanick
00d8ad81ff i965: Add support for GL_ANY_SAMPLES_PASSED_CONSERVATIVE
We just treat this as an alias for GL_ANY_SAMPLES_PASSED.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-11 10:57:25 -08:00
Ian Romanick
886979a097 mesa/es3: Add support for GL_ANY_SAMPLES_PASSED_CONSERVATIVE query target
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-11 10:57:25 -08:00
Ian Romanick
8d47fe2960 mesa/es3: Allow transpose matrix uniforms in GLES3
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-11 10:57:25 -08:00
Matt Turner
5e918a3825 glcpp: Reject token pasting operator in GLES
The GLSL ES 3.0 spec (Section 12.17) says:
"GLSL ES 1.00 removed token pasting and other functionality."

NOTE: This is a candidate for the stable branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
2013-01-11 10:57:25 -08:00
Carl Worth
93e719ba4d glcpp: Make undefined macros illegal in #if and #elif for GLES3
Simply emitting a nicely-formatted error message if any undefined macro is
encountered in a parser context expecting an expression.

With this commit, the following piglit test now passes:

	spec/glsl-es-3.00/compiler/undefined-macro.vert

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-11 10:57:25 -08:00
Carl Worth
77e1bc9f1d glcpp: Add a flag to the parser state to indicate GLES.
This can be triggered either by creation of a GLES context (with
api == API_OPENGLES2) or else by a #version directive with version
value 100 or with a string of "es" following the version value.

There's no behavioral change with this commit—just preparation for ES-specific
behavior in the preprocessor in the future.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-11 10:57:25 -08:00
Andreas Boll
100440d1b1 glcpp: Add back tests/*.out to .gitignore
Accidentally removed in ac2793cf3e
2013-01-11 11:49:33 +01:00
Knut Andre Tidemann
8da2dab31d targets/egl-static: fix link failure to libwayland-drm
Fixes the following build error:
  CXXLD    egl_gallium.la
g++: error: ../../../../src/egl/wayland/wayland-drm/.libs/.libs/libwayland-drm.a: No
such file or directory

Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-01-11 10:56:36 +01:00
Johannes Obermayr
d98716233e targets/dri-swrast: Force c++ linker in all cases.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=59226

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2013-01-11 10:20:42 +01:00
Roland Scheidegger
babab28760 llvmpipe: fix clearing integer color buffers
We get int/uint clear color value in this case, and util_pack_color can't
handle these formats at all (even if it could, float input color isn't what
we want).
Pass through the color union appropriately and handle the packing ourselves
(as I couldn't think of a good generic util solution).
This gets piglit fbo_integer_precision_clear and
fbo_integer_readpixels_sint_uint from the ext_texture_integer test group from
segfault to pass (which only leaves fbo-blending from that group not working).

v2: fix up comments
2013-01-10 18:10:20 -08:00
Roland Scheidegger
5785f22d23 gallivm: fix border color for integer textures
Need to bitcast the float border color (luckily we already get
the color as int just disguised as float).
Fixes piglit texwrap GL_EXT_texture_integer bordercolor.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-10 18:02:01 -08:00
Roland Scheidegger
31884946b5 gallivm: more integer texture format fetch fixes
Change the texel type to int/uint instead of float throughout the sampling
code which makes it easier to catch errors (as llvm will complain about wrong
types if we mistakenly treat these values as real floats somewhere).
This should also get things like e.g. sampler swizzles (for unused channels)
right.
This fixes piglit texture_integer_glsl130 test.
Border color not working (crashing) yet.
(These formats are not exposed yet in llvmpipe.)

v2: couple cleanups according to José's comments

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-01-10 18:02:01 -08:00
Matt Turner
5eeedb852b build: mapi/glapi/gen: Use BUILT_SOURCES 2013-01-10 22:01:31 +01:00
Matt Turner
ac2793cf3e Clean up .gitignore files 2013-01-10 22:01:31 +01:00
Matt Turner
3ed95dc073 Remove MESA_PIC_FLAGS macro 2013-01-10 22:01:31 +01:00
Matt Turner
f1d229ee94 Remove installmesa 2013-01-10 22:01:31 +01:00
Matt Turner
b585c0059c Remove minstall 2013-01-10 22:01:31 +01:00
Matt Turner
424f200881 Remove checking for makedepend 2013-01-10 22:01:31 +01:00
Matt Turner
c977e61fe2 Remove gallium's unused Makefile.template 2013-01-10 22:01:31 +01:00
Matt Turner
74d105174b Remove gbm's unused Makefile.template 2013-01-10 22:01:31 +01:00
Matt Turner
ae352ccb90 Remove gallium targets' Makefile.{dri,vdpau,xorg,xvmc} 2013-01-10 22:01:31 +01:00
Matt Turner
8f8e85e703 Remove mklib 2013-01-10 22:01:31 +01:00
Matt Turner
41349a4253 Remove unused glsl Makefile.template 2013-01-10 22:01:31 +01:00
Matt Turner
c87474089d Remove configs/{current,default} 2013-01-10 22:01:30 +01:00
Andreas Boll
cb4d5021c6 gallium/tests/unit: Convert to automake 2013-01-10 22:01:30 +01:00
Andreas Boll
59088a2c2c gallium/tests/trivial: Convert to automake 2013-01-10 22:01:30 +01:00
Matt Turner
45270fb0fd targets/pipe-loader: Convert to automake
C++ linking (controlled by the nodist_EXTRA idiom) is needed

unconditionally for:
	nouveau (uses C++ in the driver)
	r300 (since LLVM is always required)
	radeonsi (since LLVM is always required)
	swrast (if builting LLVM pipe)

and conditionally (depends whether LLVM is enabled) for
	i915
	r600
	vmwgfx

and never needed for swrast (softpipe).

Unfortunately, automake seems to *always* link with C++ if nodist_EXTRA
is specified, even inside a false conditional. Not sure if this is a
bug, but it does seem to be weird behavior.

v2: Johannes Obermayr <johannesobermayr@gmx.de>
    - Fix some undefined symbols.

v3: Johannes Obermayr <johannesobermayr@gmx.de>
    - Install pipe_* to $(libdir)/gallium-pipe.

v4: Johannes Obermayr <johannesobermayr@gmx.de>
    - Build it only once on --enable-gallium-gbm / --enable-opencl.
2013-01-10 22:01:30 +01:00
Matt Turner
53c62d3fb0 targets/gbm: Convert to automake 2013-01-10 22:01:30 +01:00
Matt Turner
cdee0e8084 targets/egl-static: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
    - Add missing Automake.inc

v3: Johannes Obermayr <johannesobermayr@gmx.de>
    - Fix linking.

v4: Andreas Boll <andreas.boll.dev@gmail.com>
    - Port changes from ff574d653b
	  gallium/egl-static: Fix unresolved symbol 'clock_gettime'
2013-01-10 22:01:28 +01:00
Matt Turner
d53901c67c targets/xa-vmwgfx: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:11 +01:00
Matt Turner
af6a2e4f82 targets/xvmc-softpipe: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - add missing xvmc state tracker to _LIBADD variable

v3: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:11 +01:00
Matt Turner
45bf6aa617 targets/xvmc-r600: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Add missing xvmc state tracker to _LIBADD variable

v3: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:11 +01:00
Matt Turner
c2371ccdac targets/xvmc-r300: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Add missing xvmc state tracker to _LIBADD variable

v3: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:11 +01:00
Matt Turner
b173b16cba targets/xvmc-nouveau: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Add missing xvmc state tracker to _LIBADD variable

v3: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:11 +01:00
Matt Turner
0b132df3ad build: AC_SUBST XVMC_MAJOR/MINOR 2013-01-10 22:01:11 +01:00
Matt Turner
f2bf0cdf72 targets/xorg-radeonsi: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:11 +01:00
Matt Turner
ff5ab73d53 targets/xorg-r600: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
7d451ba83a targets/xorg-r300: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
f984d128c5 targets/xorg-nouveau: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
1a4349125b targets/xorg-i915: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
7f24483e3d targets/vdpau-softpipe: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Add missing vdpau state tracker to _LIBADD variable

v3: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
e3b2160a1f targets/vdpau-radeonsi: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Add missing vdpau state tracker to _LIBADD variable

v3: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
98c051355f targets/vdpau-r600: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Add missing vdpau state tracker to _LIBADD variable

v3: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
7e0d6ff6d7 targets/vdpau-r300: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Add missing vdpau state tracker to _LIBADD variable

v3: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
d0df9e82c7 targets/vdpau-nouveau: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Add missing vdpau state tracker to _LIBADD variable

v3: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
184b2f0f68 build: AC_SUBST VDPAU_MAJOR/MINOR 2013-01-10 22:01:10 +01:00
Matt Turner
0470fb4efe targets/libgl-xlib: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
c14c801a03 targets/dri-vmwgfx: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
b3068d87cb targets/dri-swrast: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
dd65729057 targets/dri-radeonsi: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
ab07ae05a3 targets/dri-r600: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:10 +01:00
Matt Turner
b570f1fc31 targets/dri-r300: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:09 +01:00
Matt Turner
6ed9f9f232 targets/dri-nouveau: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:09 +01:00
Matt Turner
2cd5bf7536 targets/dri-i915: Convert to automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Provide compatibility with scripts for the old Mesa build system
2013-01-10 22:01:09 +01:00
Matt Turner
880063f5bc build: Update drivers/Makefile.am to use LTLIBRARIES 2013-01-10 22:01:09 +01:00
Matt Turner
c236fa82c2 state_trackers/xvmc/test: Convert to automake 2013-01-10 22:01:09 +01:00
Matt Turner
984562d630 state_trackers/xvmc: Convert to automake 2013-01-10 22:01:09 +01:00
Matt Turner
69089ef5b0 Remove xvmc hack 2013-01-10 22:01:09 +01:00
Matt Turner
405a9dabe2 state_trackers/xorg: Convert to automake 2013-01-10 22:01:09 +01:00
Matt Turner
2ad2603467 state_trackers/xa: Convert to automake 2013-01-10 22:01:09 +01:00
Matt Turner
68c0311996 state_trackers/vega: Convert to automake 2013-01-10 22:01:09 +01:00
Matt Turner
d2ca32e332 state_trackers/vdpau: Convert to automake 2013-01-10 22:01:09 +01:00
Matt Turner
1ba5d8ac40 Remove vdpau hack 2013-01-10 22:01:09 +01:00
Matt Turner
083dcdf809 state_trackers/glx: Convert to automake 2013-01-10 22:01:09 +01:00
Matt Turner
c0b9081dc5 state_trackers/gbm: Convert to automake 2013-01-10 22:01:09 +01:00
Matt Turner
8443efdf2c state_trackers/egl: Convert to automake 2013-01-10 22:01:09 +01:00
Matt Turner
9b35758926 state_trackers: Convert to automake 2013-01-10 22:01:08 +01:00
Matt Turner
5089072419 Remove state_tracker/Makefile
Unneeded and unnecessary.
2013-01-10 22:01:08 +01:00
Matt Turner
9f38a1c871 build: Don't build pipebuffer
It's already built by src/gallium/auxiliary.
2013-01-10 22:01:08 +01:00
Tom Stellard
0dcb9ae0d9 radeon/llvm: Convert to Automake
v2: Johannes Obermayr <johannesobermayr@gmx.de>
    Fix some undefined symbols.

v3: Johannes Obermayr <johannesobermayr@gmx.de>
    Build it -shared to fix egl_gallium.so on r600/radeonsi builds.
2013-01-10 22:01:08 +01:00
Matt Turner
2cbb94b3ce build: Add automake conditionals for gallium drivers 2013-01-10 22:01:08 +01:00
Matt Turner
f4b1f2807f state_trackers/dri/sw: Convert to automake 2013-01-10 22:01:08 +01:00
Matt Turner
d988481d58 state_trackers/dri/drm: Convert to automake 2013-01-10 22:01:08 +01:00
Matt Turner
2ff51cd639 state_trackers/dri: Convert to automake 2013-01-10 22:01:08 +01:00
Matt Turner
44653c0a0e winsys/sw/xlib: Convert to automake 2013-01-10 22:01:08 +01:00
Matt Turner
24c2fe94a2 winsys/sw/wrapper: Convert to automake 2013-01-10 22:01:08 +01:00
Matt Turner
1d0ef53e7b winsys/sw/wayland: Convert to automake 2013-01-10 22:01:08 +01:00
Matt Turner
1c9fb3c5b5 winsys/sw/null: Convert to automake 2013-01-10 22:01:08 +01:00
Matt Turner
5c4ade53a4 winsys/sw/fbdev: Convert to automake 2013-01-10 22:01:08 +01:00
Matt Turner
a6b3cd1349 winsys/sw/dri: Convert to automake 2013-01-10 22:01:08 +01:00
Matt Turner
b4beea6418 winsys/sw: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
2b5a1c0299 svga/winsys/drm: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
77fc30b57d nouveau/winsys/drm: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
da2d98fac7 radeonsi: Convert to automake
Can't use LTLIBRARIES here yet, since libradeon isn't converted.
2013-01-10 22:01:07 +01:00
Matt Turner
c35cddd134 nvc0: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
2a28353ca0 nv50: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
36066770bf nv30: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
1cf66321f9 nouveau: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
0a42131f3b svga: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
f781d4c60d softpipe: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
960cbd8b78 llvmpipe: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
b51cdfa64b rbug: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
3bfe7c2111 i915/winsys/sw: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
266d639b91 i915/winsys/drm: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
7d5496ab3b i915g: Convert to automake 2013-01-10 22:01:07 +01:00
Matt Turner
533130a5bb r600g: Use gallium automake include file 2013-01-10 22:01:06 +01:00
Tom Stellard
80d290d47a libgallium: Convert to automake 2013-01-10 22:01:06 +01:00
Tom Stellard
047fe04750 trace: Convert to automake 2013-01-10 22:01:06 +01:00
Tom Stellard
34a6150188 radeon/winsys: Convert to automake 2013-01-10 22:01:06 +01:00
Matt Turner
8dc4048b3b r300g: Link ralloc.c and register_allocate.c into separate library 2013-01-10 22:01:06 +01:00
Tom Stellard
e04413cbb0 r300g: Build a libtool archive 2013-01-10 22:01:06 +01:00
Tom Stellard
c07c2696c7 r300g: Use gallium automake include file
[mattst88] v2: Remove ARCH_FLAGS/OPT_FLAGS
2013-01-10 22:01:06 +01:00
Tom Stellard
c040fe102c gallium: Add common automake include file
v2: Matt Turner <mattst88@gmail.com>
    Remove ARCH_FLAGS/OPT_FLAGS

v3: Johannes Obermayr <johannesobermayr@gmx.de>
    Add -I$(top_srcdir)/include to GALLIUM_CFLAGS
2013-01-10 22:01:06 +01:00
Matt Turner
9bf0d49abe automake: Convert Gallium target and winsys 2013-01-10 22:01:06 +01:00
Kristian Høgsberg
4e42e569dd egl/gbm: Implement EGL_EXT_buffer_age
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-10 15:58:05 -05:00
Matt Turner
0ae81b8422 mesa: Rename and wire-up GetInteger64i_v
The function was named badly and wasn't in the dispatch table,
making it hard to find.

Fixes transform_feedback2_states and gets a few other transform
feedback tests closer to working in es3conform.

Reviewed-by Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
1a3ffbf378 mesa: Correct glGet{Boolean,Integer}i_v names
Reviewed-by Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
ec8ee91923 mesa: Allow GL_DEPTH_STENCIL_ATTACHMENT in ES 3
Fixes framebuffer_srgb_default_encoding_fbo and 5 packed_depth_stencil
tests from es3conform.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-10 10:57:51 -08:00
Chad Versace
75b963c095 mesa: Support more glGet enums for ES3
For glGetIntegerv, add support for the following in an OpenGL ES 3.0
context:
    GL_MAJOR_VERSION
    GL_MINOR_VERSION
    GL_NUM_EXTENSIONS

See Table 6.29 of the OpenGL ES 3.0 spec.

Fixes error GL_INVALID_ENUM in piglit egl-create-context-verify-gl-flavor,
testcase for OpenGL ES 3.0.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
532e05a9d0 mesa: Support querying GL_MAX_ELEMENT_INDEX in ES 3
The ES 3 spec says that the minumum allowable value is 2^24-1, but the
GL 4.3 and ARB_ES3_compatibility specs require 2^32-1, so return 2^32-1.

Fixes es3conform's element_index_uint_constants test.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
92855727f1 mesa: De-duplicate ES2 queries
From GL/GLES/GL_CORE and GLES2 -> GL/GL_CORE/GLES2.

Yes, we really were exposing ES2_compatibility queries on ES 1.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
5bb1827d95 mesa: Allow glGet* queries on EXT_texture_lod_bias data in ES 3
Fixes the remaining 4 texture_lod_bias failures in es3conform.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
e895d368e1 mesa: Allow glGet* queries on EXT_framebuffer_blit data in ES 3
Fixes 2 framebuffer_blit es3conform tests.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
d9948e49d3 mesa: Allow glGet* queries on ARB_fragment/vertex_shader data in ES 3
Fixes uniform_buffer_object_implementation_dependent_limits in
es3conform.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
d93c1b62f8 mesa: Allow glGet* queries on ARB_framebuffer_object data in ES 3
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
914415a63f mesa: Allow glGet* queries on ARB_transform_feedback2 data in ES 3
Fixes the transform_feedback2_init_defaults test from es3conform.

The ES 3 spec lists these as TRANSFORM_FEEDBACK_PAUSED and
TRANSFORM_FEEDBACK_ACTIVE.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
3d0e4eb134 mesa: Allow glGet* queries on EXT_transform_feedback data in ES 3
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
3f1217607a mesa: Allow glGet* queries on ARB_sync data in ES 3
Fixes the sync_coverage_max_server_wait_timeout test in es3conform.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
0a8866e751 mesa: Allow glGet* queries of EXT_pbo data in ES 3
Fixes pixel_buffer_object_default_binding and gets other tests in
es3conform closer to passing.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
be68dae374 mesa: Allow glGet* queries of select ARB_ubo data in ES 3
Fixes 5 uniform_buffer_object tests in es3conform.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:51 -08:00
Matt Turner
0cc018526f Add ES 3 handling to get.c and get_hash_generator.py
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:50 -08:00
Matt Turner
57616159aa glapi: Move ARB_base_instance to the correct location
It's #107, it shouldn't be added after the #116 comment.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:50 -08:00
Matt Turner
a5ed966069 mesa/tests: Add ARB_ES3_compatibility enums
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:50 -08:00
Matt Turner
910a0bfe5b glapi: Add enums for ARB_ES3_compatibility
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-10 10:57:50 -08:00
Quentin Glidic
c5e9396424 mesa/program: Fix both Classic and Gallium build
Follow-up for 9078441072 and
3a5ad21cd3

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57044
Tested-by: Fabio Pedretti <fabio.ped@libero.it>
Tested-by: Brad King <brad.king@kitware.com>
2013-01-10 10:34:56 -08:00
Andreas Boll
f416b382d6 configure.ac: fix typo in error message 2013-01-10 18:41:53 +01:00
Marek Olšák
2f89949b66 r300g: don't set sample positions to the pixel center if MSAA is disabled
but an MSAA resource is bound. This effectively makes the MSAA disable switch
not affect rasterization, but it still affects the alpha-to-one and
alpha-to-coverage states. This hardware just lacks a proper MSAA disable
switch.

This fixes graphics corruption in sauerbraten.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59194
2013-01-10 15:37:10 +01:00
Paul Berry
9a07b6bd74 intel: Clean up confusion between logical and physical surface dimensions.
In most cases, the width, height, and depth of the physical surface
used by the driver to implement a texture or renderbuffer is equal to
the logical width, height, and depth exposed to the client through
functions such as glTexImage3D().  However, there are two exceptions:
cube maps (which have a physical depth of 6 but a logical depth of 1)
and multisampled renderbuffers (which have larger physical dimensions
than logical dimensions to allow multiple samples per pixel).

Previous to this patch, we accounted for the difference between
physical and logical surface dimensions at inconsistent places in the
call graph (multisampling was accounted for in
intel_miptree_create_for_renderbuffer(), and cubemaps were accounted
for in intel_miptree_create_internal()).  As a result, it wasn't
always clear, when calling a miptree creation function, whether
physical or logical dimensions were needed.  Also, we weren't
consistent about storing logical dimensions in the intel_mipmap_tree
structure (we only did so in the
intel_miptree_create_for_renderbuffer() code path, and we did not
store depth).

This patch refactors things so that intel_miptree_create_internal() is
responsible for converting logical to physical dimensions and for
storing both the physical and logical dimensions in the
intel_mipmap_tree structure.  As a result, all miptree creation
functions interpret their arguments as logical dimensions, and both
physical and logical dimensions are always available to functions that
work with intel_mipmap_trees.

In addition, it renames the fields in intel_mipmap_tree used to store
the dimensions, so that it is clear from the name whether physical or
logical dimensions are being referred to.

This should fix the following bugs:

- When creating a separate stencil surface for a depthstencil cubemap,
  we would erroneously try to convert the depth from 1 to 6 twice,
  resulting in an assertion failure.

- When creating an MCS buffer for compressed multisampling, we used
  physical dimensions instead of logical dimensions, resulting in
  wasted memory.

In addition, this should considerably simplify the implementation of
ARB_texture_multisample, because it moves the code to compute the
physical size of multisampled surfaces out of renderbuffer-only code.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-09 13:10:47 -08:00
Paul Berry
a5f87e8843 intel: Add a force_y_tiling parameter to intel_miptree_create().
This allows intel_miptree_alloc_mcs() to force Y tiling for the MCS
buffer.  Previously we accomplished this by the hack of passing
INTEL_MSAA_LAYOUT_CMS as the msaa_layout parameter, but that parameter
is going to be going away soon.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-09 13:10:30 -08:00
Paul Berry
8f15f19696 intel: Move compute_msaa_layout earlier in file.
No functional change.  This patch moves the compute_msaa_layout()
function earlier in intel_mipmap_tree.c so that it can be used by
other functions in that file.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-09 13:10:14 -08:00
Vinson Lee
b37930f309 r600g: Fix memory leak in r600_bytecode_add_vtx.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-01-09 11:11:46 -05:00
Marek Olšák
f8651dea4e r300g: optionally log MSAA resources to stderr
Set: RADEON_DEBUG=msaa
2013-01-09 16:47:10 +01:00
Marek Olšák
1385c353cf r300g: fix the GPU name in the renderer string
Broken by ca474f98f2.
2013-01-09 16:40:37 +01:00
Marek Olšák
4f2d9a8f52 r300g: fix CS checker errors caused by emit_dsa_state
size is 10 on r500 and 8 on r300
2013-01-09 16:40:37 +01:00
Johannes Obermayr
959e83d650 clover: Adapt libclc's INCLUDEDIR and LIBEXECDIR to make use of the new introduced libclc.pc.
Tom Stellard:
  -Keep --with-libclc-path and mark it deprecated.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-01-08 20:32:47 -05:00
Ian Romanick
ed3f237e09 glsl: Don't add structure fields to the symbol table
I erroneously added this back in January 2011 in commit 88421589.
Looking at the commit message, I have no idea why I added it.  It only
added non-array structure fields to the symbol table, so array structure
fields are treated correctly.

Fixes piglit tests structure-and-field-have-same-name.vert and
structure-and-field-have-same-name-nested.vert.  It should also fix
WebGL conformance tests shader-with-non-reserved-words.

NOTE: This is a candidate for the stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57622
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-08 13:09:31 -08:00
Kenneth Graunke
a5265f7536 i965/fs: Fix struct vs. class in acp_entry definitions. 2013-01-08 13:09:31 -08:00
Marek Olšák
a70e5e2b94 r600g: implement buffer copying using CP DMA for R7xx, Evergreen, Cayman
R6xx doesn't work - the issue seems to be with flushing (sometimes
the destination buffer contains garbage). There are no hangs, so we're good.

R7xx doesn't seem to have any alignment restriction despite our initial
thinking. Everything just works.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-01-08 21:58:28 +01:00
Marek Olšák
2d3d0d3a5a st/mesa: fix possible MSVC build error v2
https://bugs.freedesktop.org/show_bug.cgi?id=59143

Using GLubyte as per Brian's suggestion.
2013-01-08 21:53:13 +01:00
Paul Berry
c35abcd1b0 glsl: Pack flat "varyings" of mixed types together.
This patch enhances the varying packing code so that flat varyings of
uint, int, and float types can be packed together.

We accomplish this in lower_packed_varyings.cpp by making the type of
all flat varyings ivec4, and then using information-preserving type
conversions (e.g. ir_unop_bitcast_f2i) to convert all other types to
ints.

The varying_matches::compute_packing_class() function is updated to
reflect the fact that varying packing no longer needs to segregate
varyings of different base types.

Fixes piglit test varying-packing-mixed-types.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Split lower_packed_varyings_visitor::bitwise_assign into
pack/unpack variants.
2013-01-08 09:18:14 -08:00
Paul Berry
18720555dd glsl: Prohibit structs and bools from being used as "varyings".
The GLSL 1.30 spec only allows vertex shader outputs and fragment
shader inputs ("varyings" in pre-GLSL-1.30 parlance) to be of type
int, uint, float, or vectors, matrices, or arrays thereof.  Bools,
bvec's, and structs are prohibited.  (Integral varyings were
prohibited prior to GLSL 1.30).

Previously, Mesa only performed this check on variables declared with
the "varying" keyword, and it always performed the check according to
the pre-GLSL-1.30 rules.  As a result, bools and structs were allowed
to slip through, provided they were declared using the new in/out
syntax.

This patch modifies the error check so that it occurs after "varying"
is converted to "in/out", and corrects it to properly account for GLSL
version.

Fixes piglit tests:
  in-bool-prohibited.frag
  in-bvec2-prohibited.frag
  in-bvec3-prohibited.frag
  in-bvec4-prohibited.frag
  in-struct-prohibited.frag
  out-bool-prohibited.vert
  out-bvec2-prohibited.vert
  out-bvec3-prohibited.vert
  out-bvec4-prohibited.vert
  out-struct-prohibited.vert

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-08 09:09:21 -08:00
Paul Berry
c33be485c5 glsl: Plumb through is_parameter to apply_type_qualifier_to_variable()
This patch adds logic to allow the ast_to_hir function
apply_type_qualifier_to_variable() to tell whether it is acting on a
variable declaration or a function parameter.  This will allow it to
correctly interpret the meaning of "out" and "in" keywords (which have
different meanings in those two contexts).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-08 09:09:18 -08:00
Paul Berry
4b11b57ab4 glsl: Separate varying linking code to its own file.
linker.cpp is getting pretty big, and we're about to add even more
varying packing code, so split out the linker code that concerns
varyings to its own file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-08 09:09:09 -08:00
Paul Berry
8706395f25 mesa: Add ALIGN() macro to main/macros.h.
Previously this macro existed in 3 separate places, some inside the
intel driver and some outside of it.  It makes more sense to have it
in main/macros.h

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-08 09:08:57 -08:00
Paul Berry
09df6bb96d glsl: Fix loop bounds detection.
When analyzing a loop where the loop condition is expressed in the
non-standard order (e.g. "4 > i" instead of "i < 4"), we were
reversing the condition incorrectly, leading to a loop bound that was
off by 1.

Fixes piglit tests {vs,fs}-loop-bounds-unrolled.shader_test.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-08 09:08:53 -08:00
Marek Olšák
844d14ebee winsys/radeon: bump the size of relocation hashlist
This should reduce the number of hash collisions in ETQW.
2013-01-08 16:41:57 +01:00
Christoph Bumiller
18f3f7b958 nvc0: catch too high GENERIC indices to prevent GRAPH traps 2013-01-08 16:13:52 +01:00
Christoph Bumiller
b9c8a98e21 nvc0: use correct resource target to select blit shader 2013-01-08 16:13:52 +01:00
Christoph Bumiller
41e105d5be nvc0: add missing call to map edge flag in push_vbo
Note: this is a candidate for the 9.0 stable branch.
2013-01-08 16:13:52 +01:00
Christoph Bumiller
be75a9373a nv50/ir: wrap assertion using typeid in #ifndef NDEBUG
Note: this is a candidate for the 9.0 stable branch.
2013-01-08 16:13:52 +01:00
Christoph Bumiller
076f4ced8b nvc0: fix out of bounds writes for unaligned sizes in push_data 2013-01-08 16:13:51 +01:00
Christoph Bumiller
39fe03e2de nouveau: increase max order of suballocated buffers by 1
This is really a hack to make TF2 (considerably, up to 20 -> 70 fps
at low res) faster.
2013-01-08 16:13:51 +01:00
Christoph Bumiller
48a45ec24a nouveau: improve buffer transfers
Save double memcpy on uploads to VRAM in most cases.
Properly handle FLUSH_EXPLICIT.
Reallocate on DISCARD_WHOLE_RESOURCE to avoid sync.
2013-01-08 16:13:51 +01:00
Marek Olšák
a75ddfd55d r300g: fix assertion failure in emit_dsa_state
Broken by 8ed6b1400b.
2013-01-08 14:33:18 +01:00
Kenneth Graunke
a60c567fcf i965: Support GL_FIXED and packed vertex formats natively on Haswell+.
Haswell and later support the GL_FIXED and 2_10_10_10_rev vertex formats
natively, and don't need shader workarounds.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-07 16:48:02 -08:00
Kenneth Graunke
e219764fde i965: Add #defines for GL_FIXED vertex formats.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-07 16:48:02 -08:00
Kenneth Graunke
f3840b1632 i965: Add remaining #defines for packed vertex formats.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-07 16:48:02 -08:00
Kenneth Graunke
899017fc54 i965: Use Haswell's sample_d_c for textureGrad with shadow samplers.
The new hardware actually just supports this now.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-07 16:48:02 -08:00
Kenneth Graunke
30f8f58c20 i965/fs: Remove dead code from generate_uniform_pull_constant_load_gen7.
generate_uniform_pull_constant_load_gen7() is only called on Gen7+, so
the gen < 6 code is dead.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-07 16:48:02 -08:00
Alexander von Gluck IV
23595aa427 mesa: Drop mmx optimizations on Haiku
* Prevents compatibility problems. As Haiku
  doesn't use rtasm anymore, it's kind of
  pointless.
2013-01-07 17:39:49 -06:00
Alexander von Gluck IV
b9227b3e15 mesa: Don't use rtasm for Haiku swrast
* We have a symbol conflict as rtasm in
  Mesa collides with rtasm in gallium.
* As us linking gallium and mesa together
  is an edge case, lets just omit the rtasm
  code from Mesa as we should be going
  llvmpipe soon :)
2013-01-07 17:39:49 -06:00
Alex Deucher
4332f6fc18 r600g: set the virtual address for the htile buffer
Fixes cayman and TN with htile enabled.  Should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=59089
https://bugs.freedesktop.org/show_bug.cgi?id=58667
Possibly others.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-01-07 15:21:46 -05:00
Jerome Glisse
ca474f98f2 radeon/winsys: move radeon family/class identification to winsys
Upcoming async dma support rely on winsys knowing about GPU families.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-07 11:06:07 -05:00
Jerome Glisse
d499ff98cd r600g/radeon/winsys: indentation cleanup
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-07 11:06:02 -05:00
Marek Olšák
afec10df37 r600g: flush FMASK and CMASK at the end of CS 2013-01-06 22:06:34 +01:00
Marek Olšák
8ed6b1400b r300g: implement MSAA
This is not as optimized as r600g - the MSAA compression is missing,
so r300g needs a lot of bandwidth (more than r600g to do the same thing).
However, if the bandwidth is not an issue for you, you can enjoy this
unoptimized MSAA support.
The only other missing optimization for MSAA is the fast color clear.

MSAA is enabled on r500 only, because that's the only GPU family I tested.
That said, MSAA should work on r300 and r400 as well (but you must set
RADEON_MSAA=1 to allow it, then turn MSAA on in your app or set GALLIUM_MSAA=n,
n >= 2, n <= 6)
I will enable the support by default on r300-r400 once someone (other than me)
tests those chipsets with piglit.

The supported modes are 2x, 4x, 6x.

The supported MSAA formats are RGBA8, BGRA8, and RGBA16F (r500 only).
Those 3 formats are used for all GL internal formats.

Tested with piglit. (I have ported all MSAA tests to GL2.1)
2013-01-06 14:44:12 +01:00
Marek Olšák
cc030da428 r300g: simplify DSA state, add ability to patch FG_ALPHA_FUNC while emitting
Preparation for MSAA and alpha-to-coverage.
2013-01-06 14:44:12 +01:00
Marek Olšák
25b3c0a52c r300g/compiler: add shader emulation for the alpha_to_one state 2013-01-06 14:44:12 +01:00
Vinson Lee
2f358feda3 configure.ac: Remove space after indent -T flag.
Fixes this build error on platforms not using GNU indent.

indent: Command line: ``-T'' requires a parameter

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-01-04 19:10:48 -08:00
Ian Romanick
d299ef3ad0 intel: Fix copy-and-paste bug setting gl_constants::MaxSamples
gl_constants::MaxSamples is an integer, so setting it to 1.0 is just
silly.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-04 17:39:05 -08:00
Ian Romanick
a86d629799 mesa: Disallow R, RG, or RGB integer and unsigned formats in OpenGL ES 3.0
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-04 17:39:05 -08:00
Ian Romanick
2aae3abd77 mesa: Disallow SNORM formats for renderbuffers in OpenGL ES
v2: Move {RED,RG,RGB,RGBA}_SNORM changes from the previous commit to
this commit.  Based on suggestions from Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-04 17:39:05 -08:00
Ian Romanick
4b92379da2 mesa: Disallow deprecated SNORM formats for renderbuffers
The OpenGL 3.2 core profile spec says:

    "The following base internal formats from table 3.11 are
    color-renderable: RED, RG, RGB, and RGBA. The sized internal formats
    from table 3.12 that have a color-renderable base internal format
    are also color-renderable. No other formats, including compressed
    internal formats, are color-renderable."

The OpenGL 3.2 compatibility profile spec says (only ALPHA is added):

    "The following base internal formats from table 3.16 are
    color-renderable: ALPHA, RED, RG, RGB, and RGBA. The sized internal formats
    from table 3.17 that have a color-renderable base internal format
    are also color-renderable. No other formats, including compressed
    internal formats, are color-renderable."

Table 3.12 in the core profile spec and table 3.17 in the compatibility
profile spec list SNORM formats as having a base internal format of RED,
RG, RGB, or RGBA.  From this we infer that they should also be color
renderable.

The OpenGL ES 3.0 spec says:

    "An internal format is color-renderable if it is one of the formats
    from table 3.12 noted as color-renderable or if it is unsized format
    RGBA or RGB. No other formats, including compressed internal
    formats, are color-renderable."

In the OpenGL ES 3.0 spec, none of the SNORM formats have "color-
renderable" marked in table 3.12.  The RGB I and UI formats also are not
color-renderable in ES3, but we'll save that change for another patch.

Both NVIDIA's closed-source driver (version 304.64) and AMD's
closed-source driver (Catalyst 12.6 on HD 3650) reject *all* SNORM
formats for renderbuffers in OpenGL 3.3 compatibility profiles.

v2: Move {RED,RG,RGB,RGBA}_SNORM changes from the this commit to the
next commit.  Based on suggestions from Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-04 17:39:05 -08:00
Brian Paul
69c2528b83 util: fix addressing bug in pipe_put_tile_z() for PIPE_FORMAT_Z32_FLOAT
The Z32 pixel is 4 bytes so multiply x by 4, not 2.

Note: This is a candidate for the stable branches.
2013-01-04 15:30:46 -07:00
Brian Paul
073a53fe2f util: add get/put_tile_z() support for PIPE_FORMAT_Z32_FLOAT_S8X24_UINT
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=58972

Note: This is a candidate for the stable branches.
2013-01-04 15:30:46 -07:00
Brian Paul
1b6ba9c4c8 gallivm: support more immediates in lp_build_tgsi_info()
Bump limit from 32 to 128.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=58545
2013-01-04 15:30:45 -07:00
Brian Paul
46bad058eb xlib: allow GLX_DONT_CARE for glXChooseFBConfig() attribute values
Fixes piglit glx-dont-care-mask test.

Note: This is a candidate for the stable branches.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-04 15:18:53 -07:00
Brian Paul
fe90762414 st/glx: allow GLX_DONT_CARE for glXChooseFBConfig() attribute values
Fixes piglit glx-dont-care-mask test.

Note: This is a candidate for the stable branches.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-04 15:18:53 -07:00
Tom Stellard
aed37cbee8 radeon/llvm: Remove backend code from Mesa
This code now lives in an external tree.

For the next Mesa release fetch the code from the master branch
of this LLVM repo:
http://cgit.freedesktop.org/~tstellar/llvm/

For all subsequent Mesa releases, fetch the code from the official LLVM
project:
www.llvm.org
2013-01-04 21:05:09 +00:00
Johannes Obermayr
05c143cc04 Support LLVM >= 3.2 on radeonsi and opencl.
Tom Stellard:
 - Backend now has same name for all LLVM versions
 - Add missing LLVM_VERSION_INT definition
2013-01-04 21:05:09 +00:00
Tom Stellard
54f3a3e88d clover: Fix build after the addition of enum pipe_flush_flags
Broken since commit 598cc1f74d
2013-01-04 21:05:09 +00:00
Marek Olšák
bce36d1556 r300g: don't check for vertex and index buffer bind flags 2013-01-04 21:08:28 +01:00
Marek Olšák
beb358809e r300g/swtcl: use memcpy to emit indices 2013-01-04 21:08:28 +01:00
Marek Olšák
ad1d1a4d9e r300g/swtcl: simplify vertex uploading
- skip the vertex buffer reallocation in flush and just use
  the unsynchronized flag to get new memory.
- remove the cruft needed to get around the issues with the vertex buffer
  reallocation in flush
- use pb_buffer instead of pipe_resource
2013-01-04 21:08:28 +01:00
Marek Olšák
37fd455b21 r300g/swtcl: fix crash when setting vertex buffers
Broken by e73bf3b805.
2013-01-04 21:08:28 +01:00
Marek Olšák
d4ff72b944 r300g: don't set PIPE_BIND flags for internal textures 2013-01-04 21:08:28 +01:00
Paul Berry
06f67e75ee i965: Fix glCompressedTexSubImage2D offsets for ETC textures.
This patch fixes intel_miptree_unmap_etc() (which decompresses ETC
textures to linear) to pay attention to map->x and map->y when writing
to the destination image.  Previously these values were ignored,
causing the xoffset and yoffset parameters passed to
glCompressedTexSubImage2D() to be ignored.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-01-04 11:16:43 -08:00
Kristian Høgsberg
48ac6d7e97 egl/wayland: Remove kooky flush code
We used to have to jump through hoops to call glFlush at swap buffer time,
but the flush extension made that unnecessary a long time ago.
2013-01-04 11:20:12 -05:00
Kristian Høgsberg
b433e319b3 egl/wayland: Remove confusing comment about front buffer rendering 2013-01-04 11:20:12 -05:00
Kristian Høgsberg
b5160a10c0 egl_dri2: Remove unused struct dri2_egl_buffer from header file 2013-01-04 11:20:12 -05:00
Kristian Høgsberg
0725f2d654 egl: Add extension infrastructure for EGL_EXT_buffer_age 2013-01-04 11:20:12 -05:00
Kristian Høgsberg
f79739ebdd egl: Update to revision 19987 of eglext.h
This pulls in EGL_EXT_buffer_age.
2013-01-04 11:20:12 -05:00
Brian Paul
35fe71d97e util: move var declaration before loop to fix MSVC error 2013-01-04 08:22:02 -07:00
Marek Olšák
1aebb6911e r600g: implement 3D transfers
That means we can map and read multiple slices with one transfer_map call.
2013-01-04 14:06:54 +01:00
Marek Olšák
ee351ea178 st/mesa: fix assertion failures with 2101010 vertex formats
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-04 14:06:39 +01:00
Marek Olšák
d1818d6f68 st/mesa: accelerate CopyTexSubImage for 1D array textures
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-04 14:06:36 +01:00
Marek Olšák
ed86809ac9 st/mesa: fix CopyTexSubImage fallback for 1D array textures
- We should use a 3D transfer of size Width x 1 x NumLayers.
- We should use layer_stride instead of stride.
  (even though they are likely to be equal with 1D array textures)

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-04 14:06:28 +01:00
Marek Olšák
85cb4f299d st/mesa: fix GetTexImage for compressed 2D array textures
This uses a 3D blit to decompress the texture and then a 3D transfer
to read it.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-04 14:06:17 +01:00
Marek Olšák
538d3a2d46 gallium/util: remove unused helper util_create_rgba_texture
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-04 14:06:14 +01:00
Marek Olšák
5daba187c9 st/mesa: try to find the format matching format+type in decompressed_with_blit
There was the fast path based on _mesa_format_matches_format_and_type
for GetTexImage, but it never worked, because the Mesa format we were testing
there was always compressed. Further testing showed that the fast path
had been completely broken.

In this commit, the somewhat limited helper util_create_rgba_texture is
no longer used and instead, custom code for the texture creation is added,
which tries to find the best matching RGBA8 format, so that we can hit
the fast path *always* if the read format is a variant of RGBA8 and supported
by the driver.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-04 14:06:09 +01:00
Marek Olšák
0aecb174ce st/mesa: fix GetTexImage for compressed cubemaps
I'll deal with 2D arrays later.

NOTE: This is a candidate for the stable branches.
2013-01-04 14:05:52 +01:00
Marek Olšák
afec42a648 gallium/u_blitter: implement 3D blitting
Scaling and flipping in the Z direction isn't allowed yet.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-04 14:05:49 +01:00
Marek Olšák
5665deeaea gallium/u_blitter: fix blitting TEXTURE_CUBE_ARRAY with a non-zero cube index
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-04 14:05:47 +01:00
Marek Olšák
53d232d223 gallium/u_blitter: minor simplification
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-04 14:05:45 +01:00
Marek Olšák
ccfcf32873 gallium/u_blitter: unify some parameters into a dstbox parameter in blit_generic
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-04 14:05:43 +01:00
Marek Olšák
23f76f558e gallium/u_blitter: remove useless parameter from blitter_default_dst_texture
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-04 14:05:40 +01:00
Marek Olšák
8fdece2896 gallium/util: complete implementation of util_dump_transfer
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-04 14:05:32 +01:00
Marek Olšák
8bd134f31b mesa: allow TEXTURE_CUBE_MAP_ARRAY in GetTexImage
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-01-04 14:05:21 +01:00
Marek Olšák
12aeb47b6a gallium/radeon: send the END_OF_FRAME flag to the DRM 2013-01-04 13:18:50 +01:00
Marek Olšák
598cc1f74d gallium: extend pipe_context::flush for it to accept an END_OF_FRAME flag
Usage with pipe_context:
  pipe->flush(pipe, NULL, PIPE_FLUSH_END_OF_FRAME);

Usage with st_context_iface:
  st->flush(st, ST_FLUSH_END_OF_FRAME, NULL);

The flag is only a hint for drivers. Radeon will use it for buffer eviction
heuristics in the kernel (e.g. for queries like how many frames have passed
since a buffer was used).

The flag is currently only generated by st/dri on SwapBuffers.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
2013-01-04 13:18:33 +01:00
Marek Olšák
4ad5ebaefa radeonsi: fix int->bool conversion in fence_signalled 2013-01-04 12:42:03 +01:00
Marek Olšák
9f0ddbc9e4 r600g: fix int->bool conversion in fence_signalled
NOTE: This is a candidate for the stable branches.
2013-01-04 12:42:03 +01:00
Paul Berry
b8b1d61e76 Add new .gitignore entries for Automake 1.13 tests
Automake 1.13 creates a bunch of new build artefacts:
- bin/test-driver, a script for running tests.
- *.trs files for every "make check" test result.
- *.log files containing the output of every test run by "make check".

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-03 15:24:45 -08:00
Kenneth Graunke
82f8e8ebd5 i965: Replace structs with bit-shifting for Gen7 SURFACE_STATE entries.
Every generation except Gen7 creates SURFACE_STATE entries via a
uint32_t array.  Only Gen7 uses the older bitfield structure, which we
moved away from because it was less efficient.  Convert it for
consistency.

This reduces the compiled size of gen7_wm_surface_state.o by 2.86% in a
release build.

v2: Fix accidental use of BRW_SURFACE_WIDTH/HEIGHT in brw_state_dump.c;
    switch back to gen7_set_surface_mcs_info setting surf[6] directly
    (both per Eric's review comments).

Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-01-03 13:36:04 -08:00
smoki
5bf357db89 radeon/r200: Fix tcl culling
Should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=57842
2013-01-03 13:22:22 -05:00
Jonas Ådahl
800ed958c3 wayland: Don't cancel a roundtrip when any event is received
Since wl_display_dispatch_queue() returns the number of processed events
or -1 on error, only cancel the roundtrip if an -1 is returned.

This also fixes a potential memory corruption bug happening when the
roundtrip does an early return and the callback later writes to the then
out of scope stack allocated `done' parameter.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2013-01-03 11:44:55 -05:00
Vinson Lee
622d96aae4 i965: Add break statement at end of BRW_OPCODE_CONTINUE case.
Fixes missing break in switch defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-02 22:30:13 -08:00
Chad Versace
bfe28b8d93 egl/android: Fix build for Jelly Bean (v2)
In Jelly Bean, the interface to ANativeWindow changed. The change included
adding a new parameter the queueBuffer and dequeueBuffer methods,
removing the lockBuffer method, and requiring libsync.

v2:
  - s/fence_fd == -1/fence_fd != -1/
  - Fix leak. Close the fence_fd.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-02 14:55:36 -08:00
Chad Versace
56c6cdc9e7 android: Define Make variables for Android version
Define the following Make variables:
    MESA_ANDROID_MAJOR_VERSION
    MESA_ANDROID_MINOR_VERSION
    MESA_ANDROID_VERSION

These variable will allow us to make version-dependent decisions on
library dependencies. In particular, building Mesa against JellyBean will
require libsync.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-01-02 14:51:18 -08:00
Matt Turner
7f962c5ef3 mesa: Add missing ASSERT_OUTSIDE_BEGIN_END to GetSamplerParameter*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-02 12:33:49 -08:00
Matt Turner
f10b54fd79 mesa: Add missing ASSERT_OUTSIDE_BEGIN_END to SamplerParameter*
Commit f22d49de added the SamplerParamter* functions but only used
ASSERT_OUTSIDE_BEGIN_END inside the -f and -fv versions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-02 12:33:44 -08:00
Matt Turner
1b06a0478f mesa: Mark _mesa_{init,delete}_sampler_object as static
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-01-02 12:33:35 -08:00
Adam Jackson
86b6964ef9 glcpp: Typo fix.
Note: this is a candidate for the 9.0 stable branch.

Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-01-02 14:09:22 -05:00
Adam Jackson
c8d3fd4a12 r300g: Fix visibility CFLAGS in automake
Note: this is a candidate for the 9.0 stable branch.

Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-01-02 14:08:21 -05:00
Adam Jackson
443954d161 galahad, noop: Fix visibility CFLAGS in automake
Note: this is a candidate for the 9.0 stable branch.

Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-01-02 14:08:15 -05:00
Adam Jackson
0daabd5239 glcpp: Fix visibility CFLAGS in automake
Note: this is a candidate for the 9.0 stable branch.

Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-01-02 14:07:58 -05:00
Paul Berry
7c0323296e mesa: Implement compressed 2D array textures.
This patch adds functionality to Mesa to upload compressed
2-dimensional array textures, using the glCompressedTexImage3D and
glCompressedTexSubImage3D calls.

Fixes piglit tests "EXT_texture_array/compressed *" and "!OpenGL ES
3.0/ext_texture_array-compressed_gles3 *".  Also partially fixes GLES3
conformance test "CoverageES30.test".

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-02 10:28:39 -08:00
Paul Berry
261ee4d907 mesa: Fix error reporting in _mesa_invalidate_pbo_{compressed_,}teximage.
The old error reporting was completely bogus, passing _mesa_error() a
format string that didn't even match the remaining arguments.  Also,
in many cases the number of dimensions in the TexImage call was not
preserved in the error message (e.g. an error in glTexImage2D was
reported simply as an error in glTexImage).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-01-02 10:28:23 -08:00
Brian Paul
c7d3254b8e mesa: fix signed/unsignd mix-up in fetch_signed_l_latc1()
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=58844
2013-01-02 09:02:04 -07:00
Brian Paul
955babf2d9 glsl: add cast to silence signed/unsigned comparision warning 2013-01-01 08:47:04 -07:00
Brian Paul
05cd6cfd5f xlib: handle _mesa_initialize_visual()'s return value
If the call fails, we should return NULL from XMesaCreateVisual().
This was found when Waffle tried to create a visual with depth/stencil
bits = -1.  That's an illegal value for glXChooseFBConfig() and we should
return NULL in that situation.

Note: This is a candidate for the stable branches.
2012-12-31 18:17:58 -07:00
Kenneth Graunke
66ea6e8ec3 i965: Fail to blit rather than assert on invalid pitch requirements.
Dungeon Defenders hits TexImage()'s try_pbo_upload() path where
image->Width == 2, which doesn't meet intelEmitCopyBlit's requirement
that the pitch needs to be a multiple of 4.

Since intelEmitCopyBlit can already fail for a myriad of other reasons,
and it's not clear that other callers are immune to this failure mode,
simply make it return false rather than assert.

Fixes Dungeon Defenders on i965/Ivybridge.  Now playable (aside from
having to work around the EXT_bindable_uniform issue).

NOTE: This is probably a candidate for the 9.0 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-12-29 01:04:30 -08:00
Eric Anholt
2f225f6145 intel: Skip texture validation logic when nothing has changed.
Improves GLBenchmark 2.1 offscreen performance by 3.2% +/- 1.5% (n=52).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-28 11:05:23 -08:00
Eric Anholt
73c376bbde intel: Turn a test in miptree_match_image into an assert.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-28 11:05:20 -08:00
Eric Anholt
12751ef2a7 i965: Stop making a copy of non-builtin uniforms in ParameterValues[].
We don't need them now that our set of parameter pointers points at the
GL core storage for them.  This should save memory/bandwidth/overhead in
uniform updates.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-28 10:53:54 -08:00
Eric Anholt
7e28d6c1ab i965: Consistently use nr_pull_params instead of NumParameters.
NumParameters used to be an upper bound on the number of vec4s to be
uploaded, which was basically safe (unless your buffer was bound near
the top of address space *and* you array indexed outside the buffer, in
which case I think you might GPU hang).  As I migrate the driver away
from ParameterValues[], this is no longer true.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-28 10:53:50 -08:00
Eric Anholt
aa6e35e80d i965/vs: Reference the core GL uniform storage for non-builtin uniforms.
Like in the FS, there's no reason to use an external copy if the
ParameterValues[] relayout of it isn't the layout we need.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-28 10:53:48 -08:00
Eric Anholt
c0d1f508d6 i965/fs: Reference the core GL uniform storage for non-builtin uniforms.
There's no reason to use an external copy if the relayout in the
external copy isn't serving us.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-28 10:53:45 -08:00
Eric Anholt
bd326623ef glsl: Add a note about a surprising feature of gl_uniform_storage->type.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-28 10:53:42 -08:00
Eric Anholt
f189570ccf i965/fs: Remove the param_index/param_offset indirection.
Now that ParameterValues doesn't change across the visitor, we don't
need to go through this.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-28 10:53:39 -08:00
Eric Anholt
d5efc14635 i965: Add asserts to check that we don't realloc ParameterValues.
Things are even more restrictive than they used to be, so I've made
mistakes in this area.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-28 10:53:36 -08:00
Eric Anholt
ffdfafb06c i965: Add texrect scale parameters before pointers to ParameterValues.
If adding scale parameters during program compile caused a realloc of
ParameterValues, then the driver uniform storage set up by
_mesa_associate_uniform_storage() would point to potentially freed
memory.

Note that this uses TexturesUsed, which may change at runtime for GLSL
when sampler uniforms change.  This is a flaw in our handling of texrect
in general, and not one I'm fixing currently.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-28 10:53:33 -08:00
Eric Anholt
6ccc505fc0 i965: Fix a typo in a comment.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-28 10:53:30 -08:00
Eric Anholt
50a88e2f44 i965: Add a note about a bug from the no-recompile-on-sampler-updates change.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-28 10:53:25 -08:00
Brian Paul
7c35521295 mesa: add missing texel fetch code for sRGB DXT formats
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=58548
2012-12-26 15:23:05 -07:00
Eric Anholt
5791c56811 i965: Fix border color handling for deprecated SNORM formats.
We don't have native hardware support for these, so they get promoted to
RGBA, in which case we don't have hardware dealing with the channel
swizzling for us.

Fixes piglit EXT_texture_snorm/texwrap formats bordercolor (-swizzled).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-26 12:08:33 -08:00
Eric Anholt
5628501e7b i965: Start using HIZ for Z16 textures.
I had left this out for a long time because it regressed some
depthstencil-render-miplevels cases when it was enabled.  Now that the
bugs causing those are fixed, there's nothing stopping us.

Improves glbenchmark 2.1 offscreen performance by 7.3% +/- 2.8% (n=10).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-26 12:03:04 -08:00
Eric Anholt
3e1d8e62e7 intel: Use the parent miptree's format for setting up HiZ miptrees.
This worked out before because the parent was always 4 bytes so it
didn't affect the layout, but now we want to support Z16 too.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-26 12:02:47 -08:00
Eric Anholt
cb3b172d19 intel: Remove a couple of dead function prototypes.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-12-22 13:46:12 -08:00
Eric Anholt
0d6a722ec4 i965: Add perf debug for depth/stencil alignment workaround.
Fixing these rendering bugs has been implicated in performance
regressions (which may be unfixable), but at least knowing that it's
happening should help diagnose those regressions.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-12-22 13:46:04 -08:00
Eric Anholt
e454b2d480 i965: Assert that relayout laid out something that won't need it again.
The ETC1 changes failed at this, so let's make sure it will be caught in
testing next time.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-12-22 13:46:04 -08:00
Eric Anholt
3b458416e3 i965: Also fix validation of Z32F_S8 textures.
This was caught by the assertion in the next commit.  It fixes the
remaining piglit depthstencil-render-miplevels cases, probably by
avoiding broken stencil copies in the validation path.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-12-22 13:46:04 -08:00
Eric Anholt
46386816a7 i965: Fix validation of ETC miptrees.
When comparing to the teximage's format, we have to look at the
format-the-mt-was-created-for not the format-actually-stored-in-the-mt.

Improves glbenchmark 2.1 offscreen test performance 159% +/- 17% (n=3).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54582
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-12-22 13:46:04 -08:00
Eric Anholt
3b99d094c9 qi965: Add perf debug for texture relayout.
Relayout is expensive, so it's something developers (both us and others)
should know about when it happens.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-22 13:45:56 -08:00
Eric Anholt
c417d261dd i965: Fix hiz resolves getting stomped by depth offset validation.
Fixes all the remaining non-Z32F_S8 depthstencil-render-miplevels tests
in piglit.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-12-22 13:41:22 -08:00
Marek Olšák
a58bf9d8f9 r600g: rename GPU_FLUSH -> INVAL_READ_CACHES
because that's what it does.
2012-12-22 19:39:29 +01:00
Marek Olšák
9ef26fc667 r600g: remove redundant parameter alloc_bo from r600_texture_create_object
alloc_bo == !buf
2012-12-22 19:39:29 +01:00
Matt Turner
a585b8f3a6 Make IsVertexArray() return false before BindVertexArray()
Rename existing _Used flag to EverBound.

The GL 4.3 and ES 3.0 specs say

   These names are marked as used, for the purposes of GenVertexArrays
   only, but they do not acquire array state until they are first bound.

This also affects Apple VAOs, which is fine since the
APPLE_vertex_array_object spec says

   A vertex array object is created by binding an unused name. This
   binding is accomplished by calling BindVertexArrayAPPLE with id set
   to the name of the new vertex array object.

Fixes arb_vertex_array_object_isvertexarray.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-21 20:03:30 -08:00
Matt Turner
fd93d55141 Make IsTransformFeedback() return false before BindTransformFeedback()
The GL 4.3 an ES 3.0 specs say

   A transform feedback object is created by binding a name returned by
   GenTransformFeedbacks with the command
      void BindTransformFeedback( enum target, uint id );

Fixes arb_transform_feedback2-istransformfeedback and part of
es3conform's CoverageES30.test.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-21 20:03:07 -08:00
Dave Airlie
54203ef5ac nouveau: deal with tbo cap for now.
This fixes the printk running apps against master.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-22 13:12:30 +10:00
Marek Olšák
9c6410e5c3 r600g: always use a tiled resource as the destination of MSAA resolve
i.e. we have to allocate a temporary tiled resource if dst isn't tiled.

This fixes hardlocks on r6xx-r7xx, though using a linear resource is forbidden
on later asics as well.

NOTE: This is a candidate for the stable branches.
2012-12-21 23:43:34 +01:00
Marek Olšák
9b0b4cf058 winsys/radeon: the env var RADEON_NOOP can be used to skip CS ioctls 2012-12-21 23:42:23 +01:00
Marek Olšák
eccc74f5d3 r600g: remove a false comment 2012-12-21 23:42:09 +01:00
Marek Olšák
fb45a816eb r600g: don't suspend TIME_ELAPSED queries during flushing
According to the GL spec, the result should be equivalent to comparing
two timestamps.
2012-12-21 23:42:04 +01:00
Marek Olšák
6d49ffde11 gallium/tests: fix build breakage after pipe_surface::usage removal 2012-12-21 23:41:41 +01:00
Frank Henigman
46e3aeb077 mesa: add bounds checking for uniform array access
No piglit regressions and now passes glsl-uniform-out-of-bounds-2.

validate_uniform_parameters now checks that the array index is
valid.  This means if an index is out of bounds, glGetUniform* now
fails with GL_INVALID_OPERATION, as it should.
_mesa_uniform and _mesa_uniform_matrix also call
validate_uniform_parameters so the bounds checks there became
redundant and were removed.

The test in glGetUniformLocation is modified to check array bounds
so it now returns GL_INVALID_INDEX (-1) if you ask for the location
of a non-existent array element, as it should.

Signed-off-by: Frank Henigman <fjhenigman@google.com>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
2012-12-21 11:23:48 -08:00
José Fonseca
74f0731953 util/u_format: Round when converting depth values from float to z16_unorm.
This makes the z16_unorm -> float -> z16_unorm conversion lossless.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-12-21 10:04:51 +00:00
Jerome Glisse
e8ca1a53a6 r600g: add cs tracing infrastructure for lockup pin pointing
It's a build time option you need to set R600_TRACE_CS to 1 and it
will print to stderr all cs along as cs trace point value which
gave last offset into a cs process by the GPU.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-12-20 18:23:54 -05:00
Jerome Glisse
6532eb17ba r600g: add htile support v16
htile is used for HiZ and HiS support and fast Z/S clears.
This commit just adds the htile setup and Fast Z clear.
We don't take full advantage of HiS with that patch.

v2 really use fast clear, still random issue with some tiles
   need to try more flush combination, fix depth/stencil
   texture decompression
v3 fix random issue on r6xx/r7xx
v4 rebase on top of lastest mesa, disable CB export when clearing
   htile surface to avoid wasting bandwidth
v5 resummarize htile surface when uploading z value. Fix z/stencil
   decompression, the custom blitter with custom dsa is no longer
   needed.
v6 Reorganize render control/override update mecanism, fixing more
   issues in the process.
v7 Add nop after depth surface base update to work around some htile
   flushing issue. For htile to 8x8 on r6xx/r7xx as other combination
   have issue. Do not enable hyperz when flushing/uncompressing
   depth buffer.
v8 Fix htile surface, preload and prefetch setup. Only set preload
   and prefetch on htile surface clear like fglrx. Record depth
   clear value per level. Support several level for the htile
   surface. First depth clear can't be a fast clear.
v9 Fix comments, properly account new register in emit function,
   disable fast zclear if clearing different layer of texture
   array to different value
v10 Disable hyperz for texture array making test simpler. Force
    db_misc_state update when no depth buffer is bound. Remove
    unused variable, rename depth_clearstencil to depth_clear.
    Don't allocate htile surface for flushed depth. Something
    broken the cliprect change, this need to be investigated.
v11 Rebase on top of newer mesa
v12 Rebase on top of newer mesa
v13 Rebase on top of newer mesa, htile surface need to be initialized
    to zero, somehow special casing first clear to not use fast clear
    and thus initialize the htile surface with proper value does not
    work in all case.
v14 Use resource not texture for htile buffer make the htile buffer
    size computation easier and simpler. Disable preload on evergreen
    as its still troublesome in some case
v15 Cleanup some comment and remove some left over
v16 Define name for bit 20 of CP_COHER_CNTL

Signed-off-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-12-20 18:23:51 -05:00
Jerome Glisse
24b1206ab2 r600g: rework flusing and synchronization pattern v7
This bring r600g allmost inline with closed source driver when
it comes to flushing and synchronization pattern.

v2-v4: history lost somewhere in outer space
v5: Fix compute size of flushing, use define for flags, update
    worst case cs size requirement for flush, treat rs780 and
    newer as r7xx when it comes to streamout.
v6: Fix num dw computation for framebuffer state, remove dead
    code, use define instead of hardcoded value.
v7: Remove dead code

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-12-20 18:23:31 -05:00
Paul Berry
cf5632094b mesa: Allow glReadBuffer(GL_NONE) for winsys framebuffers.
Previously, Mesa code assumed that glReadBuffer(GL_NONE) was only
valid for user-created framebuffer objects.  However, the spec is
quite clear that is should also be valid for the default framebuffer.
From section 18.2.1 ("Obtaining Pixels from the Framebuffer") of the
GL 4.3 spec:

    "When READ_FRAMEBUFFER_BINDING is zero, i.e. the default
    framebuffer, src must be one of the values listed in table 17.4,
    including NONE."

Similar language exists in the GLES 3.0 spec, and in desktop GL all
the way back to ARB_framebuffer_object.

Partially fixes GLES3 conformance test "CoverageES30.test".

NOTE: This is a candidate for stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-20 10:03:30 -08:00
José Fonseca
ab2f573634 llvmpipe: Drop PIPE_QUERY_TIME_ELAPSED support.
It was slightly wrong: we were computing the longest duration of
the query among all the rasterizer tasks.

Regardless, for tile-based implementations such as llvmpipe, time differences
will never be very useful, because rendering before/during/after the query
is all interleaved.  And this is expected, see ARB_timer_query spec, issue 10.

In particular, piglit ext_timer_query-time-elapsed still fails, because
it makes assumptions that don't hold true in in tiled architectures. Not
sure how to fix that though.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-20 16:33:49 +00:00
José Fonseca
3160b0b9fc mesa/st: Implement GL_TIME_ELAPSED w/ PIPE_QUERY_TIMESTAMP.
ARB/EXT_timer_query's definition of GL_TIME_ELAPSED match precisely the
subtraction of two GL_TIMESTAMP queries.

And for a lot of drivers, that's precisely how they have to implement
internally -- by emitting two hardware timestamp queries.

So, to simplify driver implementation, simply allow doing so in the state
tracker.

Eventually if no driver implements PIPE_QUERY_TIME_ELAPSED then we could
retire it.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-20 16:33:49 +00:00
José Fonseca
9976216bf6 gallium: s/PIPE_CAP_TIMER_QUERY/PIPE_CAP_QUERY_TIME_ELAPSED/
To better reflect what it is being advertised.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-20 16:33:49 +00:00
Marek Olšák
ef11ed61a0 r600g: add assertions to prevent creation of invalid surfaces 2012-12-20 17:13:18 +01:00
Marek Olšák
fefa2112bf r600g: refactor and make streamout dumping more informative
Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-12-20 17:13:15 +01:00
Marek Olšák
6a2ec765bd r600g: try to fix streamout for the cases where BURST_COUNT > 0
The burst was incorrectly used, because ELEM_SIZE was always 0.
I don't know if the burst works, because I don't know of any test
which uses it.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-12-20 17:13:12 +01:00
Marek Olšák
72362ebefb r600g: lower stream outputs with dst_offset < start_component
This fixes streamout breakage caused by the varying packing.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-12-20 17:13:09 +01:00
Marek Olšák
d0e40bd3ed r600g: use r600_get_temp to get temporaries for CLIPDIST shader outputs
I need this to be able to use r600_get_temp in the function later.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-12-20 17:13:06 +01:00
Brian Paul
fddcc67f5c softpipe: fix up FS variant unbinding / deletion
The old call to tgsi_exec_machine_bind_shader() in
softpipe_delete_fs_state() was never called since the shader's original
tokens are never passed to the tgsi interpreter (only shader _variant_
tokens are).  Now, unbind the variant's tokens from the tgsi interpreter
when we free the variant.

This doesn't fix any known bugs but it's the right thing to do.

Note: This is a candidate for the stable branches.
2012-12-19 09:02:08 -07:00
Brian Paul
18ef8f83b2 softpipe: fix unreliable FS variant binding bug
In exec_prepare() we were comparing pointers to see if the fragment
shader variant had changed before calling tgsi_exec_machine_bind_shader().
This didn't work reliably when there was a lot of shader token malloc/
freeing going on because the memory might get reused.
Instead, bind the shader variant during regular state validation.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=40404
(fixes a couple of piglit's glsl-max-varyings test)

Note: This is a candidate for the stable branches.
2012-12-19 09:02:08 -07:00
Jerome Glisse
50880314e3 Revert "r600g: work around ddx over alignment"
This reverts commit d8287bac1f.

Cause more issue than it fix. Need to think of a proper solution.
2012-12-19 09:56:17 -05:00
Jerome Glisse
d8287bac1f r600g: work around ddx over alignment
This force surface allocated from ddx to be consider as height
aligned on 8 and fix 1D->2D tiling transition that result from
this.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-12-18 16:10:54 -05:00
Paul Berry
1b37fc40fc i965: Fix gl_VertexID when there are no other vertex inputs.
brw_emit_vertices contains special case logic to handle the case where
a vertex shader doesn't read any inputs.  This special case logic was
incorrectly activating in the case were the only vertex input is
gl_VertexID.  As a result, if a shader used gl_VertexID but used no
other inputs, then all vertices got a gl_VertexID of zero.

Fixes oglconform test "ubo-usage advanced.transform_feedback".

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-18 09:02:53 -08:00
Paul Berry
5b7099c74d mesa: Make a function is_transform_feedback_active_and_unpaused.
The rather unweildy logic for determining this condition was repeated
in a large number of places.  This patch consolidates it to a single
inline function.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-18 09:02:53 -08:00
Paul Berry
1ad516207d mesa: Fix corner cases of BindBufferBase with transform feedback.
This patch implements the following behaviours, which are mandated by
the GL 4.3 and GLES3 specs.

1. Regarding the GL_TRANSFORM_FEEDBACK_BUFFER_SIZE query: "If the
   ... size was not specified when the buffer object was bound
   (e.g. if it was bound with BindBufferBase), ... zero is returned."
   (GL 4.3 section 6.7.1 "Indexed Buffer Object Limits and Binding
   Queries").

2. "BindBufferBase binds the entire buffer, even when the size of the
   buffer is changed after the binding is established. It is
   equivalent to calling BindBufferRange with offset zero, while size
   is determined by the size of the bound buffer at the time the
   binding is used."  (GL 4.3 section 6.1.1 "Binding Buffer Objects to
   Indexed Targets").  I interpret "at the time the binding is used"
   to mean "at the time of the call to glBeginTransformFeedback".

3. "Regardless of the size specified with BindBufferRange, or
   indirectly with BindBufferBase, the GL will never read or write
   beyond the end of a bound buffer. In some cases this constraint may
   result in visibly different behavior when a buffer overflow would
   otherwise result, such as described for transform feedback
   operations in section 13.2.2."  (GL 4.3 section 6.1.1 "Binding
   Buffer Objects to Indexed Targets").

Item 1 has been part of the spec all the way back to the inception of
the EXT_transform_feedback extension.  Items 2 and 3 were added in GL
4.2 and GLES 3.

Prior to GL 4.2, in place of items 2 and 3, the spec simply said
"BindBufferBase is equivalent to calling BindBufferRange with offset
zero and size equal to the size of buffer."  For transform feedback,
Mesa behaved as though this meant "...equal to the size of buffer at
the time of the call to BindBufferBase".  However, this was
problematic because it left it ambiguous what to do if the buffer is
shrunk between the call to BindBuffer{Base,Range} and the call to
BeginTransformFeedback.  Prior to this patch, Mesa's behaviour was to
try to write beyond the end of the buffer, likely resulting in memory
corruption.  In light of this, I'm interpreting the spec change as a
clarification, not an intended behavioural change, so I'm making the
change apply regardless of API version.

Fixes GLES3 conformance test transform_feedback2_pause_resume.test.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-12-18 09:02:49 -08:00
Paul Berry
b87e65c3b6 mesa/gles3: Generate error on draw call if transform feedback would overflow.
In desktop GL, if a draw call would cause transform feedback buffers
to overflow, the draw call should succeed, and the extra primitives
should simply not be recorded in the transform feedback buffers.

In GLES3, however, if a draw call would cause transform feedback
buffers to overflow, the draw call is supposed to produce an
INVALID_OPERATION error and no drawing should occur.

This patch implements the GLES3-required behaviour.

Fixes GLES3 conformance test "transform_feedback_overflow.test".

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-12-18 08:31:38 -08:00
Paul Berry
febc237141 mesa/gles3: Generate error on DrawElements* calls if transform feedback active.
In GLES3, only glDrawArrays() and glDrawArraysInstanced() calls are
allowed when transform feedback is active.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-12-18 08:31:34 -08:00
Paul Berry
3870f2903f mesa: refactor _mesa_compute_max_transform_feedback_vertices from i965.
Previously, the i965 driver contained code to compute the maximum
number of vertices that could be written without overflowing any
transform feedback buffers.  This code wasn't driver-specific, and for
GLES3 support we're going to need to use it in core mesa.  So this
patch moves the code into a core mesa function,
_mesa_compute_max_transform_feedback_vertices().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

v2: Eliminate C++-style variable declarations, since these won't work
with MSVC.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-18 08:31:26 -08:00
Paul Berry
61c1b065fb mesa: Change args to vbo_count_tessellated_primitives.
No functional change--this simply paves the way to allow futures
patches to call vbo_count_tessellated_primitives() during error
checking, before the _mesa_prim struct has been constructed.

This will be needed for GLES3, which requires draw calls to fail if
there is not enough space available in transform feedback buffers to
accommodate the primitives to be drawn.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-18 08:31:03 -08:00
Vadim Girlin
8cf552b182 radeon/llvm: improve cube map handling
Add support for TEX2, TXB2, TXL2, fix SHADOWCUBE

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2012-12-18 17:40:57 +04:00
Vadim Girlin
3b89fcbe54 radeon/llvm: fix TXQ_LZ handling for cube maps
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2012-12-18 17:40:57 +04:00
Vadim Girlin
63cabf0abb r600g: initialize inst_mod in r600_tex_from_byte_stream
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2012-12-18 17:40:57 +04:00
Roland Scheidegger
dc613f11dd gallivm: fix conversion for pure integer formats
Since the idea is to just expand or shrink the bit width but not otherwise do
conversion we also need to adjust the sign bit according to src, otherwise
the conversion code will incorrectly clamp the values. (Since this only works
for casting to ordinary floats the norm and fixed bits should always be fine.)

This fixes the remaining piglit attribs GL3 failures.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-12-18 01:57:35 +01:00
Kenneth Graunke
12f3b3d437 glsl: Fix gl_context vs. ralloc context in check_version again, again.
Dave found some, but there were more.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58039
2012-12-17 11:20:53 -08:00
Andreas Pokorny
fd65fb5aa8 vega: fix for object handle leak
frees the object handle when a OpenVG
is destroyed.

Signed-off-by: Andreas Pokorny <andreas.pokorny@elektrobit.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-12-17 10:22:26 -07:00
Brian Paul
9b13e731fa wmesa: include version.h to silence warning 2012-12-17 10:22:22 -07:00
Brian Paul
a9048aa6e6 xlib: include headers to fix errors/warnings 2012-12-17 10:22:10 -07:00
Jordan Justen
6cf3034ba7 mesa osmesa/x11: fix build error introduced in 4bea4cb9
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=58380

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-17 08:55:41 -08:00
Roland Scheidegger
3d14b25030 gallivm: fix texel fetch for array textures (2)
a460aea3f1 wasn't entirely correct,
since all coords are already ints hence need to skip the iround.
Passes piglit texelFetch with sampler1DArray/sampler2DArray.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-12-17 11:50:27 +01:00
Jordan Justen
1358f3a905 mesa: assert if driver did not compute the version
Make sure drivers initialize the version before:
 * _mesa_initialize_exec_table is called
 * _mesa_initialize_exec_table_vbo is called
 * A context is made current

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-16 15:30:28 -08:00
Jordan Justen
075f8722ab mesa: don't initialize VBO vtxfmt in _vbo_CreateContext
The driver should call _mesa_initialize_vbo_vtxfmt after
computing the context version.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-16 15:30:28 -08:00
Jordan Justen
53ee3959f2 mesa: don't initialize exec dispatch tables in _mesa_initialize_context
Drivers must compute the context version, and then call
_mesa_initialize_exec_table themselves.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-16 15:30:27 -08:00
Jordan Justen
d5d1f10955 mesa dispatch_sanity: call new functions to initialize exec table
In a future patch the exec functions will no longer set up
by _mesa_initialize_context and _vbo_CreateContext.

Therefore we must call _mesa_initialize_exec_table and
_mesa_initialize_exec_table_vbo.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-16 15:30:27 -08:00
Jordan Justen
4bea4cb9fd drivers: compute version and then initialize exec table
This change forces the context version to be computed before
initilizing the exec dispatch tables.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-16 15:30:27 -08:00
Jordan Justen
0924f4e90c vbo: add _mesa_initialize_vbo_vtxfmt
This function initializes the exec/save dispatch tables
for VBO vtxfmt.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-16 15:30:27 -08:00
Jordan Justen
d440149538 mesa: separate exec allocation from initialization
In glapi/gl_genexec.py:
* Remove _mesa_alloc_dispatch_table call

In glapi/gl_genexec.py and api_exec.h:
* Rename _mesa_create_exec_table to _mesa_initialize_exec_table

In context.c:
* Call _mesa_alloc_dispatch_table instead of _mesa_create_exec_table
* Call _mesa_initialize_exec_table (this is temporary)

Once all drivers have been modified to call
_mesa_initialize_exec_table, then the call to
_mesa_initialize_context can be removed from context.c.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-16 15:30:27 -08:00
Dave Airlie
fa5078c255 r600g: fixup offset types for printing
This allows the debug code to at least show the sign properly.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-16 10:36:42 +00:00
Henri Verbeet
cf358a2b42 gallium/u_blitter: Remove the overlapped blit assert from util_blitter_blit_generic().
This is used by st_BlitFramebuffer() / r600_blit(), and ARB_fbo allows
overlapped blits, even though the result is undefined. No piglit regressions
on r600g / CYPRESS.

Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-12-16 11:13:20 +01:00
Dave Airlie
a9abaaafd8 glsl_parser_extras.cpp: fixup gl vs mem contexts again.
This should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=58039

Tested-by: Darxus on bug 58039
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-16 17:30:08 +10:00
Kenneth Graunke
4f91f8dd60 i965: Move BRW_MAX_GRF and similar defines to brw_reg.h.
These don't really belong in brw_structs.h.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-15 13:40:16 -08:00
Kenneth Graunke
1db1283563 i965: Split struct brw_reg out from brw_eu.h into its own header.
struct brw_instruction and the related instruction emitting code won't
be useful on Gen8+, as the instruction encoding changed.  However, the
struct brw_reg code is still extremely valuable.

While we're at it, fix up some style points:
- s/GLuint/unsigned/g
- s/GLint/int/g
- s/GLshort/int16_t/g
- s/GLushort/uint16_t/g
- s/INLINE/inline/g
- Replace tabs with spaces
- Put return types on a separate line from the function name/parameters
- Remove trailing whitespace
- Remove extraneous whitespace around function parameters

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-15 13:40:09 -08:00
Dave Airlie
e1ca88f098 docs: add ARB_texture_buffer_object_rgb32
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-16 07:07:43 +10:00
Dave Airlie
39fa4c0a58 st/mesa: add texture buffer object rgb32 support.
This checks if the pipe driver can support RGB32 formats.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-16 06:55:39 +10:00
Dave Airlie
1b62c326ea mesa: add support for ARB_texture_buffer_object_rgb32
This adds the extensions + the tex buffer support for checking
the formats.

There is a piglit test enhancement sent to that list.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-16 06:55:33 +10:00
Dave Airlie
7d7a549fa0 glsl: avoid using gl context as a memory context
Not sure what was going on here, but running piglit with debug builds
might be a good plan :-)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-15 15:29:49 +10:00
Ian Romanick
b23e92dbe7 i965: Add missing autoconf bits so test_vec4_register_coalesce will build
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Eric Anholt <eric@anholt.net>
2012-12-14 18:44:18 -08:00
Eric Anholt
c9e48e5b08 i965: Generalize VS compute-to-MRF for compute-to-another-GRF, too.
No statistically significant performance difference on glbenchmark 2.7
(n=60).  It reduces cycles spent in the vertex shader by 3.3% +/- 0.8%
(n=5), but that's only about .3% of all cycles spent according to the
fixed shader_time.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 16:06:35 -08:00
Eric Anholt
471af25fc5 i965/vs: Extend opt_compute_to_mrf to handle limited "reswizzling"
The way our visitor works, scalar expression/swizzle results that get
stored in channels other than .x will have an intermediate MOV from
their result in the .x channel to the real .y (or whatever) channel, and
similarly for vec2/vec3 results.

By knowing how to adjust DP4-type instructions for optimizing out a
swizzled MOV, we can reduce instructions in common matrix multiplication
cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 16:06:30 -08:00
Eric Anholt
a76a03f437 i965/vs: Add a unit test for opt_compute_to_mrf().
The compute-to-mrf code is really twitchy, and it's hard to construct
GLSL testcases for it.  This unit test is also really hard to work with
(for example, if your instruction is removed by dead code elimination,
you end up inspecting something irrelevant), but I did use it for
debugging some of the commits to follow.

I called it test_vec4_register_coalesce because the compute-to-mrf code
is about to morph into that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 16:06:01 -08:00
Eric Anholt
7171c45d3a i965/fs: Drop an unnecessary _safe on a list walk.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 16:05:57 -08:00
Eric Anholt
78ce522932 i965/fs: Add a note explaining a detail of register_coalesce_2().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 16:05:48 -08:00
Eric Anholt
7baf9198b2 i965: Also consider HALTs a potential block end.
The final halt of the fragment shader turns off the remaining channels,
then jumps such that everything is turned back on.  So, we can have our
last ENDIF of the shader point at that directly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:45:26 -08:00
Kenneth Graunke
2702202290 i965: Jump to the end of the next outer conditional block on ENDIFs.
From the Ivybridge PRM, Volume 4, Part 3, section 6.24 (page 172):

"The endif instruction is also used to hop out of nested conditionals by
 jumping to the end of the next outer conditional block when all
 channels are disabled."

Also:
"Pseudocode:
 Evaluate(WrEn);
 if ( WrEn == 0 ) {  // all channels false
   Jump(IP + JIP);
 }"

First, ENDIF re-enables any channels that were disabled because they
didn't match the conditional.  If any channels are active, it proceeds
to the next instruction (IP + 16).  However, if they're all disabled,
there's no point in walking through all of the instructions that have no
effect---it can jump to the next instruction that might re-enable some
channels (an ELSE, ENDIF, or WHILE).

Previously, we always set JIP on ENDIF instructions to 2 (which is
measured in 8-byte units).  This made it do Jump(IP + 16), which just
meant it would go to the next instruction even if all channels were off.

It turns out that walking over instructions while all the channels are
disabled like this is worse than just instruction dispatch overhead: if
there are texturing messages, it still costs a couple hundred cycles to
not-actually-read from the texture results.

This patch finds the next instruction that could re-enable channels and
sets JIP accordingly.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-14 15:42:34 -08:00
Chris Forbes
2f7f095a80 i965: expose ARB_texture_cube_map_array
V3: Put enable in an existing block rather than making a new
one for no good reason.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:26:51 -08:00
Eric Anholt
380fc562b3 i965/fs: Fix setup for textureGrad(samplerCubeArray, coord, dPdx, dPdy)
Caught by tex_grad-01.frag.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:26:48 -08:00
Eric Anholt
3c56063354 i965/fs: Move the failure for gen7 16-wide intdiv to emit_math().
The cube map array code adds another caller of emit_math(), which
needs this check.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:26:43 -08:00
Chris Forbes
d2dbba8755 i965: fs: Add fixup for textureSize on Gen6/7
V2: Moved up into emit(ir_texture *) to avoid duplication and fix
ordering for Gen7; Gen6 math quirks moved into previous patches.

Tested on Gen6 only; passes all the cube_map_array piglits.

V3: Fixed weird whitespace
V4: Use sampler->type; otherwise broken on arrays of samplers.
v5: Minor style fixes (by anholt)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:26:39 -08:00
Chris Forbes
6e34723ac9 i965: fs: fix gen6+ math operands in one place
V4: Fix various style nits as pointed out by Eric, and expand IMM
    operands on both Gen6 and Gen7.
v5: minor style nits (by anholt)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:26:35 -08:00
Chris Forbes
f6a3fda25d i965: vs: Add fixup for textureSize with cube array samplers
V3: Fixed weird whitespace
V4: Use sampler's type rather than variable's type; otherwise broken
    with arrays of samplers. (Thanks Eric)
v5: Fix a couple more style nits (by anholt)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:26:31 -08:00
Chris Forbes
1cb57ea493 i965/vs: Fix gen6+ math operand quirks in one place
This causes immediate values to get moved to a temp on gen7, which is needed
for an upcoming change but hadn't happened in the visitor until then.

v2: Drop gen > 7 checks (doesn't exist), and style-fix comments (changes by
    anholt).

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:26:28 -08:00
Chris Forbes
0cda3382a6 i965: Add various plumbing for cubemap arrays
V4: Fixed style nits

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:26:12 -08:00
Eric Anholt
2cae9f2d4a i965/fs: Add empirically-determined instruction latencies for gen7.
v2: Actually switch on the other math instructions mentioned in the
    comment.
v3: Add timing data for textureSize(), and clean up some long comment
    lines.

Testing shader_time of fs16 shaders on a few frames of various apps:
nexuiz improved by 2.9% +/- 1.5% (n=10)
no difference on GLB2.5 (n=36, outliers removed)
no difference on GLB2.7 (n=25)
etqw improved by 2.6% +/- 2.2% (n=25)
no difference on lightsmark (n=25)

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:18:22 -08:00
Eric Anholt
4df1e18864 i965/fs: Fix the clock increment in scheduling.
I've tested this to be true with various ALU ops on gen7 (with the
exception of MADs, which go at either 3 or 4 cycles per dispatch).

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:18:14 -08:00
Eric Anholt
6255fc7426 i965/fs: Move the old gen4 bspec-based scheduling info to a helper func.
For gen7 everything changes, and we have actual information on latency.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:18:10 -08:00
Eric Anholt
461a29783a i965/fs: Set up gen7 UBO loads as sends from GRFs.
This gives the instruction scheduler a chance to schedule between the
loads, whereas before it was restricted due to the dependencies between
the MRFs for setting them up.

For one shader in gles3conform, it goes from getting stuck in register
allocation for as long as anybody's bothered to leave it running down
to 23 seconds, thanks to the LIFO scheduling.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:18:05 -08:00
Eric Anholt
456dbcc337 i965/fs: Before reg alloc, schedule instructions to reduce live ranges.
This came from an idea by Ben Segovia.  16-wide pixel shaders are very
important for latency hiding on i965, so we want to try really hard to
get them.  If scheduling an instruction makes some set of instructions
available, those are probably the ones that make the instruction's
result dead.  By choosing those first, we'll have a tendency to reduce
the amount of live data as opposed to creating more.

Previously, we were sometimes getting this behavior out of the
scheduler, which was what produced the scheduler's original performance
wins on lightsmark.  Unfortunately, that was mostly an accident of the
lame instruction latency information that I had, which made it
impossible to fix the actual scheduling for performance.  Now that we've
fixed the scheduling for setup for register allocation, we can safely
update the latency parameters for the final schedule.

In shader-db, we lose 37 16-wide shaders, but gain 90 new ones.  4
shaders that were spilling change how many registers spill, for a
reduction of 70/3899 instructions.

v2: Simplify the new loop.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:17:59 -08:00
Eric Anholt
ba864bfcfa i965/fs: Add some optional debug printfs to scheduling.
Seeing when instructions become available to schedule is really useful.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:17:55 -08:00
Eric Anholt
7a9f940cab i965/fs: Schedule instructions both before and after register allocation.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-14 15:17:41 -08:00
Eric Anholt
1315f3b4b3 i965: Make sure that the shader_time report at context destroy happens.
Otherwise, you end up with some report from within a second of context
destroy, which is now what you really want for testing the impact of
changes
2012-12-14 15:05:10 -08:00
Eric Anholt
81c247404a i965: Print a total time for the different shader stages.
Sometimes I've got a patch for a performance optimization that's not
showing a statistically significant performance difference on reported
FPS, but still seems like a good idea because it ought to reduce time
spent in the shader.  If I can see the total number of cycles spent in
the shader stage being optimized, it may show that the patch is still
worthwhile (or point out that it's actually broken in some way).
2012-12-14 15:05:10 -08:00
Eric Anholt
f74560f3fb i965: Scale shader_time to compensate for resets.
Some shaders experience resets more than others, which skews the numbers
reported.  Attempt to correct for this by linearly scaling according to
the number of resets that happen.

Note that will not be accurate if invocations of shaders have varying
times and longer invocations are more likely to reset.  However, this
should at least be better than the previous situation.
2012-12-14 15:05:10 -08:00
Eric Anholt
338b5f887d i965: Adjust the split between shader_time_end() and shader_time_write().
I'm about to emit other kinds of writes besides time deltas, and it
turns out with the frequency of resets, we couldn't really use the old
time delta write() function more than once in a shader.
2012-12-14 15:05:10 -08:00
Paul Berry
ca7e891e8a glsl/linker: Pack between varyings.
This patch implements varying packing between varyings.

Previously, each varying occupied components 0 through N-1 of its
assigned varying slot, so there was no way to pack two varyings into
the same slot.  For example, if the varyings were a float, a vec2, a
vec3, and another vec2, they would be stored as follows:

 <----slot1----> <----slot2----> <----slot3----> <----slot4---->  slots
  *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *
 flt  x   x   x  <vec2->  x   x  <--vec3--->  x  <vec2->  x   x   varyings

(Each * represents a varying component, and the "x"s represent wasted
space).

This change packs the varyings together to eliminate wasted space
between varyings, like so:

 <----slot1----> <----slot2----> <----slot3----> <----slot4---->  slots
  *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *
 <vec2-> <vec2-> flt <--vec3--->  x   x   x   x   x   x   x   x   varyings

Note that we take advantage of the sort order introduced in previous
patches (vec4's first, then vec2's, then scalars, then vec3's) to
minimize how often a varying is "double parked" (split across varying
slots).

Reviewed-by: Eric Anholt <eric@anholt.net>

v2: Skip varying packing if ctx->Const.DisableVaryingPacking is true.
2012-12-14 10:51:21 -08:00
Paul Berry
df87722bec glsl/linker: Pack within compound varyings.
This patch implements varying packing within varyings that are
composed of multiple vectors of size less than 4 (e.g. arrays of
vec2's, or matrices with height less than 4).

Previously, such varyings used up a full 4-wide varying slot for each
constituent vector, meaning that some of the components of each
varying slot went unused.  For example, a mat4x3 would be stored as
follows:

 <----slot1----> <----slot2----> <----slot3----> <----slot4---->  slots
  *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *
 <-column1->  x  <-column2->  x  <-column3->  x  <-column4->  x   matrix

(Each * represents a varying component, and the "x"s represent wasted
space).  In addition to wasting precious varying components, this
layout complicated transform feedback, since the constituents of the
varying are expected to be output to the transform feedback buffer
contiguously (e.g. without gaps between the columns, in the case of a
matrix).

This change packs the constituents of each varying together so that
all wasted space is at the end.  For the mat4x3 example, this looks
like so:

 <----slot1----> <----slot2----> <----slot3----> <----slot4---->  slots
  *   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *
 <-column1-> <-column2-> <-column3-> <-column4->  x   x   x   x   matrix

Note that matrix columns 2 and 3 now cross a boundary between varying
slots (a characteristic I call "double parking" of a varying).

We don't bother trying to eliminate the wasted space at the end of the
varying, since the patch that follows will take care of that.

Since compiler back-ends don't (yet) support this packed layout, the
lower_packed_varyings function is used to rewrite the shader into a
form where each varying occupies a full varying slot.  Later, if we
add native back-end support for varying packing, we can make this
lowering pass optional.

Reviewed-by: Eric Anholt <eric@anholt.net>

v2: Skip varying packing if ctx->Const.DisableVaryingPacking is true.
2012-12-14 10:51:18 -08:00
Paul Berry
4bb8661b1b gallium: Disable varying packing on hardware with <=8 texture indirections.
In practice this will disable varying packing on R300, R400, i915g,
and nv30.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-12-14 10:51:10 -08:00
Paul Berry
6ee500cfd2 mesa: Add an option so driver can opt out of varying packing.
On hardware that supports a limited number of texture indirections,
varying packing will comsume an extra texture indirection, since ALU
operations are needed in the fragment shader to unpack the varyings
before any texturing can be done.

This patch introduces a new driver option,
ctx->Const.DisableVaryingPacking, which can be used by a driver to opt
out of varying packing if the extra texture indirection is costly
enough to outweigh the advantages of packing varyings.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-12-14 10:49:32 -08:00
Paul Berry
1745a4d751 glsl: Add a lowering pass for packing varyings.
This lowering pass generates GLSL code that manually packs varyings
into vec4 slots, for the benefit of back-ends that don't support
packed varyings natively.

No functional change--the lowering pass is not yet used.

Reviewed-by: Eric Anholt <eric@anholt.net>

v2: Don't use ir_hierarchical_visitor--just loop over instructions
directly.  Also, make the names of the packed varyings include the
names of the original varyings that were packed into them.
2012-12-14 10:49:21 -08:00
Paul Berry
f3993107f0 glsl/linker: Sort varyings by packing class, then vector size.
This patch paves the way for varying packing by adding a sorting step
before varying assignment, which sorts the varyings into an order that
increases the likelihood of being able to find an efficient packing.

First, varyings are sorted into "packing classes" by considering
attributes that can't be mixed during varying packing--at the moment
this includes base type (float/int/uint/bool) and interpolation mode
(smooth/noperspective/flat/centroid), though later we will hopefully
be able to relax some of these restrictions.  The number of packing
classes places an upper limit on the amount of space that must be
wasted by varying packing, since in theory a shader might nave 4n+1
components worth of varyings in each of m packing classes, resulting
in 3m components worth of wasted space.

Then, within each packing class, varyings are sorted by vector size,
with vec4's coming first, then vec2's, then scalars, and then finally
vec3's.  The motivation for this order is that it ensures that the
only vectors that might be "double parked" (with part of the vector in
one varying slot and the remainder in another) are vec3's.

Note that the varyings aren't actually packed yet, merely placed in an
order that will facilitate packing.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-14 10:49:12 -08:00
Paul Berry
eb989e37cb glsl/linker: Subdivide the first phase of varying assignment.
This patch further subdivides the loop that assigns varying locations
into two phases: one phase to match up the varyings between shader
stages, and one phase to assign them varying locations.

In between the two phases the matched varyings are stored in a new
data structure called varying_matches.  This will free us to be able
to assign varying locations in any order, which will pave the way for
packing varyings.

Note that the new varying_matches::assign_locations() function returns
the number of varying slots that were used; this return value will be
used in a future patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-14 10:49:08 -08:00
Paul Berry
25ed3bef9b glsl/linker: Defer recording transform feedback locations.
This patch subdivides the loop that assigns varying locations into two
phases: one phase to match up varyings between shader stages (and
assign them varying locations), and a second phase to record the
varying assignments for use by transform feedback.

This paves the way for varying packing, which will require us to
further subdivide the first phase.

In addition, it lets us avoid a clumsy O(n^2) algorithm, since we can
now record the locations of all transform feedback varyings in a
single pass through the tfeedback_decls array, rather than have to
iterate through the array after assigning each varying.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-14 10:49:05 -08:00
Paul Berry
3e81c666db glsl: Create a field to store fractional varying locations.
Currently, the location of each varying is recorded in ir_variable as
a multiple of the size of a vec4.  In order to pack varyings, we need
to be able to record, e.g. that a vec2 is stored in the second half of
a varying slot rather than the first half.

This patch introduces a field ir_variable::location_frac, which
represents the offset within a vec4 where a varying's value is stored.
Varyings that are not subject to packing will always have a
location_frac value of zero.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-14 10:48:52 -08:00
Paul Berry
3c9c17db4a glsl/linker: Make separate ir_variable field to mean "unmatched".
Previously, the linker used a value of -1 in ir_variable::location to
denote a generic input or output of the shader that had not yet been
matched up to a variable in another pipeline stage.

This patch introduces a new ir_variable field,
is_unmatched_generic_inout, for that purpose.

In future patches, this will allow us to separate the process of
matching varyings between shader stages from the processes of
assigning locations to those varying.  That will in turn pave the way
for packing varyings.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-14 10:48:38 -08:00
Paul Berry
50895d443a glsl/linker: Always invalidate shader ins/outs, even in corner cases.
Previously, link_invalidate_variable_locations() was only called
during assign_attribute_or_color_locations() and
assign_varying_locations().  This meant that in the corner case when
there was only a vertex shader, and varyings were being captured by
transform feedback, link_invalidate_variable_locations() wasn't being
called for the varyings.

This patch migrates the calls to link_invalidate_variable_locations()
to link_shaders(), so that they will be called in all circumstances.
In addition, it modifies the call semantics so that
link_invalidate_variable_locations() need only be called once per
shader stage (rather than once for inputs and once for outputs).

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-14 10:48:35 -08:00
Paul Berry
18392443d4 glsl/lower_clip_distance: Update symbol table.
This patch modifies the clip distance lowering pass so that the new
symbol it generates (glClipDistanceMESA) is added to the shader's
symbol table.

This will allow a later patch to modify the linker so that it finds
transform feedback varyings using the symbol table rather than having
to iterate through all the declarations in the shader.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-14 10:48:28 -08:00
Tapani Pälli
d249159fe6 android: build fix for libmesa_glsl_utils
hash_table.c compilation requires ralloc.h include path

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-12-14 10:01:45 -08:00
Brian Paul
a12a8c910f mesa: minor indentation fixes in texcompress_etc.c 2012-12-14 06:33:08 -07:00
Brian Paul
b29f2d5ff5 mesa: remove old swrast-based compressed texel fetch code 2012-12-14 06:33:08 -07:00
Brian Paul
7dc36a50de swrast: use new core Mesa compressed texel fetch functions 2012-12-14 06:33:08 -07:00
Brian Paul
faa95fd7fa mesa: reimplement _mesa_decompress_image() using new tex fetch code 2012-12-14 06:33:08 -07:00
Brian Paul
ccbe7db1e6 mesa: added _mesa_get_compressed_fetch_func() 2012-12-14 06:33:08 -07:00
Brian Paul
ad3e39bb6d mesa: add new texel fetch code for etc formats 2012-12-14 06:33:07 -07:00
Brian Paul
cd7baf5bf4 mesa: add new texel fetch code for rgtc formats 2012-12-14 06:33:07 -07:00
Brian Paul
141d299965 mesa: add new texel fetch code for fxt formats 2012-12-14 06:33:07 -07:00
Brian Paul
a774eaa57e mesa: add new texel fetch code for dxt formats 2012-12-14 06:33:07 -07:00
Brian Paul
2037a06da9 mesa: add compressed_fetch_func typedef
This is a first step in removing the swrast-related code in core
Mesa's texture compression files.
2012-12-14 06:33:07 -07:00
Brian Paul
90b7797a1d swrast: merge get_texel_fetch_func() and set_fetch_functions()
No real need for separate functions anymore.
2012-12-14 06:33:07 -07:00
Brian Paul
f4896cea04 swrast: make _mesa_get_texel_fetch_func() static
Not called from any other file.
2012-12-14 06:33:07 -07:00
Dave Airlie
9e41b0badb draw/llvmpipe: fix transform feedback position + enable other extensions
This builds on the previous draw/softpipe patch.

So llvmpipe does streamout calls after clip/viewport stages,
but we have the pre-clip position stored for later use, so
when we are doing transform feedback, and its the position vertex
grab the vertex from the stored pre clip position.

The perfect fix is too probably add a codegen transform feedback
stage in between shader and clip stages, but this is good enough
for now.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-14 11:34:40 +10:00
Dave Airlie
55d37eb40e draw: add support for later transform feedback extensions
This adds support to draw for the new features of transform feedback.

a) fix count_from_stream_output, using max_index+1 for now but it looks
like it should be valid as its derived from the vertex elements/vbo.

b) fix striding and dst offsets in output buffers - was just wrong before.

c) fix crash if tfb is suspended (so.num_targets == 0)

This also enables the new features on softpipe. It should be possible
to enable them on llvmpipe as well after this commit, but would need
to schedule piglit runs.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-14 11:34:15 +10:00
Tom Stellard
4330cfec8b clover: Fix build since removal of pipe_surface::usage
by commit 25409c6da8
2012-12-13 20:04:34 +00:00
Maxence Le Dore
6d7d821e3d r600g/radeonsi: Silence warnings
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-12-13 19:40:28 +00:00
Tom Stellard
c68babfc3c clover: Add support for compiler flags
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2012-12-13 19:22:44 +00:00
Tom Stellard
7f71efcf7a clover: Don't erase build info of devices not being built
Every call to _cl_program::build() was erasing the binaries and logs for
every device associated with the program.  This is incorrect because
it is possible to build a program for only a subset of devices and so
any device not being build should not have this information erased.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2012-12-13 19:22:35 +00:00
Vincent Lejeune
c7f9fb37ea r600g: use load_ar checks with llvm output.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-12-13 19:22:10 +00:00
Thierry Reding
60e05d7388 build: Fix AX_PROG_{CC,CXX}_FOR_BUILD macros
Override the cross_compiling and ac_tool_prefix variables by reassigning
to them instead of redefining the macros. Redefining them will actually
cause the variable names to be replaced instead of their content.

Furthermore push the definition of CPPFLAGS before running the checks
for the build tools to avoid the host CPPFLAGS from leaking into the
build CPPFLAGS.

While at it drop the redefinition of AC_TRY_COMPILER which hasn't been
used since autoconf 2.50 and make sure that all definitions are properly
popped when done (LDFLAGS, ac_cv_prog_CPP, ac_cv_prog_CXXCPP).

Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
2012-12-13 10:58:11 -08:00
Roland Scheidegger
a460aea3f1 gallivm: fix texel fetch for array textures
Since we don't call lp_build_sample_common() in the texel fetch path we missed
the layer fixup code. If someone would have tried to do texelFetch with array
textures it would have crashed for sure.
Not really tested (can't run the piglit test being able to use texelFetch with
array samplers for now with llvmpipe).

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-12-13 19:17:09 +01:00
Paul Berry
6267853055 mesa: Fix computation of default vertex attrib stride for 2_10_10_10 formats.
Previously, if the client program didn't specify a stride when setting
up a vertex attribute, we used _mesa_sizeof_type() to compute the size
of the type, and multiplied it by the number of components.

This didn't work for the 2_10_10_10 formats, since _mesa_sizeof_type()
returns -1 for those types, resulting in all kinds of havoc, since it
was causing the hardware to be programmed with a negative stride
value.

This patch adds a new function _mesa_bytes_per_vertex_attrib(), which
is similar to the existing function _mesa_bytes_per_pixel(), but which
computes the size of a vertex attribute based on the type and the
number of formats.  For packed formats (currently only the 2_10_10_10
formats), it verifies that the number of components is correct and
returns the size of the packed format.  For unpacked formats, it
returns the size of the type times the number of components.

In addition, this patch adds an assertion so that if we ever forget to
update _mesa_bytes_per_vertex_attrib() when adding a new vertex
format, we'll see the problem quickly rather than having to debug a
subtle conformance test failure.

Fixes GLES3 conformance tests
vertex_type_2_10_10_10_rev_{conversion,divisor,stride_pointer}.test.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-13 10:09:03 -08:00
Matt Turner
11cea47246 mesa/uniform_query: Don't write to *params if there is an error
The GL 3.1 and ES 3.0 specs say of glGetActiveUniformsiv:
   "If an error occurs, nothing will be written to params."

So, make a pass through the indices and check that they're valid before
the pass that actually writes to params. Checking pname happens on the
first iteration of the second loop.

Fixes es3conform's getactiveuniformsiv_for_nonexistent_uniform_indices
test.

NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-13 09:53:28 -08:00
Matt Turner
6acabe33a3 mesa: print unsigned values with %u
Otherwise messages say silly things like
   glGetActiveUniformBlockiv(block index -1 >= 0)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-13 09:53:11 -08:00
Kenneth Graunke
200bb36778 i965: Fix disassembly of jump targets on Gen7.
Gen7 stores the JIP/UIP bits in different places.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-12 22:19:08 -08:00
Kenneth Graunke
c2eb9d3a0a i965: Make try_rewrite_rhs_to_dst compare VGRF size to regs written.
try_rewrite_rhs_to_dst is a quick optimization to avoid generating new
temporaries (and MOVs from those temporaries to the dest) for every
expression tree we visit.  By generating better code in simple cases, we
reduce the burden on later optimization passes like register coalescing.

Previously, we compared inst->regs_written() to lhs->vector_elements
to make sure the instruction generating our value wrote the same number
of components as our destination register.

However, this fails in some cases.  One example is texturing (which
produces a vec4) into gl_FragData[i].  Technically, gl_FragData[i] is
also a vec4.  However, the destination VGRF actually has size 4n (where
n is the size of the array).

split_virtual_grfs() can't split VGRFs that are used by SEND messages
which require contiguous destination registers (like texturing), and
register allocation needs all VGRFs to have sizes between 1 and 4.

Amnesia: The Dark Descent hits this case: a texturing instruction
(4 components) gets rewritten to the gl_FragData output register
(which was 4*3 = 12 components), causing the register allocator to
hit the "we rely on split_virtual_grfs" assertion.

This makes it possible to play Amnesia.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-12 14:44:37 -08:00
Emil Velikov
1223458764 configure.ac: Disable compiler optimizations when --enable-debug is set
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dan Nicholson <dbn.lists@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-12-12 14:48:06 -06:00
Brian Paul
e721a76e68 softpipe: remove unused corner0 variable 2012-12-12 08:51:19 -07:00
Brian Paul
8ef27e8fa9 llvmpipe: remove unneeded draw_flush() call
This is redundant since we're calling draw_bind_fragment_shader()
which already does a flush.

v2: the redundant flush in llvmpipe_set_constant_buffer() has
already been removed by commit 3427466e6d

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-12-12 08:45:45 -07:00
Marek Olšák
d225d076a9 r600g: suballocate memory for fetch shaders from a large buffer
Fetch shaders are usually destroyed at the context destruction by the state
tracker, so we can put them all in a large buffer without wasting memory.

This reduces the number of relocations sent to the kernel a little bit.

Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-12 13:12:31 +01:00
Marek Olšák
8df3855eed r600g: suballocate memory for the STRMOUT_BUFFER_FILLED_SIZE register
Instead of having a 4-byte buffer for each streamout target, we suballocate
each dword from a 4K buffer.

This further reduces the overall number of relocations.

Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-12 13:12:28 +01:00
Marek Olšák
cc2d908572 gallium/util: add a simple allocator for suballocating from a large buffer
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-12 13:12:24 +01:00
Marek Olšák
2478fcd87c r600g: use u_upload_mgr for allocating staging transfer buffers
u_upload_mgr suballocates memory from a large buffer and maps the allocated
range (unsychronized), which is perfect for short-lived staging buffers.

This reduces the number of relocations sent to the kernel.

Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-12 13:11:52 +01:00
Marek Olšák
448cd5ea60 winsys/radeon: don't use BIND flags, add a flag for the cache bufmgr instead 2012-12-12 13:09:54 +01:00
Marek Olšák
1d0bf69f83 st/dri: add a way to force MSAA on with an environment variable
There are 2 ways. I prefer the former:
  GALLIUM_MSAA=n
  __GL_FSAA_MODE=n

Tested with ETQW, which doesn't support MSAA on Linux. This is
the only way to get MSAA there.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:54 +01:00
Marek Olšák
afa902a705 mesa: don't advertise ARB_texture_buffer_object in legacy contexts
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-12 13:09:54 +01:00
Marek Olšák
0ac83a2001 mesa: disallow creation of GL 3.1 compatibility contexts
Death to driver-specific hacks!

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-12 13:09:54 +01:00
Marek Olšák
25409c6da8 gallium: remove pipe_surface::usage
Not really used by anybody now.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:54 +01:00
Marek Olšák
c1f704073b svga: stop using pipe_surface::usage
There are only 2 possible usages: render target and depth stencil.
Both can be derived from the surface format, so the flag is redundant.

And it's going away...

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Marek Olšák
21b1ec69fc gallium/util: move util_try_blit_via_copy_region to u_surface.c
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Marek Olšák
3a555637b2 gallium/cso: don't use the pipe_error return type where it's not needed
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Marek Olšák
eae9674f18 gallium: manage render condition in cso_context and fix postprocessing w/ it
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Marek Olšák
9ec6ffd85d st/mesa: remove a weird msaa hack
It doesn't work and it's not clear how it's supposed to work.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Dave Airlie
621259b3de softpipe: implement seamless cubemap support. (v1.1)
This adds seamless sampling for cubemap boundaries if requested.

The corner case averaging is messy but seems like it should be spec
compliant.

The face direction stuff is also a bit messy, I've no idea if that could
or should be simpler, or even if all my directions are fully correct!

v1.1: update comments, drop unneeded seamless calls for nearest, fix
if statement layout.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-12 10:35:05 +10:00
Dave Airlie
3392f2fbcf gallium: fix cap warnings for tbo cap.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-12 07:16:02 +10:00
Dave Airlie
5cdcd7251a glsl_to_tgsi: emit multi-level structs and arrays properly.
This follow the code from the i965 driver, and emits the structs
and arrays recursively.

This fixes an assert in the two UBO tests
fs-struct-copy-complicated and
vs-struct-copy-complicated

These tests now pass on softpipe, with no regressions.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-12 06:57:38 +10:00
Brian Paul
2ee0b44252 llvmpipe: don't use user constant buffers
This fixes some use-after-free issues.  I haven't measured any real
performance difference with a handful of Mesa demos.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-12-11 12:48:07 -07:00
Brian Paul
3427466e6d llvmpipe: support pipe_resource-based constant buffers
Before this we only supported user-based constant buffers.

First, we basically plumb pipe_constant_buffer objects through llvmpipe
rather than pipe_resource objects.

Second, update llvmpipe_set_constant_buffer() and try_update_scene_state()
so they understand both resource- and user-based constant buffers.

The problem with user constant buffers is the potential for use-after-free,
as seen in some WebGL tests.  The next patch will flip the switch for
resource-based const buffers.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-12-11 12:48:06 -07:00
Brian Paul
4c6053dc51 util: add util_copy_constant_buffer() helper function
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-12-11 12:48:06 -07:00
Eric Anholt
beafced21c i965/fs: Improve performance of shaders that start out with a discard.
I had tried this in the past, but ran into trouble with applications
that sample from undiscarded pixels in the same subspan.  To fix that
issue, only jump to the end for an entire subspan at a time.

Improves GLbenchmark 2.7 (1024x768) performance by 7.9 +/- 1.5% (n=8).

v2: Drop the br variable in the jump instruction -- if I ever do jumps
    pre-gen6, it'll be a different code block anyway since we don't have
    HALT until gen6.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:13:15 -08:00
Eric Anholt
d5016495cc i965/fs: Rewrite discards to use a flag subreg to track discarded pixels.
This makes much more sense on gen6+, and will also prove useful for
early exit of shaders on discard.

v2: fix up a stale comment from before converting gen4-5.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:13:08 -08:00
Eric Anholt
b278f65e1c i965/fs: Add an instruction flag for choosing the flag subregister.
We're going to redo discard handling to track discards in the other flag
subregister, saving instructions in the discard and allowing predicated
jumps out to the end of the shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:12:58 -08:00
Eric Anholt
2c69a9fb60 i965: Let brw_flag_reg() choose the flag reg and subreg.
We're about to start using the f0.1 subregister.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:12:54 -08:00
Eric Anholt
6a1490bc8f i965: Print the flag reg updated by conditional modifiers.
This makes our output more consistent with other disasm tools, and
will be necessary when we start using f0.1.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:12:49 -08:00
Eric Anholt
b7fd4b3f94 i965: Add the new flag_reg_nr instruction field from IVB.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:12:47 -08:00
Eric Anholt
f606a42a3c i965: Correct the name and usage of the flag subregister number field.
We've been calling it a register number, it's actually the subregister,
and things will get confusing once we start using it if it isn't fixed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:12:41 -08:00
Eric Anholt
7d404a4bd8 i965: Remove bogus flag_reg_nr field from bits3.
There's a flag subreg nr field in bits2 next to src0.vertstride, but
there shouldn't be anything in bits3 next to src1.vertstride.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:11:44 -08:00
Tobias Droste
cb8300f5a9 st/egl/drm: only unref the udev device if needed
Fixes compiler warning:

drm/native_drm.c: In function ‘native_create_display’:
drm/native_drm.c:180:21: warning: ‘device’ may be used uninitialized in this function [-Wmaybe-uninitialized]
drm/native_drm.c:157:24: note: ‘device’ was declared here

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-11 12:53:58 -05:00
José Fonseca
bc4bf3c840 softpipe: Use os_time_get_nano() everywhere. 2012-12-11 16:45:01 +00:00
Johannes Obermayr
b361bb3de4 clover: Install CL headers.
Note: This is a candidate for the stable branches.
2012-12-10 19:22:37 -05:00
Tom Stellard
ffe1794e0c gallivm: Lower TGSI_OPCODE_MUL to fmul by default
This fixes a number of crashes on r600g due to the fact that
lp_build_mul assumes vector types when optimizing mul to bit shifts.

This bug was uncovered by 0ad1fefd69
2012-12-10 19:22:37 -05:00
Dave Airlie
8000e7b4b6 llvmpipe: fix txq for 1d/2d arrays. (v3)
Noticed would fail, we were doing two things wrong

a) 1d arrays require the layers in height
b) minifying the layers field.

v2: don't change height code, fixup completely inside txq
as suggested by Roland.

v3: just add minify before texture array size

v1: Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-11 09:38:01 +10:00
Dave Airlie
41f4f094c4 llvmpipe: increase texture target width to reflect increase
Now that we've gone over 7.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-11 09:37:55 +10:00
Jordan Justen
0151237457 mesa syncobj: don't store a pointer to the set_entry
The set_entry pointer can become invalid if the set table
is re-hashed.

This likely will fix
https://bugs.freedesktop.org/show_bug.cgi?id=58012
(Regression since 56e95d3c)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-10 10:58:45 -08:00
Fabio Pedretti
8b6e782eb9 vega: remove unused variables
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-12-10 09:43:20 -07:00
Fabio Pedretti
eefd373876 nvc0: comment unused nvc0_validate_zcull function
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-12-10 09:43:18 -07:00
Fabio Pedretti
9b4926b64b nv50: remove unused OpClassStr array
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-12-10 09:43:17 -07:00
smoki
320d531373 r200: fix broken tcl lighting
command mistakenly used vector instead of scalar emit (the more or less
identical code in radeon is already correct).
Seems like it would be broken ever since kms probably.
Should fix bugs 22576, 26809.
2012-12-10 17:30:26 +01:00
Dave Airlie
17f5dc5730 st_glsl_to_tgsi: fix ubo bools.
This should fix the ubo boolean tests, along with the previous
ubo loading fix.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-10 14:25:49 +10:00
Dave Airlie
7a66c8acd3 st_glsl_to_tgsi: call ubo load pass earlier
This calls it in around the same place as the 965 driver.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-10 14:22:34 +10:00
Dave Airlie
af2d9affb1 glsl_to_tgsi: fix texture offset translation
I noticed the texelFetch offset test failed on 2D rect samplers
with GLSL 1.40. This is because I wrote the immediate->offset
translation wrong.

Fixed the translation to actually use the ureg info to set the
offsets up.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-10 12:23:47 +10:00
Dave Airlie
157f5d043a drisw: fix up context and apis for software context
This ports over from the dri2 code to the drisw bits. It means 3.1
core contexts now work for softpipe.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-09 20:28:56 +10:00
Kenneth Graunke
bd87441ac0 i965: Add missing _NEW_BUFFERS dirty bit in Gen7 SBE state.
This is needed to compute render_to_fbo.  It even has the comment.

NOTE: This is a candidate for stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-08 18:12:21 -08:00
Christoph Bumiller
5e98cefb5a st/mesa: set PIPE_BIND_SAMPLER_VIEW for TBOs in st_bufferobj_data 2012-12-08 22:47:00 +01:00
Christoph Bumiller
1f079f9e58 nvc0/ir: allow neg,abs modifiers on OP_SET with integer result 2012-12-08 22:47:00 +01:00
Christoph Bumiller
7c6584b996 nvc0/ir/emit: fix check for flags register use in logic ops 2012-12-08 22:46:37 +01:00
Brian Paul
4b73cdb864 draw: fix/improve dirty state validation
This patch does two things:

1. Constant buffer state changes were broken (but happened to work by
   dumb luck).  The problem is we weren't calling draw_do_flush() in
   draw_set_mapped_constant_buffer() when we changed that state.  All the
   other draw_set_foo() functions were calling draw_do_flush() already.

2. Use a simpler state validation step when we're changing light-weight
   parameter state such as constant buffers, viewport dims or clip planes.
   There's no need to revalidate the whole pipeline when changing state
   like that.  The new validation method is called bind_parameters()
   and is called instead of the prepare() method.  A new
   DRAW_FLUSH_PARAMETER_CHANGE flag is used to signal these light-weight
   state changes.  This results in a modest but measurable increase in
   FPS for many Mesa demos.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-12-08 06:58:10 -07:00
Brian Paul
c5f544e690 draw: add reminder comments about similar code in different files
When one function is changed, also look at the other.
Presently, there are some differences with respect to geometry
shaders and instanced drawing...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-12-08 06:58:10 -07:00
Brian Paul
a506ccd89f draw: rearrange code in llvm_middle_end_prepare()
To clean it up and make it look more like the non-LLVM
fetch_pipeline_prepare() function.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-12-08 06:58:10 -07:00
Brian Paul
3e0fa487fb draw: fix comment typo
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-12-08 06:58:10 -07:00
Brian Paul
9b11344b25 draw: add comment on draw->pt.opt field
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-12-08 06:58:10 -07:00
Brian Paul
b46b44b0a9 draw: update a comment about index buffers
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-12-08 06:58:10 -07:00
José Fonseca
122dfc5ee2 gallium/os: Fix nano->micro second concersion.
copy'n'paste: best friend, worst enemy..

Trivial.
2012-12-08 11:15:46 +00:00
Dave Airlie
1f688327e6 llvmpipe: fix missing tbo cap warning.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-08 03:46:56 +00:00
Dave Airlie
73ae865af8 mesa/st: add ARB_uniform_buffer_object support (v2)
this adds UBO support to the state tracker, it works with softpipe
as-is.

It uses UARL + CONST[x][ADDR[0].x] type constructs.

v2: don't disable UBOs if geom shaders don't exist (me)
rename upload to bind (calim)
fix 12 -> 13 comparison as comment (calim + brianp)
fix signed->unsigned (Brian)
remove assert (Brian)

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-08 13:32:38 +10:00
Dave Airlie
535e248c5f softpipe: enable GLSL 1.40
This enables GLSL 1.40 advertising by softpipe.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-08 13:32:38 +10:00
Dave Airlie
a6256f1e67 softpipe: add texture buffer object support
This adds TBO support to softpipe.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-08 13:32:38 +10:00
Dave Airlie
22439f24a2 st/mesa: add option to enable GLSL 1.40
Allow GLSL 1.40 to be enabled if the driver advertises it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-08 13:32:34 +10:00
Dave Airlie
915efe7f07 st/mesa: add texture buffer object support to state tracker (v1.1)
This adds the necessary changes to the st to allow texture buffer object
support if the driver advertises it.

v1.1: remove extra blank line and whitespace

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-08 13:32:33 +10:00
Dave Airlie
a0281c4a8c gallium: add new texture buffer object capability
this just adds the define to the header.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-08 13:32:33 +10:00
José Fonseca
0c2492ea4a mesa/meta: Move declaration before statements. 2012-12-08 01:05:52 +00:00
José Fonseca
eeff87cee3 mesa: Move declaration before statement.
For MSVC's sake.
2012-12-08 01:02:30 +00:00
Anuj Phogat
4e9d19717c intel: Enable ETC2 support on intel hardware
This patch enables support for ETC2 compressed textures on
all intel hardware. At present, ETC2 texture decoding is not
available on intel hardware. So, compressed ETC2 texture data
is decoded in software and stored in a suitable uncompressed
MESA_FORMAT at the time of glCompressedTexImage2D. Currently,
ETC2 formats are only exposed in OpenGL ES 3.0.

V2: Use single etc_wraps variable for both etc1 and etc2.
V3: Remove redundant code and use just one intel_miptree_map_etc()
    and intel_miptree_unmap_etc() function.
    Choose MESA_FORMAT_SIGNED_{R16, GR1616} for ETC2 signed-{r11, rg11}
    formats

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-07 16:29:49 -08:00
Anuj Phogat
e06dcbfdc2 mesa: Add decoding functions for GL_COMPRESSED_SRGB8_PUNCHTHROUGH_ALPHA1_ETC2
Data in GL_COMPRESSED_SRGB8_PUNCHTHROUGH_ALPHA1_ETC2 format is decoded and stored
in MESA_FORMAT_SARGB.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 16:29:49 -08:00
Anuj Phogat
883efbf6da mesa: Add decoding functions for GL_COMPRESSED_RGB8_PUNCHTHROUGH_ALPHA1_ETC2
Data in GL_COMPRESSED_RGB8_PUNCHTHROUGH_ALPHA1_ETC2 format is decoded and stored
in MESA_FORMAT_RGBA8888_REV.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 16:29:49 -08:00
Anuj Phogat
75211f4367 mesa: Add decoding functions for GL_COMPRESSED_SIGNED_RG11_EAC
Data in GL_COMPRESSED_SIGNED_RG11_EAC format is decoded and stored in
MESA_FORMAT_SIGNED_GR1616.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 16:29:48 -08:00
Anuj Phogat
7697f25667 mesa: Add decoding functions for GL_COMPRESSED_SIGNED_R11_EAC
Data in GL_COMPRESSED_SIGNED_R11_EAC format is decoded and stored in
MESA_FORMAT_SIGNED_R16.

v2:
16 bit signed data is converted to 16 bit unsigned data by
adding 2 ^ 15 and stored in an unsigned texture format.

v3:
1. Handle a corner case when base code word value is -128. As per
OpenGL ES 3.0 specification -128 is not an allowed value and should
be truncated to -127.
2. Converting a decoded 16 bit signed data to 16 bit unsigned data by
adding 2 ^ 15 gives us an output which matches the decompressed image
(.ppm) generated by ericsson's etcpack tool. ericsson is also doing this
conversion in their tool because .ppm image files don't support signed
data. But gles 3.0 specification doesn't suggest this conversion. We
need to keep the decoded data in signed format. Both signed format
tests in gles3 conformance pass with these changes.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 16:29:48 -08:00
Anuj Phogat
aa217090f5 mesa: Add decoding functions for GL_COMPRESSED_RG11_EAC
Data in GL_COMPRESSED_RG11_EAC format is decoded and stored in
MESA_FORMAT_RG1616.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 16:29:48 -08:00
Anuj Phogat
dc86cb3705 mesa: Add decoding functions for GL_COMPRESSED_R11_EAC
Data in GL_COMPRESSED_R11_EAC format is decoded and stored in
MESA_FORMAT_R16.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 16:29:48 -08:00
Anuj Phogat
5ea8cd0084 mesa: Add decoding functions for GL_COMPRESSED_SRGB8_ALPHA8_ETC2_EAC
Data in GL_COMPRESSED_SRGB8_ALPHA8_ETC2_EAC format is decoded and stored
in MESA_FORMAT_SARGB8.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 16:29:48 -08:00
Anuj Phogat
62fc4b4ae1 mesa: Add decoding functions for GL_COMPRESSED_RGBA8_ETC2_EAC
Data in GL_COMPRESSED_RGBA8_ETC2_EAC format is decoded and stored
in MESA_FORMAT_RGBA8888_REV.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 16:29:48 -08:00
Anuj Phogat
965a24995d mesa: Add decoding functions for GL_COMPRESSED_SRGB8_ETC2
Data in GL_COMPRESSED_SRGB8_ETC2 format is decoded and stored
in MESA_FORMAT_SARGB8.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 16:29:48 -08:00
Anuj Phogat
81911101ee mesa: Add decoding functions for GL_COMPRESSED_RGB8_ETC2
Data in GL_COMPRESSED_RGB8_ETC2 format is decoded and stored in
MESA_FORMAT_RGBX8888_REV.

v2: Use CLAMP macro and stdbool.h
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 16:29:48 -08:00
Anuj Phogat
38d523584c mesa: Make nonlinear_to_linear() function available outside file
This patch changes nonlinear_to_linear() function to non static inline
and makes it available outside format_unpack.c. Also, removes the
duplicate copies in other files.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 16:29:48 -08:00
Anuj Phogat
e519b8a9af mesa: Add new MESA_FORMATs for ETC2 compressed textures
It is required by OpenGL ES 3.0 to support ETC2 textures.
This patch adds new MESA_FORMATs for following etc2 texture
formats:
 GL_COMPRESSED_RGB8_ETC2
 GL_COMPRESSED_SRGB8_ETC2
 GL_COMPRESSED_RGBA8_ETC2_EAC
 GL_COMPRESSED_SRGB8_ALPHA8_ETC2_EAC
 GL_COMPRESSED_R11_EAC
 GL_COMPRESSED_RG11_EAC
 GL_COMPRESSED_SIGNED_R11_EAC
 GL_COMPRESSED_SIGNED_RG11_EAC
 MESA_FORMAT_ETC2_RGB8_PUNCHTHROUGH_ALPHA1
 MESA_FORMAT_ETC2_SRGB8_PUNCHTHROUGH_ALPHA1

Above formats are currently available in only gles 3.0.

v2: Add entries in texfetch_funcs[] array.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

v3 (Paul Berry <stereotype441@gmail.com>): comment out symbols that
are not implemented yet, so that this commit compiles on its own;
future commits will uncomment the symbols as they become available.
2012-12-07 16:29:47 -08:00
Kenneth Graunke
23b7103cee meta: Use #version 300 es for _mesa_glsl_Clear's integer shaders on ES3.
Fixes es3conform's color_buffer_float_clamp_(fixed|on|off) tests.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-07 16:29:45 -08:00
Kenneth Graunke
50e4a1df94 meta: Use #version 300 es in GenerateMipmap shaders on ES3.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-07 16:29:31 -08:00
Paul Berry
6cffdb1ca0 Set es_version to false when using FF fragment shading in meta ops
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-07 16:28:40 -08:00
Eric Anholt
1ddc021b2a mesa: Use the new hash table for the variable refcount visitor.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
[jordan.l.justen@intel.com: open_hash_table => hash_table]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-07 14:46:18 -08:00
Jordan Justen
59284bc44a program/hash_table.c: rename to program/prog_hash_table.c
Removes a collision of the object file name for main/hash_table
and program/hash_table.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-07 14:46:18 -08:00
Matt Turner
970ec8dbc3 mesa: Ignore size and offset parameters for BindBufferRange when buffer is 0
The ES 3 conformance suite unbinds buffers (by binding buffer 0) and
passes zero for the size and offset, which the spec explicitly
disallows. Otherwise, this seems like a reasonable thing to do.

Khronos will be changing the spec to allow this (bug 9765). Fixes
es3conform's transform_feedback_init_defaults test.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-07 14:11:13 -08:00
Christoph Bumiller
cfa752cd33 nv50,nvc0: fix shader eviction 2012-12-07 22:48:54 +01:00
Christoph Bumiller
f7599b2c32 nv50,nvc0: add support for cube map arrays
NOTE: nv50 support not enabled, someone with nva3/8 please fix.
2012-12-07 22:48:54 +01:00
Stefan Dösinger
ff5a9868c8 r300: Don't disable destination read if the src blend factor needs it
The read can remain disabled if the src alpha factor needs it because
the result would still be zero.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57984

NOTE: This is a candidate for stable release branches.

Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-12-07 17:48:16 +01:00
Michel Dänzer
ff574d653b gallium/egl-static: Fix unresolved symbol 'clock_gettime'.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-12-07 16:10:02 +01:00
José Fonseca
e7bbd9c243 gallivm: Rudimentary native integer support.
Just enough for draw module to work ok.

This improves "piglit attribs GL3", though something fishy is still
happening with certain unsigned integer values.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 15:03:07 +00:00
José Fonseca
6e27e2e90e draw: Dump LLVM shader key.
Just like we do in llvmpipe for the fragment shader compilation key.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 15:03:07 +00:00
José Fonseca
3b7ce72625 gallivm: Allow indirection from TEMP registers too.
The ADDR file is cumbersome for native integer capable drivers.  We
should consider deprecating it eventually, but this just adds support
for indirection from TEMP registers.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 15:03:07 +00:00
José Fonseca
1d35f77228 gallivm,llvmpipe,draw: Support multiple constant buffers.
Support 16 (defined in LP_MAX_TGSI_CONST_BUFFERS) as opposed to 32 (as
defined by PIPE_MAX_CONSTANT_BUFFERS) because that would make the jit
context become unnecessarily large.

v2: Bump limit from 4 to 16 to cover ARB_uniform_buffer_object needs,
per Dave Airlie.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 15:03:07 +00:00
Marek Olšák
35840ab189 st/dri: implement MSAA for GLX/DRI2 framebuffers
All MSAA buffers are allocated privately and resolved into the DRI-provided
back and front buffers.

If an MSAA visual is chosen, the buffers st/mesa receives are all
multi-sample. st/mesa doesn't have access to the single-sample buffers
in that case.

This makes MSAA work in games like Nexuiz.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 14:19:29 +01:00
Marek Olšák
919f788b92 gallium: pass the current context to the flush_front state tracker function
I will later use the context to resolve an MSAA front buffer.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 14:19:29 +01:00
Marek Olšák
888714feb6 st/dri: don't expose MSAA configs with accumulation buffer
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 14:19:28 +01:00
Marek Olšák
985649b596 st/dri: refactor dri_fill_in_modes
- We can use a single loop for adding new configs.
- The useless parameter depth_bits is removed.
- The maximum number of samples is bumped to 32.
- We can support Z16_UNORM and Z32_UNORM unconditionally since the zbuffers
  are private.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 14:19:28 +01:00
Marek Olšák
39737e17e7 st/dri: always allocate private depth-stencil buffers
This disables DRI2 sharing of zbuffers. The window zbuffer is allocated just
like any other texture - through resource_create.

The idea of allocating a zbuffer through DRI2 isn't very useful with MSAA,
where a single-sample zbuffer is useless.

IIRC, the Intel driver does the same thing.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 14:19:28 +01:00
Marek Olšák
976b832e9a st/mesa: implement CopyTexSubImage for MSAA framebuffers
Reviewed-by: Brian Paul <brianp@vmware.com>

Just use pipe->blit, which can do resolve, flipping, and format conversions.
The util_blit_pixels codepath is still there for the cases where we have to
force alpha to 1.

This also turns on acceleration for copying GL_DEPTH_STENCIL.
2012-12-07 14:19:28 +01:00
Marek Olšák
9f06966a7b gallium/u_blitter: fix conflict with u_memory.h
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 14:19:28 +01:00
Marek Olšák
49f1104c44 r600g: transfers of MSAA color textures should do the resolve
so that ReadPixels and various fallbacks work.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 14:19:28 +01:00
Marek Olšák
cbddb8f365 trace: dump pipe_resource::nr_samples
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 14:19:28 +01:00
Marek Olšák
5fb2b1f4d4 glx/dri2: set the __DRI2_FLUSH_DRAWABLE flag where it should be set
Sorry, I accidentally omitted this.

It only broke MLAA.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-07 14:19:28 +01:00
Andreas Boll
520892688a build: Fix GLES linkage without libglapi
fixes a regression introduced with
fc9ea7c74d

NOTE: This is a candidate for the 9.0 branch.

Reported-by: Brian Paul <brianp@vmware.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2012-12-07 09:21:44 +01:00
Dave Airlie
5b2a3443fa llvmpipe: fix regression in gears speed.
This fixes the gears regression since transform feedback.

Reported-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-07 08:35:08 +10:00
Kenneth Graunke
76f13f80e6 glsl: Add missing semicolon in the grammar
This may not be strictly necessary, but every other rule in the grammar ends
with a semicolon.  It also appears that this was supposed to be commited with
the original patch that changed this rule, but the wrong version of the patch
was accidentally pushed.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-06 12:13:22 -08:00
Ian Romanick
62c0938639 glsl: Allow layout qualifiers in GLSL 3.00 ES
Note that while 'packed' is a reserved word in GLSL ES, row_major is not.
This means that we have to use the string-based matching for that.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:22 -08:00
Paul Berry
486f955654 glsl: Create builtin function profiles for GLSL 3.00 ES.
Nearly all of the builtin functions in GLSL 3.00 ES are already
implemented in Mesa; this patch enables them.

A few functions are not implemented yet; those have been commented
out, with a FIXME comment to act as a reminder of what still needs to
be implemented.  Here is the complete list: packSnorm2x16,
unpackSnorm2x16, packUnorm2x16, unpackUnorm2x16, packHalf2x16,
unpackHalf2x16.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:22 -08:00
Paul Berry
9a69f66353 glsl: add determinant() functions.
These functions are defined in GLSL 1.50 and GLES 3.00 ES.

The formulas have been extracted from the existing implementation of
inverse().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:22 -08:00
Paul Berry
4d6d22100a glsl: Make builtin function profiles for GLSL ES use "es" in the filename.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:22 -08:00
Paul Berry
8dec1bfedd glsl: Add builtin variables for GLSL 3.00 ES.
This patch also adds assertions so that when we add new GLSL versions,
we'll notice that we need to update the builtin variables.

[v2, idr]: s/Frab/Frag/  Noticed by Eric.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:22 -08:00
Paul Berry
d7949eefcb glsl: Populate built-in types correctly for GLSL 3.00 ES.
This patch implements all of the built-in types for GLSL 3.00 ES.
This is almost exactly the same as the set of built-in types for GLSL
1.30, except ate 1D samplers are skipped, and samplerCubeShadow is
added.

This patch also addes an assertion so that when we add new GLSL
versions, we'll notice that we need to update the types.

In review, Eric noted:

    "This change looks correct.  The overall interaction of profiles is
    getting ugly, though.  I'm imagining a restructure of the symbol
    table population so that there's a big list of types, and each
    #version has a nice list of strings of type names copy and pasted
    out of its spec."

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:22 -08:00
Paul Berry
5e10a5c5e4 glsl: Make {Min,Max}ProgramTexelOffset available to compiler.
These constants need to be made available to shaders in GLSL 3.00 ES.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:22 -08:00
Paul Berry
15ba2a5825 glsl: Fix linker checks for GLSL ES 3.00.
This patch updates the following linker checks to do the right thing
in GLSL 3.00 ES:

- Failing to write to gl_Position is allowed in GLSL 1.40+ as well as
  GLSL 3.00 ES.

- It is an error to write to both gl_ClipVertex and gl_ClipDistance in
  GLSL 1.30+.  This does not apply to GLSL 3.00 ES.

- GLSL 3.00 ES uses the same varying counting rules as GLSL 1.00 ES.

- In GLSL 1.30 and GLSL 3.00 ES, "discard" terminates the shader.

- In GLSL 1.00 ES and GLSL 3.00 ES, both a fragment and a vertex
  shader must be present.

[v2, idr]: Fix minro typo in a comment.  Noticed by Ken.

[v3, idr]: s/IsEs(Shader|Prog)/IsES/  Suggested by Ken and Eric.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:22 -08:00
Paul Berry
91c92bb6fb glsl: Record in gl_shader_program whether the program uses GLSL ES.
Previously we recorded just the GLSL version (or the max version, if
GLSL 1.10 and GLSL 1.20 programs were linked together).

[v2, idr]: s/IsEs(Shader|Prog)/IsES/  Suggested by Ken and Eric.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:22 -08:00
Paul Berry
a9f34dc304 glsl: Clean up shading language mixing check for GLSL 3.00 ES.
Previously, we prohibited mixing of shading language versions if
min_version == 100 or max_version >= 130.  This was technically
correct (since desktop GLSL 1.30 and beyond prohibit mixing of shading
language versions, as does GLSL 1.00 ES), but it was confusing.  Also,
we asserted that all shading language versions were between 1.00 and
1.40, which was unnecessary (since the parser already checks shading
language versions) and doesn't work for GLSL 3.00 ES.

This patch changes the code to explicitly check that (a) ES shaders
aren't mixed with desktop shaders, (b) shaders aren't mixed between ES
versions, and (c) shaders aren't mixed between desktop GLSL versions
when at least one shader is GLSL 1.30 or greater.  Also, it removes
the unnecessary assertion.

[v2, idr]: Slightly tweak the is_es_prog detection to occur outside the loop
instead of doing something special on the first loop iteration.  Suggested by
Ken.

[v3, idr]: s/IsEs(Shader|Prog)/IsES/  Suggested by Ken and Eric.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:22 -08:00
Paul Berry
c150e876b4 glsl: Record in gl_shader whether the shader uses GLSL ES.
Previously we recorded just the GLSL version, with the knowledge that
100 means GLSL 1.00 ES.  With the advent of GLSL 3.00 ES, this is
going to get more complex, and eventually will probably become
ambiguous (GLSL 4.00 already exists, and GLSL 4.00 ES is likely to be
created some day).

To reduce confusion, this patch simply records whether the shader is
GLSL ES as an explicit boolean.

[v2, idr]: s/IsEs(Shader|Prog)/IsES/  Suggested by Ken and Eric.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:22 -08:00
Paul Berry
2b4aeddfb3 glsl/parser: Handle "#version 300 es" directive.
Note that GLSL 1.00 is selected using "#version 100", so "#version 100
es" is prohibited.

v2: Check for GLES3 before allowing '#version 300 es'

v3: Make sure a correct language_version is set in
_mesa_glsl_parse_state::process_version_directive.

Signed-off-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:22 -08:00
Paul Berry
629b9edc99 glsl/parser: Extract version directive processing into a function.
Version directive handling is going to have to be used within two
parser rules, one for desktop-style version directives (e.g. "#version
130") and one for the new ES-style version directive (e.g. "#version
300 es"), so this patch moves it to a function that can be called from
both rules.

No functional change.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:21 -08:00
Paul Berry
a03c2c7ab9 glsl/preprocessor: Handle "#version 300 es" directive.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:21 -08:00
Paul Berry
2152df51c0 glsl/preprocessor: Extract version directive processing into a function.
Version directive handling is going to have to be used within two
parser rules, one for desktop-style version directives (e.g. "#version
130") and one for the new ES-style version directive (e.g. "#version
300 es"), so this patch moves it to a function that can be called from
both rules.

No functional change.

[mattst88] v2: Use intmax_t instead of int for version argument. Would
otherwise write garbage after #version since PRIiMAX was reading 64-bits
instead of 32.

[idr] v3: A later commit fixes the caller of
_glcpp_parser_handle_version_declaration to pass the correct number of
parameters.  Fix it in the patch that changes the interface instead.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:21 -08:00
Paul Berry
d4a24745b8 glsl: Enable GLSL ES 3.00 features inherited from desktop GLSL.
This patch turns on the following features for GLSL ES 3.00:

- Array constructors, whole array assignment, and array comparisons.
- Second and third operands of ?: may be arrays.
- Use of "in" and "out" qualifiers on globals.
- Bitwise and modulus operators.
- Integral vertex shader inputs.
- Range-checking of literal integers.
- array.length method.
- Function calls may be constant expressions.
- Integral varyings must be qualified with "flat".
- Interpolation and centroid qualifiers may not be applied to vertex
  shader inputs.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:21 -08:00
Paul Berry
534ec62152 glsl: parse GLSL ES 3.00 keywords correctly.
GLSL ES 3.00 adds the following keywords over GLSL 1.00: uint,
uvec[2-4], matNxM, centroid, flat, smooth, various samplers, layout,
switch, default, and case.

Additionally, it reserves a large number of keywords, some of which
were already reserved in versions of desktop GL that Mesa supports,
some of which are new to Mesa.

A few of the reserved keywords in GLSL ES 3.00 are keywords that are
supported in all other versions of GLSL: attribute, varying,
sampler1D, sampler1DShador, sampler2DRect, and sampler2DRectShadow.

This patch updates the lexer to handle all of the new keywords
correctly when the language being parsed is GLSL 3.00 ES.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:21 -08:00
Paul Berry
948e5dda67 glsl: Rework lexer keyword handling in preparation for GLSL 3.00 ES.
This patch expands the lexer KEYWORD macro to take two additional
arguments: the GLSL ES versions in which the given keyword was first
reserved, and supported, respectively.  This will allow us to
trivially add support for GLSL 3.00 ES keywords, even though the set
of GLSL 3.00 ES keywords is neither a subset or a superset of the
keywords corresponding to any desktop GLSL version.

The new KEYWORD macro makes use of the
_mesa_glsl_parse_state::is_version() function, so it accepts 0 as
meaning "unsupported" (rather than 999, which we used previously).

Note that a few keywords ("packed" and "row_major") are supported
*either* when GLSL 1.40 is in use or when ARB_uniform_buffer_obj
support is enabled.  Previously, we handled these by cleverly taking
advantage of the fact that the KEYWORD macro didn't parenthesize its
arguments in the usual way.  Now they are handled more
straightforwardly, with a new macro, KEYWORD_WITH_ALT.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:21 -08:00
Paul Berry
0d9bba6e43 glsl: Make use of new _mesa_glsl_parse_state::check_version() function.
Previous to this patch, we were not very consistent about the errors
we generate when a shader tried to use a feature that is prohibited in
the current GLSL version.  Some error messages failed to mention the
GLSL version currently in use (or did so inaccurately), and some error
messages failed to mention the first GLSL version in which the given
feature is allowed.

This patch reworks all of the error checks to use the check_version()
function, which produces error messages in a standard form
(approximately "$FEATURE forbidden in $CURRENT_GLSL_VERSION
($REQUIRED_GLSL_VERSION required).").

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:21 -08:00
Paul Berry
e3ded7fe62 glsl: Make use of new _mesa_glsl_parse_state::is_version() function.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:21 -08:00
Paul Berry
5d0fd3270f glsl: Add GLSL version query functions.
With the advent of GLSL 3.00 ES, the version checks we perform in the
GLSL compiler (to determine which language features are present) will
become more complicated.  To reduce the complexity, this patch adds
functions check_version() and is_version() to _mesa_glsl_parse_state.
These functions take two version numbers: a desktop GLSL version and a
GLSL ES version, and return a boolean indicating whether the GLSL
version being compiled is at least the required version.  So, for
example, is_version(130, 300) returns true if the GLSL version being
compiled is at least desktop GLSL 1.30 or GLSL 3.00.

The check_version() function additionally produces an error message if
the version check fails, informing the user of which GLSL version(s)
support the given feature.

[v2, idr]: Add PRINTFLIKE annotation to the new method.  The numbering of th
parameters is correct because GCC is silly.

[v3, idr]: Fix copy-and-paste error in the comment before
_mesa_glsl_parse_state::is_version.  Noticed by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:21 -08:00
Paul Berry
dc9f9d8e66 glsl: Compute version_string on the fly.
Fixes a bug where version_string would be left uninitialized if no
GLSL "#version" directive was used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:21 -08:00
Paul Berry
d9bfaa104e glsl: Make a function to express a GLSL version ir human-readable form.
This will be useful in generating more helpful error messages,
especially with the addition of GLSL 3.00 ES support.

[v2, idr]: Rename ctx parameter to mem_ctx

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:21 -08:00
Paul Berry
53e572f15c glsl: Simplify symbol table version checking.
Previously, we stored the GLSL language version in the
glsl_symbol_table struct.  But this was unnecessary--all
glsl_symbol_table needs to know is whether functions and variables
have separate namespaces (they do in GLSL 1.10 only).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:21 -08:00
Paul Berry
9a93ba3068 mesa: Add ARB_ES3_compatibility flag.
Adding this now makes it easier to develop and test GLES3 features, since we
can do initial development and testing using desktop GL.  Later GLSL compiler
patches check for either ctx->Extensions.ARB_ES3_compatibility or
_mesa_is_gles3 to allow certain features (i.e., "#version 300 es").

[v2, idr]: Just edits to the commit message.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
2012-12-06 12:13:21 -08:00
Michel Dänzer
e0f2ffc3d9 radeonsi: Fix cube texture coordinates.
8 more piglits.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-12-06 20:35:18 +01:00
Michel Dänzer
aac2154729 radeon/llvm: Export prepare_cube_coords helper to driver.
To be used by radeonsi.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-12-06 20:18:40 +01:00
Brian Paul
7745596ceb mesa: use rand() instead of random()
As Vinson Lee did in commit bb284669f8
in hash_table.c

Signed-off-by: Brian Paul <brianp@vmware.com>
2012-12-06 11:55:02 -07:00
Jordan Justen
56e95d3ca2 mesa: validate that sync objects were created by mesa
Previously, the user could send in a pointer that was not created
by mesa. When we dereferenced that pointer, there would be an
exception.

Now we keep a set of pointers and verify that the pointer
exists in that set before dereferencing it.

Note: This fixes several crashing gles3conform tests.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-06 09:43:07 -08:00
Jordan Justen
e12d9f0c6d main/syncobj: return GL_INVALID_VALUE for invalid sync objects
Note: The GL/GLES3 web man pages don't seem to properly
document glWaitSync's error when the sync object is invalid.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-06 09:43:07 -08:00
Eric Anholt
82c9d98ab9 mesa: add set support (stores a set of pointers)
From: git://people.freedesktop.org/~anholt/hash_table

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
[jordan.l.justen@intel.com: minor rework for mesa]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-06 09:43:07 -08:00
José Fonseca
db9a1052d1 llvmpipe: Fix statement before declaration. 2012-12-06 17:23:11 +00:00
José Fonseca
b79194401a util: Add util_copy_box helper.
Must users of util_copy_rect() need or should deal with volumes.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-06 17:12:31 +00:00
José Fonseca
4da4e8ee2a gallium/util: Move the util_copy/fill_rect into u_surface.
u_rect.h said these should move to a different file, and u_surface seems
a better home.

Leave #include "util/u_surface.h" to avoid having to touch thousand of
files.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-06 17:12:31 +00:00
José Fonseca
d296326e06 gallium/os: Cleanup up os_time_get/os_time_get_nano.
- Re-implement os_time_get in terms of os_time_get_nano() for consistency
- Use CLOCK_MONOTONIC as recommended
- Only use clock_gettime on Linux for now.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-06 17:12:31 +00:00
José Fonseca
7e14293556 gallium/os: Fix os_time_sleep() on Windows for small durations.
Prevents undetermined sleeps.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-06 17:12:31 +00:00
Stefan Dösinger
d8069b7603 meta: Disable GL_FRAGMENT_SHADER_ATI in MESA_META_SHADER
Fixes clears in Wine on r200.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-06 11:37:24 -05:00
Stefan Dösinger
f6a4e1bc1e radeon: Initialize swrast before setting limits
NOTE: This is a candidate for stable release branches.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-06 11:37:24 -05:00
Stefan Dösinger
654a945f4d r200: Initialize swrast before setting limits
Otherwise the driver announces 4096 vertex shader constants and other
way too high limits.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-06 11:37:24 -05:00
Matthew Waters
ac24d17258 mesa: fix compiler warnings when including GL/gl.h with other gl headers
GL/gl.h provides some definitions (GL_FALSE, GL_ONE, etc) that have
the same value as other gl headers but are represented differently
(0 vs 0x0 and 1 vs 0x1).
This causes compiler warnings about redefining such definitions when
including GL/gl.h with other gl headers.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=57802

Signed-off-by: Brian Paul <brianp@vmware.com>
2012-12-06 09:08:52 -07:00
José Fonseca
5e99cd9159 gallivm: Fix lerping of (un)signed normalized numbers.
Several issues actually:

- Fix a regression in unsigned normalized in the rescaling
  [0, 255] to [0, 256]

- Ensure we use signed shifts where appropriate (instead of
  unsigned shifts)

- Refactor the code slightly -- move all the logic inside
  lp_build_lerp_simple().

This change, plus an adjustment in the tolerance of signed normalized
results in piglit fbo-blending-formats fixes bug 57903

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-06 15:58:40 +00:00
José Fonseca
33ffca713a gallivm: Fix lp_build_print_value of smaller integer types.
They need to be converted to the native integer type to prevent garbage
in higher order bits from being printed.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-06 15:58:40 +00:00
Brian Paul
5396582f5e llvmpipe: remove unused variable 2012-12-06 08:34:08 -07:00
Brian Paul
52b02cc676 draw: remove some dead constant buffer code
Remove the draw_vs_set_constants() and draw_gs_set_constants()
functions and the draw->vs.aligned_constants,
draw->vs.aligned_constant_storage and draw->vs.const_storage_size
fields.  None of it was used.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-12-06 07:48:53 -07:00
Chad Versace
45a01cba90 android: Fix build of libmesa_program
Commit 4097308 fixed the build in a questionable way. It worked at the
time, but, as Ian pointed out, the fix would likely fail at a future
commit due to the indeterminism of parallel builds. And that's exactly
what happened; the fix no longer works. `mm -j4` on Fedora 17 fails for
me.

The problem is that there is no rule for program_parse.tab.h. To fix that,
this patch adds a rule that makes program_parse.tab.c depend on
program_parse.tab.h. Technically, the c file does not depend on the
h file. However, because the two files are generated together by a single
invocation of Bison, any rule that forces execution of Bison is
sufficient.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-12-05 23:06:16 -08:00
Dave Airlie
77b26564c3 llvmpipe: EXT_transform_feedback support (v1.1)
I'd written most of this ages ago, but never finished it off.

This passes 115/130 piglit tests so far. I'll look into the
others as time permits.

v1.1: fix calloc return check as suggested by Jose.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-06 14:48:10 +10:00
Eric Anholt
71f06344a0 i965: Add a debug flag for counting cycles spent in each compiled shader.
This can be used for two purposes: Using hand-coded shaders to determine
per-instruction timings, or figuring out which shader to optimize in a
whole application.

Note that this doesn't cover the instructions that set up the message to
the URB/FB write -- we'd need to convert the MRF usage in these
instructions to GRFs so that our offsets/times don't overwrite our
shader outputs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)

v2: Check the timestamp reset flag in the VS, which is apparently
    getting set fairly regularly in the range we watch, resulting in
    negative numbers getting added to our 32-bit counter, and thus large
    values added to our uint64_t.
v3: Rebase on reladdr changes, removing a new safety check that proved
    impossible to satisfy.  Add a comment to the AOP defs from Ken's
    review, and put them in a slightly more sensible spot.
v4: Check timestamp reset in the FS as well.
2012-12-05 14:29:44 -08:00
Eric Anholt
ef2fbf67d4 i965: Add a flag for instructions with normal writemasking disabled.
For getting values from the new timestamp register, the channels we
load have nothing to do with the pixels dispatched.
2012-12-05 14:29:44 -08:00
Vincent Lejeune
00d77e9fe4 r600g: use default action for min/max opcode in tgsi to llvm
Reveiwed-by: Tom Stellard <thomas.stellard at amd.com>
2012-12-05 18:31:55 +01:00
Vincent Lejeune
2d97f77b9f gallivm: Have a default emit function for min/max opcode
Reveiwed-by: Tom Stellard <thomas.stellard at amd.com>
2012-12-05 18:31:18 +01:00
Vincent Lejeune
2a03f28e54 r600g: use default action for fdiv/rcp opcode
Reveiwed-by: Tom Stellard <thomas.stellard at amd.com>
2012-12-05 18:31:02 +01:00
Vincent Lejeune
0a2f58f6ed gallivm: have a default emit function for fdiv/rcp
Reveiwed-by: Tom Stellard <thomas.stellard at amd.com>
2012-12-05 18:30:39 +01:00
Vincent Lejeune
0ad1fefd69 r600g: Use default mul/mad function for tgsi-to-llvm
Reveiwed-by: Tom Stellard <thomas.stellard at amd.com>
2012-12-05 18:30:16 +01:00
Vincent Lejeune
e9f090e8b2 glsl: add new variable declaration in function body in lower_output_read
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
2012-12-05 18:23:42 +01:00
Brian Paul
d2c7fe5389 draw: set precalc_flat flag for AA lines too
Fixes flat shading for AA lines.  demos/src/trivial/line-smooth is a
test case which hits this.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-12-05 08:05:00 -07:00
Chris Forbes
484a8dcfa8 mesa: expose ARB_texture_cube_map_array in core contexts as well
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@gmail.com>
2012-12-05 16:52:37 +10:00
Vinson Lee
129a580062 scons: Require drm to build gallium/state_trackers/egl/x11/x11_screen.c.
x11_screen.c includes xf86drm.h, which comes from libdrm-dev.

This patch fixes this build error.

  Compiling src/gallium/state_trackers/egl/x11/x11_screen.c ...
src/gallium/state_trackers/egl/x11/x11_screen.c:30:21: fatal error: xf86drm.h: No such file or directory

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-12-04 22:21:04 -08:00
Eric Anholt
a64c1eb9b1 i965/fs: Add support for uniform array access with a variable index.
Serious Sam 3 had a shader hitting this path, but it's used rarely so it
didn't show a significant performance difference (n=7).  It does reduce
compile time massively, though -- one shader goes from 14s compile time
and 11723 instructions generated to .44s and 499 instructions.

Note that some shaders lose 16-wide mode because we don't support
16-wide and pull constants at the moment (generally, things looping over
a few-element array where the loop isn't getting unrolled).  Given that
those shaders are being generated with 15-20% fewer instructions, it
probably outweighs the loss of 16-wide.
2012-12-04 17:11:11 -08:00
Eric Anholt
67d9e7b581 i965/fs: Conditionalize constant-index UBO load code and add comments.
I wanted to separate this step for easier reviewing when I add the
variable-index case next.
2012-12-04 16:59:59 -08:00
Eric Anholt
f22a909a08 i965/fs: Restrict optimization that would fail for gen7's SENDs from GRFs
v2: Fix SNB math bug in register_coalesce() where I was looking at the
    instruction to be removed, not the instruction to be copy propagated
    into.
2012-12-04 16:58:46 -08:00
Eric Anholt
9156d0cba1 i965/fs: Allow source mods on gen7+ math.
This gen6 restriction was removed in gen7 as the mathbox merge to act
more like a normal instruction was finished in the hardware.
2012-12-04 16:27:54 -08:00
Eric Anholt
d8214e4384 i965/fs: Add instruction emit for varying-index reads of uniforms.
The gen7 send-from-GRF path is sufficiently different from the perspective of
IR generation and optimization that I just made it a separate opcode.

v2: fix whitespace, rebase on Ken's recent refactor.
2012-12-04 16:27:53 -08:00
Eric Anholt
29340d02dc i965/fs: Rename the existing pull constant load opcode.
We're going to use another send message for handling loads with a varying
per-fragment array index.
2012-12-04 16:27:53 -08:00
Eric Anholt
78e9c57a3e i965: Add a header_present flag for setting up dp read messages.
As of gen7, we can skip the header on some messages, and this can make
optimization on those messages much nicer when you've got GRFs instead of MRFs
as the source.
2012-12-04 16:27:53 -08:00
Eric Anholt
8f05b2f2b0 i965/gen7: Add some safety checks for send messages from GRFs. 2012-12-04 16:27:53 -08:00
José Fonseca
fb6d901ad2 gallivm: Re-add the kludge for lp_build_lerp of fixed point types.
I removed it in commit 7d44d354bd but
texture sample code still relies on it.

Not sure how to this cleanly, so put it pack for now.
2012-12-04 21:18:18 +00:00
José Fonseca
ed4dfaa164 scons: Link against librt
Fixes missing clock_gettime symbol.
2012-12-04 19:37:21 +00:00
José Fonseca
de76101672 util/u_debug: Cleanup/fix debug_dump_image.
- Handle other formats.
- Prevent CRLF on Windows.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-04 19:35:19 +00:00
José Fonseca
a416a4a91d translate: Fix the fetch function assertions.
fetch_rgba_float is NULL for integer formats, and vice-versa.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-04 19:35:19 +00:00
José Fonseca
4da0cb83ab util/u_draw: Skip rendering instead of aborting when excessive number of instances is found.
This is a temporary hack. I believe the only way of properly fixing this
is to check buffer overflow just before fetching based on addresses,
instead of number of vertices/instances. This change simply allows tests
that stress buffer overflows to complete without asserting, and should
not affect valid rendering.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-04 19:35:19 +00:00
José Fonseca
7da3a947c7 draw: Properly limit vertex buffer fetches on draw arrays.
We need to clamp vertex buffer fetch based on its size, not based on the
user specified max index hint.

This matches draw_pt_fetch_run() above.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-04 19:35:19 +00:00
José Fonseca
d1864273f2 draw: Use symbolic primitive names in debug output.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-04 19:35:18 +00:00
José Fonseca
32e899ab8b draw: Consider the geometry shader when choosing the vertex size.
A single vertex size is chosen for the whole pipeline. So the number of
geometry shader outputs must also be taken in consideration.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-04 19:35:18 +00:00
José Fonseca
b636204ae8 tgsi: Allow TXF from buffers.
There is more work necessary to properly support buffers in shaders, but
this gets things a bit further along.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-04 19:35:18 +00:00
José Fonseca
c0e4ee9b27 util/surface: Always use the surface format when clearing.
Not the texture format, as they might differ.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-04 19:35:18 +00:00
José Fonseca
64f9916314 tgsi: Increase maximum number of temps to 4096.
To match Shader Model 4 limits, as specified in
http://msdn.microsoft.com/en-us/library/windows/desktop/ff471378.aspx

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-04 19:35:18 +00:00
José Fonseca
294d8a71ef llvmpipe: Fix alignment.
My understanding and actual implementation of how the pixels are being
fetch differed.

This fixes bug 57863.

Trivial.
2012-12-04 19:33:04 +00:00
José Fonseca
7d44d354bd gallivm: Generalize lp_build_mul and lp_build_lerp for signed normalized types.
This fixes fdo bug 57755 and most of the failures of piglit fbo-blending-formats
GL_EXT_texture_snorm.

GL_INTENSITY_SNORM is still failing, but problem is probably elsewhere,
as GL_R8_SNORM works fine.
2012-12-04 19:32:50 +00:00
Dave Airlie
ec83535c83 automake/gallium: attempt to fix -lrt
fix non-automake bits in pipe-load to.

Should fix:
http://bugs.freedesktop.org/57852

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-04 18:02:07 +10:00
Dave Airlie
a853301fb7 u_cache: fix dereference before NULL check 2012-12-04 17:55:52 +10:00
Ian Romanick
bdba4b30de intel: Always enable GL_ARB_framebuffer_object
Now that _mesa_BindFramebuffer does the right thing in ES contexts when the
gl_extensions::ARB_framebuffer_object bit is set, the Intel driver doesn't
need this hack.

No piglit or GLES2 conformance regressions observed on IVB, and this
patch (and the previous) fix es3conform's framebuffer_srgb_draw and
transform_feedback_misc tests.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-03 21:26:48 -08:00
Ian Romanick
a13f9dfbb8 mesa: Only require Gen'ed name for glBind{Framebuffer,Renderbuffer} on desktop
Desktop OpenGL implementations that support either
GL_ARB_framebuffer_object or OpenGL 3.0 must require names from
glGenFramebuffers for glBindFramebuffer.  We have enforced this rule for
quite some time.  However, OpenGL ES 1.0, 2.0, and 3.0 implementations
are required to allow user-defined names (e.g., not from
glGenFramebuffers{OES,}).

The Intel drivers have hacked around this by not enabling
GL_ARB_framebuffer_object in an ES context.  Instead, just pick the
correct behavior in _mesa_BindFramebuffer based on the context API.

Chad pointed out in a review e-mail:

    "I'd like to point out, though, that glBindFramebufferEXT and
    glBindRenderbufferEXT are still broken on desktop GL because they
    don't accept user-genned names. But that fix belongs to a different
    series."

Currently glBindFramebufferEXT is an alias for glBindFramebuffer.
Unalising two functions presents some difficulty, so we'll have to
revisit this eventually.

v2: Perform same check in _mesa_BindRenderbuffer too.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
2012-12-03 21:24:54 -08:00
Brian Paul
4d2f04cd6c mesa: fix uint64 printing in syncobj.c
To silence printf format warnings.

v2: insert "0x" prefix

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-12-03 20:55:17 -07:00
Kenneth Graunke
32c6db3978 mesa: Disable GL_NV_primitive_restart extension in core contexts.
The NV formulation of primitive restart is turned on/off with
glEnableClientState/glDisableClientState.  These two functions don't
exist in core contexts, which mean that GL_NV_primitive_restart is
essentially useless...even broken.

However, leaving it on causes oglconform's primitive-restart-nv tests to
run in OpenGL 3.1 contexts, which results in them all failing.  This
patch causes 29 subtests to go from "fail" to "not run".

NOTE: This is a candidate for stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-03 17:59:28 -08:00
Kenneth Graunke
3ac97c6ccc i965: Allow INTEL_DEBUG=fs as a synonym for INTEL_DEBUG=wm.
I keep accidentally trying to use it.  "fs" is a sensible name for
fragment shader debugging, and "wm" is...not.  It's also more symmetric
with "vs".

Leave INTEL_DEBUG=wm because old habits die hard.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-03 17:57:43 -08:00
Johannes Obermayr
21694b8eac gallium/auxiliary: Add -fno-rtti to CXXFLAGS on LLVM >= 3.2.
Also remove the recently added and overloaded LLVM_CXXFLAGS from CXXFLAGS.

Note: This is a candidate for the stable branches.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-12-03 19:07:43 -05:00
Stefan Dösinger
e866bd1ade r300g: Give CLIP_DISABLE another try
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-12-04 00:07:13 +01:00
Eric Anholt
b126228f12 i965: Include codegen time in the INTEL_DEBUG=perf stall detection.
In the VS case, we were missing the entire compile time in the stall
detection!

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-03 13:19:50 -08:00
Eric Anholt
0f06864ba5 i965: Don't leak the IR annotation into later instructions.
After walking our IR instructions (Mesa or GLSL), we don't want to also
mark the start of the FB/URB writes or whatever as being that IR.  This
can end up being misleading when the end of the IR visit got copy
propagated out to a later instruction in the URB writes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-03 13:19:46 -08:00
Eric Anholt
1db9a72351 i965/vp: Fix crashes with INTEL_DEBUG=vs.
The VP generation doesn't set up the output reg strings, so if you
didn't happen to get these values as 0 on the stack, you'd lose.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-03 13:19:43 -08:00
Eric Anholt
0e5f94a552 i965/vs: Fix uninitialized shader pointer used in debug output.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-03 13:19:34 -08:00
Adrian Marius Negreanu
409730843f android: fix bison warning of conflicting outputs to file
Bison -o parameter expects a .c file.
The corresponding .h filename is obtained
by removing the extension of the initial .c.

This was breaking compilation on Ubuntu 12.04

libmesa_dricore_intermediates/libmesa_dricore.a(program_parse.tab.o): In
function `_mesa_parse_arb_program':
external/mesa/src/mesa/program/program_parse.y:2682: multiple definition
of `_mesa_parse_arb_program'
libmesa_dricore_intermediates/libmesa_dricore.a(lex.yy.o):external/mesa/src/mesa/program/program_parse.y:2682:
first defined here

Signed-off-by: Adrian Marius Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2012-12-03 12:12:28 -08:00
Brian Paul
a4311054c7 st/mesa: add null pointer check in st_renderbuffer_delete()
In my testing I haven't found any cases where we get a null context
pointer, but it might still be possible.  Check for null just to be safe.

Note: This is a candidate for the stable branches.
2012-12-03 11:30:42 -07:00
Brian Paul
c6d74bfaf6 st/glx: accept GLX_SAMPLE_BUFFERS/SAMPLES_ARB == 0
Only fail if GLX_SAMPLE_BUFFERS_ARB or GLX_SAMPLES_ARB are non-zero.
We were already doing this in the older swrast/glx code.

This fixes a piglit/waffle problem where we'd always fail to get a
visual/config and report the test as "skip".

Note: This is a candidate for the stable branches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-03 11:10:09 -07:00
Brian Paul
006918c0db mesa: remove warning message in _mesa_reference_renderbuffer_()
We were warning when there was no current context and we're about
to delete a renderbuffer, but that happens fairly often and isn't
really a problem.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=57754

Note: This is a candidate for the stable branches.

Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-03 11:10:09 -07:00
James Benton
16f0d70ffe llvmpipe: Implement PIPE_QUERY_TIMESTAMP and PIPE_QUERY_TIME_ELAPSED.
This required an update for the query storage in llvmpipe, there
can now be an active query per query type, so an occlusion query
can run at the same time as a time elapsed query.

Based on PIPE_QUERY_TIME_ELAPSED patch from Dave Airlie.

v2: fix up piglits for timers (also from Dave Airlie)

a) if we don't render anything the result is 0, so just
return the current time

b) add missing screen get_timestamp callback.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-12-03 17:21:57 +00:00
Roland Scheidegger
041966801e gallivm: fix srgb format fetch
we need to rely on util code for fetching those, just like before
9f06061d50.
Fixes bugs 57699 and 57756.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-12-03 14:10:36 +00:00
José Fonseca
6a2f2300a8 llvmpipe: Refactor convert_to/from_blend_type to convert in place.
This fixes the "Source and destination overlap in memcpy" valgrind
warnings.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-12-03 14:02:43 +00:00
José Fonseca
03aa3fd54b llvmpipe: Improve color buffer loads/stores alignment.
Tell LLVM the exact alignment we can guarantee, based on the fs block
dimensions, pixel format, and the alignment of the resource base pointer
and stride.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-12-03 14:02:43 +00:00
José Fonseca
0bc6ec238b llvmpipe: Recompute the fs shader key when framebuffer varies.
The fs shader now depends on the color buffer formats. The shader key was
extended to accommodate this, but llvmpipe_update_derived needs to be
updated to check the framebuffer dirty flag.

This fixes bug 57674.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-12-03 14:02:43 +00:00
Marek Olšák
54ff536823 r300g: increment num_z_clears only if we have Hyper-Z 2012-12-02 22:22:39 +01:00
Marek Olšák
838b19609f r300g: add blacklist for apps that shouldn't steal hyperz access 2012-12-02 22:18:11 +01:00
Marek Olšák
12dcbd5954 r300g: enable Hyper-Z by default on r500
I fixed the only known bugs on r500 with 0222b2bd41.
Now there are no piglit regressions with Hyper-Z and all apps I tested seem
to work.

To summarize how it works:
- Only one process can use it at a time. This is a hardware limitation.
- The first process to clear a zbuffer gets the exclusive access to use
  Hyper-Z.
- Compositors don't use any zbuffer, so they won't steal it, but some web
  browsers do, so make sure there's no web browser running if you want your
  game to use Hyper-Z.
- There's no need to restart an app which couldn't get the access to Hyper-Z.
  Just quit the app which took it, the driver can turn it on for the other app
  in the middle of rendering.
- If an app gets the access to Hyper-Z, it prints "radeon: Acquired Hyper-Z"
  to stdout.

r300-r400:
  Hyper-Z will be enabled by default on r300-r400 once sufficient testing is
  done with piglit and Lightsmark at least.
  Be sure to set the env var RADEON_HYPERZ and run piglit with parameters: -c 0
2012-12-02 18:07:26 +01:00
Marek Olšák
0222b2bd41 r300g: clear the ZB cache before clearing ZMASK or HIZ
This fixes wrong rendering in Lightsmark and
the piglit/depthstencil-render-miplevels.

I think I fixed Hyper-Z. So far every app seems to work like a charm.
2012-12-02 07:07:33 +01:00
Marek Olšák
62cba629c0 Revert "r300g: fix occlusion queries when depth test is disabled or zbuffer is missing"
It broke Hyper-Z terribly.
2012-12-02 07:07:33 +01:00
Chad Versace
e5f1f8d52e dri: Fix i965 build
The following commit broke the i965 build:

    commit 4a486f8bf2
    Author: Marek Olšák <maraeo@gmail.com>
    Date:   Fri Nov 23 18:31:42 2012 +0100

    glx/dri2: add and use new driver hook flush_with_flags

That commit added a forward declaration of enum __DRI2throttleReason to
dri_interface.h. C++ 98 does not allow forward declarations of enums.

The fix: Move the enum's definition to earlier in the file.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-12-01 17:08:41 -08:00
Marek Olšák
3039addf93 st/dri: implement new driver hook flush_with_flags
v2: added documentation for dri_flush as per Brian's request
2012-12-02 00:19:02 +01:00
Marek Olšák
4003961fbf st/mesa: make st_flush do what glFlush does 2012-12-02 00:19:02 +01:00
Marek Olšák
4a486f8bf2 glx/dri2: add and use new driver hook flush_with_flags 2012-12-02 00:19:00 +01:00
Marek Olšák
5b7e9b7360 glx: move the glFlush call one layer down 2012-12-02 00:15:00 +01:00
Marek Olšák
8ad9d42b33 r300g: refuse to create too large textures 2012-12-01 22:41:39 +01:00
Marek Olšák
e694ea09f5 r300g: fix memory leaks in texture_create error paths 2012-12-01 22:38:36 +01:00
Marek Olšák
3e3a586236 r300g: fix revoking hyperz access
The bug was uncovered by 67c8e96f5a.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57763
2012-12-01 21:43:17 +01:00
Roland Scheidegger
587bd16d0d gallivm: drop border wrap clamping code
The border clamping code is unnecessary, since we don't care if a wrapped
coord value is -1 or <-1 (same for length vs. >length), in either case the
border handling code will mask out the offset and replace the texel value with
the border color.
Note that technically this is not entirely correct. Omitting clamping on the
float coords means that flt->int conversion may result in undefined values for
values of very large magnitude.
However there's no reason we should honor this here since:
a) we don't care for that for ordinary wrap modes in the aos code when
   converting coords and the problem is worse there (as we've got only
   effectively 24 instead of 32bits)
b) at least in some cases the clamping was done already in int space hence
   doing nothing to fix that problem.
c) with sse2 flt->int conversion with such values results in 0x80000000 which
   is just perfect (for clamp to border - not so much for the ordinary clamp to
   edge).

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-01 17:05:48 +01:00
Marek Olšák
224d0e4a3f r300g: handle map flag DISCARD_WHOLE_RESOURCE
This should improve performance in apps which trigger this codepath.
(e.g. Wine does)
2012-12-01 14:33:11 +01:00
Vinson Lee
da7029dcb4 radeon: Fix memory leak in radeonCreateScreen2.
Fixes a memory leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-30 19:34:31 -08:00
Brian Paul
a17750b688 nouveau: Fix build.
Fixes nouveau build failure introduced at
c73245882c.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57746
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2012-11-30 19:11:21 -08:00
Dave Airlie
f3476ec8fa glsl: fix uninitialised variable from constructor
Coverity pointed out this uninitialised class member.

Note: This is a candidate for stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-01 11:27:16 +10:00
Dave Airlie
906670a790 glsl: initialise killed_all field.
coverity pointed out this field was being used uninitialised.

Note: This is a candidate for stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-01 11:27:16 +10:00
Dave Airlie
d128ae347a svga: remove pointless assert on unsigned >= 0
all unsigneds are >= 0 :-)

There may be an argument for leaving this in, in case someone
changes min_lod to an integer, so feel free to apply or drop.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-01 11:25:15 +10:00
Dave Airlie
e85c9a4d28 glsl: fix cut-n-paste error in error handling. (v2)
Reported by coverity scan.

v2: fix second case

Note: This is a candidate for stable branches.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-01 11:24:16 +10:00
Dave Airlie
67c8e96f5a r300g: fix comparison of hyperz flush time.
I haven't confirmed this is doing the correct thing, but at
least this might make someone review it!

Reported by internal RH coverity scan.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-12-01 11:23:48 +10:00
Dave Airlie
a0ec9185eb dri_glx: fix use after free report
the critical error would use driverName.

Found by internal RH coverity scan.

Note: This is a candidate for stable branches.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-01 11:20:36 +10:00
Carl Worth
a47a0200a7 Revert "glcpp: Rewrite line-continuation support to act globally."
This reverts commit 962a1c07b4.

Further testing revealed that this commit can cause the pre-processor to enter
infinite loops. For now, simply revert this code until a cleaner,
better-tested version is available.
2012-11-30 17:17:56 -08:00
Carl Worth
962a1c07b4 glcpp: Rewrite line-continuation support to act globally.
Previously, we were only supporting line-continuation backslash characters
within lines of pre-processor directives, (as per the specification). With
OpenGL 4.2 and GLES3, line continuations are now supported anywhere within a
shader.

While changing this, also fix a bug where the preprocessor was ignoring
line continuation characters when a line ended in multiple backslash
characters.

The new code is also more efficient than the old. Previously, we would
perform a ralloc copy at each newline. We now perform copies only at each
occurrence of a line-continuation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-30 15:08:38 -08:00
Ander Conselvan de Oliveira
60a11e295b egl/wayland: Dispatch the event queue before get_buffers
When a client frame callback is executed and the client starts rendering
again, the egl event queue might not have been dispatched so that the
buffer release event for the previous frame hasn't been processed. In
that case a third buffer is allocated, even though it would be possible
to reuse the buffer that was just released.

The wl_display_dispatch_queue_pending() entry point is available from
wayland-client 1.0.2, so require that in configure.ac.  Also, just
let the pkg-config macro throw its own error, which will show what version
we were looking for and failed to find.

Note: This is a candidate for stable branches.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
2012-11-30 17:05:50 -05:00
Kristian Høgsberg
89ba4368fd egl/wayland: Add invalidate back in eglSwapBuffers()
Commit ca3ed3e024 fixed the problem where
eglMakeCurrent would trigger a getbuffer callback that then breaks the
following wl_egl_window_resize() call.  However, we still need to
invalidate buffers in eglSwapBuffers, since in wayland we always swap
buffers, so the dri driver needs to come out and ask us for the next buffer
after each swapbuffer.

Note: this is a candidate for stable branches.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2012-11-30 17:04:22 -05:00
Kenneth Graunke
8d0bb74a11 i965/fs: Add fs_reg::is_zero() and is_one(); use for opt_algebraic().
These helper macros save you from writing nasty expressions like:

   if ((inst->src[1].type == BRW_REGISTER_TYPE_F &&
         inst->src[1].imm.f == 1.0) ||
        ((inst->src[1].type == BRW_REGISTER_TYPE_D ||
          inst->src[1].type == BRW_REGISTER_TYPE_UD) &&
         inst->src[1].imm.u == 1)) {

Instead, you simply get to write inst->src[1].is_one().  Simple.
Also, this makes the FS backend match the VS backend (which has these).

This patch also converts opt_algebraic to use the new helper functions.
As a consequence, it will now also optimize integer-typed expressions.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-30 13:15:39 -08:00
Brian Paul
4cedb65a43 st/mesa: fix context use-after-free problem in st_renderbuffer_delete()
The use-after-free happened when the renderbuffer was shared by multiple
contexts and we tried to delete the renderbuffer using a context which
was previously deleted.

Note: this is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-30 12:08:07 -07:00
Brian Paul
51223784d6 util: added pipe_surface_release() function
To fix a pipe_context::surface_destroy() use-after-free problem.
We previously added pipe_sampler_view_release() for similar reasons.

Note: this is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-30 12:08:07 -07:00
Brian Paul
c73245882c mesa: pass context parameter to gl_renderbuffer::Delete()
We sometimes need a rendering context when deleting renderbuffers.
Pass it explicitly instead of trying to grab a current context
(which might be NULL).  The next patch will make use of this.

Note: this is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-30 12:08:06 -07:00
Ander Conselvan de Oliveira
ca3ed3e024 egl/wayland: Don't invalidate drawable on swap buffers
We used to invalidate the drawable after a call to eglSwapBuffers(),
so that a wl_egl_window_resize() would take effect for the next frame.
However, that leads to calling dri2_get_buffers() when eglMakeCurrent()
is called with the current context and surface, and a later call to
wl_egl_window_resize() would not take effect until the next buffer
swap.

Instead, add a callback from wl_egl_window_resize() back to the wayland
egl platform, and invalidate the drawable only when it is resized.

This solves a bug on wayland clients when going back to windowed mode
from fullscreen when clicking a pop up menu, where the window size
after this would be the fullscreen size.

Note: this is a candidate for stable branches.
CC: wayland-devel@lists.freedesktop.org
2012-11-30 11:08:04 -05:00
Kristian Høgsberg
b5c53245af egl: Only enable GLX backend if X11 EGL platform is enabled
We don't want to compile in a bunch of X11 dependencies in libEGL if
we can't run EGL on X11.
2012-11-30 11:08:03 -05:00
José Fonseca
e7177e362e llvmpipe: Remove remnants of lp_tile_soa from Makefile.
Completely forgot about updating Makefile when removing it. Stephane
already fixed the make build, but there were a few mentions of
lp_tile_soa left in the tree.
2012-11-30 07:07:38 +00:00
Eric Anholt
2f7915bdb9 i965/fp: Fix segfault on gen4 TXB instructions.
The gen4 simd16 workaround looks at ir->type to determine how much
storage to allocate for the simd16 value.  In fragment programs,
texturing only ever returns float vec4s (unlike GLSL, which can also
have scalar floats or vector integers), so this is the right type.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56962
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-29 22:34:28 -08:00
Vinson Lee
f126f34c1d llvmpipe: Fix incorrect sizeof.
Fixes sizeof not portable defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-29 21:08:48 -08:00
Stéphane Marchesin
4430d44eac llvmpipe: Fix build break from 75da95c50
The Makefile looks for a file which is gone (lp_tile_soa.c)

http://bugs.freedesktop.org/show_bug.cgi?id=57713
2012-11-29 19:54:34 -08:00
Anuj Phogat
9ab896243c mesa: Fix GL_LUMINANCE handling for textures in glGetTexImage
We need to rebase colors (ex: set G=B=0) when getting GL_LUMINANCE
textures in following cases:
1. If the luminance texture is actually stored as rgba
2. If getting a luminance texture, but returning rgba
3. If getting an rgba texture, but returning luminance

A similar fix was pushed by Brian Paul for uncompressed textures
in commit: f5d0ced.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=47220

Observed no regressions in piglit and ogles2conform due to this fix.
This patch will cause failures in intel oglconform pxconv-gettex,
pxstore-gettex and pxtrans-gettex test cases. The cause of failures
is a bug in test cases. Expected luminance value is calculted
incorrectly in test cases: L = R+G+B.

V2: Set G = 0 when getting a RG texture but returning luminance.

Note: This is a candidate for stable branches.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
2012-11-29 14:05:25 -08:00
Kenneth Graunke
53ba40c156 Revert "meta: Don't try to glOrtho when the draw buffer isn't initialized."
This reverts commit 9947470655.
Apparently it caused a lot of Piglit regressions.
2012-11-29 13:49:07 -08:00
Vincent Lejeune
3fcb3fbf22 r600g: mirror simplification of if/break opcodes
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-11-29 22:15:18 +01:00
Vincent Lejeune
5fda2990aa r600g: separate resource_id and sampler_id tex info in tgsi-to-llvm
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-11-29 22:15:18 +01:00
Carl Worth
9ff6b52886 glcpp: Update README for new support of __LINE__ and __FILE__.
Drop these from the known limitations list since support was recently added
for these.

Also, fix a typo while in the area, (and the oddly missing final newline).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-29 13:03:02 -08:00
Carl Worth
89cd6df034 glcpp: Add test involving token pasting of INTEGER tokens.
This test file is very similar to test 113-line-and-file-macros but uses token
pasting for cleaner quiz answers (without spaces between the digits). This
test passes thanks to the recent addition of support for pasting INTEGER
tokens, (but would have failed without that).

(Note that this test is distinct from test 059-token-pasting-integer which
pastes integers parsed from the source. Those are parsed to INTEGER_STRING
tokens and are already pasted correctly as verified by that test. The only way
to generate the INTEGER tokens which currently fail to paste is with an
internal define such as __LINE__ that results in an integer.)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-29 13:03:02 -08:00
Carl Worth
522d1ccd77 glcpp: Add support for pasting of INTEGER tokens.
By generalizing the current code designed to paste string tokens of various
types.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-29 13:03:02 -08:00
Carl Worth
e1597f0a81 glcpp: Flag invalid pastes for integer followed by non-digits
As recently tested in the additions to the invalid paste test, it is illegal
to paste a non-digit sequence onto the end of an integer.

The 082-invalid-paste test should now pass again.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-29 13:03:01 -08:00
Carl Worth
c86eb0cd65 glcpp: Extend the invalid-paste test
The current code lets a few invalid pastes through, such as an string pasted
onto the end of an integer. Extend the invalid-paste test to catch some of
these.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-29 13:03:01 -08:00
Carl Worth
01b83171c9 glcpp: More factoring-out of common code to simplify things.
This time creating a new _token_list_create_with_one_integer function
modeled after the existing _token_list_create_with_one_space function
(both implemented with new _token_list_create_with_one_ival).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-29 13:03:01 -08:00
Carl Worth
ea34ac499d glcpp: Factor out a tiny bit of repeated code.
This function is getting a little long too read. Simplify it by pulling
up one assignment from every condition.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-29 13:03:01 -08:00
Carl Worth
907a10378e glcpp: Add support for __LINE__ and __FILE__ macros
These tokens are easy to expand by just looking at the current, tracked
location values, (and no need to look anything up in the hash table).

Add a test which verifies __LINE__ with several values, (and verifies __FILE__
for the single value of 0). Our testing framework isn't sophisticated enough
here to have a test with multiple file inputs.

This commit fixes part of es3conform's preprocess16_frag test.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-29 13:03:01 -08:00
Paul Berry
dbd6135bc1 mesa: Rename API_OPENGL to API_OPENGL_COMPAT.
This should help avoid confusion now that we're using the gl_api enum
to distinguishing between core and compatibility API's.  The
corresponding enum value for core API's is API_OPENGL_CORE.

Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-29 11:33:15 -08:00
Marek Olšák
3e163a137b gallium/postprocess: share pipe_context and cso_context with the state tracker
Using one context instead of two is more efficient and
we can skip another context flush.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-29 20:31:41 +01:00
Marek Olšák
135fe907a0 mesa: move some helper functions from fboobject.c to glformats.c
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-29 20:31:41 +01:00
Tapani Pälli
0fda2e9147 android: include api_exec.c in generated files list
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-29 09:14:27 -08:00
José Fonseca
9c9c18a395 gallivm: Fix lp_build_float_to_half.
The current implementation was close by not fully correct: several
operations that should be done in floating point were being done in
integer.

Fixes piglit fbo-clear-formats GL_ARB_texture_float

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-29 16:52:42 +00:00
Roland Scheidegger
b5918d8f1d gallivm: fix a trivial txq issue for 2d shadow and cube shadow samplers
untested (couldn't get the piglit test to run even with version overrides)
but seemed blatantly wrong.
In any case it would only affect an error case which when it would happen
probably all hope is lost anyway.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-29 15:31:46 +01:00
Roland Scheidegger
6d50148742 llvmpipe: support array textures
This adds array (1d,2d) texture support to llvmpipe.
Though probably should do something about 1d array textures requiring gobs
of memory (this issue is not strictly limited to arrays but it is probably
worse there).
Initial code by Jakob Bornecrantz <jakob@vmware.com>

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-29 15:30:19 +01:00
Roland Scheidegger
95e03914d8 gallivm: support array textures
Support 1d and 2d array textures (including shadow samplers),
and (as a side effect mostly) also shadow cube samplers.
Seems to pass the relevant piglit tests both for sampling and rendering
to (though some require version overrides).
Since we don't support render target indices rendering to array textures
is still restricted to a single layer at a time.
Also, the min/max layer in the sampler view (which is unnecessary for GL)
is ignored (always use all layers).

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-29 15:28:25 +01:00
José Fonseca
88e92f5bcd llvmpipe: Remove lp_build_blend_soa()
No longer used/necessary, as we always blend in AoS now.

Trivial.
2012-11-29 14:08:43 +00:00
José Fonseca
75da95c50a llvmpipe: Eliminate color buffer swizzling.
Now dead code.

Also had to remove the show_tiles/show_subtiles because now the color
buffers are always stored in their native format, so there is no longer
an easy way to paint the tile sizes.

Depth-stencil buffers are still swizzled.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-29 14:08:43 +00:00
José Fonseca
6916387e53 llvmpipe: Only advertise unswizzled formats.
Update llvmpipe_is_format_supported and llvmpipe_is_format_unswizzled
so that only the formats that we can render without swizzling are
advertised.

We can still render all D3D10 required formats except
PIPE_FORMAT_R11G11B10_FLOAT, which needs to be implemented in a future
opportunity.

Removal of rendertarget swizzling will be done in a subsequent change.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-29 14:08:42 +00:00
José Fonseca
9f06061d50 util/u_format: Kill util_format_is_array().
It is buggy (it was giving wrong results for some of the formats with
padding), and util_format_description::is_array already does precisely
what's intended.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-29 14:08:42 +00:00
José Fonseca
a47674ee89 util/u_format: Tighten the meaning of is_array bit to exclude mixed type formats.
This is what we want in practice.

The only change is in PIPE_FORMAT_R8SG8SB8UX8U_NORM, which no longer is
considered an array format.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-29 14:08:42 +00:00
Adhemerval Zanella
64e9ec634b util/u_format: Fix format manipulation for big-endian
This patch fixes various format manipulation for big-endian
architectures.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:54:23 +00:00
Adhemerval Zanella
e25abacc18 gallivm: Fix format manipulation for big-endian
This patch fixes various format manipulation for big-endian
architectures.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:54:18 +00:00
Adhemerval Zanella
b772d784b2 gallivm: Add byte-swap construct calls
This patch adds two more functions in type conversions header:
* lp_build_bswap: construct a call to llvm.bswap intrinsic for an
  element
* lp_build_bswap_vec: byte swap every element in a vector base on the
  input and output types.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:54:14 +00:00
Adhemerval Zanella
86902b5134 gallivm: Fix vector constant for shuffle
This patch fixes the vector constant generation used for vector shuffle
for big-endian machines.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:54:10 +00:00
Adhemerval Zanella
29ba79b2c9 gallivm: clear Altivec NJ bit
This patch enforces the clear of NJ bit in VSCR Altivec register so
denormal numbers are handles as expected by IEEE standards.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:52:05 +00:00
Adhemerval Zanella
43ce9efdbf gallivm: Altivec floating-point rounding
This patch adds Altivec intrinsics for float vector types. It changes
the SSE specific definitions to a platform neutral and adds the calls
to Altivec intrinsic builder.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:52:00 +00:00
Adhemerval Zanella
dd5c580816 gallivm: Altivec vector add/sub intrisics
This patch add correct vector addition and substraction intrisics when
using Altivec with PPC. Current code uses default path and LLVM backend
ends up issuing carry-out arithmetic instruction while it is expected
saturated ones.

It also includes a fix for PowerPC where char are unsigned by default,
resulting in bogus values for vector shifting.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:51:53 +00:00
Adhemerval Zanella
2ea7d3dabd gallivm: Altivec vector max/min intrisics
This patch adds the PPC Altivec instrics max/min instruction for
supported Altivec vector types (16xi8, 8xi16, 4xi32, 4xf32).

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:51:46 +00:00
Adhemerval Zanella
31c63b058e gallivm: Altivec pack/unpack intrisics
This patch adds PPC Altivec support for pack/unpack operations using Altivec
supported vector type (8xi8, 16xi16, 4xi32, 4xf32).

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:51:41 +00:00
Michel Dänzer
8b6aec6533 radeonsi: Bitcast result of packf16 intrinsic to float for export intrinsic.
Fixes 7 piglit tests, and prevents many more from crashing.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-and-Tested-by: Christian König <christian.koenig@amd.com>
2012-11-29 10:08:53 +01:00
Kenneth Graunke
c102360800 i965/vs: Move struct brw_compile (p) entirely inside vec4_generator.
The brw_compile structure contains the brw_instruction store and the
brw_eu_emit.c state tracking fields.  These are only useful for the
final assembly generation pass; the earlier compilation stages doesn't
need them.

This also means that the code generator for future hardware won't have
access to the brw_compile structure, which is extremely desirable
because it prevents accidental generation of Gen4-7 code.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-11-28 18:16:01 -08:00
Kenneth Graunke
eda9726ef5 i965/vs: Split final assembly code generation out of vec4_visitor.
Compiling shaders requires several main steps:

   1. Generating VS IR from either GLSL IR or Mesa IR
   2. Optimizing the IR
   3. Register allocation
   4. Generating assembly code

This patch splits out step 4 into a separate class named "vec4_generator."

There are several reasons for doing so:

   1. Future hardware has a different instruction encoding.  Splitting
      this out will allow us to replace vec4_generator (which relies
      heavily on the brw_eu_emit.c code and struct brw_instruction) with
      a new code generator that writes the new format.

   2. It reduces the size of the vec4_visitor monolith.  (Arguably, a lot
      more should be split out, but that's left for "future work.")

   3. Separate namespaces allow us to make helper functions for
      generating instructions in both classes: ADD() can exist in
      vec4_visitor and create IR, while ADD() in vec4_generator() can
      create brw_instructions.  (Patches for this upcoming.)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-11-28 18:15:58 -08:00
Kenneth Graunke
db6231fece i965/vs: Abort on unsupported opcodes rather than failing.
Final code generation should never fail.  This is a bug, and there
should be no user-triggerable cases where this could occur.

Also, we're not going to have a fail() method after the split.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-11-28 18:15:57 -08:00
Kenneth Graunke
8af8a26480 i965/vs: Move uses of brw_compile from do_vs_prog to brw_vs_emit.
The brw_compile structure is closely tied to the Gen4-7 hardware
encoding.  However, do_vs_prog is very generic: it just calls out to
get a compiled program and then uploads it.

This isn't ultimately where we want it, but it's a step in the right
direction: it's now closer to the code generator.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-11-28 18:15:55 -08:00
Kenneth Graunke
746fc346ea i965/vs: Rework memory contexts for shader compilation data.
During compilation, we allocate a bunch of things: the IR needs to last
at least until code generation...and then the program store needs to
last until after we upload the program.

For simplicity's sake, just keep it all around until we upload the
program.  After that, it can all be freed.

This will also save a lot of headaches during the upcoming refactoring.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-11-28 18:15:53 -08:00
Kenneth Graunke
031146736c i965/vs: Pass the brw_context pointer into brw_compute_vue_map().
We used to steal it out of the brw_compile struct, but that won't be
initialized in time soon (and is eventually going away).

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-11-28 18:15:51 -08:00
Kenneth Graunke
403bb1d306 i965/vs: Pass the brw_context pointer into vec4_visitor and do_vs_prog.
We used to steal it out of the brw_compile struct...but vec4_visitor
isn't going to have one of those in the future.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-11-28 18:15:50 -08:00
Kenneth Graunke
dd50c88386 i965/vs: Move some functions from brw_vec4_emit.cpp to brw_vec4.cpp.
This leaves only the final code generation stage in brw_vec4_emit.cpp,
moving the payload setup, run(), and brw_vs_emit functions to brw_vec4.cpp.

The fragment shader backend puts these functions in brw_fs.cpp, so this
patch also helps with consistency.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-11-28 18:15:26 -08:00
Kenneth Graunke
9947470655 meta: Don't try to glOrtho when the draw buffer isn't initialized.
I ran across this while running a glGenerateMipmap() test.

_meta_GenerateMipmap sets MESA_META_TRANSFORM, which causes
_mesa_meta_begin to try and set a default orthographic projection.

Unfortunately, if the drawbuffer isn't set up, ctx->DrawBuffer->Width
and Height are 0, which just causes an GL_INVALID_VALUE error.

Fixes oglconform's fbo/mipmap.automatic, mipmap.manual, and
mipmap.manualIterateTexTargets.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-28 18:12:07 -08:00
Jason Wood
8d1ee38a4c docs: Mark some features in GL3.txt as done for r600
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-11-29 01:07:26 +01:00
Marek Olšák
aa46cc2879 st/mesa: allow forward-compatible contexts and set Const.ContextFlags
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-29 01:07:26 +01:00
Marek Olšák
249f86e3f8 st/mesa: add support for GL core profiles
The rest of the plumbing was in place already.

I have tested this by turning on all GL 3.1 features.
The drivers not supporting GL 3.1 will fail to create a core profile
as they should.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-29 01:07:26 +01:00
Marek Olšák
f9429e30aa configure.ac: remove -fomit-frame-pointer from LLVM flags
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-11-29 00:07:27 +01:00
Marek Olšák
3d59cde92e configure.ac: look for whole words in LLVM flags, not prefixes
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-11-29 00:07:27 +01:00
Marek Olšák
9b67a347f6 configure.ac: consolidate stripping unwanted LLVM flags
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-11-29 00:07:27 +01:00
Marek Olšák
a84a8da4f8 configure.ac: print LLVM flags
to see what we're mixing with ours

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-11-29 00:07:27 +01:00
Brian Paul
0904973e39 util: add more memory debugging features
Add a DEBUG_FREED_MEMORY option to help catch use-after-free errors.
Add debug_memory_check() function which can be periodically called to
check that all known blocks are good.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-28 15:03:29 -07:00
José Fonseca
1cead8845b llvmpipe: Implement logic ops for the AoS path.
It was forgotten in the previous patch series, but it is trivial to
implement, based on the SoA path.

This fixes glean logicOp failures.
2012-11-28 20:45:18 +00:00
José Fonseca
547efc76df llvmpipe: Don't use dynamically sized arrays.
Unfortunately for MSVC arrays with a constant variable size are still
considered dynamically sized.
2012-11-28 19:58:47 +00:00
Eric Anholt
c8ed9f6262 i965/gen4-5: Fix segfaults with stencil-only depth/stencil setups.
Fixes a ton of piglit regressions since the depthstencil fixes for gen6+.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57309
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-28 11:26:41 -08:00
Eric Anholt
b9b033d8e4 i965/fs: Don't generate saturates over existing variable values.
Fixes a crash in http://workshop.chromeexperiments.com/stars/ on i965,
and the new piglit test glsl-fs-clamp-5.
We were trying to emit a saturating move into a uniform, which the code
generator appropriately choked on.  This was broken in the change in
32ae8d3b32.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57166
NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-28 11:26:34 -08:00
Eric Anholt
154ef07aa7 i965/fs: Add some minimal backend-IR dumping.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-28 11:26:33 -08:00
James Benton
960ab06da0 llvmpipe: Update llvmpipe_is_format_unswizzled to reflect latest changes.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton
66fdf626bb llvmpipe: Enable vertex color clamping.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton
fa1b481c09 llvmpipe: Unswizzled rendering.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton
1d3789bccb gallivm: Updated lp_build_const_mask_aos to input number of channels.
Also updated lp_build_const_mask_aos_swizzled to reflect this.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton
d03d29a044 util: Updated util_format_is_array to be more accurate.
Will allow formats with padding, e.g. RGBX.
Will now allow swizzled formats as long as the alpha is channel 3.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton
e66ec7c46b gallivm: Added support for float to half-float conversion in lp_build_conv.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton
d7a8390a82 gallivm: Changed lp_build_pad_vector to correctly handle scalar argument.
Removed the lp_type argument as it was unnecessary.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton
71c6fe76c0 gallivm: Add a function to generate lp_type for a format.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton
cd548836a1 gallivm: Add support for unorm16 in lp_build_mul.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:20 +00:00
Matt Turner
c3a465ae98 glcpp: Support #elif(expression) with no intervening space.
And add test cases to ensure that this works
	- 110 verifies that glcpp rejects #elif<digits> which glcpp
	  previously accepted.
	- 111 verifies that glcpp accepts #if followed immediately by
	  (, +, -, !, or ~.
	- 112 does the same as 111 but for #elif.

See 17f9beb6 for #if change.
Reviewed-by: Carl Worth <cworth@cworth.org>
2012-11-28 10:27:02 -08:00
Matt Turner
aed466192a glcpp: Reject #version and #line not followed by whitespace
Fixes part of es3conform's preprocess16_frag test.
Reviewed-by: Carl Worth <cworth@cworth.org>
2012-11-28 10:26:53 -08:00
Marek Olšák
91ca053714 mesa: fix BlitFramebuffer between linear and sRGB formats
NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-28 18:48:22 +01:00
Roland Scheidegger
406b76ca32 gallivm: fix multiple lods with different min/mag filter and wide vectors
broken since 529fe420ba,
I forgot some code, only added the comment...
Fixes bug 57644.
2012-11-28 18:07:27 +01:00
Michel Dänzer
6e33b55ee1 radeonsi: Reinstate assertions against invalid colour/depth formats.
radeonsi now supports Z16 and doesn't fail these assertions anymore.

This partially reverts commit 7bba4879bb, but
leaves the error messages in place to allow diagnosing such problems even with
non-debugging builds.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-11-28 15:48:50 +01:00
Michel Dänzer
a8d46d0173 radeonsi: Re-enable Z16 depth buffers.
8 more piglits.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-11-28 13:53:54 +01:00
Marek Olšák
726fe54cbc radeonsi: remove redundant parameter in r600_init_surface
[ Cherry-picked from r600g commit f5ac60152b ]
2012-11-28 13:35:17 +01:00
Michel Dänzer
fa83d52961 radeonsi: Use explicit stencil mipmap level offsets.
Extracted from r600g commit 428e37c2da.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:17 +01:00
Marek Olšák
39b56afaa2 radeonsi: correct texture memory size for Z32F_S8X24
[ Cherry-picked from r600g commit ea72351a91 ]
2012-11-28 13:35:17 +01:00
Michel Dänzer
20f651d003 radeonsi: Depth/stencil fixes.
Adapted from r600g commit 018e3f75d6.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:17 +01:00
Michel Dänzer
1a616c1009 radeonsi: Flesh out support for depth/stencil exports from the pixel shader.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:16 +01:00
Michel Dänzer
49003a5cb6 radeonsi: Fix sampler views for depth textures.
Consistently reference the flushed depth texture in the sampler view, not the
original one.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:16 +01:00
Jerome Glisse
3c024624fd radeonsi: Fix z/stencil texture creation.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>

[ Cherry-picked from r600g commit b4f0ab0b22 ]
2012-11-28 13:35:16 +01:00
Vinson Lee
ffc318a97a scons: Build ws_xlib on Mac OS X.
Fixes this SCons build error on Mac OS X if X11 is found.

NameError: name 'ws_xlib' is not defined:
  File "SConstruct", line 144:
    duplicate = 0 # http://www.scons.org/doc/0.97/HTML/scons-user/x2261.html
  File "scons-2.2.0/SCons/Script/SConscript.py", line 614:
    return method(*args, **kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 551:
    return _SConscript(self.fs, *files, **subst_kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 260:
    exec _file_ in call_stack[-1].globals
  File "src/SConscript", line 34:
    SConscript('gallium/SConscript')
  File "scons-2.2.0/SCons/Script/SConscript.py", line 614:
    return method(*args, **kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 551:
    return _SConscript(self.fs, *files, **subst_kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 260:
    exec _file_ in call_stack[-1].globals
  File "src/gallium/SConscript", line 135:
    'targets/libgl-xlib/SConscript',
  File "scons-2.2.0/SCons/Script/SConscript.py", line 614:
    return method(*args, **kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 551:
    return _SConscript(self.fs, *files, **subst_kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 260:
    exec _file_ in call_stack[-1].globals
  File "src/gallium/targets/graw-xlib/SConscript", line 9:
    ws_xlib,

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-27 23:13:57 -08:00
Johannes Obermayr
53636fdf93 configure.ac: Remove -O., -g and -Wall from LLVM_C{PP,XX}FLAGS.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-11-28 00:19:17 +01:00
Brian Paul
f75acabb96 vbo: move another line of code after declarations
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-11-27 15:34:56 -07:00
Brian Paul
8765c0d20f vbo: move code after declarations to fix MSVC errors
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-27 14:28:56 -07:00
Brian Paul
f94e672b47 vbo: minor whitespace fix 2012-11-27 13:56:52 -07:00
Brian Paul
a547e532fc mesa: remove '(void) k' lines
Serves no purpose as the k parameter is used later in the code.
2012-11-27 13:56:52 -07:00
Kenneth Graunke
7a414fea87 mesa/vbo: Check for invalid types in various packed vertex functions.
According to the ARB_vertex_type_2_10_10_10_rev specification:
"The error INVALID_ENUM is generated by VertexP*, NormalP*,
 TexCoordP*, MultiTexCoordP*, ColorP*, or SecondaryColorP if <type>
 is not UNSIGNED_INT_2_10_10_10_REV or INT_2_10_10_10_REV."

Fixes 7 subcases of oglconform's packed-vertex test.

v2: Add "gl" prefix to error messages (pointed out by Brian).
    Also rebase atop the ctx plumbing.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-27 12:36:33 -08:00
Kenneth Graunke
6a529e2b48 mesa/vbo: Support the ES 3.0 signed normalized scaling rules.
Traditionally, OpenGL has had two separate equations for converting from
signed normalized fixed-point data to floating point data.  One was used
primarily for vertex data, while the other was primarily for texturing
and framebuffer data.

However, ES 3.0 and GL 4.2 change this, declaring there's only one
equation to be used in all cases.  Unfortunately, it's the other one.

v2: Correctly convert 0b10 to -1.0, as pointed out by Chris Forbes.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2012-11-27 12:36:33 -08:00
Kenneth Graunke
c8d8d5db72 mesa/vbo: Plumb ctx through to the conv_i(10|2)_to_norm_float functions.
The rules for converting these values actually depend on the current
context API and version.  The next patch will implement those changes.

v2: Mark ctx as const, as suggested by Brian.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2012-11-27 12:36:33 -08:00
Matt Turner
13f9012ad3 mesa: Set transform feedback's default buffer mode to INTERLEAVED_ATTRIBS
Fixes part of es3conform's transform_feedback_init_defaults test.
NOTE: This is a candidate for the stable branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-27 10:40:50 -08:00
Matt Turner
7c2060f0f0 mesa: Return 0 for XFB_VARYING_MAX_LENGTH if no varyings
v2: Perform this count the same way as elsewhere in this file, per
    Brian Paul's review.

Fixes part of es3conform's transform_feedback_init_defaults test.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-27 10:40:49 -08:00
Andreas Boll
f65741721b gallium/tests/trivial: updates for transfer functions changes
Fixes build error with configure option --enable-gallium-tests
introduced in 369e468889

Compile tested only.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-11-27 19:00:48 +01:00
Andreas Boll
cba639f2a1 gallium/tests/trivial: updates for CSO interface changes
Fixes build error with configure option --enable-gallium-tests
introduced in ea6f035ae9

Cc: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-11-27 19:00:48 +01:00
Andreas Boll
1553f5ce83 gallium/tests/trivial: updates for util_draw_vertex_buffer changes
Fixes build error with configure option --enable-gallium-tests
introduced in e73bf3b805

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-11-27 19:00:48 +01:00
James Benton
9bd4856b5c util: Modified u_rect to default to memcpy.
Previously this function would assert if the format didn't fit an expected 4 channel format size.

Now will work with any format type with any amount of channels.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 16:24:42 +00:00
James Benton
65016646e3 util/format: Fix bug in float to non-float conversion in u_format_pack.py.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 16:24:02 +00:00
James Benton
978df710f2 gallivm: Fix bug in lp_build_one which would incorrectly return a vector for length 1.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 16:23:04 +00:00
Kenneth Graunke
9bc9895c4a glsl: Support unsigned integer constants in layout qualifiers.
Fixes es3conform's explicit_attrib_location_integer_constants.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-and-tested-by: Matt Turner <mattst88@gmail.com>
2012-11-26 21:02:45 -08:00
Kenneth Graunke
9136723214 i965/fs: Move struct brw_compile (p) entirely inside fs_generator.
The brw_compile structure contains the brw_instruction store and the
brw_eu_emit.c state tracking fields.  These are only useful for the
final assembly generation pass; the earlier compilation stages doesn't
need them.

This also means that the code generator for future hardware won't have
access to the brw_compile structure, which is extremely desirable
because it prevents accidental generation of Gen4-7 code.

v2: rzalloc p, as suggested by Eric.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:34 -08:00
Kenneth Graunke
ea681a0d64 i965/fs: Split final assembly code generation out of fs_visitor.
Compiling shaders requires several main steps:

   1. Generating FS IR from either GLSL IR or Mesa IR
   2. Optimizing the IR
   3. Register allocation
   4. Generating assembly code

This patch splits out step 4 into a separate class named "fs_generator."

There are several reasons for doing so:

   1. Future hardware has a different instruction encoding.  Splitting
      this out will allow us to replace fs_generator (which relies
      heavily on the brw_eu_emit.c code and struct brw_instruction) with
      a new code generator that writes the new format.

   2. It reduces the size of the fs_visitor monolith.  (Arguably, a lot
      more should be split out, but that's left for "future work.")

   3. Separate namespaces allow us to make helper functions for
      generating instructions in both classes: ADD() can exist in
      fs_visitor and create IR, while ADD() in fs_generator() can
      create brw_instructions.  (Patches for this upcoming.)

Furthermore, this patch changes the order of operations slightly.
Rather than doing steps 1-4 for SIMD8, then 1-4 for SIMD16, we now:

   - Do steps 1-3 for SIMD8, then repeat 1-3 for SIMD16
   - Generate final assembly code for both modes together

This is because the frontend work can be done independently, but final
assembly generation needs to pack both into a single program store to
feed the GPU.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:34 -08:00
Kenneth Graunke
dd1fd30047 i965/fs: Abort on unsupported opcodes rather than failing.
Final code generation should never fail.  This is a bug, and there
should be no user-triggerable cases where this could occur.

Also, we're not going to have a fail() method in a moment.

v2: Just abort() rather than assert, to cover the NDEBUG case
    (suggested by Eric).

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:34 -08:00
Kenneth Graunke
cd0acb1abe i965: Make it possible to create a cfg_t without a backend_visitor.
All we really need is a memory context and the instruction list; passing
a backend_visitor is just convenient at times.

This will be necessary two patches from now.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:34 -08:00
Kenneth Graunke
4d09fe938e i965/fs: Move uses of brw_compile from do_wm_prog to brw_wm_fs_emit.
The brw_compile structure is closely tied to the Gen4-7 hardware
encoding.  However, do_wm_prog is very generic: it just calls out to
get a compiled program and then uploads it.

This isn't ultimately where we want it, but it's a step in the right
direction: it's now closer to the code generator.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:34 -08:00
Kenneth Graunke
3417b2f2b2 i965/fs: Pass the brw_context pointer into fs_visitor explicitly.
We used to steal it out of the brw_compile struct...but fs_visitor
isn't going to have one of those in the future.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Kenneth Graunke
1f74002a98 i965/fs: Move brw_wm_compile::fp to fs_visitor.
Also change it from a brw_fragment_program to a gl_fragment_program,
since that seems to be what everything wants anyway.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Kenneth Graunke
7b0d30eb87 i965/fs: Remove struct brw_shader * parameter to fs_visitor constructor.
We can easily recover it from prog, and this makes it clear that we
aren't passing additional information in.

v2: Use an if-statement rather than the ?: operator (suggested by Eric).

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Kenneth Graunke
a303df86de i965/fs: Move brw_wm_compile::dispatch_width into fs_visitor.
Also, rather than having brw_wm_fs_emit poke at it directly, make it a
parameter to the fs_visitor constructor.

All other changes generated by search and replace (with occasional
whitespace fixup).

v2: Make dispatch_width const (as suggested by Paul); fix doxygen
    mistake (pointed out by Eric); update for rebase.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Kenneth Graunke
47a6a7b51b i965/fs: Move brw_wm_lookup_iz() to fs_visitor::setup_payload_gen4().
This necessitates compiling brw_wm_iz.c as C++.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Kenneth Graunke
2429c9d347 i965/fs: Move brw_wm_payload_setup() to fs_visitor::setup_payload_gen6()
Now that we only have the one backend, there's no real point in keeping
this separate.  Moving it should allow some future simplifications.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Kenneth Graunke
ce96f6db90 i965/fs: Remove brw_wm_compile::computes_depth field.
Everybody determines this by checking if fp's OutputsWritten field
contains the FRAG_RESULT_DEPTH bit.  Rather than having payload setup
check this and set the computes_depth flag, we can just do the check in
the only place that actually used it: emit_fb_writes().

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Roland Scheidegger
529fe420ba gallivm: use the new mip per quad handling in texture fetch path
No longer have to split fetching into quads dynamically if mip levels
are not the same for all quads (aos sampling still always splits due
to performance reasons).
Instead handle multiple mip levels further down, minification etc. takes
this into account.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 03:30:55 +01:00
Roland Scheidegger
0b6554ba6f gallivm,llvmpipe: handle TXF (texelFetch) instruction, including offsets
This also adds some code to handle per-quad lods for more than 4-wide fetches,
because otherwise I'd have to integrate the texelFetch function into
the splitting stuff... (but it is not used yet outside texelFetch).
passes piglit fs-texelFetch-2D, fails fs-texelFetchOffset-2D due to I believe
a test error (results are undefined for out-of-bounds fetches, we return
whatever is at offset 0, whereas the test expects [0,0,0,1]).
Texel offsets are only handled by texelFetch for now, though the interface
can handle it for everything.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 03:26:49 +01:00
Chris Forbes
93c689a2df i965: Enable ARB_vertex_type_2_10_10_10_rev on Gen4+.
v2 (Kayden): Move the enable into an existing intel->gen >= 4 block
(as suggested by Ian).

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 16:48:29 -08:00
Chris Forbes
4a64efc01b i965: emit w/a for packed attribute formats in VS
Implements BGRA swizzle, sign recovery, and normalization
as required by ARB_vertex_type_10_10_10_2_rev.

V2: Ported to the new VS backend, since that's all that's left;
	fixed normalization.

V3: Moved fixups out of the GLSL-only path, so it works for FF/VP too.

V4 (Kayden): Rework ES3 normalization, don't heap allocate registers;
	tidy comments.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 16:35:10 -08:00
Chris Forbes
352ae51efd i965: set attribute w/a bits for packed formats
Flag the need for various workarounds to be applied by
the vertex shader.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 16:35:00 -08:00
Chris Forbes
c3c680950d i965: Generalize GL_FIXED VS w/a support
Next few patches build on this to add other workarounds
for packed formats.

V2: rename BRW_ATTRIB_WA_COMPONENTS to BRW_ATTRIB_WA_COMPONENT_MASK;
V3 (Kayden): remove separate bit for ES3 signed normalization

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 16:34:28 -08:00
Chris Forbes
23f4411c41 i965: support 2_10_10_10 formats in get_surface_type.
Always use R10G10B10A2_UINT; Most of the other formats we'd like
don't actually work on the hardware. Will emit w/a for scaling,
sign recovery and BGRA swizzle in the VS.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 16:34:23 -08:00
Chris Forbes
f9a08f7f0f i965: implement get_size for 2_10_10_10 formats
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 16:34:20 -08:00
Chris Forbes
894fe54ec9 i965/vs: add support for emitting SHL, SHR, ASR
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 14:02:30 -08:00
Matt Turner
8f3570efc7 mesa: Use correct glGetTransformFeedbackVarying name in error msg
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-26 10:08:05 -08:00
Andreas Boll
0f5e2ce854 build: use git ls-files for adding all Makefile.in into the release tarball
Until we have proper 'make dist' this is an improvement of the current
situation, because each time some old Makefiles got converted to automake
we had to update the tarballs target.

NOTE: This is a candidate for the 9.0 branch.

Cc: Eric Anholt <eric@anholt.net>
Acked-by: Matt Turner <mattst88@gmail.com>
2012-11-26 19:03:21 +01:00
Eric Anholt
97747ac88f i965: Fix hangs with FP KIL instructions pre-gen6.
We can't support IF statements in 16-wide on these.  To get back to 16-wide
for these shaders, we need to support predicate on discard instructions in the
backend IR, which is something we've sort of got on the list to do anyway.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55828
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-25 20:22:02 -08:00
Eric Anholt
59bfd66a61 i965/gen4: Fix memory leak each time compile_gs_prog() is called.
Commit 774fb90db3 introduced a ralloc context to
each user of struct brw_compile, but for this one a NULL context was used,
causing the later ralloc_free(mem_ctx) to not do anything.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55175
NOTE: This is a candidate for the stable branches.
2012-11-25 18:25:26 -08:00
Eric Anholt
244db0855c i965/gen4: Fix LOD bias texturing since my fixed reg classes change.
We have a special case where non-shadow comparison with LOD requires using a
SIMD16 vec4 in an 8-wide shader, which appears in the register allocator as a
size 8 vgrf.

Fixes assertions in various piglit tests and webgl conformance.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56521
2012-11-25 18:25:26 -08:00
Marek Olšák
cff4c948ed r600g: fix broken streamout if streamout_begin caused a context flush
This fixes graphics corruption in the case where the DISCARD_RANGE flag
is used to map a buffer.

NOTE: This is a candidate for the stable branches.
2012-11-23 00:42:02 +01:00
Marek Olšák
d172fa825b r600g: fix ARB_map_buffer_alignment with unaligned offsets and staging buffers 2012-11-22 22:40:06 +01:00
Vinson Lee
f884005771 scons: Append x11 library path if linking x11 library.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2012-11-21 22:34:20 -08:00
Kenneth Graunke
bf75a1f092 mesa/vbo: Fix scaling issue in 2-bit signed normalized packing.
Since a signed 2-bit integer can only represent -1, 0, or 1, it is
tempting to simply to convert it directly to a float.  This maps it
onto the correct range of [-1.0, 1.0].  However, it gives different
values compared to the usual equation:

(2.0 *  1.0 + 1.0) * (1.0 / 3.0) = +1.0           (same)
(2.0 *  0.0 + 1.0) * (1.0 / 3.0) = +0.33333333... (different)
(2.0 * -1.0 + 1.0) * (1.0 / 3.0) = -0.33333333... (different)

According to the GL_ARB_vertex_type_2_10_10_10_rev extension, signed
normalization is performed using equation 2.2 from the GL 3.2
specification, which is:

   f = (2c + 1)/(2^b - 1).                                (2.2)

Comments below that equation state: "In general, this representation is
used for signed normalized fixed-point parameters in GL commands, such
as vertex attribute values."  Which is what we're doing here.

The 3.2 specification goes on to declare an alternate formula:

   f = max{c/(2^(b-1) - 1), -1.0}                         (2.3)

which is closer to the existing code, and maps the end points to exactly
-1.0 and 1.0.  Comments below the equation state: "In general, this
representation is used for signed normalized fixed-point texture or
framebuffer values."  Which is *not* what we're doing here.

It then states: "Everywhere that signed normalized fixed-point
values are converted, the equation used is specified."  This is the real
clincher: the extension explicitly specifies that we must use equation
2.2, not 2.3.  So we need to do (2x + 1) / 3.

This matches the behavior expected by oglconform's packed-vertex test,
and is correct for desktop GL (pre-4.2).  It's not correct for ES 3.0,
but a future patch will correct that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marek Olšák <maraeo@gmail.com>
2012-11-21 20:32:54 -08:00
Kenneth Graunke
e9967aba61 mesa/vbo: Fix scaling issue in 10-bit signed normalized packing.
For the 10-bit components, the divisor was incorrect.  A 10-bit signed
integer can represent -2^9 through 2^9 - 1, which leads to the following
ranges:

       (float)value.x          -> [ -512,  511]
2.0F * (float)value.x          -> [-1024, 1022]
2.0F * (float)value.x + 1.0F   -> [-1023, 1023]

So dividing by 511 would incorrectly scale it to approximately:
[-2.001956947, 2.001956947].  To correctly scale to [-1.0, 1.0], we need
to divide by 1023.

This correctly implements the desktop GL rules.  ES 3.0 has different
rules, but those will be implemented in a separate patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marek Olšák <maraeo@gmail.com>
2012-11-21 20:29:38 -08:00
Alex Deucher
e2df37f69a radeonsi: add a new SI pci id
Note: this is a candidate for the stable branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-11-21 18:49:00 -05:00
Vinson Lee
10f214e5b2 i915: Fix wrong sizeof argument in i915_update_tex_unit.
The bug was found by Coverity.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-21 15:02:35 -08:00
Andreas Boll
59b3d3ad6e Add .dirstamp to toplevel .gitignore 2012-11-21 18:25:10 +01:00
Andreas Boll
f7e2e864c8 gallium/tests: update .gitignore files 2012-11-21 18:24:30 +01:00
Eric Anholt
d82b873a50 i965/fs: Add helper functions for IF and CMP and use them.
v2: Rebase on gen6-if fix.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2012-11-20 13:38:38 -08:00
Eric Anholt
32d6809bb5 i965/fs: Add helper functions for generating ALU ops, like in the VS.
This gives us checking of our arguments (no more passing 1 operand to
BRW_OPCODE_MUL!), at the cost of a couple of extra parens.

v2: Rebase on gen6-if fix.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2012-11-20 12:55:08 -08:00
Eric Anholt
1665af3066 i965/gen4: Fix crash with fragment programs and texture rectangle.
This was a regression in the brw_fs_fp.cpp change.  We just need to return
something good enough to get the IR generation to the end without crashing,
but ir->type isn't initialized and we wanted something of the coordinate's
type anyway.

Fixes around 30 piglit cases on my ilk system in drawpixels and framebuffer
blit.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56962
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-19 22:33:44 -08:00
Eric Anholt
d411bbd5bd i965: Disable the GB clip test when a limited viewport is set.
The theory of the guardband is that you extend the clip volume to avoid
expensive clipping computation, and just let fragments outside the viewport
get clipped by the drawable's bounds.  But if a smaller-than-window-size
viewport is set, and we don't also happen to have a scissor set, then
rendering could incorrectly extend outside of the viewport when it should have
been clipped to the viewport.

Fixes the new piglit triangle-guardband-viewport test.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.0 branch.
2012-11-19 22:33:44 -08:00
Eric Anholt
23e7b81f2d i965: Use fewer temporary variables in clip setup.
When you're comparing to the spec, you're trying to immediately see what
numbered dword of the packet your bit ends up in.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.0 branch.
2012-11-19 22:33:43 -08:00
Eric Anholt
afc5a26b5c Revert "i965/fs: Fix conversions float->bool, int->bool"
This reverts commit cf0bbb30f6.  It
was just papering over the bug fixed in the previous commit.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-19 22:33:43 -08:00
Eric Anholt
0482998ccc i965/fs: Fix the gen6-specific if handling for 80ecb8f15b
Fixes oglconform shad-compiler advanced.TestLessThani.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48629
NOTE: This is a candidate for the 9.0 branch.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-19 22:33:43 -08:00
Chad Versace
c9f5126b15 intel: Use designated initializers for DRI extension structs
All Intel code is compiled with -std=c99. There is no excuse to not use
designated initializers.

As a nice benefit, the code is now more friendly to grep. Without
designated initializers, psychic prowess is required to find the
initialization of DRI extension function pointers with grep.  I have
observed several people, when they first encounter the DRI code, fail at
statically chasing the DRI function pointers due to this problem.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-19 15:09:55 -08:00
Chad Versace
62332f4125 dri: Use designated initializers for DRI extension structs
The dri directory is compiled with -std=c99. There is no excuse to not use
designated initializers.

As a nice benefit, the code is now more friendly to grep. Without
designated initializers, psychic prowess is required to find the
initialization of DRI extension function pointers with grep.  I have
observed several people, when they first encounter the DRI code, fail at
statically chasing the DRI function pointers due to this problem.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-19 15:09:55 -08:00
Eric Anholt
fdd6d146d9 i965: Use the separate stencil buffer's offsets for stencil setup.
For a packed depth/stencil buffer on separate stencil hardware, the
separate depth miptree is set up with alignment of 4,4 and the separate
stencil miptree is setup with alignment of 8,8.  We can't just use the
irb->draw_{x,y} offsets for stencil, since that is the offset in the
depth miptree.

Fixes 12 piglit depthstencil testcases on ivb.

Acked-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-19 15:07:22 -08:00
Eric Anholt
52ee1a7269 i965: Move all the depth/stencil/hiz offset logic into the workaround.
Given that we have the mask information here (assuming the rebase is to
the same tiling, which is safe), we can just save a set of miptrees and
offsets and the global intra-tile offset in the context and cut out a
bunch of logic.  This will also save emitting the next fix I need to do
twice.

Acked-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-19 15:07:22 -08:00
Eric Anholt
9ec6a54ba9 i965: When rebasing depth or stencil, update x/y before deciding the other.
Fixes a theoretical problem where we had an aligned depth buffer and a
misaligned stencil buffer with a matching tile offset, so we would fail
to rebase depth even after the needed tile offset changed due to the
rebase of stencil.

It should also fix double-rebase of a misaligned packed depth/stencil
renderbuffer, which may have been a performance issue.

Acked-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-19 15:07:22 -08:00
Eric Anholt
be9e664307 intel: Push face/level -> slice handling to the caller of get_image_offset().
We were always passing 0 for one of the two fields, and the code just used
whichever one wasn't 0.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-19 15:07:22 -08:00
Eric Anholt
c1fabea1c5 i965: Add some checks for array textures in unsupported paths.
I noticed these in the next patch where these paths were using the Face
of a teximage but didn't have array handling.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-19 15:07:22 -08:00
Eric Anholt
923c4b3f4a i965: Add a little bit more debug info for validate blits.
The kind of data you're copying is definitely an interesting variable.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-19 15:07:22 -08:00
Eric Anholt
e5671040c5 intel: Remove dead function prototype.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-19 15:07:22 -08:00
Eric Anholt
1f35ec585f i965: Remove stale comment about wrapped_depth.
I removed that code almost a year ago.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-19 15:07:22 -08:00
Kenneth Graunke
1f74a5b3cc mesa: Mark GetBufferParameteri64v as implemented.
Apparently this was accidentally marked as unimplemented, and thus not
put in the dispatch table.

Fixes 7 es3conform tests:
- copy_buffer_parameters
- copy_buffer_data
- copy_buffer_usage
- pixel_buffer_object_bind
- pixel_buffer_object_parameteriv
- pixel_buffer_object_texture_read
- pixel_buffer_object_usage

v2: Also update the DispatchSanity test for this change.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-19 11:49:04 -08:00
Kenneth Graunke
bbda7d65a9 mesa: Require gen'd names in glBeginQuery on ES 3.0.
Only legacy OpenGL allows the use of non-gen'd names.  Core profiles
and ES 3 both require the use of glGenQueries().

Note that BeginQuery doesn't exist in ES 1 or ES 2.

Fixes es3conform's occlusion_query_invalid_beginquery test.

Reviewed-and-tested-by: Matt Turner <mattst88@gmail.com>
2012-11-19 11:49:00 -08:00
Kenneth Graunke
c6ed42a89e mesa: Support EXT_framebuffer_blit targets in ES 3.0 as well.
GL_READ_FRAMEBUFFER and GL_DRAW_FRAMEBUFFER are valid targets in ES 3.

Fixes 23 es3conform framebuffer_blit tests.  Two more go from fail to
crash, but that appears to be because they actually run now.

Reviewed-and-tested-by: Matt Turner <mattst88@gmail.com>
2012-11-19 11:48:56 -08:00
Kenneth Graunke
f399a707c8 mesa: Fix error code for glTexParameteri of TEXTURE_MAX_LEVEL.
Calling glTexParameteri() with pname GL_TEXTURE_MAX_LEVEL and either a
target of GL_TEXTURE_RECTANGLE or a negative value previously generated
GL_INVALID_OPERATION.  However, GL_INVALID_VALUE seems more appropriate.

Fixes oglconform's api-error/negative.glTexParameter and es3conform's
sgis_texture_lod_basic_error.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-and-tested-by: Matt Turner <mattst88@gmail.com>
2012-11-19 11:48:52 -08:00
Kenneth Graunke
4e907018b2 i965/vs: Don't lose attribute type when converting ATTR to FIXED_HW_REG.
The new brw_reg always had type BRW_REGISTER_TYPE_F, rather than
inheriting the original type of the ATTR file register.

In the past, this hasn't been a problem since we only execute this code
when fixing up GL_FIXED attributes, which always have float types.
However, we'll soon be using it for ARB_vertex_type_10_10_10_2 support,
which uses D and UD types.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-19 11:38:58 -08:00
Chad Versace
5cf8536690 egl/dri2: Set error code when dri2CreateContextAttribs fails
When dri2CreateContextContextAttribs failed, eglCreateContext returned
NULL yet set the error code to EGL_SUCCESS! The problem was that
eglCreateContext ignored the error code returned by
driCreateContextAttribs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56706
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-19 08:18:22 -08:00
Chad Versace
243cf7a924 i965: Validate requested GLES context version in brwCreateContext
For GLES1 and GLES2, brwCreateContext neglected to validate the requested
context version received from the DRI layer. If DRI requested an OpenGL
ES2 context with version 3.9, we provided it one.

Before this fix, the switch statement that validated the requested GL
context flavor was an ugly #ifdef copy-paste mess. Instead of reproducing
the copy-past-mess for GLES1 and GLES2, I first refactored it.  Now the
switch statement is readable.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-19 08:17:32 -08:00
Maarten Lankhorst
ddb901fbf4 automake: strip LLVM_CXXFLAGS and LLVM_CPPFLAGS too
It seems that -NDEBUG and other flags might still be leaked through
those variables, so strip those off there as well.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2012-11-19 09:43:29 +01:00
Kenneth Graunke
5cea027341 i965/fs: Properly patch special values during VGRF compaction.
In addition to registers used by instructions, fs_visitor maintains
direct references to certain "special" values used for inputs/outputs.

When I added VGRF compaction, I overlooked these, believing that these
direct references weren't used once instructions were generated.  That
was wrong.  For example, pixel_x/y are used in virtual_grf_interferes(),
which is called by optimization passes and register allocation.

This patch treats all of them as used and patches them after compacting.
While it's not strictly necessary to patch all of them (as some aren't
used after emitting code), it seems safer to simply fix them all.

Fixes oglconform's textureswizzle/advanced.shader.targets, piglit's
glsl-fs-lots-of-tex, and glean's texCombine on pre-Gen6 hardware.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56790
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-17 14:37:16 -08:00
Eric Anholt
3c368bb307 i965/gen4: Respect the VERTEX_PROGRAM_TWO_SIDE vertex program/shader flag.
Fixes piglit "vertex-program-two-side enabled front back" and 4 others.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-17 12:55:54 -08:00
Eric Anholt
94e82b2e6c mesa: Fix linker-assigned varying component counting since 8fb1e4a462
The goal of that change was to skip counting things that aren't actually
outputs from the VS to the FS.  However, explicit_location isn't set in
the case of linker-assigned locations (the common case), so basically
varying component counting got disabled.  At this stage of the linker,
we've already ensured that var->location is set, so we can just look at
it without worrying.

Fixes i965 assertion failure with the new
piglit glsl-max-varyings --exceed-limits.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51545
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-17 12:55:54 -08:00
Eric Anholt
5c99697f74 mesa: Fix segfault on reading from a missing color read buffer.
The diff looks funny, but it's moving the integer vs non-integer check
below the _mesa_source_buffer_exists() check that ensures
_ColorReadBuffer is non-null, so we get a GL_INVALID_OPERATION instead
of a segfault.  This looks like it had regressed in the
_mesa_error_check_format_and_type() changes, which removed the first of
the two duplicated checks for the source buffer.  Fixes segfault in the
new piglit ARB_framebuffer_object/negative-readpixels-no-rb.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45877
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-17 12:55:54 -08:00
Eric Anholt
df3361df01 intel: Use core mesa support for determining lastLevel.
We had similar issues with using depth in determining the lastLevel of array
textures.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-17 12:55:54 -08:00
Eric Anholt
02652eaa25 mesa: Also handle GL_TEXTURE_EXTENRAL_OES in max num levels.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-17 12:55:54 -08:00
Eric Anholt
a43b107403 i965/fs: Unify the param pointer allocation for FP/non-FP.
Now that we're using the new backend, we may actually put things into push
constants if you have too many uniform values uploaded.  Also, correctly
account for texture rectangle params and drop the old special case for the
0.0/1.0 params from the old backend.
2012-11-17 12:39:27 -08:00
Maarten Lankhorst
c64adedc5f st/vdpau: Fix vlVdpVideoSurfaceSize for interlaced buffers
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2012-11-17 13:25:08 +01:00
Andreas Boll
a204e26495 docs: import release notes for 9.0.1, add news item 2012-11-17 09:02:03 +01:00
Vinson Lee
acc1e59013 util: Only use open coded snprintf for MSVC.
MinGW has snprintf.

The patch fixes these warnings with the MinGW SCons build.

src/gallium/auxiliary/util/u_snprintf.c:459:1: warning: no previous prototype for ‘util_vsnprintf’ [-Wmissing-prototypes]
src/gallium/auxiliary/util/u_snprintf.c:1436:1: warning: no previous prototype for ‘util_snprintf’ [-Wmissing-prototypes]

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Brian Paul <brianp@vmware.com>
2012-11-16 23:18:23 -08:00
Tom Stellard
b36b6fdb32 clover: Fix build with clang 3.2 2012-11-16 17:07:56 -05:00
Tom Stellard
71877143b6 r300/compiler: Avoid generating MOV instructions for invalid IMM swizzles v2
If an instruction reads from a constant register that contains
immediates using an invalid swizzle, we can avoid generating MOV
instructions to fix up the swizzle by loading the immediates into a
different constant register that can be read using a valid swizzle.

This only affects r300 and r400 cards.

For example:

CONST[1] = {    -3.5000     3.5000     2.5000     1.5000 }

MAD temp[4].xy, const[0].xy__, const[1].xz__, input[0].xy__;

========== Before this change would be lowered to: =========

CONST[1] = {    -3.5000     3.5000     2.5000     1.5000 }

MOV temp[0].x, const[1].x___;
MOV temp[0].y, const[1]._z__;
MAD temp[4].xy, const[0].xy__, temp[0].xy__, input[0].xy__;

========== After this change is lowered to:  ===============

CONST[1] = {    -3.5000     3.5000     2.5000     1.5000 }
CONST[2] = {     0.0000    -3.5000     2.5000     0.0000 }

MAD temp[4].xy, const[0].xy__, const[2].yz__, input[0].xy__;

============================================================

This change reduces one of the Lightsmark shaders from 133 to 91
instructions.

v2:
  - Fix crash caused by swizzles with only inline constants.
2012-11-16 17:07:11 -05:00
Alex Deucher
26463b8996 radeonsi: clean up some magic numbers
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-16 13:02:42 -05:00
Alex Deucher
ce17964fe5 radeonsi: emit PA_SC_RASTER_CONFIG
Use per asic golden values.

Programming this register doesn't seem to be strictly
necessary on SI, but programming it wrong leads to
rendering issues or reduced performance so just
go ahead and program the golden values explicitly
to avoid any potential problems down the road.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-16 13:02:42 -05:00
Maarten Lankhorst
4f0537e645 [PATCH] makefiles: use configured name for -ldrm* where possible
For precise lts support I had to do some magic with the library names, which works fine
as long as the libraries from pkg-config are used.

The parts with src/gallium/targets/va-*/Makefile will not apply on the master branch,
but do apply to the 9.0 branch.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2012-11-16 18:50:57 +01:00
Andreas Boll
6346214f05 docs: add note about removal of OpenVMS support 2012-11-16 10:01:47 +01:00
Matt Turner
1f82bf12ed Remove OpenVMS support
Not maintained since 2008. Doubtful that it's worked in quite a while.

Also see commit 32ac8cb05 which removed VMS stuff from Makefile in 2009.

Cc: Jouk Jansen <j.jansen@tudelft.nl>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2012-11-16 10:01:46 +01:00
Andreas Boll
900f5eb7a8 build: add missing Makefile.in files to tarballs target
Those are recently introduced on master.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-16 10:01:26 +01:00
Andreas Boll
4a38926601 build: fix make tarballs target
fixes regression introduced in 9078441072

Targets for making lex.yy.c program_parse.tab.c and program_parse.tab.h
got moved into its own Makefile

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-16 10:01:06 +01:00
Matt Turner
5c78ad84f4 gles2: Update gl2ext.h to revision 19436
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-15 15:21:28 -08:00
Matt Turner
88ec004381 gles2: Update gl2.h to revision 16803
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-15 15:21:28 -08:00
Matt Turner
e565260b30 gles: Update glext.h to revision 19260
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-15 15:21:28 -08:00
Matt Turner
aec36a10dd egl: Update eglext.h to revision 19571
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-15 15:21:28 -08:00
Matt Turner
47d862517e mesa: return INVALID_VALUE from WaitSync if timeout != GL_TIMEOUT_IGNORED
This was added in version 22 of the GL_ARB_sync spec.

Fixes gles3conform's sync_error_waitsync_timeout test.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-15 15:21:28 -08:00
Matt Turner
32cc20d9f5 mesa: return INVALID_VALUE from WaitSync if flags != 0
Fixes gles3conform's sync_error_waitsync_flags test.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-15 15:21:28 -08:00
Matt Turner
5b0012f5c2 mesa: return INVALID_VALUE from ClientWaitSync if flags contains an unsupported flag
Fixes gles3conform's sync_error_clientwaitsync_flags test.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-15 15:21:28 -08:00
Matt Turner
ae1f09babb mesa: return INVALID_VALUE from VertexAttribDivisor if index out of range
All the other range checks on index already return the proper error,
INVALID_VALUE.

Fixes gles3conform's instanced_arrays_invalid test.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-15 15:21:28 -08:00
Matt Turner
e21debbf75 glcpp: Don't define macros for extensions that aren't in ES
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-15 15:21:28 -08:00
Alex Deucher
7bba4879bb radeonsi: remove new asserts and replace with warnings
Fixes piglit regressions.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-11-15 15:46:02 -05:00
Kenneth Graunke
d010e70a07 i965/fs: Don't calculate_live_intervals() in opt_algebraic().
There's no point: opt_algebraic() doesn't use any liveness information.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-15 11:14:50 -08:00
Kenneth Graunke
b02492fd33 i965: Remove duplicate brw_opcodes table in favor of opcode_descs.
brw_optimize.c's brw_opcodes table was a copy of brw_disasm.c's
opcode_descs table, but with an additional field: is_arith.  Now that
I've deleted that, the two are identical.  Keep the one in brw_disasm.c.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-15 11:14:47 -08:00
Kenneth Graunke
a405717b88 i965/vs: Remove dead vec4_visitor::src_reg_for_float prototype.
No such function exists.  src_reg's constructor does that.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-15 11:14:45 -08:00
Kenneth Graunke
eec5669bc9 i965/fs: Remove bblock field of fs_visitor.
All users of basic block analysis simply create their own local
variables.  Nobody uses the visitor-wide field.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-15 11:14:43 -08:00
Kenneth Graunke
e7668609a7 i965: Remove brw_instruction_info::is_arith().
Nobody uses it.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-15 11:14:40 -08:00
Kenneth Graunke
c4b99c1857 i965: Remove some dead code optimization passes.
The old brw_remove_grf_to_mrf_moves() pass is obsolete and replaced by
fs_visitor::compute_to_mrf().

The old brw_remove_duplicate_mrf_moves() pass is obsolete and replaced
by fs_visitor::remove_duplicate_mrf_writes().

The remaining pass, brw_set_dp4_dependency_control(), is currently
unused, but could be, so I'm leaving it for now.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-15 11:14:38 -08:00
Kenneth Graunke
1484faa0f4 i965: Remove unused BRW_PACKCOLOR8888 macro.
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-15 11:14:36 -08:00
Kenneth Graunke
80b3af5b6d i965: Remove brw_shader_program wrapper struct.
At this point, it's just gl_shader_program.  Nobody even uses it; even
the program that creates them only returns gl_shader_program pointers.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-15 11:14:34 -08:00
Kenneth Graunke
eb18e3d32a i965: Remove unused struct brw_vs_ouput_sizes.
With a name like that, it can't be used.  Sure enough, it's not.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-15 11:14:24 -08:00
José Fonseca
35e28b4583 util/u_debug: Fix DEBUG_NAMED_VALUE.
"#__symbol" doesn't work with nested macro expansions, at least not on gcc.
2012-11-15 17:38:03 +00:00
Roland Scheidegger
94f9ea03a1 draw: fix crashes with out-of-bounds indices
The passthrough pipeline needs to check index values (which might be passed
through) as they can be invalid (which causes crashes and various assertion
failures if the clip code runs). Obviously, rendering won't be well-defined,
but those bogus indices might come directly from apps.
There were already debug printfs which reported the out-of-bounds indices but
we really ought to not crash.
While checking at that point doesn't seem like the most efficient solution,
it seems there isn't really another appropriate function to do it.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-15 17:37:06 +00:00
Alex Deucher
3893593732 radeonsi: cleanup si_db()
Clean up a few magic numbers and rework the code a bit.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-15 12:11:28 -05:00
Alex Deucher
565c29f221 radeonsi: assert the CB format is valid (v2)
Assert the the CB format is valid and default to
the INVALID hw format rather than ~0U when the format
doesn't match for non-debug builds.

v2: use INVALID hw format rather than ~0U

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-15 12:10:48 -05:00
Alex Deucher
34d487b64d radeonsi: assert that the DB format is valid (v2)
Assert that the DB format is valid and default to
the INVALID hw format rather than ~0U when the format
doesn't match for non-debug builds.

v2: use INVALID hw format rather than ~0U

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-15 12:08:17 -05:00
Dmitry Cherkassov
fd1196c412 gallium: fix some function comments in p_context.h
Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-15 07:54:38 -07:00
Andreas Boll
8a9f0fdeab build: add missing files to tarballs target
fixes errors ./configure and make was complaining about

NOTE: This is a candidate for the 9.0 branch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-14 23:13:34 +01:00
Andreas Boll
bc08f26485 build: add missing Makefile.in files to tarballs target
fixes errors ./configure was complaining about

NOTE: This is a candidate for the 9.0 branch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-14 23:13:14 +01:00
Andreas Boll
a0a90ea920 build: add config.sub and config.guess to tarballs target
fixes errors ./configure was complaining about

NOTE: This is a candidate for the 9.0 branch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-14 23:12:58 +01:00
Andreas Boll
ca8988673b mesa: use .cherry-ignore in the get-pick-list.sh script
NOTE: This is a candidate for the stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-14 20:45:33 +01:00
Paul Berry
b85a8cd208 mesa: Add .gitignore for hashtable collision unit test.
This test was introduced in commit
35fd61bd99.
2012-11-14 11:23:51 -08:00
Michel Dänzer
73d9703a93 radeonsi: Set STENCILOPVAL fields to 1.
This is necessary for backwards compatibility with pre-SI for stencil.

Fixes a number of stencil related piglit tests, and real apps using stencil.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-14 16:08:34 +01:00
Michel Dänzer
91c1d4472f radeonsi: Bump SI_PM4_MAX_DW.
Fixes assertion failure with Mesa demo glsl/samplers.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-11-14 12:08:25 +01:00
Michel Dänzer
56ae9be957 radeonsi: Handle TGSI TXL opcode.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-11-14 12:08:19 +01:00
Michel Dänzer
3e20513b8f radeonsi: Handle TGSI TXB opcode.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-11-14 12:08:10 +01:00
Vinson Lee
ca5840afb0 mesa: Include compiler.h in hash_table.h.
Include the header for the inline symbol. MSVC does not have the inline
keyword for C.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2012-11-13 21:19:50 -08:00
Marek Olšák
186579e724 r600g: use LINEAR_ALIGNED tiling for 1D array textures and if height0 <= 3
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-11-13 17:17:05 +01:00
Tom Stellard
2e6b81ff7a auxillary: Append LLVM_CXXFLAGS to CXXFLAGS 2012-11-13 15:13:07 +00:00
Marek Olšák
e3813ecfa3 r300g: don't call buffer_unmap in draw functions
It's been a no-op anyway.
2012-11-13 15:53:17 +01:00
Marek Olšák
7a8affb6a1 r300g: fix crash since the set_vertex_buffers(start_slot) change 2012-11-13 15:53:16 +01:00
Marek Olšák
d4780fddb1 r600g: untiled window-system buffers should be LINEAR_ALIGNED
though I guess the DDX allocates them as LINEAR_GENERAL

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-11-13 15:00:37 +01:00
Marek Olšák
c9e5309223 r600g: use LINEAR_ALIGNED tiling for 1D textures
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-11-13 15:00:37 +01:00
Marek Olšák
ac4f61b232 r600g: use LINEAR_ALIGNED tiling for staging textures, reorder the code
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-11-13 15:00:37 +01:00
Kenneth Graunke
fe2ef4b810 i965/vs: Fix user clip plane setup on Gen4-5.
On Gen6-7, we don't compact clip planes, and nr_userclip_plane_consts
is the last bit set, so iterating from i = 0..nr_userclip_plane_consts
covers all active clip planes and is the right thing to do.
works and is the right thing to do.

However, that doesn't work at all on Gen4-5.  Since we don't compact
clip planes, we skip over ones which aren't active (via the continue
statement).  We also set set nr_userclip_plane_consts to the number of
active clip planes, which means that we end the loop after checking that
many bits.  If the set of clip planes wasn't contiguous, this means we'd
fail to find the last few.

By changing the iteration to MAX_CLIP_PLANES, we correctly find all of
the active clip planes.

Fixes regressions since 66c8473e02 (replacing the old VS backend) in
Piglit's spec/glsl-1.20/execution/clipping/fixed-clip-enables and
oglconform's mustpass(basic.clip) and userclip(basic.allCases).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56791
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-13 01:27:33 -08:00
Kenneth Graunke
3262857843 i965/vs: Simplify the Gen6-7 part of setup_uniform_clipplane_values().
There's no compaction, so we can drop that code and simply use 'i'.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-13 01:27:31 -08:00
Kenneth Graunke
0ad4360ca1 i965/vs: Split setup_uniform_clipplane_values() into Gen4-5/6-7 parts.
Since Gen4-5 compacts clip planes and Gen6-7 doesn't, it makes sense to
split them into separate code paths.  This patch simply copies the code
to both halves; the next commits will simplify it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-13 01:27:15 -08:00
Vinson Lee
bb284669f8 mesa: Replace random with standard C rand.
BSD random is not available on some compilers.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2012-11-12 22:15:42 -08:00
Brian Paul
9b67460223 automake: Remove empty file variable.
Fixes SCons build regression introduced with commit
a665cf1226.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2012-11-12 21:29:34 -08:00
Eric Anholt
3a5ad21cd3 mesa: Fix gallium build since 6991c2922f
Looks like I screwed up and didn't test gallium again after tweaking the
Makefile.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57044
2012-11-12 19:35:31 -08:00
Eric Anholt
6991c2922f mesa: Convert the hash table for GL object ids to the open-addressing hash.
The previous 1023-entry chaining hash table never resized, so it was very
inefficient when there were many objects live.  While one could have an even
more efficient implementation than this (keep an array for genned names with
packed IDs, or take advantage of the fact that key == hash or key ==
*(uint32_t *)data to store less data), this is fairly fast, and I want a nice
replacement hash table for other parts of Mesa, too.

It improves Minecraft performance 12.3% +/- 1.4% (n=9), dropping hash lookups
from 8% of the profile to 0.5%.

I also tested cairo-gl, which should be a pessimal workload for this hash
table: around 247000 FBOs created and destroyed, only around 65 live at any
time, and few lookups of them between creation and destruction.  No
statistically significant performance difference at n=76 (mean 20.3/20.4
seconds, sd 2.8/3.2 seconds).  If I remove the >20 seconds outliers that
appear to be due to thermal throttling, there's possibly a .97% +/- 0.31%
performance win (n=61/59).  The choice of cutoff for outliers feels a lot like
cooking the data, but I've gone through this process 3 times for minor
iterations of the code with the same conclusion each time.

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2012-11-12 15:52:43 -08:00
Eric Anholt
35fd61bd99 mesa: Import a copy of the open-addressing hash table code I wrote.
Mesa's chaining hash table for object names is slow, and this should be much
faster.  I namespaced the functions under _mesa_*, to avoid visibility
troubles that we may have had before with hash_table_* functions.

v2: Move .c file to main/, const a few things, clean up loop conditions,
    add/extend some comments.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2012-11-12 15:52:42 -08:00
Eric Anholt
1e8dd15311 automake: Remove libdricore clip.c workaround lib.
sparc/clip.c got moved to sparc/sparc-clip.c to avoid doing this workaround in
the parent directory.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2012-11-12 15:52:42 -08:00
Eric Anholt
9078441072 automake,android: Build program/ into a helper lib (v2)
While simplifying mesa/Makefile.am, the more important feature of this commit
is allowing a file with the same name to appear in both main/ and program/.

v2: [chadv] Add changes to Android makefiles.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Signed-off-by: Chad Versace <chad.versace@linux.intel.com> (v2)
2012-11-12 15:52:42 -08:00
Chad Versace
0ef8535747 android: Moves rules for libmesa_st_mesa to separate makefile
The pair of files src/mesa/Android.mk and src/mesa/Android.gen.mk are too
long and complex to be easily understood. This patch belongs to a series
that decomposes them into several easily digestible makefiles.

This patch move the rules for libmesa_st_mesa.a from Android.mk to
Android.libmesa_st_mesa.mk.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-12 15:52:42 -08:00
Chad Versace
7071ffb464 android: Moves rules for libmesa_dricore to separate makefile
The pair of files src/mesa/Android.mk and src/mesa/Android.gen.mk are too
long and complex to be easily understood. This patch belongs to a series
that decomposes them into several easily digestible makefiles.

This patch move the rules for libmesa_dricore.a from Android.mk to
Android.libmesa_dricore.mk.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-12 15:52:42 -08:00
Chad Versace
5f935af675 android: Moves rules for mesa_gen_matypes to separate makefile
The pair of files src/mesa/Android.mk and src/mesa/Android.gen.mk are too
long and complex to be easily understood. This patch belongs to a series
that decomposes them into several easily digestible makefiles.

This patch move the rules for host executable mesa_gen_matypes from
Android.mk to Android.mesa_gen_matypes.mk.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-12 15:52:42 -08:00
Chad Versace
f2b638a997 android: Moves rules for libmesa_glsl_utils to separate makefile
The pair of files src/mesa/Android.mk and src/mesa/Android.gen.mk are too
long and complex to be easily understood. This patch belongs to a series
that decomposes them into several easily digestible makefiles.

This patch move the rules for the host and target libmesa_glsl_utils.a
from Android.mk to Android.libmesa_glsl_utils.mk.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-12 15:52:42 -08:00
Eric Anholt
a665cf1226 automake: Merge *_CXX_FILES variables in the glsl build.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-12 15:52:42 -08:00
Eric Anholt
34d4216e64 automake: Merge per-type *_FILES variables in intel drivers.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-12 15:52:42 -08:00
Eric Anholt
e9e8e194e2 automake: Merge separated *_CXX_FILES variables to *_FILES in core mesa.
They were always used with the corresponding *_FILES variables now that
automake handles rule generation.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-12 15:52:41 -08:00
Eric Anholt
be655ec617 automake: Remove dead *_OBJECTS variables from the old build system.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-12 15:52:41 -08:00
Eric Anholt
906d832db5 automake: Fix a comment typo.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-12 15:52:41 -08:00
Marek Olšák
f5ac60152b r600g: remove redundant parameter in r600_init_surface 2012-11-13 00:34:35 +01:00
Marek Olšák
e7dde5c8fb st/mesa: fix computation of last_level in GenerateMipmap
Array textures were broken.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-11-12 21:37:31 +01:00
Marek Olšák
6dd839f23a st/mesa: fix computation of last_level during texture creation
Array textures were broken.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-11-12 21:37:31 +01:00
Marek Olšák
c06258dd02 st/mesa: fix guessing the base level size
It was pretty broken with array textures, where the array size (height or
depth depending on the target) shouldn't be magnified.

The guessing also doesn't fail with 1D and cube textures.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-11-12 21:37:31 +01:00
Marek Olšák
985f2aec4a mesa: fix error checking of TexStorage(levels) for array and rect textures
NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-11-12 21:37:30 +01:00
Marek Olšák
12a4fd7e45 mesa: use MaxNumlevels in _mesa_test_texobj_completeness
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-11-12 21:37:30 +01:00
Marek Olšák
8111342e81 mesa: add MaxNumLevels to gl_texture_image, remove MaxLog2
MaxLog2 led to bugs, because it didn't work well with 1D and 3D textures.

NOTE: This is a candidate for the stable branches.

v2: correct the comment at MaxNumlevels

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-11-12 21:36:56 +01:00
Roland Scheidegger
26097c4855 gallivm,draw,llvmpipe: use base ptr + mip offsets instead of mip pointers
This might have a slight overhead but handling mip offsets more like
the width (and image) strides should make some things easier (mip level
being just part of the offset calculation) later.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-12 21:02:59 +01:00
Roland Scheidegger
8257bb963f llvmpipe: always allocate whole miptrees not individual levels
This is preparation work for using mip level offsets + base_ptr for texture
sampling instead of per-mip pointers.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-12 21:02:59 +01:00
Kenneth Graunke
df3cecab7d i965: Add comments for each of the surface state table's columns.
I can never remember what "AB" means, and having to constantly consult
the docs is annoying.  Just add comments to the top which explain each
of the abbreviations.
2012-11-12 11:24:38 -08:00
Paul Berry
21e23fbe21 glapi: Stop handling XML annotation exec="{es,check,loopback}".
Previously, we used these XML annotations to make the code generation
scripts aware of any instances where the Mesa implementation of a
function had a prefix other than "_mesa_".  Now that all of the mesa
implementation functions have been renamed to match the XML, we only
need to handle exec="skip", exec="dynamic", and the default case of
exec="mesa".

Acked-by: Brian Paul <brianp@vmware.com>
2012-11-12 10:53:58 -08:00
Paul Berry
55b81ff56b glapi: Remove handling of mesa_name XML attribute.
Previously, we used the mesa_name XML attribute to make the code
generation scripts aware of any instances where the Mesa
implementation of a function had a different function name suffix than
the primary name in the XML.  Now that all of the Mesa implementation
functions have been renamed to match the XML, this attribute is no
longer necessary.

Acked-by: Brian Paul <brianp@vmware.com>
2012-11-12 10:53:57 -08:00
Paul Berry
bb3db388d8 mesa: Fix const correctness of API implementation functions.
This patch changes the use of const in the type signatures of
_mesa_ShaderSource() and _mesa_TransformFeedbackVaryings(), to match
the type signatures in the GL spec.  This avoids warnings when
building the code-generated api_exec.c file.

Note: previously we avoided the build warnings because these functions
were being type-checked against ShaderSourceARB and
TransformFeedbackVaryingsEXT; those functions are semantically
equivalent, but have fewer const qualifiers in their type signatures.

Acked-by: Brian Paul <brianp@vmware.com>
2012-11-12 10:53:57 -08:00
Paul Berry
1a1db1746d mesa: Standardize names of OpenGL functions.
This patch adjusts the aliasing pattern in the GL API description XML,
and the functions that implement the GL API within Mesa, to
consistently follow these naming conventions:

- When several function names are aliases of each other, the primary
  name is the one with no extension suffix (or the name with the
  suffix "ARB", if no unsuffixed name is available).  (By "primary
  name", I mean the name that all the other functions point to using
  the XML "alias" attribute).

- The name of the mesa implementation of each function is the same as
  the primary name, with the prefix "_mesa_".

This patch renames the following mesa functions:
  _check_GetTexGenxvOES => _mesa_GetTexGenxvOES
  _check_TexGenxOES => _mesa_TexGenxOES
  _check_TexGenxvOES => _mesa_TexGenxvOES
  _es_AlphaFuncx => _mesa_AlphaFuncx
  _es_ClearColorx => _mesa_ClearColorx
  _es_ClearDepthx => _mesa_ClearDepthx
  _es_ClipPlanef => _mesa_ClipPlanef
  _es_ClipPlanex => _mesa_ClipPlanex
  _es_Color4x => _mesa_Color4x
  _es_DepthRangex => _mesa_DepthRangex
  _es_DrawTexxOES => _mesa_DrawTexxOES
  _es_DrawTexxvOES => _mesa_DrawTexxvOES
  _es_Fogx => _mesa_Fogx
  _es_Fogxv => _mesa_Fogxv
  _es_Frustumf => _mesa_Frustumf
  _es_Frustumx => _mesa_Frustumx
  _es_GetClipPlanef => _mesa_GetClipPlanef
  _es_GetClipPlanex => _mesa_GetClipPlanex
  _es_GetLightxv => _mesa_GetLightxv
  _es_GetMaterialxv => _mesa_GetMaterialxv
  _es_GetTexEnvxv => _mesa_GetTexEnvxv
  _es_GetTexParameterxv => _mesa_GetTexParameterxv
  _es_LightModelx => _mesa_LightModelx
  _es_LightModelxv => _mesa_LightModelxv
  _es_Lightx => _mesa_Lightx
  _es_Lightxv => _mesa_Lightxv
  _es_LineWidthx => _mesa_LineWidthx
  _es_LoadMatrixx => _mesa_LoadMatrixx
  _es_Materialx => _mesa_Materialx
  _es_Materialxv => _mesa_Materialxv
  _es_MultMatrixx => _mesa_MultMatrixx
  _es_MultiTexCoord4x => _mesa_MultiTexCoord4x
  _es_Normal3x => _mesa_Normal3x
  _es_Orthof => _mesa_Orthof
  _es_Orthox => _mesa_Orthox
  _es_PointParameterx => _mesa_PointParameterx
  _es_PointParameterxv => _mesa_PointParameterxv
  _es_PointSizex => _mesa_PointSizex
  _es_PolygonOffsetx => _mesa_PolygonOffsetx
  _es_QueryMatrixxOES => _mesa_QueryMatrixxOES
  _es_Rotatex => _mesa_Rotatex
  _es_SampleCoveragex => _mesa_SampleCoveragex
  _es_Scalex => _mesa_Scalex
  _es_TexEnvx => _mesa_TexEnvx
  _es_TexEnvxv => _mesa_TexEnvxv
  _es_TexParameterx => _mesa_TexParameterx
  _es_TexParameterxv => _mesa_TexParameterxv
  _es_Translatex => _mesa_Translatex
  _mesa_ActiveTextureARB => _mesa_ActiveTexture
  _mesa_BeginQueryARB => _mesa_BeginQuery
  _mesa_BindAttribLocationARB => _mesa_BindAttribLocation
  _mesa_BindBufferARB => _mesa_BindBuffer
  _mesa_BindFramebufferEXT => _mesa_BindFramebuffer
  _mesa_BindProgram => _mesa_BindProgramARB
  _mesa_BindRenderbufferEXT => _mesa_BindRenderbuffer
  _mesa_BlendEquationSeparateEXT => _mesa_BlendEquationSeparate
  _mesa_BlendEquationSeparatei => _mesa_BlendEquationSeparateiARB
  _mesa_BlendEquationi => _mesa_BlendEquationiARB
  _mesa_BlendFuncSeparateEXT => _mesa_BlendFuncSeparate
  _mesa_BlendFuncSeparatei => _mesa_BlendFuncSeparateiARB
  _mesa_BlendFunci => _mesa_BlendFunciARB
  _mesa_BlitFramebufferEXT => _mesa_BlitFramebuffer
  _mesa_BufferDataARB => _mesa_BufferData
  _mesa_BufferSubDataARB => _mesa_BufferSubData
  _mesa_CheckFramebufferStatusEXT => _mesa_CheckFramebufferStatus
  _mesa_ClampColorARB => _mesa_ClampColor
  _mesa_ClientActiveTextureARB => _mesa_ClientActiveTexture
  _mesa_ColorMaskIndexed => _mesa_ColorMaski
  _mesa_CompileShaderARB => _mesa_CompileShader
  _mesa_CompressedTexImage1DARB => _mesa_CompressedTexImage1D
  _mesa_CompressedTexImage2DARB => _mesa_CompressedTexImage2D
  _mesa_CompressedTexImage3DARB => _mesa_CompressedTexImage3D
  _mesa_CompressedTexSubImage1DARB => _mesa_CompressedTexSubImage1D
  _mesa_CompressedTexSubImage2DARB => _mesa_CompressedTexSubImage2D
  _mesa_CompressedTexSubImage3DARB => _mesa_CompressedTexSubImage3D
  _mesa_DeleteBuffersARB => _mesa_DeleteBuffers
  _mesa_DeleteFramebuffersEXT => _mesa_DeleteFramebuffers
  _mesa_DeletePrograms => _mesa_DeleteProgramsARB
  _mesa_DeleteQueriesARB => _mesa_DeleteQueries
  _mesa_DeleteRenderbuffersEXT => _mesa_DeleteRenderbuffers
  _mesa_DeleteVertexArraysAPPLE => _mesa_DeleteVertexArrays
  _mesa_DisableIndexed => _mesa_Disablei
  _mesa_DisableVertexAttribArrayARB => _mesa_DisableVertexAttribArray
  _mesa_DrawBuffersARB => _mesa_DrawBuffers
  _mesa_DrawTexf => _mesa_DrawTexfOES
  _mesa_DrawTexfv => _mesa_DrawTexfvOES
  _mesa_DrawTexi => _mesa_DrawTexiOES
  _mesa_DrawTexiv => _mesa_DrawTexivOES
  _mesa_DrawTexs => _mesa_DrawTexsOES
  _mesa_DrawTexsv => _mesa_DrawTexsvOES
  _mesa_EnableIndexed => _mesa_Enablei
  _mesa_EnableVertexAttribArrayARB => _mesa_EnableVertexAttribArray
  _mesa_EndQueryARB => _mesa_EndQuery
  _mesa_FogCoordPointerEXT => _mesa_FogCoordPointer
  _mesa_FramebufferRenderbufferEXT => _mesa_FramebufferRenderbuffer
  _mesa_FramebufferTexture1DEXT => _mesa_FramebufferTexture1D
  _mesa_FramebufferTexture2DEXT => _mesa_FramebufferTexture2D
  _mesa_FramebufferTexture3DEXT => _mesa_FramebufferTexture3D
  _mesa_FramebufferTextureLayerEXT => _mesa_FramebufferTextureLayer
  _mesa_GenBuffersARB => _mesa_GenBuffers
  _mesa_GenFramebuffersEXT => _mesa_GenFramebuffers
  _mesa_GenPrograms => _mesa_GenProgramsARB
  _mesa_GenQueriesARB => _mesa_GenQueries
  _mesa_GenRenderbuffersEXT => _mesa_GenRenderbuffers
  _mesa_GenerateMipmapEXT => _mesa_GenerateMipmap
  _mesa_GetActiveAttribARB => _mesa_GetActiveAttrib
  _mesa_GetActiveUniformARB => _mesa_GetActiveUniform
  _mesa_GetAttribLocationARB => _mesa_GetAttribLocation
  _mesa_GetBooleanIndexedv => _mesa_GetBooleani_v
  _mesa_GetBufferParameterivARB => _mesa_GetBufferParameteriv
  _mesa_GetBufferPointervARB => _mesa_GetBufferPointerv
  _mesa_GetBufferSubDataARB => _mesa_GetBufferSubData
  _mesa_GetCompressedTexImageARB => _mesa_GetCompressedTexImage
  _mesa_GetFramebufferAttachmentParameterivEXT => _mesa_GetFramebufferAttachmentParameteriv
  _mesa_GetIntegerIndexedv => _mesa_GetIntegeri_v
  _mesa_GetQueryObjecti64vEXT => _mesa_GetQueryObjecti64v
  _mesa_GetQueryObjectivARB => _mesa_GetQueryObjectiv
  _mesa_GetQueryObjectui64vEXT => _mesa_GetQueryObjectui64v
  _mesa_GetQueryObjectuivARB => _mesa_GetQueryObjectuiv
  _mesa_GetQueryivARB => _mesa_GetQueryiv
  _mesa_GetRenderbufferParameterivEXT => _mesa_GetRenderbufferParameteriv
  _mesa_GetShaderSourceARB => _mesa_GetShaderSource
  _mesa_GetUniformLocationARB => _mesa_GetUniformLocation
  _mesa_GetUniformfvARB => _mesa_GetUniformfv
  _mesa_GetUniformivARB => _mesa_GetUniformiv
  _mesa_GetVertexAttribPointervARB => _mesa_GetVertexAttribPointerv
  _mesa_GetVertexAttribdvARB => _mesa_GetVertexAttribdv
  _mesa_GetVertexAttribfvARB => _mesa_GetVertexAttribfv
  _mesa_GetVertexAttribivARB => _mesa_GetVertexAttribiv
  _mesa_IsBufferARB => _mesa_IsBuffer
  _mesa_IsEnabledIndexed => _mesa_IsEnabledi
  _mesa_IsFramebufferEXT => _mesa_IsFramebuffer
  _mesa_IsQueryARB => _mesa_IsQuery
  _mesa_IsRenderbufferEXT => _mesa_IsRenderbuffer
  _mesa_IsVertexArrayAPPLE => _mesa_IsVertexArray
  _mesa_LinkProgramARB => _mesa_LinkProgram
  _mesa_LoadTransposeMatrixdARB => _mesa_LoadTransposeMatrixd
  _mesa_LoadTransposeMatrixfARB => _mesa_LoadTransposeMatrixf
  _mesa_MapBufferARB => _mesa_MapBuffer
  _mesa_MultTransposeMatrixdARB => _mesa_MultTransposeMatrixd
  _mesa_MultTransposeMatrixfARB => _mesa_MultTransposeMatrixf
  _mesa_MultiDrawArraysEXT => _mesa_MultiDrawArrays
  _mesa_PointSizePointer => _mesa_PointSizePointerOES
  _mesa_ProvokingVertexEXT => _mesa_ProvokingVertex
  _mesa_RenderbufferStorageEXT => _mesa_RenderbufferStorage
  _mesa_SampleCoverageARB => _mesa_SampleCoverage
  _mesa_SecondaryColorPointerEXT => _mesa_SecondaryColorPointer
  _mesa_ShaderSourceARB => _mesa_ShaderSource
  _mesa_Uniform1fARB => _mesa_Uniform1f
  _mesa_Uniform1fvARB => _mesa_Uniform1fv
  _mesa_Uniform1iARB => _mesa_Uniform1i
  _mesa_Uniform1ivARB => _mesa_Uniform1iv
  _mesa_Uniform2fARB => _mesa_Uniform2f
  _mesa_Uniform2fvARB => _mesa_Uniform2fv
  _mesa_Uniform2iARB => _mesa_Uniform2i
  _mesa_Uniform2ivARB => _mesa_Uniform2iv
  _mesa_Uniform3fARB => _mesa_Uniform3f
  _mesa_Uniform3fvARB => _mesa_Uniform3fv
  _mesa_Uniform3iARB => _mesa_Uniform3i
  _mesa_Uniform3ivARB => _mesa_Uniform3iv
  _mesa_Uniform4fARB => _mesa_Uniform4f
  _mesa_Uniform4fvARB => _mesa_Uniform4fv
  _mesa_Uniform4iARB => _mesa_Uniform4i
  _mesa_Uniform4ivARB => _mesa_Uniform4iv
  _mesa_UniformMatrix2fvARB => _mesa_UniformMatrix2fv
  _mesa_UniformMatrix3fvARB => _mesa_UniformMatrix3fv
  _mesa_UniformMatrix4fvARB => _mesa_UniformMatrix4fv
  _mesa_UnmapBufferARB => _mesa_UnmapBuffer
  _mesa_UseProgramObjectARB => _mesa_UseProgram
  _mesa_ValidateProgramARB => _mesa_ValidateProgram
  _mesa_VertexAttribPointerARB => _mesa_VertexAttribPointer
  _mesa_WindowPos2dMESA => _mesa_WindowPos2d
  _mesa_WindowPos2dvMESA => _mesa_WindowPos2dv
  _mesa_WindowPos2fMESA => _mesa_WindowPos2f
  _mesa_WindowPos2fvMESA => _mesa_WindowPos2fv
  _mesa_WindowPos2iMESA => _mesa_WindowPos2i
  _mesa_WindowPos2ivMESA => _mesa_WindowPos2iv
  _mesa_WindowPos2sMESA => _mesa_WindowPos2s
  _mesa_WindowPos2svMESA => _mesa_WindowPos2sv
  _mesa_WindowPos3dMESA => _mesa_WindowPos3d
  _mesa_WindowPos3dvMESA => _mesa_WindowPos3dv
  _mesa_WindowPos3fMESA => _mesa_WindowPos3f
  _mesa_WindowPos3fvMESA => _mesa_WindowPos3fv
  _mesa_WindowPos3iMESA => _mesa_WindowPos3i
  _mesa_WindowPos3ivMESA => _mesa_WindowPos3iv
  _mesa_WindowPos3sMESA => _mesa_WindowPos3s
  _mesa_WindowPos3svMESA => _mesa_WindowPos3sv
  loopback_Color3b_f => _mesa_Color3b
  loopback_Color3bv_f => _mesa_Color3bv
  loopback_Color3d_f => _mesa_Color3d
  loopback_Color3dv_f => _mesa_Color3dv
  loopback_Color3i_f => _mesa_Color3i
  loopback_Color3iv_f => _mesa_Color3iv
  loopback_Color3s_f => _mesa_Color3s
  loopback_Color3sv_f => _mesa_Color3sv
  loopback_Color3ub_f => _mesa_Color3ub
  loopback_Color3ubv_f => _mesa_Color3ubv
  loopback_Color3ui_f => _mesa_Color3ui
  loopback_Color3uiv_f => _mesa_Color3uiv
  loopback_Color3us_f => _mesa_Color3us
  loopback_Color3usv_f => _mesa_Color3usv
  loopback_Color4b_f => _mesa_Color4b
  loopback_Color4bv_f => _mesa_Color4bv
  loopback_Color4d_f => _mesa_Color4d
  loopback_Color4dv_f => _mesa_Color4dv
  loopback_Color4i_f => _mesa_Color4i
  loopback_Color4iv_f => _mesa_Color4iv
  loopback_Color4s_f => _mesa_Color4s
  loopback_Color4sv_f => _mesa_Color4sv
  loopback_Color4ub_f => _mesa_Color4ub
  loopback_Color4ubv_f => _mesa_Color4ubv
  loopback_Color4ui_f => _mesa_Color4ui
  loopback_Color4uiv_f => _mesa_Color4uiv
  loopback_Color4us_f => _mesa_Color4us
  loopback_Color4usv_f => _mesa_Color4usv
  loopback_EdgeFlagv => _mesa_EdgeFlagv
  loopback_EvalCoord1d => _mesa_EvalCoord1d
  loopback_EvalCoord1dv => _mesa_EvalCoord1dv
  loopback_EvalCoord1fv => _mesa_EvalCoord1fv
  loopback_EvalCoord2d => _mesa_EvalCoord2d
  loopback_EvalCoord2dv => _mesa_EvalCoord2dv
  loopback_EvalCoord2fv => _mesa_EvalCoord2fv
  loopback_FogCoorddEXT => _mesa_FogCoordd
  loopback_FogCoorddvEXT => _mesa_FogCoorddv
  loopback_Indexd => _mesa_Indexd
  loopback_Indexdv => _mesa_Indexdv
  loopback_Indexi => _mesa_Indexi
  loopback_Indexiv => _mesa_Indexiv
  loopback_Indexs => _mesa_Indexs
  loopback_Indexsv => _mesa_Indexsv
  loopback_Indexub => _mesa_Indexub
  loopback_Indexubv => _mesa_Indexubv
  loopback_Materialf => _mesa_Materialf
  loopback_Materiali => _mesa_Materiali
  loopback_Materialiv => _mesa_Materialiv
  loopback_MultiTexCoord1dARB => _mesa_MultiTexCoord1d
  loopback_MultiTexCoord1dvARB => _mesa_MultiTexCoord1dv
  loopback_MultiTexCoord1iARB => _mesa_MultiTexCoord1i
  loopback_MultiTexCoord1ivARB => _mesa_MultiTexCoord1iv
  loopback_MultiTexCoord1sARB => _mesa_MultiTexCoord1s
  loopback_MultiTexCoord1svARB => _mesa_MultiTexCoord1sv
  loopback_MultiTexCoord2dARB => _mesa_MultiTexCoord2d
  loopback_MultiTexCoord2dvARB => _mesa_MultiTexCoord2dv
  loopback_MultiTexCoord2iARB => _mesa_MultiTexCoord2i
  loopback_MultiTexCoord2ivARB => _mesa_MultiTexCoord2iv
  loopback_MultiTexCoord2sARB => _mesa_MultiTexCoord2s
  loopback_MultiTexCoord2svARB => _mesa_MultiTexCoord2sv
  loopback_MultiTexCoord3dARB => _mesa_MultiTexCoord3d
  loopback_MultiTexCoord3dvARB => _mesa_MultiTexCoord3dv
  loopback_MultiTexCoord3iARB => _mesa_MultiTexCoord3i
  loopback_MultiTexCoord3ivARB => _mesa_MultiTexCoord3iv
  loopback_MultiTexCoord3sARB => _mesa_MultiTexCoord3s
  loopback_MultiTexCoord3svARB => _mesa_MultiTexCoord3sv
  loopback_MultiTexCoord4dARB => _mesa_MultiTexCoord4d
  loopback_MultiTexCoord4dvARB => _mesa_MultiTexCoord4dv
  loopback_MultiTexCoord4iARB => _mesa_MultiTexCoord4i
  loopback_MultiTexCoord4ivARB => _mesa_MultiTexCoord4iv
  loopback_MultiTexCoord4sARB => _mesa_MultiTexCoord4s
  loopback_MultiTexCoord4svARB => _mesa_MultiTexCoord4sv
  loopback_Normal3b => _mesa_Normal3b
  loopback_Normal3bv => _mesa_Normal3bv
  loopback_Normal3d => _mesa_Normal3d
  loopback_Normal3dv => _mesa_Normal3dv
  loopback_Normal3i => _mesa_Normal3i
  loopback_Normal3iv => _mesa_Normal3iv
  loopback_Normal3s => _mesa_Normal3s
  loopback_Normal3sv => _mesa_Normal3sv
  loopback_Rectd => _mesa_Rectd
  loopback_Rectdv => _mesa_Rectdv
  loopback_Rectfv => _mesa_Rectfv
  loopback_Recti => _mesa_Recti
  loopback_Rectiv => _mesa_Rectiv
  loopback_Rects => _mesa_Rects
  loopback_Rectsv => _mesa_Rectsv
  loopback_SecondaryColor3bEXT_f => _mesa_SecondaryColor3b
  loopback_SecondaryColor3bvEXT_f => _mesa_SecondaryColor3bv
  loopback_SecondaryColor3dEXT_f => _mesa_SecondaryColor3d
  loopback_SecondaryColor3dvEXT_f => _mesa_SecondaryColor3dv
  loopback_SecondaryColor3iEXT_f => _mesa_SecondaryColor3i
  loopback_SecondaryColor3ivEXT_f => _mesa_SecondaryColor3iv
  loopback_SecondaryColor3sEXT_f => _mesa_SecondaryColor3s
  loopback_SecondaryColor3svEXT_f => _mesa_SecondaryColor3sv
  loopback_SecondaryColor3ubEXT_f => _mesa_SecondaryColor3ub
  loopback_SecondaryColor3ubvEXT_f => _mesa_SecondaryColor3ubv
  loopback_SecondaryColor3uiEXT_f => _mesa_SecondaryColor3ui
  loopback_SecondaryColor3uivEXT_f => _mesa_SecondaryColor3uiv
  loopback_SecondaryColor3usEXT_f => _mesa_SecondaryColor3us
  loopback_SecondaryColor3usvEXT_f => _mesa_SecondaryColor3usv
  loopback_TexCoord1d => _mesa_TexCoord1d
  loopback_TexCoord1dv => _mesa_TexCoord1dv
  loopback_TexCoord1i => _mesa_TexCoord1i
  loopback_TexCoord1iv => _mesa_TexCoord1iv
  loopback_TexCoord1s => _mesa_TexCoord1s
  loopback_TexCoord1sv => _mesa_TexCoord1sv
  loopback_TexCoord2d => _mesa_TexCoord2d
  loopback_TexCoord2dv => _mesa_TexCoord2dv
  loopback_TexCoord2i => _mesa_TexCoord2i
  loopback_TexCoord2iv => _mesa_TexCoord2iv
  loopback_TexCoord2s => _mesa_TexCoord2s
  loopback_TexCoord2sv => _mesa_TexCoord2sv
  loopback_TexCoord3d => _mesa_TexCoord3d
  loopback_TexCoord3dv => _mesa_TexCoord3dv
  loopback_TexCoord3i => _mesa_TexCoord3i
  loopback_TexCoord3iv => _mesa_TexCoord3iv
  loopback_TexCoord3s => _mesa_TexCoord3s
  loopback_TexCoord3sv => _mesa_TexCoord3sv
  loopback_TexCoord4d => _mesa_TexCoord4d
  loopback_TexCoord4dv => _mesa_TexCoord4dv
  loopback_TexCoord4i => _mesa_TexCoord4i
  loopback_TexCoord4iv => _mesa_TexCoord4iv
  loopback_TexCoord4s => _mesa_TexCoord4s
  loopback_TexCoord4sv => _mesa_TexCoord4sv
  loopback_Vertex2d => _mesa_Vertex2d
  loopback_Vertex2dv => _mesa_Vertex2dv
  loopback_Vertex2i => _mesa_Vertex2i
  loopback_Vertex2iv => _mesa_Vertex2iv
  loopback_Vertex2s => _mesa_Vertex2s
  loopback_Vertex2sv => _mesa_Vertex2sv
  loopback_Vertex3d => _mesa_Vertex3d
  loopback_Vertex3dv => _mesa_Vertex3dv
  loopback_Vertex3i => _mesa_Vertex3i
  loopback_Vertex3iv => _mesa_Vertex3iv
  loopback_Vertex3s => _mesa_Vertex3s
  loopback_Vertex3sv => _mesa_Vertex3sv
  loopback_Vertex4d => _mesa_Vertex4d
  loopback_Vertex4dv => _mesa_Vertex4dv
  loopback_Vertex4i => _mesa_Vertex4i
  loopback_Vertex4iv => _mesa_Vertex4iv
  loopback_Vertex4s => _mesa_Vertex4s
  loopback_Vertex4sv => _mesa_Vertex4sv
  loopback_VertexAttrib1dARB => _mesa_VertexAttrib1d
  loopback_VertexAttrib1dNV => _mesa_VertexAttrib1dNV
  loopback_VertexAttrib1dvARB => _mesa_VertexAttrib1dv
  loopback_VertexAttrib1dvNV => _mesa_VertexAttrib1dvNV
  loopback_VertexAttrib1sARB => _mesa_VertexAttrib1s
  loopback_VertexAttrib1sNV => _mesa_VertexAttrib1sNV
  loopback_VertexAttrib1svARB => _mesa_VertexAttrib1sv
  loopback_VertexAttrib1svNV => _mesa_VertexAttrib1svNV
  loopback_VertexAttrib2dARB => _mesa_VertexAttrib2d
  loopback_VertexAttrib2dNV => _mesa_VertexAttrib2dNV
  loopback_VertexAttrib2dvARB => _mesa_VertexAttrib2dv
  loopback_VertexAttrib2dvNV => _mesa_VertexAttrib2dvNV
  loopback_VertexAttrib2sARB => _mesa_VertexAttrib2s
  loopback_VertexAttrib2sNV => _mesa_VertexAttrib2sNV
  loopback_VertexAttrib2svARB => _mesa_VertexAttrib2sv
  loopback_VertexAttrib2svNV => _mesa_VertexAttrib2svNV
  loopback_VertexAttrib3dARB => _mesa_VertexAttrib3d
  loopback_VertexAttrib3dNV => _mesa_VertexAttrib3dNV
  loopback_VertexAttrib3dvARB => _mesa_VertexAttrib3dv
  loopback_VertexAttrib3dvNV => _mesa_VertexAttrib3dvNV
  loopback_VertexAttrib3sARB => _mesa_VertexAttrib3s
  loopback_VertexAttrib3sNV => _mesa_VertexAttrib3sNV
  loopback_VertexAttrib3svARB => _mesa_VertexAttrib3sv
  loopback_VertexAttrib3svNV => _mesa_VertexAttrib3svNV
  loopback_VertexAttrib4NbvARB => _mesa_VertexAttrib4Nbv
  loopback_VertexAttrib4NivARB => _mesa_VertexAttrib4Niv
  loopback_VertexAttrib4NsvARB => _mesa_VertexAttrib4Nsv
  loopback_VertexAttrib4NubARB => _mesa_VertexAttrib4Nub
  loopback_VertexAttrib4NubvARB => _mesa_VertexAttrib4Nubv
  loopback_VertexAttrib4NuivARB => _mesa_VertexAttrib4Nuiv
  loopback_VertexAttrib4NusvARB => _mesa_VertexAttrib4Nusv
  loopback_VertexAttrib4bvARB => _mesa_VertexAttrib4bv
  loopback_VertexAttrib4dARB => _mesa_VertexAttrib4d
  loopback_VertexAttrib4dNV => _mesa_VertexAttrib4dNV
  loopback_VertexAttrib4dvARB => _mesa_VertexAttrib4dv
  loopback_VertexAttrib4dvNV => _mesa_VertexAttrib4dvNV
  loopback_VertexAttrib4ivARB => _mesa_VertexAttrib4iv
  loopback_VertexAttrib4sARB => _mesa_VertexAttrib4s
  loopback_VertexAttrib4sNV => _mesa_VertexAttrib4sNV
  loopback_VertexAttrib4svARB => _mesa_VertexAttrib4sv
  loopback_VertexAttrib4svNV => _mesa_VertexAttrib4svNV
  loopback_VertexAttrib4ubNV => _mesa_VertexAttrib4ubNV
  loopback_VertexAttrib4ubvARB => _mesa_VertexAttrib4ubv
  loopback_VertexAttrib4ubvNV => _mesa_VertexAttrib4ubvNV
  loopback_VertexAttrib4uivARB => _mesa_VertexAttrib4uiv
  loopback_VertexAttrib4usvARB => _mesa_VertexAttrib4usv
  loopback_VertexAttribI1iv => _mesa_VertexAttribI1iv
  loopback_VertexAttribI1uiv => _mesa_VertexAttribI1uiv
  loopback_VertexAttribI4bv => _mesa_VertexAttribI4bv
  loopback_VertexAttribI4sv => _mesa_VertexAttribI4sv
  loopback_VertexAttribI4ubv => _mesa_VertexAttribI4ubv
  loopback_VertexAttribI4usv => _mesa_VertexAttribI4usv
  loopback_VertexAttribs1dvNV => _mesa_VertexAttribs1dvNV
  loopback_VertexAttribs1fvNV => _mesa_VertexAttribs1fvNV
  loopback_VertexAttribs1svNV => _mesa_VertexAttribs1svNV
  loopback_VertexAttribs2dvNV => _mesa_VertexAttribs2dvNV
  loopback_VertexAttribs2fvNV => _mesa_VertexAttribs2fvNV
  loopback_VertexAttribs2svNV => _mesa_VertexAttribs2svNV
  loopback_VertexAttribs3dvNV => _mesa_VertexAttribs3dvNV
  loopback_VertexAttribs3fvNV => _mesa_VertexAttribs3fvNV
  loopback_VertexAttribs3svNV => _mesa_VertexAttribs3svNV
  loopback_VertexAttribs4dvNV => _mesa_VertexAttribs4dvNV
  loopback_VertexAttribs4fvNV => _mesa_VertexAttribs4fvNV
  loopback_VertexAttribs4svNV => _mesa_VertexAttribs4svNV
  loopback_VertexAttribs4ubvNV => _mesa_VertexAttribs4ubvNV

And changes the primary name assignment in the XML as follows:
  ActiveTextureARB => ActiveTexture
  AlphaFuncxOES => AlphaFuncx
  BeginConditionalRenderNV => BeginConditionalRender
  BeginQueryARB => BeginQuery
  BeginTransformFeedbackEXT => BeginTransformFeedback
  BindAttribLocationARB => BindAttribLocation
  BindBufferARB => BindBuffer
  BindBufferBaseEXT => BindBufferBase
  BindBufferRangeEXT => BindBufferRange
  BindFragDataLocationEXT => BindFragDataLocation
  BindFramebufferEXT => BindFramebuffer
  BindProgramNV => BindProgramARB
  BindRenderbufferEXT => BindRenderbuffer
  BlendEquationSeparateEXT => BlendEquationSeparate
  BlendFuncSeparateEXT => BlendFuncSeparate
  BlitFramebufferEXT => BlitFramebuffer
  BufferDataARB => BufferData
  BufferSubDataARB => BufferSubData
  CheckFramebufferStatusEXT => CheckFramebufferStatus
  ClampColorARB => ClampColor
  ClearColorxOES => ClearColorx
  ClearDepthxOES => ClearDepthx
  ClientActiveTextureARB => ClientActiveTexture
  ClipPlanefOES => ClipPlanef
  ClipPlanexOES => ClipPlanex
  Color4xOES => Color4x
  ColorMaskIndexedEXT => ColorMaski
  CompileShaderARB => CompileShader
  CompressedTexImage1DARB => CompressedTexImage1D
  CompressedTexImage2DARB => CompressedTexImage2D
  CompressedTexImage3DARB => CompressedTexImage3D
  CompressedTexSubImage1DARB => CompressedTexSubImage1D
  CompressedTexSubImage2DARB => CompressedTexSubImage2D
  CompressedTexSubImage3DARB => CompressedTexSubImage3D
  DeleteBuffersARB => DeleteBuffers
  DeleteFramebuffersEXT => DeleteFramebuffers
  DeleteProgramsNV => DeleteProgramsARB
  DeleteQueriesARB => DeleteQueries
  DeleteRenderbuffersEXT => DeleteRenderbuffers
  DeleteVertexArraysAPPLE => DeleteVertexArrays
  DepthRangexOES => DepthRangex
  DisableIndexedEXT => Disablei
  DisableVertexAttribArrayARB => DisableVertexAttribArray
  DrawBuffersARB => DrawBuffers
  EnableIndexedEXT => Enablei
  EnableVertexAttribArrayARB => EnableVertexAttribArray
  EndConditionalRenderNV => EndConditionalRender
  EndQueryARB => EndQuery
  EndTransformFeedbackEXT => EndTransformFeedback
  FogCoordPointerEXT => FogCoordPointer
  FogCoorddEXT => FogCoordd
  FogCoorddvEXT => FogCoorddv
  FogxOES => Fogx
  FogxvOES => Fogxv
  FramebufferRenderbufferEXT => FramebufferRenderbuffer
  FramebufferTexture1DEXT => FramebufferTexture1D
  FramebufferTexture2DEXT => FramebufferTexture2D
  FramebufferTexture3DEXT => FramebufferTexture3D
  FramebufferTextureLayerEXT => FramebufferTextureLayer
  FrustumfOES => Frustumf
  FrustumxOES => Frustumx
  GenBuffersARB => GenBuffers
  GenFramebuffersEXT => GenFramebuffers
  GenProgramsNV => GenProgramsARB
  GenQueriesARB => GenQueries
  GenRenderbuffersEXT => GenRenderbuffers
  GenerateMipmapEXT => GenerateMipmap
  GetActiveAttribARB => GetActiveAttrib
  GetActiveUniformARB => GetActiveUniform
  GetAttribLocationARB => GetAttribLocation
  GetBooleanIndexedvEXT => GetBooleani_v
  GetBufferParameterivARB => GetBufferParameteriv
  GetBufferPointervARB => GetBufferPointerv
  GetBufferSubDataARB => GetBufferSubData
  GetClipPlanefOES => GetClipPlanef
  GetClipPlanexOES => GetClipPlanex
  GetCompressedTexImageARB => GetCompressedTexImage
  GetFixedvOES => GetFixedv
  GetFragDataLocationEXT => GetFragDataLocation
  GetFramebufferAttachmentParameterivEXT => GetFramebufferAttachmentParameteriv
  GetIntegerIndexedvEXT => GetIntegeri_v
  GetLightxvOES => GetLightxv
  GetMaterialxvOES => GetMaterialxv
  GetQueryObjecti64vEXT => GetQueryObjecti64v
  GetQueryObjectivARB => GetQueryObjectiv
  GetQueryObjectui64vEXT => GetQueryObjectui64v
  GetQueryObjectuivARB => GetQueryObjectuiv
  GetQueryivARB => GetQueryiv
  GetRenderbufferParameterivEXT => GetRenderbufferParameteriv
  GetShaderSourceARB => GetShaderSource
  GetTexEnvxvOES => GetTexEnvxv
  GetTexParameterIivEXT => GetTexParameterIiv
  GetTexParameterIuivEXT => GetTexParameterIuiv
  GetTexParameterxvOES => GetTexParameterxv
  GetTransformFeedbackVaryingEXT => GetTransformFeedbackVarying
  GetUniformLocationARB => GetUniformLocation
  GetUniformfvARB => GetUniformfv
  GetUniformivARB => GetUniformiv
  GetUniformuivEXT => GetUniformuiv
  GetVertexAttribIivEXT => GetVertexAttribIiv
  GetVertexAttribIuivEXT => GetVertexAttribIuiv
  GetVertexAttribPointervNV => GetVertexAttribPointerv
  GetVertexAttribdvARB => GetVertexAttribdv
  GetVertexAttribfvARB => GetVertexAttribfv
  GetVertexAttribivARB => GetVertexAttribiv
  IsBufferARB => IsBuffer
  IsEnabledIndexedEXT => IsEnabledi
  IsFramebufferEXT => IsFramebuffer
  IsProgramNV => IsProgramARB
  IsQueryARB => IsQuery
  IsRenderbufferEXT => IsRenderbuffer
  IsVertexArrayAPPLE => IsVertexArray
  LightModelxOES => LightModelx
  LightModelxvOES => LightModelxv
  LightxOES => Lightx
  LightxvOES => Lightxv
  LineWidthxOES => LineWidthx
  LinkProgramARB => LinkProgram
  LoadMatrixxOES => LoadMatrixx
  LoadTransposeMatrixdARB => LoadTransposeMatrixd
  LoadTransposeMatrixfARB => LoadTransposeMatrixf
  MapBufferARB => MapBuffer
  MaterialxOES => Materialx
  MaterialxvOES => Materialxv
  MultMatrixxOES => MultMatrixx
  MultTransposeMatrixdARB => MultTransposeMatrixd
  MultTransposeMatrixfARB => MultTransposeMatrixf
  MultiDrawArraysEXT => MultiDrawArrays
  MultiTexCoord1dARB => MultiTexCoord1d
  MultiTexCoord1dvARB => MultiTexCoord1dv
  MultiTexCoord1iARB => MultiTexCoord1i
  MultiTexCoord1ivARB => MultiTexCoord1iv
  MultiTexCoord1sARB => MultiTexCoord1s
  MultiTexCoord1svARB => MultiTexCoord1sv
  MultiTexCoord2dARB => MultiTexCoord2d
  MultiTexCoord2dvARB => MultiTexCoord2dv
  MultiTexCoord2iARB => MultiTexCoord2i
  MultiTexCoord2ivARB => MultiTexCoord2iv
  MultiTexCoord2sARB => MultiTexCoord2s
  MultiTexCoord2svARB => MultiTexCoord2sv
  MultiTexCoord3dARB => MultiTexCoord3d
  MultiTexCoord3dvARB => MultiTexCoord3dv
  MultiTexCoord3iARB => MultiTexCoord3i
  MultiTexCoord3ivARB => MultiTexCoord3iv
  MultiTexCoord3sARB => MultiTexCoord3s
  MultiTexCoord3svARB => MultiTexCoord3sv
  MultiTexCoord4dARB => MultiTexCoord4d
  MultiTexCoord4dvARB => MultiTexCoord4dv
  MultiTexCoord4iARB => MultiTexCoord4i
  MultiTexCoord4ivARB => MultiTexCoord4iv
  MultiTexCoord4sARB => MultiTexCoord4s
  MultiTexCoord4svARB => MultiTexCoord4sv
  MultiTexCoord4xOES => MultiTexCoord4x
  Normal3xOES => Normal3x
  OrthofOES => Orthof
  OrthoxOES => Orthox
  PointParameterfEXT => PointParameterf
  PointParameterfvEXT => PointParameterfv
  PointParameteriNV => PointParameteri
  PointParameterivNV => PointParameteriv
  PointParameterxOES => PointParameterx
  PointParameterxvOES => PointParameterxv
  PointSizexOES => PointSizex
  PolygonOffsetxOES => PolygonOffsetx
  PrimitiveRestartIndexNV => PrimitiveRestartIndex
  ProvokingVertexEXT => ProvokingVertex
  RenderbufferStorageEXT => RenderbufferStorage
  RotatexOES => Rotatex
  SampleCoverageARB => SampleCoverage
  SampleCoveragexOES => SampleCoveragex
  ScalexOES => Scalex
  SecondaryColor3bEXT => SecondaryColor3b
  SecondaryColor3bvEXT => SecondaryColor3bv
  SecondaryColor3dEXT => SecondaryColor3d
  SecondaryColor3dvEXT => SecondaryColor3dv
  SecondaryColor3iEXT => SecondaryColor3i
  SecondaryColor3ivEXT => SecondaryColor3iv
  SecondaryColor3sEXT => SecondaryColor3s
  SecondaryColor3svEXT => SecondaryColor3sv
  SecondaryColor3ubEXT => SecondaryColor3ub
  SecondaryColor3ubvEXT => SecondaryColor3ubv
  SecondaryColor3uiEXT => SecondaryColor3ui
  SecondaryColor3uivEXT => SecondaryColor3uiv
  SecondaryColor3usEXT => SecondaryColor3us
  SecondaryColor3usvEXT => SecondaryColor3usv
  SecondaryColorPointerEXT => SecondaryColorPointer
  ShaderSourceARB => ShaderSource
  TexBufferARB => TexBuffer
  TexEnvxOES => TexEnvx
  TexEnvxvOES => TexEnvxv
  TexParameterIivEXT => TexParameterIiv
  TexParameterIuivEXT => TexParameterIuiv
  TexParameterxOES => TexParameterx
  TexParameterxvOES => TexParameterxv
  TransformFeedbackVaryingsEXT => TransformFeedbackVaryings
  TranslatexOES => Translatex
  Uniform1fARB => Uniform1f
  Uniform1fvARB => Uniform1fv
  Uniform1iARB => Uniform1i
  Uniform1ivARB => Uniform1iv
  Uniform1uiEXT => Uniform1ui
  Uniform1uivEXT => Uniform1uiv
  Uniform2fARB => Uniform2f
  Uniform2fvARB => Uniform2fv
  Uniform2iARB => Uniform2i
  Uniform2ivARB => Uniform2iv
  Uniform2uiEXT => Uniform2ui
  Uniform2uivEXT => Uniform2uiv
  Uniform3fARB => Uniform3f
  Uniform3fvARB => Uniform3fv
  Uniform3iARB => Uniform3i
  Uniform3ivARB => Uniform3iv
  Uniform3uiEXT => Uniform3ui
  Uniform3uivEXT => Uniform3uiv
  Uniform4fARB => Uniform4f
  Uniform4fvARB => Uniform4fv
  Uniform4iARB => Uniform4i
  Uniform4ivARB => Uniform4iv
  Uniform4uiEXT => Uniform4ui
  Uniform4uivEXT => Uniform4uiv
  UniformMatrix2fvARB => UniformMatrix2fv
  UniformMatrix3fvARB => UniformMatrix3fv
  UniformMatrix4fvARB => UniformMatrix4fv
  UnmapBufferARB => UnmapBuffer
  UseProgramObjectARB => UseProgram
  ValidateProgramARB => ValidateProgram
  VertexAttrib1dARB => VertexAttrib1d
  VertexAttrib1dvARB => VertexAttrib1dv
  VertexAttrib1sARB => VertexAttrib1s
  VertexAttrib1svARB => VertexAttrib1sv
  VertexAttrib2dARB => VertexAttrib2d
  VertexAttrib2dvARB => VertexAttrib2dv
  VertexAttrib2sARB => VertexAttrib2s
  VertexAttrib2svARB => VertexAttrib2sv
  VertexAttrib3dARB => VertexAttrib3d
  VertexAttrib3dvARB => VertexAttrib3dv
  VertexAttrib3sARB => VertexAttrib3s
  VertexAttrib3svARB => VertexAttrib3sv
  VertexAttrib4NbvARB => VertexAttrib4Nbv
  VertexAttrib4NivARB => VertexAttrib4Niv
  VertexAttrib4NsvARB => VertexAttrib4Nsv
  VertexAttrib4NubARB => VertexAttrib4Nub
  VertexAttrib4NubvARB => VertexAttrib4Nubv
  VertexAttrib4NuivARB => VertexAttrib4Nuiv
  VertexAttrib4NusvARB => VertexAttrib4Nusv
  VertexAttrib4bvARB => VertexAttrib4bv
  VertexAttrib4dARB => VertexAttrib4d
  VertexAttrib4dvARB => VertexAttrib4dv
  VertexAttrib4ivARB => VertexAttrib4iv
  VertexAttrib4sARB => VertexAttrib4s
  VertexAttrib4svARB => VertexAttrib4sv
  VertexAttrib4ubvARB => VertexAttrib4ubv
  VertexAttrib4uivARB => VertexAttrib4uiv
  VertexAttrib4usvARB => VertexAttrib4usv
  VertexAttribDivisorARB => VertexAttribDivisor
  VertexAttribI1ivEXT => VertexAttribI1iv
  VertexAttribI1uivEXT => VertexAttribI1uiv
  VertexAttribI4bvEXT => VertexAttribI4bv
  VertexAttribI4svEXT => VertexAttribI4sv
  VertexAttribI4ubvEXT => VertexAttribI4ubv
  VertexAttribI4usvEXT => VertexAttribI4usv
  VertexAttribIPointerEXT => VertexAttribIPointer
  VertexAttribPointerARB => VertexAttribPointer
  WindowPos2dMESA => WindowPos2d
  WindowPos2dvMESA => WindowPos2dv
  WindowPos2fMESA => WindowPos2f
  WindowPos2fvMESA => WindowPos2fv
  WindowPos2iMESA => WindowPos2i
  WindowPos2ivMESA => WindowPos2iv
  WindowPos2sMESA => WindowPos2s
  WindowPos2svMESA => WindowPos2sv
  WindowPos3dMESA => WindowPos3d
  WindowPos3dvMESA => WindowPos3dv
  WindowPos3fMESA => WindowPos3f
  WindowPos3fvMESA => WindowPos3fv
  WindowPos3iMESA => WindowPos3i
  WindowPos3ivMESA => WindowPos3iv
  WindowPos3sMESA => WindowPos3s
  WindowPos3svMESA => WindowPos3sv

Acked-by: Brian Paul <brianp@vmware.com>
2012-11-12 10:53:57 -08:00
Michel Dänzer
7708a86464 radeonsi: Implement alpha testing in pixel shader.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-12 15:45:42 +01:00
Michel Dänzer
e44dfd4b3c radeonsi: Initialize uses_kill boolean from TGSI info.
Fixes discarded pixels incorrectly updating the depth buffer.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-12 15:45:42 +01:00
Vincent Lejeune
557d4918ad glsl: store read vector in a temp in vec_index_to_cond
Vector indexing on matrixes generates several copy of the
constant matrix, for instance vec=mat4[i][j] generates :
vec=mat4[i].x;
vec=(j==1)?mat4[i].y;
vec=(j==2)?mat4[i].z;
vec=(j==3)?mat4[i].w;
In the case of constant matrixes, the mat4[i] expression generates
copy of the 16 elements of the matrix 4 times ; indirect addressing
also prevents some conservative CSE algorithms (like the one in LLVM)
from factoring the mat4[i] expression.
This patch will make the vec_index_to_cond pass generates :
temp = mat4[i];
vec=temp.x;
vec=(j==1)?temp.y;
vec=(j==2)?temp.z;
vec=(j==3)?temp.w;

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-11 22:20:15 +01:00
Marek Olšák
05a2f66cde gallium/u_blitter: handle PIPE_TEXTURE_CUBE_ARRAY in is_box_inside_resource 2012-11-11 13:33:01 +01:00
Andreas Boll
5ecbc3a9e8 build: fix enable/disable language in ./configure --help
Based on patch from Brian Paul.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32317
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-10 21:32:09 +01:00
Kenneth Graunke
e639385064 i965: Fix AA Line Distance Mode in 3DSTATE_SF on Ivybridge.
We were accidentally setting bit 14 in DWord 2 (which is Reserved/MBZ)
rather than bit 14 in DWord 3 (which is AA Line Distance Mode).

There's also no reason to ever set it to legacy mode; the bit is only
used when drawing antialiased lines anyway.  Set it unconditionally.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-10 12:14:23 -08:00
Ian Romanick
5581954c3a dri_util: Fix prologue comment for driCreateConfigs
The parameters and operation of this function changed, but I didn't
bother to change the prologue comment.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-09 18:38:19 -08:00
Ian Romanick
3ec3201f31 swrast: swrastFillInModes doesn't do 8-bit modes, so don't try
Support for 8-bit modes was removed in commits 0398a26 and bda208a4.
However, I didn't notice code in dri_init_screen that explicitly tries
to create this modes.  This is structurally different from other drivers
(that only create modes that match the display color depth).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56555
Cc: Vinson Lee <vlee@freedesktop.org>
2012-11-09 18:38:19 -08:00
Darren Salt
d2a6dd9a95 Fix use of glsl_parser.{cc,h} where source dir != build dir.
Fixes a regression caused by commit 9948a3365.

https://bugs.freedesktop.org/show_bug.cgi?id=56787
https://bugs.freedesktop.org/show_bug.cgi?id=56685
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-09 16:56:35 -08:00
Brian Paul
2951a9dd51 Revert "mesa: assert that key->fragprog_inputs_read value isn't too large"
This reverts commit 0d61f879a1.

Assigning the FS inputs to the 12 bit field is fine since we don't care
about the higher FS inputs.  Maybe I'll revisit silencing the compiler
warning another day.
2012-11-09 16:31:22 -07:00
Matt Turner
c6f426c02d glcpp: wire up glcpp-test to make check
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-09 14:33:08 -08:00
Matt Turner
68414bc868 glcpp/tests: Add tests for multiline #elif
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-09 14:33:08 -08:00
Matt Turner
28e397660c glcpp/tests: Add test for multiline #if
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-09 14:33:08 -08:00
Matt Turner
b44423cf75 glcpp/tests: Add test for multiline #line
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-09 14:33:08 -08:00
Matt Turner
c3a15d9a35 glcpp/tests: Add test to check #line followed by code
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51802
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51506
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41152
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-09 14:33:08 -08:00
Fabian Bieler
9ad71c44fa glcpp: don't push #line directives into next line
By moving the HASH_LINE rule out of control_line: and into line:, we avoid
adding control_line's additional \n (as seen in the first hunk).

mattst88: Carl and I determined independently of Fabian that the 091
test needed to be modified identically to this, and our patch to fix the
test was more complicated.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51506
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-09 14:33:08 -08:00
Matt Turner
060e696799 glcpp: Reject garbage after #else and #endif tokens
Previously we were accepting garbage after #else and #endif tokens when
the previous preprocessor conditional evaluated to false (eg, #if 0).

When the preprocessor hits a false conditional, it switches the lexer
into the SKIP state, in which it ignores non-control tokens. The parser
pops the SKIP state off the stack when it reaches the associated #elif,
#else, or #endif. Unfortunately, that meant that it only left the SKIP
state after the lexing the entire line containing the #token and thus
would accept garbage after the #token.

To fix this we use a mid-rule, which is executed immediately after the
#token is parsed.

NOTE: This is a candidate for the stable branch
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56442
Fixes: preprocess17_frag.test from oglconform
Reviewed-by: Carl Worth <cworth@cworth.org> (glcpp-parse.y)
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-09 14:33:08 -08:00
Dave Airlie
afcaa03f7e r600g: fix printk warnings
Brian reported seeing:
r600_texture.c: In function ‘r600_texture_create_object’:
r600_texture.c:468:12: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64_t’
r600_texture.c:468:12: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’
r600_texture.c:485:12: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64_t’
r600_texture.c:485:12: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’

this should wrap over them fine.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-10 06:39:38 +10:00
Dave Airlie
aafdc5bda4 softpipe: fix unused variable warning.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-10 06:39:38 +10:00
Dave Airlie
add3a0709f gallium: fix unused cap warnings in drivers for cube map array cap.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-10 06:39:38 +10:00
Dave Airlie
eb44c36df8 r600g: add initial cube map array support (v2)
This contains the evergreen support.

Support is possible on rv670 upwards and the code in here
should work, but it doesn't and I haven't debugged it to
figure out why.

Beyond just adding support for the cube map array sampling,
r600 resinfo isn't conformant with the GL specification,
which states the number of layers should be returned for
the textureSize, so we have to track in an external
constant buffer the layers for each sampler if we need
them in the shader.

v2: only update the sampler constants if the sampler views have changed,
as suggested by Marek.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-10 06:20:46 +10:00
Dave Airlie
e9cf40142d u_blitter: fix cube array check
Pointed out by Marek on irc,

no committing after beer!

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-10 06:20:45 +10:00
José Fonseca
5dbc84ecb0 util/u_surface: Support 3D/array textures in util_resource_copy_region().
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-09 13:43:19 +00:00
José Fonseca
c84dd7a940 draw: Remove redundant draw_geometry_shader_delete().
draw_delete_geometry_shader() seems to be the real one.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-09 13:43:19 +00:00
José Fonseca
dc53e1b410 trace: Support geometry shaders.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-09 13:43:19 +00:00
José Fonseca
bbb48a4a55 util/u_surface: Fix util_clear_depth_stencil for Z32_FLOAT_S8X24_UINT.
util_pack_z_stencil was being unconditionally invoked for all formats,
causing an assertion failure for Z32_FLOAT_S8X24_UINT.

NOTE: Candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-09 13:43:19 +00:00
José Fonseca
49dff2cb05 galahad: Support geometry shader / stream-output methods.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-09 13:43:19 +00:00
José Fonseca
658b73a246 softpipe,util: Fix blending of R and RG formats.
Alpha is also 1 for formats like R32G32_FLOAT.

NOTE: Candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-09 13:43:19 +00:00
José Fonseca
48ce928900 softpipe: Fix rgb_dst_factor == PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE.
We must multiply the factor against the destination, not the source.

NOTE: Candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-09 13:43:19 +00:00
José Fonseca
e5f0ae0bd8 tgsi: Lift the requirement of indirection being done by ADDR register.
For drivers with native integer / SM4 support this is just an hindrance.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-09 13:43:19 +00:00
José Fonseca
7e112c604e util: Fix reduction of line adjacency primitives.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-09 13:43:19 +00:00
José Fonseca
b7283834dc softpipe: Handle adjacency primitives.
Not fully tested.

Based on diagrams from
http://msdn.microsoft.com/en-us/library/windows/desktop/bb205124.aspx#Primitive_Adjacency

v2: Fix based on Brian's feedback.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-09 13:43:19 +00:00
José Fonseca
5d12c7b755 util/u_rect: Make it C++ safe.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-09 13:43:18 +00:00
Dave Airlie
1d9738dab3 u_blitter: don't create fragment program for cube maps unless supported.
should fix http://bugs.freedesktop.org/56906

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 20:34:23 +10:00
Mario Kleiner
eabbe5c45f mesa: Don't glPopAttrib() GL_POINT_SPRITE_COORD_ORIGIN on < OpenGL-2.0
The GL_POINT_BIT state attribute GL_POINT_SPRITE_COORD_ORIGIN
is only supported on OpenGL-2.0 or later. Prevent glPopAttrib()
from trying to restore it on OpenGL-1.4 implementations which
support GL_ARB_POINT_SPRITE, as otherwise the sequence...

glPushAttrib(GL_POINT_BIT);
glPopAttrib();

throws an GL_INVALID_ENUM error in glPopAttrib().

See also commit f778174ea1

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-08 22:53:54 -08:00
Kenneth Graunke
c299f44782 mesa: Fix glGetVertexAttribI[u]iv now that we have real integer attribs.
Since cf438f5375e242, we store actual integers for the attribute data.
We just need to reinterpret the GLfloat array as a GLint/GLuint array
so we can read the proper data.

Fixes oglconform's glsl-vertex-attrib/basic.VertexAttribI[1234][u]i
subtests (after fixing an unrelated bug in those test cases).

v2: Use the COPY_4V macro to be concise.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <maraeo@gmail.com> [v1]
2012-11-08 22:53:54 -08:00
Kenneth Graunke
6ccfa1c543 mesa: Fix typo in glDeleteQueriesARB debug message.
"Deleete" all the extra letters!
2012-11-08 22:53:39 -08:00
Vinson Lee
2aa783318d svga: Fix memory leak in svga_buffer_transfer_map.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-08 21:28:15 -08:00
Dave Airlie
2c8f088132 docs: update with ARB_texture_cube_map_array support
just mention softpipe is done, r600g will come soon.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 10:58:49 +10:00
Dave Airlie
308a03f1ab u_blitter: add cube map array support.
This adds cube array support to the blitter.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 10:29:26 +10:00
Dave Airlie
309fda2fb2 softpipe: add ARB_texture_cube_map_array support (v1.1)
This adds support to the softpipe texture sampler and tgsi exec.

In order to handle the extra input to the texture sampling,
I've had to expand the interfaces to take a c1 value for storing
the texture compare value for the TEX2 case.

v1.1: add comments (Brian)

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 10:29:26 +10:00
Dave Airlie
8c0ccce300 st/mesa: add support for ARB_texture_cube_map_array (v2)
This adds mesa state tracker support for the new extension,
along with glsl->tgsi conversion to use the new opcodes
where appropriate.

v2: fix assert found running textureSize tests.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 10:29:13 +10:00
Dave Airlie
c4427ceff7 gallium: add defines/shader opcode for texture cube map array
This just adds the texture target and capability along
with 3 new opcodes required to support this extension.

As this extension requires some texture opcodes with samp + 5 args,
we need to use another src register, this is only required
for TEX, TXL and TXB opcodes to implement this spec.

TEX2 is required for shadow cube map arrays
TXL2 is required for cube map array sampler + explicit lod
TXB2 is required for cube map array sampler + lod bias

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 10:26:37 +10:00
Dave Airlie
5b115864d2 mesa: arb_texture_cube_map_array: fix attrib push/pop
fdo9833 piglit test caught this.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 10:26:37 +10:00
Dave Airlie
4c8750015b glsl: add ARB_texture_cube_map_array support (v2)
This adds all the new builtins + the new sampler types,
and hooks them up if the extension is supported.

v2: fix missing signatures for grad/lod
fix missing textureSize clarifications
fix compare vs starts with usage

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 10:26:33 +10:00
Dave Airlie
2c52c0e1ce mesa: add get support for TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 09:24:41 +10:00
Dave Airlie
e0e7e29554 mesa: add fbo/texture support for ARB_texture_cube_map_array (v2)
This adds the mesa core + texture + fbo support for the
texture cube map array extension.

v2:
add comment to _mesa_num_tex_faces related to cube map arrays (Brian)
drop wrong comment cut-n-paste (Brian)
fix / 6 maximum check issue (Kenneth)
coalsece some array case statements (Kenneth)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 09:24:41 +10:00
Dave Airlie
5a5a80e021 mesa: add ARB_texture_cube_map_array extension bits
This just adds the bit + extension name.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 09:24:41 +10:00
Dave Airlie
d078c4fb92 glapi: add ARB_texture_cube_map_array.
This adds the ARB_texture_cube_map_array enums.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 09:24:41 +10:00
Dave Airlie
037b4f8038 r600g: fix lod bias/explicit lod with cube maps.
While developing cube map array support I found that we didn't
support this properly, also piglit didn't test for it at all.

I've submitted a test to piglit to check for this, and this
fixes explicit lod and lod bias with cube maps.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 09:24:40 +10:00
Dave Airlie
7356579540 r600g: clarify const buffer numbering and handling
For cube map arrays I'll need another driver private constant
buffer, and looking forward to UBOs. So clean up with some
defines, that can be modified when adding cube map array and ubos
later.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 09:24:40 +10:00
Eric Anholt
2fcaf4eae8 i965: Fix slow leak of brw->wm.compile_data->store
We were successfully freeing our compile data at context destroy, but until
then we were allocating a new store every compile without freeing it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56019
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-08 14:50:32 -08:00
Eric Anholt
177c82555b i965/fs: Add support for global copy propagation.
It is common for complicated shaders, particularly code-generated ones, to
have a big array of uniforms or attributes, and a prologue in the shader that
dereferences from the big array to more informatively-named local variables.
Then there will be some small control flow operation (like a ? : statement),
and then use of those informatively-named variables.  We were emitting extra
MOVs in these cases, because copy propagation couldn't reach across control
flow.

Instead, implement dataflow analysis on the output of the first copy
propagation pass and re-run it to propagate those extra MOVs out.

On one future Steam release, reduces VS+FS instruction count from 42837 to
41437.  No statistically significant performance difference (n=48), though, at
least at the low resolution I'm running it at.

shader-db results:

total instructions in shared programs: 722170 -> 702545 (-2.72%)
instructions in affected programs:     260618 -> 240993 (-7.53%)

Some shaders do get hurt by up to 2 instructions, because a choice to copy
propagate instead of coalesce or something like that results in a dead write
sticking around.  Given that we already have instances of those instructions
in the affected programs (particularly unigine), we should just improve dead
code elimination to fix the problem.
2012-11-08 14:50:32 -08:00
Dave Airlie
9785ae0973 glsl_to_tgsi: fix dst register for texturing fetches.
I've no idea why there isn't a piglit that triggers this behaviour,
but while enabling TBOs for softpipe and r600g, I noticed all the
integer tests failed. I tracked it back to the TXF returning a float
when it should be returning an int. This fixed it and I haven't
seen any regressions in a full piglit run on softpipe.

http://bugs.freedesktop.org/55010

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09 06:05:54 +10:00
Vincent Lejeune
e6b3858c89 r600g: fix pre eg export with llvm
Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-11-08 13:21:57 +01:00
Vinson Lee
4cb8b946d9 i965: Fix assertion in brw_alu3.
Fixes side effect in assertion defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-07 22:54:26 -08:00
Jonas Ådahl
a3b6b2d305 wayland: Destroy frame callback when destroying surface
If a frame callback is not destroyed when destroying a surface, its
handler function will be invoked if the surface was destroyed after the
callback was requested but before it was invoked, causing a write on
free:ed memory.

This can happen if eglDestroySurface() is called shortly after
eglSwapBuffers().

Note: This is a candidate for stable branches.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2012-11-07 16:13:03 -05:00
Alex Deucher
0b61f0b148 r600g/compute: fix call to r600_bytecode_init
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-11-07 11:51:16 -05:00
Kenneth Graunke
65faedb0d9 mesa: Remove PROG_EMIT_VERTEX and PROG_END_PRIMITIVE opcodes.
These were only used for geometry shader support back in the days before
the new GLSL compiler.  Future geometry shader support will not use
these.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-07 00:00:46 -08:00
Vinson Lee
57049219f5 svga: Ensure vb_transfer in svga_swtnl_draw_vbo in initialized.
Fixes a uninitialized pointer read defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-06 23:33:00 -08:00
Vinson Lee
5cbc0f0036 scons: Build src/mesa/main/es1_conversion.c for all builds.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2012-11-06 23:16:29 -08:00
Fredrik Höglund
f42518962a egl_dri2/x11: Fix eglPostSubBufferNV()
This got broken in commit 0a523a8820.

NOTE: This is a candidate for the 9.0 branch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55856
2012-11-07 00:51:09 +01:00
Paul Berry
91b828ea74 dispatch: Delete unused init_dispatch functions.
The new code-generated version of _mesa_create_exec_table() populates
the entire dispatch table (except for dynamic functions) by itself; it
no longer calls separate functions to initialize parts of the dispatch
table.  This patch removes those no-longer-needed functions.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:46 -08:00
Paul Berry
98874ec30b dispatch: Code generate api_exec.c.
This patch adjusts makefiles to cause src/mesa/main/api_exec.c to be
generated using src/mapi/glapi/gen/gl_genexec.py.  There should be no
functional change.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:45 -08:00
Paul Berry
38a1039a42 glapi/gen: Add code generation script for _mesa_create_exec_table().
This script generates the file api_exec.c, which contains just the
function _mesa_create_exec_table(), based on the XML files in
src/mapi/glapi/gen.

The following XML attributes, in particular, are used:
- "es1" indicates functions that should be available in ES1 contexts.
- "es2" indicates functions that should be available in ES2/ES3
  contexts.
- "exec" indicates which Mesa function should be dispatched to.  E.g.
  if the GL function is glFoo(), then:
  - exec="mesa" (the default) dispatches to _mesa_Foo().
  - exec="check" dispatches to _check_Foo().
  - exec="es" dispatches to _es_Foo().
  - exec="loopback" dispatches to loopback_Foo().
  - exec="skip" or exec="dynamic" causes this function to be skipped;
    either it is not yet supported ("skip"), or its dispatch table
    entry will be dynamically populated based on GL state ("dynamic").
- "desktop" indicates functions that should be available in desktop GL
  (non-ES) contexts.
- "deprecated" indicates functions that should not be available in
  core contexts.
- "mesa_name" indicates functions whose implementation in Mesa has a
  different suffix than the corresponding GL function name.

The generated code looks roughly like this (showing just a single
statement in each block for brevity):

    struct _glapi_table *
    _mesa_create_exec_table(struct gl_context *ctx)
    {
       struct _glapi_table *exec;

       exec = _mesa_alloc_dispatch_table(_gloffset_COUNT);
       if (exec == NULL)
          return NULL;

       if (_mesa_is_desktop_gl(ctx)) {
          SET_ActiveProgramEXT(exec, _mesa_ActiveProgramEXT);
          /* other functions not shown */
       }
       if (_mesa_is_desktop_gl(ctx) || _mesa_is_gles3(ctx)) {
          SET_BeginQueryARB(exec, _mesa_BeginQueryARB);
          /* other functions not shown */
       }
       if (_mesa_is_desktop_gl(ctx) || ctx->API == API_OPENGLES) {
          SET_GetPointerv(exec, _mesa_GetPointerv);
          /* other functions not shown */
       }
       if (_mesa_is_desktop_gl(ctx) || ctx->API == API_OPENGLES || ctx->API == API_OPENGLES2) {
          SET_ActiveTextureARB(exec, _mesa_ActiveTextureARB);
          /* other functions not shown */
       }
       if (_mesa_is_desktop_gl(ctx) || ctx->API == API_OPENGLES2) {
          SET_AttachShader(exec, _mesa_AttachShader);
          /* other functions not shown */
       }
       if (ctx->API == API_OPENGL) {
          SET_Accum(exec, _mesa_Accum);
          /* other functions not shown */
       }
       if (ctx->API == API_OPENGL || ctx->API == API_OPENGLES) {
          SET_AlphaFunc(exec, _mesa_AlphaFunc);
          /* other functions not shown */
       }
       if (ctx->API == API_OPENGLES) {
          SET_AlphaFuncxOES(exec, _es_AlphaFuncx);
          /* other functions not shown */
       }

       return exec;
    }

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:45 -08:00
Paul Berry
679df028e7 glapi/gen: handle new XML attributes.
This patch updates gl_XML.py to parse the new XML attributes "exec",
"desktop", "deprecated", and "mesa_name", which will be needed to code
generate _mesa_create_exec_table().

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:44 -08:00
Paul Berry
91b5a741f6 glapi/gen: Gather API version info across aliased functions.
gl_XML.py's gl_function class keeps track of an entry_point_api_map
property that tracks, for each set of aliased functions, which ES1 or
ES2 version the given function name first appeared in.

This patch aggregates that information together across aliased
functions, into an easier-to-use api_map property.

Future patches will use this information when code generating
_mesa_create_exec_table(), to determine which set of dispatch table
entries should be populated based on the API.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:44 -08:00
Paul Berry
ccd872824b glapi/gen: Comment fix.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:44 -08:00
Paul Berry
f7fa946d1d dispatch: Make all API functions non-static.
Some of the functions that we store in the dispatch table are declared
as non-static in their .c files and are inserted into the dispatch
table directly by _mesa_create_exec_table().  Other functions are
declared as static, and are inserted into the dispatch table by a
dedicated function that lives in the same .c file
(e.g. _mesa_loopback_init_api_table() in api_loopback.c).

This patch makes all of these functions non-static, and creates
appropriate prototypes for them, so that in future patches we can
populate the entire dispatch table using a single code-generated
function.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:43 -08:00
Paul Berry
e41d1a4e74 glapi: Annotate XML with function name suffix anomalies.
When the XML lists one or more GL api functions as aliases for another
GL function, the mesa function that implements the functionality is
usually named after the canonical version of the function (the one
that is the target of the aliases).  For example, FogCoordd is listed
as an alias of FogCoorddEXT, and the Mesa function implementing the
functionality is called loopback_FogCoorddEXT.

However, there are exceptions.  For example, Enablei is listed as an
alias of EnableIndexedEXT, but the Mesa function implementing the
functionality is called _mesa_EnableIndexed.

To account for these anomalies, this patch annotates the XML with
"mesa_name" attributes, which describe how to adjust the function name
to find the corresponding Mesa function.

For example:

  <function name="EnableIndexedEXT" mesa_name="-EXT">...</function>
  <function name="IsProgramNV" mesa_name="-NV+ARB">...</function>

means that EnableIndexedEXT is implemented by a Mesa function called
_mesa_EnableIndexed, and IsProgramNV is implemented by a Mesa function
called _mesa_IsProgramARB.

Future patches will use this annotation when code generating
_mesa_create_exec_table(), to determine the name of the Mesa function
that should be stored in each dispatch table entry.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:43 -08:00
Paul Berry
4b37fa8581 glapi: Annotate XML with desktop="false" for GLES-only functions.
Future patches will use this annotation when code generating
_mesa_create_exec_table(), to determine which functions should be
skipped when the API is desktop GL.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:42 -08:00
Paul Berry
3c474657f7 glapi: Annotate XML with exec="{es,check}" for special GLES1 functions.
Future patches will use this annotation when code generating
_mesa_create_exec_table(), to determine which functions should be
dispatched to ES-specific implementations.  exec="es" indicates that
the ES-specific implementation has a name beginning with "_es_"
(e.g. _es_QueryMatrixxOES), and exec="check" indicates that the
ES-specific implementation has a name beginning with "_check_"
(e.g. _check_GetTexGenxvOES).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:42 -08:00
Paul Berry
d1b2bd5191 glapi: Annotate XML with exec="loopback" for loopback functions.
Future patches will use this annotation when code generating
_mesa_create_exec_table(), to determine which functions should be
dispatched to functions in api_loopback.c.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:42 -08:00
Paul Berry
784d2f303c glapi: Annotate XML with exec="dynamic" for dynamic functions.
Future patches will use this annotation when code generating
_mesa_create_exec_table(), to determine which functions should be
skipped because Mesa dispatches them differently depending on GL
state.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:41 -08:00
Paul Berry
3464bce419 glapi: Annotate XML with exec="skip" for unimplemented functions.
Future patches will use this annotation when code generating
_mesa_create_exec_table(), to determine which functions should be
skipped because they aren't implemented by Mesa.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:41 -08:00
Paul Berry
89a577080f glapi: Annotate XML with deprecated="3.1" for deprecated functions.
Future patches will use this annotation when code generating
_mesa_create_exec_table(), to determine which functions should be
skipped in core contexts.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:41 -08:00
Paul Berry
11e9d8dd05 glapi: Mark GLX extensions as window_system="glX".
We were already doing this for some GLX extensions, but not others.
This patch makes our use of window_system="glX" consistent.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:40 -08:00
Paul Berry
e70b1a1379 glapi: Use GL_ or GLX_ prefix for all category names.
This patch standardizes the category names used in the glapi XML files
to begin each extension name with the prefix "GL_" or "GLX_".  There
is no functional change, because these category names are not used in
the generated code.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:40 -08:00
Paul Berry
5708e27113 dispatch: Remove a few FEATURE_ES1 conditionals.
This allows the GLES1.1 dispatch sanity test to be run on all builds,
even builds that do not include GLES1 support.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-11-06 12:57:39 -08:00
Brian Paul
0d61f879a1 mesa: assert that key->fragprog_inputs_read value isn't too large
fragprog_inputs_read is a 12-bit bitfield so check the assigned value.
MSVC warns on the assignment.  Not easy to fix but let's do a sanity check.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
406df38a66 mesa: fix MSVC signed/unsigned warnings in context.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
27d70b7266 mesa: fix MSVC signed/unsigned warnings in transformfeedback.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
085d81c370 swrast: fix MSVC signed/unsigned warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
cb5fb15578 tnl: fix MSVC signed/unsigned warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
5c05d28a43 mesa: silence MSVC signed/unsigned warning in texgetmage.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
0dddf592ed mesa: silence MSVC signed/unsigned warning in texstorage.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
15cb1a9029 vbo: use GLuint for numInstances to silence MSVC warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
859c387603 mesa: fix signed/unsigned MSVC warnings in fbobject.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
d4e18764c6 mesa: s/GLint/GLuint/ in matrix.c to silence MSVC warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
49cea4d40c mesa: s/int/GLuint/ in get.c to silence MSVC warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
cc6c887cca mesa: fix assorted MSVC conversion warnings in format_pack.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
75f2ccf3a2 st/mesa: change glsl_to_tgsi_visitor from class to struct
To match the declaration in the .h file and silence an MSVC warning.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
80b3dfa704 st/mesa: add int cast to silence warning
MSVC warns that negating an unsigned value yields an unsigned value.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
ab8c5347f1 glsl: fix signed/unsigned comparision warnings on MSVC
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
78d3cfb5b4 glsl: remove incorrect 'struct' keyword
ir_variable is a class, not a struct.  Fixes an MSVC warning.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
e9dd5895dd glsl: add 'f' suffix to floats to silence MSVC warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Brian Paul
c3466315c0 glsl: change int->unsigned to silence MSVC warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-06 07:42:37 -07:00
Vinson Lee
e87a57843c scons: Require libdrm_radeon 2.4.40.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2012-11-05 22:00:01 -08:00
Marek Olšák
428e37c2da r600g: add in-place DB decompression and texturing with DB tiling
The decompression is done in-place and only the compressed tiles are
decompressed. Note: R6xx-R7xx can do that only with Z16 and Z32F.

The texture unit is programmed to use non-displayable tiling and depth
ordering of samples, so that it can fetch the texture in the native DB format.

The latest version of the libdrm surface allocator is required for stencil
texturing to work. The old one didn't create the mipmap tree correctly.
We need a separate mipmap tree for stencil, because the stencil mipmap
offsets are not really depth offsets/4.

There are still some known bugs, but this should save some memory and it also
improves performance a little bit in Lightsmark (especially with low
resolutions; tested with Radeon HD 5000).

The DB->CB copy is still used for transfers.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-11-06 02:54:16 +01:00
Marek Olšák
c80ceded6f configure.ac: require libdrm_radeon 2.4.40 2012-11-06 02:36:12 +01:00
Marek Olšák
acf438f537 vbo: fix glVertexAttribI* functions
The functions were broken, because they converted ints to floats.
Now we can finally advertise OpenGL 3.0. ;)

In this commit, the vbo module also tracks the type for each attrib
in addition to the size. It can be one of FLOAT, INT, UNSIGNED_INT.

The little ugliness is the vertex attribs are declared as floats even though
there may be integer values. The code just copies integer values into them
without any conversion.

This implementation passes the glVertexAttribI piglit test which I am going
to commit in piglit soon. The test covers vertex arrays, immediate mode and
display lists.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>

v2: cosmetic changes as suggested by Brian
2012-11-06 01:13:48 +01:00
Anuj Phogat
a196f43596 meta: Remove redundant code in _mesa_meta_GenerateMipmap
Integer textures generate invalid operation in glGenerateMipmap.
So, the code related to integer textures is now redundant.

Note: This is a candidate for stable branches.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-05 10:17:48 -08:00
Anuj Phogat
c0a78d7d7b mesa: Generate invalid operation in glGenerateMipMap for integer textures
Khronos has reached a conclusion and disallowed following texture formats in
glGenerateMipMap():
 (a) ASTC textures
 (b) integer internal formats (e.g., RGBA8UI, RG16I)
 (c) textures with stencil formats (e.g., STENCIL_INDEX8)
 (d) textures with packed depth/stencil formats (e.g, DEPTH24_STENCIL8)

https://cvs.khronos.org/bugzilla/show_bug.cgi?id=9471

Note: This is a candidate for stable branches.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-05 10:17:48 -08:00
José Fonseca
3700bd1158 trace: Prevent segfault when passing NULL to set_vertex_buffers.
State tracker now passes NULL buffer array to unbind buffers.
2012-11-05 11:18:07 +00:00
José Fonseca
99c45c5aa4 galahad: Prevent segfault when passing NULL to set_vertex_buffers.
State tracker now passes NULL buffer array to unbind buffers.
2012-11-05 11:05:34 +00:00
José Fonseca
f1034e944b util: Make u_framebuffer.h C++ safe. 2012-11-05 10:39:42 +00:00
Eric Anholt
ccbfe3dde9 mesa: Use "non-gen name" more consistently as an error message in GL core.
I used this to help verify that my test was actually testing the paths I
wanted to.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-04 12:28:46 -08:00
Eric Anholt
4fce0230fc mesa: Fix core GL genned-name handling for glBeginQuery().
Fixes piglit gl-3.1/genned-names.

NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-04 12:28:36 -08:00
Eric Anholt
947d8ff4a7 mesa: Fix the core GL genned-name handling for glBindBufferBase()/Range().
This is part of fixing gl-3.1/genned-names.

v2: Fix a missing return value.

NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-04 12:28:03 -08:00
Vandrus Zoltán
5ac46da588 i965: Fix oversized initial allocation of the state cache table pointers.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55030
2012-11-04 12:24:13 -08:00
Eric Anholt
3a937daf3f i965: Force border color A to 1 when it's not present in the GL format.
It's usually forced to 1 by the surface format, but sometimes we actually have
alpha present because it's the only format available.

Fixes piglit texwrap bordercolor tests for OpenGL 1.1, GL_EXT_texture_sRGB and
GL_ARB_texture_float.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-04 12:23:59 -08:00
Eric Anholt
1e08d5452e i965: Fix uploading user vertex arrays with basevertex set.
If the index buffer is full of values like "0 1 2 3", but basevertex is 4, we
need to upload at least vertex data for elements 4 5 6 7.  Whether we also
upload 0 1 2 3 is a question of whether there are VBOs present or not -- see
the code setting start_vertex_bias in brw_draw_upload.c.

Fixes piglit draw-elements*base-vertex user_varrays

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-04 11:15:44 -08:00
Eric Anholt
29a6307e12 i965: Set dirty state for brw_draw_upload.c when num_instances changes.
Otherwise, if we had a set of prims passed in with a num_instances varying
between them, we wouldn't upload enough (or too much!) from user vertex
arrays.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-04 11:15:44 -08:00
Eric Anholt
13170321f6 i965: Remove the vbo_rebase_prims() path.
The brw_draw_upload.c start_vertex_bias code has support for doing the rebase
without rewriting the index buffer by applying a basevertex.  It looks like
vbo_rebase_prims() is not equipped to handle basevertex.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-04 11:15:44 -08:00
Eric Anholt
9864a5b098 i965/fs: Fix a comment in copy propagation.
We haven't been only tracking raw GRF-GRF moves since the constant propagation
merge, and also the extension for source modifiers and uniforms.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-04 11:15:44 -08:00
Eric Anholt
545b59b62a i965/fs: Allow copy-propagation on pull constant load values.
Given that we handle similarly-regioned GRFs registers for our copy
propagation from our UNIFORM file, there's no reason not to allow it.

The shader-db impact is negligible -- +90 instructions total, 2 shaders helped
and 7 hurt (slightly increased register pressure increased spilling), but this
is to prevent regression in other shaders when fixing copy_propagation to
reduce register pressure in the shaders that are hurt here.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-04 11:15:44 -08:00
Eric Anholt
cf26b4569a i965/fs: Do dead code elimination just after copy propagation.
If we put the register coalescing in between the two, then we end up with code
sequences involving dead writes that the dead code elimination doesn't know
how to remove.  In place of making dead code elimination smart (which we
should do, too), make it less important for the moment.

shader-db results:

total instructions in shared programs: 722240 -> 721275 (-0.13%)
instructions in affected programs:     50573 -> 49608 (-1.91%)

(no shaders regressed).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-04 11:15:44 -08:00
Lucas Stach
d8988f048f nv50,nvc0: expose ARB_map_buffer_alignment
All HW buffers (also suballocated ones) are already aligned.
Just make sure that also the initial sysram buffers have proper
alignment.
2012-11-04 12:33:38 +01:00
Kenneth Graunke
05882b0d3b i965/fs: Compact the virtual GRF arrays.
During code generation, we create tons of temporary variables, many of
which get immediately killed and are never used.  Later optimization and
analysis passes, such as compute_live_intervals, loop over all the
virtual GRFs.  By compacting them, we can save a lot of overhead.

Reduces compilation time in L4D2's largest fragment shader from 10.2
seconds to 5.2 seconds (50%).  Drops compute_live_variables() from
10-12% of another game's startup time to 8%.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-03 20:28:57 -07:00
Jordan Justen
e3542ea51b dispatch_sanity test: add GL CORE 3.1 test
The function list was generated from glcorearb.h for GL 4.3.

Note that many GL 4.X functions are commented out, and indicate
that they need to be added to Mesa's XML.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:41 -07:00
Jordan Justen
3b64f4b4fb dispatch_sanity test: create common context creation function
We also no longer call _swrast_CreateContext, _tnl_CreateContext
or _swsetup_CreateContext when creating the context.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:41 -07:00
Jordan Justen
83b6a7cdaa dispatch_sanity test: allow newer functions to be set to NOP
If a GL function was introduced in a later GL version than the
context we are testing, then it is okay if it is set to the
_mesa_generic_nop function.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:41 -07:00
Jordan Justen
7e64fe583f dispatch_sanity test: pass ctx to validate_functions/nops
This will allow validate_functions to access ctx->Version.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:41 -07:00
Jordan Justen
2ad1b13508 dispatch_sanity test: add version to function list
This will be used by GL CORE contexts to differentiate functions that
can be set to nop from functions that are required for a particular
context version.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:41 -07:00
Jordan Justen
4d62cb64a5 mesa: remove unimplemented FramebufferTextureFaceARB
This function can be re-added with an actual implementation
when ARB_geometry_shader4 is supported.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:41 -07:00
Jordan Justen
f625cb580a mesa: remove unimplemented FramebufferTextureARB
This function can be re-added with an actual implementation
when ARB_geometry_shader4 is supported.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:41 -07:00
Jordan Justen
f862be0d7c mesa: disable ProgramParameteri until it is needed
ProgramParameteri will be required for ARB_geometry_shader4
or GLES3. Don't enable this function until either of those
is supported.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:41 -07:00
Jordan Justen
31c03f2f8c glapi: alias ProgramParameteriARB to ProgramParameteri
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:41 -07:00
Jordan Justen
1c3a64793a glapi: move include for ARB_get_program_binary.xml to gl_API.xml
These functions are part in GL 4.3. Moving this will allow
ProgramParameteriARB to alias ProgramParameteri.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:41 -07:00
Jordan Justen
dd6660038e glapi: alias FramebufferTextureARB to FramebufferTexture
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:40 -07:00
Jordan Justen
9e036966bb mesa shaderapi: don't enable various functions for GL CORE
These EXT_separate_shader_objects function will no longer be
enabled for CORE profiles:
* UseShaderProgramEXT
* ActiveProgramEXT
* CreateShaderProgramEXT

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:40 -07:00
Jordan Justen
5ae8c9c0ca mesa api_exec: disable StencilFuncSeparateATI for API_OPENGL_CORE
This was mistakenly enabled in a21116f.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:40 -07:00
Jordan Justen
86d5c28580 mesa api_exec: add comment regarding GetPointerv & CORE profiles
GetPointerv was de-deprecated in 893ddb.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-03 10:54:40 -07:00
Vincent Lejeune
84b4372132 r600g: make tgsi-to-llvm generates store.pixel* intrinsic for fs
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-11-02 23:19:11 +01:00
Vincent Lejeune
1feb6b79ab configure.ac: Prevent build of radeon llvm backend with llvm < 3.2
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-11-02 23:18:16 +01:00
Thierry Reding
c0def90ede android: Update for builtin_stubs.cpp move
This fixes the Android build after the move of builtin_stubs.cpp into
the builtin_compiler subdirectory. This patch is untested.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-02 10:37:03 -07:00
Michel Dänzer
c5c3d2f933 radeonsi: Implement support for vertex shader samplers.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-02 18:27:18 +01:00
Johannes Obermayr
ebf0a96250 glsl: Fix builtin_compiler build by -I $(top_srcdir)/include.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56664
2012-11-02 08:53:31 -07:00
José Fonseca
8ac4b82699 scons: Update for builtin_stubs.cpp
Note this by itself is not enough to fix scons build -- it will fail
until you remove:

   rm -rf build/*/glsl/builtin_compiler

because that node was a filei before, but it will be now a directory.

This also means that bisecting across this change will require wiping
the build directory..
2012-11-02 09:43:42 +00:00
Thierry Reding
9948a33653 build: Don't cross-compile GLSL builtin compiler
The builtin_compiler binary is used during the build process to generate
code for the builtin GLSL functions. Since this binary needs to be run
on the build host, it must not be cross-compiled.

This patch fixes the build system to compile a second version of the
source files and the builtin_compiler binary itself for the build
system. It does so by defining the CC_FOR_BUILD and CXX_FOR_BUILD
variables, which are searched for by the configure script and point to
the location of native C and C++ compilers.

In order for this to work properly, builtin_function.cpp is removed
from BUILT_SOURCES, otherwise the build system would try to generate it
before having had a chance to descend into the builtin_compiler
subdirectory. With the builtin_compiler and glsl_compiler now being
generated at different stages, the build instructions for glsl_compiler
can be simplified a bit.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-01 18:18:21 -07:00
Brian Paul
8d3fb1be6b libgl-xlib: include glheader.h instead of GL/gl.h to fix build
GL/gl.h doesn't define GLfixed but glapitable.h uses it.
2012-11-01 17:38:42 -06:00
Kenneth Graunke
df8a4001f5 i965: Remove unused variables after removing the old VS backend.
Fixes compiler warnings about unused variables.
2012-11-01 16:13:16 -07:00
Kenneth Graunke
60c008dde6 i965: Remove unnecessary walk through Mesa IR in ProgramStringNotify().
Variable indexing of non-uniform arrays only exists in GLSL.  Likewise,
OPCODE_CAL/OPCODE_RET only existed to try and support GLSL's function
calls.  We don't use Mesa IR for GLSL, and these features are explicitly
disallowed by ARB_vertex_program/ARB_fragment_program and never
generated by ffvertex_prog.c.

Since they'll never happen, there's no need to check for them, which
saves us from walking through all the Mesa IR instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-01 14:29:36 -07:00
Kenneth Graunke
109a97dbd2 i965: Remove VS constant buffer read support from brw_eu_emit.c.
brw_vec4_emit.cpp implements this directly; only the old backend used
the brw_eu_emit.c code.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-01 14:29:34 -07:00
Kenneth Graunke
31c1ea5ed4 i965: Update comment about clipper constants.
The old VS backend doesn't exist, but I believe these still need to be
delivered to the clipper thread.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-01 14:29:32 -07:00
Kenneth Graunke
b68e662e61 i965/vs: Remove brw_vs_compile::constant_map.
It was only used for the old backend.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-01 14:29:30 -07:00
Kenneth Graunke
ab973403e4 i965/vs: Remove support for the old parameter layout.
Only the old backend used it.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-01 14:29:29 -07:00
Kenneth Graunke
4b2457b548 i965/vs: Delete the old vertex shader backend.
It's no longer used for anything.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-01 14:29:26 -07:00
Kenneth Graunke
66c8473e02 i965/vs: Replace brw_vs_emit.c with dumping code into the vec4_visitor.
Rather than having two separate backends, just create a small layer that
translates the subset of Mesa IR used for ARB_vertex_program and fixed
function programs to the Vec4 IR.  This allows us to use the same
optimization passes, code generator, register allocator as for GLSL.

v2: Incorporate Eric's review comments.
- Fix use of uninitialized src_swiz[] values in the SWIZZLE_ZERO/ONE
  case: just initialize it to 0 (.x) since the value doesn't matter
  (those channels get writemasked out anyway).
- Properly reswizzle source register's swizzles, rather than overwriting
  the swizzle.
- Port the old brw_vs_emit code for computing .x of the EXP2 opcode.
- Update comments, removing mention of NV_vertex_program, etc.
- Delete remaining #warning lines and debug comments.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-01 14:29:23 -07:00
Kenneth Graunke
1f0093720d i965/vs: Refactor min/max handling to share code.
v2: Properly use "conditionalmod" pre-Gen6, rather than the incorrectly
copy-and-pasted "BRW_CONDITIONAL_G".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-01 14:29:21 -07:00
Kenneth Graunke
fd8655aa7a i965/vs: Add support for emitting DPH opcodes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-01 14:29:19 -07:00
Kenneth Graunke
6bc021bc78 i965/vs: Only do INTEL_DEBUG=perf when there's a GLSL shader.
This will become necessary once we start supporting ARB programs and
fixed function in this backend.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-11-01 14:29:12 -07:00
Paul Berry
a8ab7e335d dispatch: stop generating separate GLES1 API code.
This patch removes the generated files api_exec_es1.c,
api_exec_es1_dispatch.h, and api_exec_es1_remap_helper.h (and the
source files and build rules used to generate them), since they are no
longer used.  GLES1 now uses the same dispatch table layout as all the
other APIs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-01 11:26:30 -07:00
Paul Berry
8386088e3d dispatch: stop using _mesa_create_exec_table_es1() for GLES1.
This patch modifies context creation code for GLES1 to use
_mesa_create_exec_table() (which is used for all other APIs) instead
of the GLES1-specific _mesa_create_exec_table_es1().

There is a slight change in functionality.  As a result of a mistake
in the code generation of _mesa_create_exec_table_es1(), it does not
include glFlushMappedBufferRangeEXT or glMapBufferRangeEXT (this is
because when support for those two functions was added in commit
762d9ac, src/mesa/main/APIspec.xml wasn't updated).  With this patch,
glFlushMappedBufferRangeEXT and glMapBufferRangeEXT are properly
included in the dispatch table.  Accordingly, dispatch_sanity.cpp is
modified to expect these two functions to be present.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Leave GLES1.1 dispatch sanity test disabled when not building
GLES1 support.
2012-11-01 11:26:07 -07:00
Paul Berry
a21116f87e dispatch: GLES1 fixes for _mesa_create_exec_table().
Currently, _mesa_create_exec_table() (in api_exec.c) is used for all
APIs except GLES1.  In GLES1, _mesa_create_exec_table_es1() (a code
generated function) is used instead.

In principle, this shouldn't be necessary.  It should be possible for
api_exec.c to contain the logic for populating the dispatch table for
all API's.

This patch paves the way for using _mesa_create_exec_table() instead
of _mesa_create_exec_table_es1(), by making _mesa_create_exec_table()
(and the functions it calls) expose the correct subset of desktop GL
functions for GLES1.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-01 11:25:27 -07:00
Paul Berry
5a1b40acf5 dispatch: Make a header to go along with querymatrix.c.
This patch creates a header querymatrix.h, to allow functions defined
in querymatrix.c to be used from other .c files.  It also switches
from the nonstandard GL_APIENTRY to GLAPIENTRY.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Don't declare _mesa_Get{Integer,Float}v in querymatrix.c.
Instead, just include main/get.h.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-01 11:25:14 -07:00
Paul Berry
b60244cfb9 dispatch: Add standard boilerplate and GL_APIENTRY to es1_conversion.h.
This patch adds the usual boilerplate (copyright notice and guards
against redundant inclusion) to es1_conversion.h.  It also moves the
definition of GL_APIENTRY from es1_conversion.c.

This allows es1_conversion.h to be safely included from other .c files.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Use copyright notice from src/mesa/main/es_generator.py (the
script that used to generate this file).
2012-11-01 11:24:57 -07:00
Paul Berry
dd3218d73b dispatch: Include GLES1-only functions in dispatch table.
Previously dispatch table-related code was generated from gl_API.xml,
so it did not include slots for GLES1-only functions (such as those
taking fixed-point arguments).

This patch generates dispatch table-related code from
gl_and_es_API.xml, so that GLES1-only functions are included.  This
paves the way for future patches that will unify the GLES1 dispatch
table with the dispatch tables for the other APIs.

The following generated files are affected:
- glapi_x86.S
- glapi_x86-64.S
- glapi_sparc.S
- glprocs.h
- glapitemp.h
- glapitable.h
- glapi_gentable.c
- dispatch.h
- remap_helper.h

Since this change affects makefiles, a full rebuild is required.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Adjust dependencies to ensure that generated files will be rebuilt
whenever any ES-related XML source files are changed.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-01 11:23:22 -07:00
Paul Berry
571d5c353a dispatch: properly handle parameter name mismatches in glapitemp.h.
Previously, when code-generating aliased functions in glapitemp.h, we
weren't consistent about which function alias we used to obtain the
parameter names, with the risk that we would generate incorrect code
like this:

  KEYWORD1 void KEYWORD2 NAME(Foo)(GLint x)
  {
    (void) x;
    DISPATCH(Foo, (x), (F, "glFoo(%d);\n", x));
  }
  KEYWORD1 void KEYWORD2 NAME(FooEXT)(GLint y)
  {
    (void) x;
    DISPATCH(Foo, (x), (F, "glFooEXT(%d);\n", x));
  }

At the moment there are no aliased functions with mismatched parameter
names, so this isn't the problem.  But when we introduce GLES1
functions into the dispatch table, there will be
(MapBufferRange/MapBufferRangeEXT).  This patch paves the way for that
by fixing the code generation script to handle the mismatch correctly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-01 11:22:49 -07:00
Paul Berry
33e0004720 dispatch: Include glheader.h in dispatch-related files.
This ensures that GLES1-only typedefs are available in these files.
In a future patch, this will allow us to expand the dispatch table to
include GLES1-only functions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-01 11:22:24 -07:00
Paul Berry
47deaf6175 dispatch: Update check_table.cpp to reflect recent aliasing changes.
In commits bad96f6 and e7dd2e5 I added the following aliases:
- ClampColor -> ClampColorARB
- VertexAttribDivisor -> VertexAttribDivisorARB

But I neglected to update check_table.cpp, causing "make check" to
fail for non-shared-glapi builds.

This patch removes the functions that are now aliased from
check_table.cpp, so that "make check" works correctly again.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-11-01 11:22:09 -07:00
Eric Anholt
56f8ed4c35 i965/gen4: Fix assertion failures in depthstencil piglit tests.
Don't forget to set depth_mt even if !hiz_mt.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-01 09:22:09 -07:00
Kenneth Graunke
b57d2dfbf6 i965: Add "alpha to coverage" to performance debug recompile messages.
This was missing and got labeled "Something else".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-31 19:54:36 -07:00
Kenneth Graunke
369419e847 i965: Don't replicate data for zero-stride arrays when copying to VBOs.
When copy_array_to_vbo_array encountered an array with src_stride == 0
and dst_stride != 0, we would replicate out the single element to the
whole size (max - min + 1).  This is unnecessary: we can simply upload
one copy and set the buffer's stride to 0.

Decreases vertex upload overhead in an upcoming Steam for Linux title.
Prior to this patch, copy_array_to_vbo_array appeared very high in the
profile (Eric quoted 20%).  After the patch, it disappeared completely.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-31 19:54:36 -07:00
Kenneth Graunke
3d2b4291c2 i965: Don't bother trying to extend the current vertex buffers.
This essentially reverts the following:

  commit c625aa19cb
  Author: Chris Wilson <chris@chris-wilson.co.uk>
  Date:   Fri Feb 18 10:37:43 2011 +0000

      intel: extend current vertex buffers

While working on optimizing an upcoming Steam title, I broke this code.
Eric expressed his doubts about this optimization, and noted that the
original commit offered no performance data.

I ran before and after benchmarks on Xonotic and Citybench, and found
that this code made no difference.  So, remove it to reduce complexity
and make future work simpler.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-31 19:54:36 -07:00
Marek Olšák
1eedebc65b r600g: re-enable handling of DISCARD_RANGE, improving performance
It seems to work for me now. Even the graphics corruption is gone.

This also boosts performance in Reaction Quake.
2012-11-01 03:17:58 +01:00
Marek Olšák
fa58644855 r600g: fix abysmal performance in Reaction Quake
The problem was we set VRAM|GTT for relocations of STATIC resources.
Setting just VRAM increases the framerate 4 times on my machine.

I rewrote the switch statement and adjusted the domains for window
framebuffers too.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-11-01 03:17:58 +01:00
Marek Olšák
4cf6acc3d0 gallium/u_vbuf: document how it works 2012-11-01 03:17:58 +01:00
Marek Olšák
46b0893fb9 gallium/u_vbuf: optimize looping over the list of buffers to upload 2012-11-01 03:17:58 +01:00
Marek Olšák
a97b053fdd gallium/u_vbuf: skip processing of buffers unused by the vertex element state 2012-11-01 03:17:58 +01:00
Brian Paul
fc2cf14038 swrast: remove explicit size from texfetch_funcs array
By removing the array size, the static assertion to check for missing
elements can do its job properly.  This will catch cases where a new
Mesa format is added but the swrast texfetch code isn't updated.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-31 13:37:11 -06:00
José Fonseca
f69fc36127 llvmpipe: Obey back writemask.
Tested with a modified glean tstencil2 test.

NOTE: This is a candidate for stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-31 16:50:40 +00:00
Jerome Glisse
470952f751 r600g: avoid shader needing too many gpr to lockup the gpu v2
On r6xx/r7xx shader resource management need to make sure that the
shader does not goes over the gpr register limit. Each specific
asic has a maxmimum register that can be split btw shader stage.
For each stage the shader must not use more register than the
limit programmed.

v2: Print an error message when discarding draw. Don't add another
    boolean to context structure, but rather propagate the discard
    boolean through the call chain.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-10-31 10:49:15 -04:00
Marek Olšák
183e122bdf draw: fix assertion failure in draw_emit_vertex_attr
This is a regression since b3921e1f53.

The array stores VS outputs, not FS inputs.
Now llvmpipe can do 32 varyings too.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-31 02:06:11 +01:00
Marek Olšák
91107a3522 r600g: use SQ_VTX_SEMANTIC_CLEAR to clear the semantic registers
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-10-31 02:06:11 +01:00
Marek Olšák
d6600f9d39 mesa: remove NV_read_buffer extension enable flag
It's been enabled by default, so the flag isn't really useful.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-31 02:05:12 +01:00
Marek Olšák
b8380e54b8 mesa: remove SGIS_texture_lod extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 02:05:11 +01:00
Marek Olšák
01f0bedc2d mesa: remove NV_texgen_reflection extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 02:04:54 +01:00
Marek Olšák
7857dbeb17 mesa: remove NV_light_max_exponent extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 02:04:54 +01:00
Marek Olšák
cc07149276 mesa: remove IBM_rasterpos_clip extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 02:04:54 +01:00
Marek Olšák
f5543d6eb2 mesa: remove IBM_multimode_draw_arrays extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 02:04:54 +01:00
Marek Olšák
271b6aeccd mesa: remove APPLE_packed_pixels extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 02:04:54 +01:00
Marek Olšák
55bf57dbb4 mesa: don't always enable OES_standard_derivatives
For Intel, expose it only if gen >= 4.
For Gallium, expose it only if PIPE_CAP_SM3 is advertised.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 02:04:54 +01:00
Marek Olšák
b6f5c37ac3 mesa: move EXT_texture3D enabling to _mesa_init_extensions 2012-10-31 02:04:16 +01:00
Marek Olšák
2266b1df23 mesa: remove EXT_separate_specular_color extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 01:58:26 +01:00
Marek Olšák
39a0223a87 mesa: remove EXT_rescale_normal extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 01:58:23 +01:00
Marek Olšák
6f5fc612f3 mesa: remove EXT_packed_pixels extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 01:58:21 +01:00
Marek Olšák
57b00c85b1 mesa: remove EXT_draw_range_elements extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 01:58:19 +01:00
Marek Olšák
cf9acc3833 mesa: remove EXT_compiled_vertex_array extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 01:58:17 +01:00
Marek Olšák
1301f91b31 mesa: remove ARB_window_pos extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 01:58:15 +01:00
Marek Olšák
d012e6d8fe mesa: remove ARB_transpose_matrix extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 01:58:12 +01:00
Marek Olšák
3bba7c5ab4 mesa: remove ARB_copy_buffer extension enable flag
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-31 01:57:25 +01:00
Marek Olšák
c9f2af3df7 gallium: expose ARB_map_buffer_alignment on Radeon
Reviewed-by: Brian Paul <brianp@vmware.com>

v2: update relnotes-9.1
v3: use align_malloc and align_free for malloced buffers in r300g
v4: document the new CAP in the docs
2012-10-31 01:53:50 +01:00
Marek Olšák
f2f782d50f mesa: implement ARB_map_buffer_alignment
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-31 01:48:40 +01:00
Marek Olšák
0ebd0b78c6 st/mesa: don't use _NEW_PROGRAM where ST_NEW_xxx_PROGRAM is sufficient
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-31 01:44:58 +01:00
Marek Olšák
c0c26ddaef r600g: use better sample positions for 8x MSAA
Taken from the intel driver. The sample positions are actually a solution
to the 8 queens puzzle.  It gives more accurate and smoother AA.
2012-10-31 00:55:23 +01:00
Marek Olšák
e73bf3b805 gallium: add start_slot parameter to set_vertex_buffers
This allows updating only a subrange of buffer bindings.

set_vertex_buffers(pipe, start_slot, count, NULL) unbinds buffers in that
range. Binding NULL resources unbinds buffers too (both buffer and user_buffer
must be NULL).

The meta ops are adapted to only save, change, and restore the single slot
they use. The cso_context can save and restore only one vertex buffer slot.
The clients can query which one it is using cso_get_aux_vertex_buffer_slot.
It's currently set to 0. (the Draw module breaks if it's set to non-zero)

It should decrease the CPU overhead when using a lot of meta ops, but
the drivers must be able to treat each vertex buffer slot as a separate
state (only r600g does so at the moment).

I can imagine this also being useful for optimizing some OpenGL use cases.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-31 00:55:13 +01:00
Marvin Schmidt
a7c5be098a st/xorg: Remove superfluous miInitializeBackingStore() call
It was defined as an empty function since Nov 2010 and was ultimately
removed completely.

See xserver commit 1cb0261

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-10-30 11:36:31 +01:00
Vinson Lee
0a66ced8f8 xlib: Do not undefine _R, _G, and _B.
Fixes build error on Cygwin and Solaris. _R, _G, and _B are used in
ctype.h on those platforms.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-29 22:25:56 -07:00
Brian Paul
aab0ea9352 mesa: remove array size so the static assert can work
With the explit NUM_TEXTURE_TARGETS array size, the assertion that
Elements(targets) == NUM_TEXTURE_TARGETS would pass even if elements
were missing.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-29 17:56:05 -06:00
Brian Paul
1e46d810c8 mesa: use GLuint for more gl_constants fields
To silence assorted MSVC warnings.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-29 17:56:05 -06:00
Brian Paul
ec5341800b vbo: silence MSVC double/float conversion warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-29 17:56:05 -06:00
Brian Paul
f6c83e1661 mesa: silence some MSVC conversion warnings in get.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-29 17:56:05 -06:00
Brian Paul
06bb81f01d mesa: silence MSVC signed/unsigned comparision warnings in hash_table.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-29 17:56:04 -06:00
Brian Paul
8e45e38512 mesa: silence MSVC signed/unsigned comparision warnings in transformfeedback.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-29 17:56:04 -06:00
Brian Paul
03503daa21 mesa: silence MSVC signed/unsigned comparision warnings in accum.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-29 17:56:04 -06:00
Brian Paul
db0136ae3e mesa: silence MSVC signed/unsigned comparison warning in texstorage.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-29 17:56:04 -06:00
Brian Paul
298d7a20e1 mesa: silence MSVC double/float assignment warnings in pixel unpack code
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-29 17:56:04 -06:00
Vincent Lejeune
5ab82e0ccf r600g: tgsi-to-llvm emits right input intrinsics
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-10-30 00:28:42 +01:00
Tapani Pälli
e4e3b07181 intel: support for 16 bit config with 24 depth and 8 stencil
Patch adds additional singlesample config with 565 color buffer,
24 bit depth and 8 bit stencil buffer. This makes Quadrant benchmark
work on Android. Tested with Sandybridge and Ivybridge machines.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-29 11:58:47 -07:00
Ian Romanick
e8f2bec25e dri: Support MESA_FORMAT_SARGB8 in driCreateConfigs
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-29 09:55:56 -07:00
Ian Romanick
749ac8b73a intel: If the visual is sRGB, use an sRGB internal format
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-29 09:55:53 -07:00
Ian Romanick
1f6e10f67b dri: Convert driCreateConfigs to use a gl_format enum
This is instead of the pair of GLenums for format and type that were
previously used.  This is necessary for the Intel drivers to expose sRGB
framebuffer formats.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-29 09:55:42 -07:00
Ian Romanick
43d6fe156b dri_util: Elminiate the bytes_per_pixel table
With fewer formats to support, it's kind of useless.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-29 09:55:38 -07:00
Ian Romanick
bda208a4d4 dri_util: Remove support for RGB332 framebuffers
None of the remaining DRI drivers in Mesa use this.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-29 09:55:32 -07:00
Ian Romanick
0398a26097 swrast: Remove the 2_3_3_REV framebuffer format
There is no gl_format in Mesa that corresponds to this arrangement, so I
have a very hard time believing that this works.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-29 09:55:27 -07:00
Ian Romanick
386282b5c2 glx: Add the extension string for GLX_ARB_framebuffer_sRGB
From the GLX perspective, the ARB and EXT extensions are identical.  Use
a single bit for both.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Maciej Wieczorek <maciej.t.wieczorek@intel.com>
2012-10-29 09:55:23 -07:00
Ian Romanick
7b0f912e70 glx: Set sRGBCapable to a default value
Previously, if the server didn't send a GLX_FRAMEBUFFER_SRGB_CAPABLE_EXT
tag, it would still be set to GLX_DONT_CARE (which is -1).  Set it to
GL_FALSE instead.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Maciej Wieczorek <maciej.t.wieczorek@intel.com>
2012-10-29 09:55:15 -07:00
Bryan Cain
170f0459a2 glsl_to_tgsi: set correct register type for array and structure elements
This fixes an issue where glsl_to_tgsi_visior::get_opcode() would emit the
wrong opcode because the register type was GLSL_TYPE_ARRAY/STRUCT instead of
GLSL_TYPE_FLOAT/INT/UINT/BOOL, so the function would use the float opcodes for
operations on integer or boolean values dereferenced from an array or
structure.  Assertions have been added to get_opcode() to prevent this bug
from reappearing in the future.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2012-10-29 15:49:00 +01:00
Marek Olšák
96ed6c90ef r600g: implement texturing with 8x MSAA compressed surfaces for Evergreen
The 2x and 4x MSAA cases are completely broken. The lfdptr instruction returns
garbage there.

The 8x MSAA case is broken on Cayman, though at least the result looks somewhat
correct.

Only the 8x MSAA case works on Evergreen and is enabled.
2012-10-29 12:51:41 +01:00
Marek Olšák
b3921e1f53 mesa: bump MAX_VARYING to 32
We're starting to get apps utilizing more than 16 varyings and
most current hardware supports 32 anyway.

Tested with r600g.
swrast, softpipe and llvmpipe still advertise 16 varyings.

This fixes a WebGL crash after launching this demo:
https://developer.mozilla.org/en-US/demos/detail/falling-cubes

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54402

NOTE: This is a candidate for the stable branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-29 12:51:41 +01:00
Andreas Boll
00e6819e99 Revert "glsl_to_tgsi: set correct register type for array and structure elements"
This reverts commit ebd8df7a31.

accidentally pushed.
2012-10-29 12:21:07 +01:00
Vinson Lee
d37ae64203 scons: Add -fno-rtti to CXXFLAGS with llvm-3.2.
llvm-3.2svn r166772 no longer requires RTTI for lib/Support.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-10-28 11:31:25 -07:00
Christoph Bumiller
9ae7d8bb79 nv50/ir: restore use of long immediate encodings
NOTE: This is a candidate for the 9.0 branch.
2012-10-28 14:57:20 +01:00
Christoph Bumiller
351d3c59f2 nv50,nvc0: fix 2d engine stencil-only copies 2012-10-28 14:25:56 +01:00
Alexander V. Nikolaev
eaa8e56108 gallium/gallivm: code generation options for LLVM 3.1+
LLVM 3.1+ haven't more "extern unsigned llvm::StackAlignmentOverride"
and friends for configuring code generation options, like stack
alignment.

So I restrict assiging of lvm::StackAlignmentOverride and other
variables to LLVM 3.0 only, and wrote similiar code using
TargetOptions.

This patch fix segfaulting of WINE using llvmpipe built with LLVM 3.1

Signed-off-by: Alexander V. Nikolaev <avn@daemon.hole.ru>
Signed-off-by: José Fonseca <jose.r.fonseca@gmail.com>
2012-10-28 10:34:26 +00:00
Eric Anholt
459b28aba7 i965: Merge brw_prepare_query_begin() and brw_emit_query_begin().
This is a leftover from when we had to split those two functions due to
the separate BO validation step.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-26 12:41:54 -07:00
Eric Anholt
99dc870613 i965: Rename misleading "active" field of brw->query.
"Active" is an already-used term for the query being between
glBeginQuery() and glEndQuery(), while this is tracking whether the
start of the packet pair for emitting state has been inserted into the
current batchbuffer.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-26 12:41:01 -07:00
Marek Olšák
b78b62497f r600g: advertise 32 streamout vec4 outputs
to match the varying limit.
2012-10-26 19:01:16 +02:00
Brian Paul
80bc3206aa softpipe: remove extraneous whitespace 2012-10-26 10:59:29 -06:00
Brian Paul
369b5a311c gallivm/llvmpipe: fix 64-bit %ll format compiler warnings for mingw32
Use the PRIx64 and PRIu64 format macros from inttypes.h.  We made a
similar change in prog_print.c in df2d81ea59.
2012-10-26 10:59:29 -06:00
Marek Olšák
8b63512be0 r600g: advertise 32 fragment shaders inputs, not 34 2012-10-26 18:01:14 +02:00
José Fonseca
8eb2b331ef graw/fs-test: Use user constant buffers.
Much simpler. More interesting.
2012-10-26 16:02:59 +01:00
José Fonseca
ce10624e9e trace: Flush before drawing. 2012-10-26 16:02:59 +01:00
José Fonseca
91332e455a graw: Ensure new members are zeroed.
Several new state members were added, and they were not being zeroed,
causing random crashes.
2012-10-26 16:02:59 +01:00
José Fonseca
2532f0d063 tests/graw: Update occlusion query example. 2012-10-26 16:02:58 +01:00
Michel Dänzer
97078b198d radeonsi: Handle TGSI_SEMANTIC_FACE.
Fixes two piglit tests using gl_FrontFacing.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-26 15:51:18 +02:00
Michel Dänzer
691f08dbea radeonsi: Handle TGSI_SEMANTIC_BCOLOR.
Put the back face colour right after the front face colour in the LDS parameter
space.

Fixes 18 piglit tests related to two sided lighting.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-26 15:51:17 +02:00
Michel Dänzer
44ef033c25 radeonsi: Don't snoop context state while building shaders.
Let's use the shader key describing the state.

Ported from r600g commit b652180107.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-26 15:51:17 +02:00
Michel Dänzer
f3257d80b0 radeon/llvm: Add intrinsic for reading SI FRONT_FACE VGPR in the pixel shader.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-26 15:51:17 +02:00
Alex Deucher
bd274eb8f4 r600g: split cayman common state out into a shared function
And use it for compute.  This should improve compute support
on cayman.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-26 09:33:32 -04:00
Alex Deucher
67c875117c r600g: emit some additional regs on cayman
These are common to both evergreen and cayman, but were
not emitted on cayman.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-26 09:33:06 -04:00
Alex Deucher
d781f0c73c r600g: there are 16 const buffer size regs for each shader stage
we were previously only setting 8 of them.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-26 09:32:55 -04:00
Alex Deucher
20d268b350 r600g: rework evergreen_init_common_regs()
Move gfx specific bits out as the code is shared with
compute.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-26 09:32:40 -04:00
Alex Deucher
480e146305 r600g/compute: always CONTEXT_CONTROL packet at start of CS
It's required.  The CP uses this to properly allocate new
contexts.  Also do a CS partial flush since we are updating
CONFIG regs which are single state.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-26 09:32:33 -04:00
José Fonseca
4a93414985 tools/trace: More helpful message when no args are provided. 2012-10-26 10:50:48 +01:00
José Fonseca
54536686b2 scons: Build xlib swrast too.
Helpful for debugging.
2012-10-26 10:50:48 +01:00
Christian König
59d4bc8c48 vl: fix the dri winsys helper screen init
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-10-26 09:53:04 +02:00
Vinson Lee
8cb2a4a7f5 tests: Use printf instead of debug_printf in u_format_compatible_test.
Use printf instead of debug_printf to be consistent with print
statements in rest of unit tests.

This also fixes the lack of print output with the MinGW build of
u_format_compatible_test.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-25 23:09:59 -07:00
Marek Olšák
8fb4b1dce1 r300g: fix texture border color for sRGB formats
NOTE: This is a candidate for the stable branches.
2012-10-26 01:27:05 +02:00
Kenneth Graunke
b45a68eebf glsl: Allow ir_if in the linker's move_non_declarations function.
Global initializers using the ?: operator with at least one non-constant
operand generate ir_if statements.  For example,

   float foo = some_boolean ? 0.0 : 1.0;

becomes:

   (declare (temporary) float conditional_tmp)
   (if (var_ref some_boolean)
       ((assign (x) (var_ref conditional_tmp) (constant float (0.0))))
       ((assign (x) (var_ref conditional_tmp) (constant float (1.0)))))

This pattern is necessary because the second or third arguments could be
function calls, which create statements (not expressions).

The linker moves these global initializers into the main() function.
However, it incorrectly had an assertion that global initializer
statements were only assignments, calls, or temporary variable
declarations.  As demonstrated above, they can be if statements too.

Other than the assertion, everything works fine.  So remove it.

Fixes new Piglit test condition-08.vert, as well as an upcoming
game that will be released on Steam.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-25 14:53:39 -07:00
Kenneth Graunke
03ea156f1b i965/vs: Preserve the type when copy propagating into an instruction.
Consider the following code, which reinterprets a register as a
different type:

mov(8)          g6<1>F          g1.4<0,4,1>.xF
and(8)          g5<1>.xUD       g6<4,4,1>.xUD   0x7fffffffUD

Copy propagation would notice that we can replace the use of g6 with
g1.4 and eliminate the MOV.  Unfortunately, it failed to preserve the UD
type, incorrectly generating:

and(8)          g5<1>.xUD       g6<4,4,1>.xF    0x7fffffffUD

Found while debugging Ian's uncommitted ARB_vertex_program LOG opcode
test with my new Mesa IR -> Vec4 IR translator.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-25 14:52:54 -07:00
Kenneth Graunke
10ff6772c8 i965/vs: Don't lose the MRF writemask when doing compute-to-MRF.
Consider the following code sequence:

   mul(8)          g4<1>F          g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF
   mov.sat(8)      m1<1>.xyF       g4<4,4,1>F
   mul(8)          g4<1>F          g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF
   mov.sat(8)      m1<1>.zwF       g4<4,4,1>F

The compute-to-MRF pass will discover the first mov.sat and attempt to
replace it by rewriting earlier instructions.  Everything works out,
so it replaces scan_inst's destination file, reg, and reg_offset,
resulting in:

   mul(8)          m1<1>F          g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF
   mul(8)          g4<1>F          g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF
   mov.sat(8)      m1<1>.zwF       g4<4,4,1>F

Unfortunately, it loses the .xy writemask on the mov.sat's MRF
destination.  While this doesn't pose an immediate problem, it then
proceeds to transform the second mov.sat, resulting in:

   mul(8)          m1<1>F          g1<0,4,1>.wzwwF g3<4,4,1>.wzwwF
   mul(8)          m1<1>F          g1<0,4,1>.xxyxF g3<4,4,1>.xxyxF

Instead of writing both halves of the vector (like the original code),
it overwrites the full vector both times, clobbering the desired .xy
values.

When encountering a MOV, the compute-to-MRF code scans for instructions
which generate channels of the MOV source.  It ensures that all
necessary channels are available (possibly written by several
instructions).  In this case, *more* channels are available than
necessary, so we want to take the subset that's actually used.
Taking the bitwise and of both writemasks should accomplish that.

This was discovered by analyzing an ARB_vertex_program test
(glean/vertProg1/MUL test (with swizzle and masking)) with my new
Mesa IR -> Vec4 IR translator code.  However, it should be possible
with GLSL programs as well.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-25 14:52:54 -07:00
Kenneth Graunke
9142ade154 glcpp: Don't use infinite lookhead for #define differentiation.
Previously, we used lookahead patterns to differentiate:

   #define FOO(x)  function macro
   #define FOO (x) object macro

Unfortunately, our rule for function macros:

   {HASH}define{HSPACE}+/{IDENTIFIER}"("

relies on infinite lookahead, and apparently triggers a Flex bug where
the generated code overflows a state buffer (see YY_STATE_BUF_SIZE).

There's no need to use infinite lookahead.  We can simply change state,
match the identifier, and use a single character lookahead for the '('.
This apparently makes Flex not generate the giant state array, which
avoids the buffer overflow, and should be more efficient anyway.

Fixes piglit test 17000-consecutive-chars-identifier.frag.

NOTE: This is a candidate for every release branch ever.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
2012-10-25 14:52:53 -07:00
Kenneth Graunke
eeb2fb72eb i965/vs: Fix debug dumping of VS push constants.
While copying the values into the batch space, we advance the param
pointer.  The debug code then tries to iterate over all the uploaded
values, starting at param...which is now the end of the uploaded data,
rather than the start.

This patch saves a pointer to the start of push constant space before
it gets altered and switches the debug code to use that.

Tested by uncommenting the code and examining the output of
glsl-vs-clamp-1.shader_test.  Previously all values appeared to be zero.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-25 14:52:53 -07:00
Matt Turner
df924d82e2 mesa/tests: Add ES3.0 dispatch table sanity test
Since ES3.0 is backward compatible with 2.0, we check that all the 2.0
functions and additional 3.0 functions exist.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-10-25 14:29:05 -07:00
Matt Turner
355f507f2a Split dispatch sanity's validate_function test into two
Will be useful for the next patch, adding GLES 3 testing.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-10-25 14:29:05 -07:00
Paul Berry
1cf6360f89 dispatch_sanity: print names of functions that shouldnt be in dispatch table.
Previously we just printed the dispatch table index and the user had
to convert it to a function name.  That was a pain because when
FEATURE_remap_table is defined, the assignment of functions to
dispatch table entries is done at run time.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-10-25 14:23:01 -07:00
Paul Berry
03984b26c4 shared-glapi: implement _glapi_get_proc_name().
Previously this function was only implemented for non-shared-glapi
builds.  Since the function is only intended for debugging purposes we
use a simple O(n) algorithm.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-10-25 14:23:01 -07:00
Matt Turner
67f1e7bf5f src/glsl/tests/Makefile.am: Specify -I... in AM_CPPFLAGS
When specifying per-target CFLAGS (e.g., ralloc_test_CFLAGS) AM_CFLAGS
are not used. AM_CPPFLAGS should be used for includes anyway.

Fixes a build problem since 41b14d125:

CC       ralloc_test-ralloc.o
In file included from ../../../src/glsl/ralloc.c:42:0:
../../../src/glsl/ralloc.h:57:27: fatal error: main/compiler.h: No such file or directory

Acked-by: Paul Berry <stereotype441@gmail.com>
2012-10-25 13:31:24 -07:00
Matt Turner
d654afd892 egl: Import eglext.h revision 19332
The version number (14) wasn't updated.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-25 10:49:09 -07:00
Matt Turner
41b14d1251 ralloc: Annotate printf functions with PRINTFLIKE(...)
Catches problems such as (in the gles3 branch)

glcpp-parse.y: In function '_glcpp_parser_handle_version_declaration':
glcpp-parse.y:1990:39: warning: format '%lli' expects argument of type
	'long long int', but argument 4 has type 'int' [-Wformat]

As a side-effect, remove ralloc.c's likely/unlikely macros and just use
the ones from main/compiler.h.

NOTE: This is a candidate for the release branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-25 10:47:43 -07:00
Matt Turner
ec57fbbc72 build: Ship install-sh in the tarball
Fixes the problem where configure from the tarball would report missing
files:

$ ./configure
configure: error: cannot find install-sh, install.sh, or shtool in bin

NOTE: This is a candidate for the 9.0 branch.
2012-10-25 10:47:43 -07:00
José Fonseca
0cb0c38cce mesa/st: Don't use 4bits for GL_UNSIGNED_BYTE_3_3_2(_REV)
4bits and 3bits quantitization values differ significantly for
values other than 0 and 1.

Fixes piglit draw-pixels for softpipe/llvmpipe.

NOTE: Probably a candidate for stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-10-25 17:57:57 +01:00
José Fonseca
4efcdd1e7a trace: Fix dumping of set_constant_buffer method. 2012-10-25 15:30:19 +01:00
Andreas Boll
86cd77d0a9 docs: add another fixed bug to mesa 8.0.5 release notes
Reported-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2012-10-24 23:59:51 +02:00
Andreas Boll
2574d10398 docs: Add 8.0.5 release notes
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2012-10-24 23:51:09 +02:00
Bryan Cain
ebd8df7a31 glsl_to_tgsi: set correct register type for array and structure elements
This fixes an issue where glsl_to_tgsi_visior::get_opcode() would emit the
wrong opcode because the register type was GLSL_TYPE_ARRAY/STRUCT instead of
GLSL_TYPE_FLOAT/INT/UINT/BOOL, so the function would use the float opcodes for
operations on integer or boolean values dereferenced from an array or
structure.  Assertions have been added to get_opcode() to prevent this bug
from reappearing in the future.
2012-10-24 23:51:08 +02:00
Vincent Lejeune
0f35702d79 r600g: force bank_swizzle if already set
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-10-24 23:37:02 +02:00
Vincent Lejeune
d1eaa9ea70 r600g: rewrite tgsi-to-llvm load-input to handle fragcoord
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-10-24 23:31:41 +02:00
Christoph Bumiller
d310e29302 nv50/ir/tgsi: fix srcMask for TXP with SHADOW1D 2012-10-24 20:47:38 +02:00
Ian Romanick
be1c5f4498 mesa: Use MIN instead of CLAMP for unsigned source data
This silences a zillion GCC warnings like:

../../../src/mesa/main/pack.c: In function '_mesa_pack_rgba_span_from_uints':
../../../src/mesa/main/pack.c:560:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-24 11:03:16 -07:00
Michel Dänzer
eee1ff423c st/mesa: Fix assertions for copying texture image to finalized miptree.
The layer dimension of array textures is not subject to mipmap minification.
OTOH we were missing an assertion for the depth dimension.

Fixes assertion failures with piglit {f,v}s-textureSize-sampler1DArrayShadow.
For some reason, they only resulted in piglit 'warn' results for me, not
failures.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56211

NOTE: This is a candidate for the stable branches.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2012-10-24 17:54:25 +02:00
Andreas Boll
ecb02c27fc gallium/docs: fix sphinx warning
src/gallium/docs/source/context.rst:495: WARNING:
malformed hyperlink target.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-24 14:14:03 +02:00
Vinson Lee
016897cc66 scons: Do not use -fvisibilty=hidden on Cygwin.
This is a follow-up to commit db78643182.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-23 23:22:45 -07:00
Andreas Boll
3e3ff4cd73 mesa: fix indentation in get-pick-list.sh script
NOTE: This is a candidate for the stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-23 19:26:56 +02:00
Andreas Boll
135ec3a1db mesa: grep for commits with cherry picked in commit message only once
and save them temporary in already_picked

NOTE: This is a candidate for the stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-23 19:26:56 +02:00
Andreas Boll
b2991526ed mesa: optimize get-pick-list.sh script
cuts down the while loop iterations from 4600 to 380 commits at the
moment

NOTE: This is a candidate for the stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-23 19:26:56 +02:00
Andreas Boll
fa27a0db43 mesa: simplify get-pick-list.sh script
and add a description for the script

NOTE: This is a candidate for the stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-23 19:26:56 +02:00
Ian Romanick
2d95db660e mesa: add get-pick-list.sh script into bin/
NOTE: This is a candidate for the stable branches.
2012-10-23 19:26:56 +02:00
Paul Berry
2e0de80751 _mesa_create_exec_table: GLES3 fixes.
This patch sets up the dispatch table for the following GLES3
functions when a GLES3 context is in use:

- BeginQuery
- BeginTransformFeedback
- BindSampler
- BindTransformFeedback
- BlitFramebuffer
- ClearBufferfi
- ClearBufferfv
- ClearBufferiv
- ClearBufferuiv
- ClientWaitSync
- CopyBufferSubData
- DeleteQueries
- DeleteSamplers
- DeleteSync
- DeleteTransformFeedbacks
- EndQuery
- EndTransformFeedback
- FenceSync
- FramebufferTextureLayer
- GenQueries
- GenSamplers
- GenTransformFeedbacks
- GetInteger64v
- GetQueryObjectuiv
- GetQueryiv
- GetSamplerParameterfv
- GetSamplerParameteriv
- GetStringi
- GetSynciv
- GetTransformFeedbackVarying
- GetVertexAttribIiv
- GetVertexAttribIuiv
- IsQuery
- IsSampler
- IsSync
- IsTransformFeedback
- PauseTransformFeedback
- RenderbufferStorageMultisample
- ResumeTransformFeedback
- SamplerParameterf
- SamplerParameterfv
- SamplerParameteri
- SamplerParameteriv
- TransformFeedbackVaryings
- VertexAttribDivisor
- VertexAttribIPointer
- WaitSync

And it avoids setting up the dispatch table for these non-GLES3
functions:

- ColorMaski
- GetBooleani_v
- Enablei
- Disablei
- IsEnabledi
- ClearColorIiEXT
- ClearColorIuiEXT
- TextureStorage2DEXT
- TextureStorage3DEXT
- GetActiveUniformName
- GetnUniformdv
- GetnUniformfv
- GetnUniformiv
- GetnUniformuiv

Reviewed-by: Brian Paul <brianp@vmware.com>

v2: Make the ctx argument to _mesa_init_transform_feedback_dispatch()
a const pointer.  Add a comment to remind us to add
GetBufferParameteri64v once tests exist for it.  Also add
VertexAttribDivisor for GLES3, and remove GetActiveUniformName and
GetnUniform{dv,fv,iv,uiv} for GLES3.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-23 10:24:40 -07:00
Paul Berry
5863e3d16e _mesa_create_exec_table(): deprecate ProgramStringARB.
This function is only useful for the ARB_{vertex,fragment}_program
extensions, which we don't expose in core contexts.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-23 10:24:40 -07:00
Paul Berry
893ddb068f _mesa_create_exec_table: de-deprecate GetPointerv.
glGetPointerv was de-deprecated in GL 4.3, because GL 4.3 adds
functionality from KHR_debug and ARB_debug_output, which require
glGetPointerv.

This patch modifies _mesa_create_exec_table() to populate
glGetPointerv in the dispatch table for core contexts.

Technically this is not in compliance with the spec--what we really
ought to do for core contexts is expose glGetPointerv only when a GL
4.3 context is in use or one of the two extensions is present.
However, it seems silly to go to that extra work, since the only
client-visible effect would be for glGetPointerv to raise an
INVALID_OPERATION error instead of an INVALID_ENUM error.  Besides,
the other functions set up by _mesa_create_exec_table() only depend on
the API in use, not on the GL version or extensions supported.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-23 10:24:39 -07:00
Paul Berry
e7dd2e5213 glapi: Alias VertexAttribDivisor and VertexAttribDivisorARB.
There's no reason to have separate slots in the dispatch table for
these two functions, since they are synonymous.

Note: previous to this patch, we never populated the dispatch table
slot for VertexAttribDivisor, which was ok, since it is not required
until 3.3.  After this patch, both functions will be usable provided
that the ARB_instanced_arrays extension is present.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-23 10:24:39 -07:00
Paul Berry
bad96f6ada glapi: Alias ClampColor and ClampColorARB.
There's no reason to have separate slots in the dispatch table for
these two functions, since they are synonymous.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-23 10:24:39 -07:00
Paul Berry
992ed68ed6 main: Fix warning ('struct gl_context' declared inside parameter list).
This eliminates a warning in GCC 4.7.1.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-23 10:24:39 -07:00
Eric Anholt
ab7188e199 mesa: Return 0 for GL_CURRENT_QUERY with a mismatched query target.
With the previous two commits, this fixes piglit
GL_ARB_occlusion_query2/api.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-22 17:36:53 -07:00
Eric Anholt
8f1131fcc0 mesa: Refuse to EndQuery with a mismatched query target.
v2: Add a comment about what we're checking for.

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-22 17:36:53 -07:00
Eric Anholt
ce086ebd89 mesa: Throw an error for a new query on an already-active query target.
There's a similar test below, but it's not the same: that one checks whether
this query object is already active (potentially on another target).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-22 17:36:52 -07:00
Eric Anholt
e755c1a36b i965: Actually add support for GL_ANY_SAMPLES_PASSED from GL_ARB_oq2.
v2: Fix mangled sentence in the comment, and make the loop exit early.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
2012-10-22 17:35:59 -07:00
Brian Paul
11070105f0 st/mesa: free TGSI tokens with ureg_free_tokens()
since they're allocated by ureg_get_tokens().

NOTE: This is a candidate for the 8.0 and 9.0 branches.
2012-10-22 15:49:31 -06:00
Brian Paul
bb93439873 st/mesa: replace REALLOC() with realloc()
We should use the later since we're freeing the memory with free(),
not the gallium FREE() macro.

This fixes a mismatch when using the gallium debug memory functions.

NOTE: This is a candidate for the 9.0 branch.
2012-10-22 15:49:31 -06:00
Brian Paul
140f1d9207 docs: GL_ARB_texture_storage is supported for all gallium drivers 2012-10-22 15:49:31 -06:00
Matt Turner
9a51edfb5a Re-add HAVE_PTHREADS preprocessor macro
Broken in commit 814345f54b.

NOTE: This is a candidate for the 9.0 branch.
2012-10-22 10:52:47 -07:00
Kristian Høgsberg
259fc154f1 gbm: Use the kms dumb ioctls for cursor instead of libkms
We need to create bos suitable for cursor usage that we can map and
write data into.  The kms dumb ioctls is all we need for this, so drop
the dependency on libkms.
2012-10-21 13:00:49 -04:00
Tom Stellard
d2b0338e33 r600g: Remove special handling of PRED_SET* insructions for LLVM 3.2
The 3.2 version of the backend now sets all the correct fields for
PRED_SET* instructions.
2012-10-19 21:25:01 +00:00
Tom Stellard
8030cb0ed4 radeon/llvm: Sort tgsi opcode action initialization
This was done in order to identify and remove duplicate entries.
2012-10-19 21:25:01 +00:00
Tom Stellard
bd8af8a3dc radeon/llvm: Fix lowering TGSI_OPCODE_SSG 2012-10-19 21:25:00 +00:00
Eric Anholt
cae077cd0f i965: Stop flushing the batch on timestamp queries, too.
Given the usecase we have of trying to measure timestamps across individual
draw calls, flushing will totally mess up what people are trying to measure.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-19 11:47:01 -07:00
Eric Anholt
1973845fbd i965: Don't flush the batch immediately on EndQuery.
The theory I had when I wrote the code was that you wanted to minimize latency
on your queries because the app was going to ask soon.  Only, it turns out
that everybody batches up their queries and asks for the results later (often
after the next SwapBuffers!), so this was a pessimization.

Until now, I had no workload where it mattered enough to benchmark.  Recently
I started playing some Minecraft, which uses tons of queries to decide whether
to render chunks of the terrain.  For that app, avoiding the flush in the
query-generation loop improves performance 22.7% +/- 4.7% (n=3) on an apitrace
capture of it (confirmed in game by watching the fps meter found by pressing
F3, 15/16 -> 20/21 fps).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-19 11:47:01 -07:00
Eric Anholt
804469c58d i965/fs: Fix typo in refactor of brw_fs_reg_allocate.cpp.
I'm amazed that my usual warnings check didn't catch this, and that this
passed piglit.
2012-10-19 11:47:01 -07:00
Tapani Pälli
f593acd577 i965/vs: include format argument in debug printf
otherwise some compilers will throw error
"error: format not a string literal and no format arguments"

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-19 10:00:19 -07:00
Michel Dänzer
c2e37b1d2e st/mesa: Fix source miptree level for copying data to finalized miptree.
Fixes WebGL texture mips conformance test, no piglit regressions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44912

NOTE: This is a candidate for the stable branches.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2012-10-19 16:01:14 +02:00
Francisco Jerez
26fc30ef83 clover: No need for clover::is_zero() to be a functor.
Simplify is_zero() somewhat, and as a side effect work around a gcc compiler
bug that causes build failure.

https://bugs.freedesktop.org/show_bug.cgi?id=56140

Reported-by: Dmitry Cherkassov <dcherkassov@gmail.com>
2012-10-19 12:38:44 +02:00
Brian Paul
6551c4ea3c st/mesa: improve the guess_and_alloc_texture() heuristic
If GL_BASE_LEVEL==0 and GL_MAX_LEVEL==0 that's a pretty good hint that
there'll be a single mipmap level in the texture.

Google Earth sets the texture's state this way before the first glTexImage
call.  This saves a bit of texture memory.
2012-10-18 18:00:50 -06:00
Marek Olšák
e5a9bf5523 gallium: remove unused data pointer from pipe_transfer
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-18 22:54:53 +02:00
Chad Versace
0da72d3502 intel: Skip texsubimage fastpath for more pixel unpack state (v2)
Fixes piglit tests "unpack-teximage2d --pbo=* --format=GL_BGRA" on
Sandybridge+.

The fastpath was checking an incomplete set of pixel unpack state. This
patch adds checks for all the fields of gl_pixelstore_attrib that affect
2D texture uploads.  Also, it begins permitting the case where
GL_UNPACK_ROW_LENGTH is 0.

Ideally, we would just ask a unicorn to JIT this fastpath for us in
a way that safely handles the unpacking state. Until then, it's safer if
only a small set of situations activate the fastpath.

v2: Use _mesa_is_bufferobj(), per Anholt.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-18 08:53:59 -07:00
Matt Turner
6c28174969 Finish _HAVE_FULL_GL removal
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 19:30:34 -07:00
Dmitry Cherkasov
b21455f27d configure.ac: Fix LLVM 3.2 r600/radeonsi error message
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Dmitry Cherkasov <Dmitrii.Cherkasov@amd.com>
2012-10-17 17:43:35 -04:00
Brian Paul
0d1ee26489 svga: add svga_screen_cache_dump() debug helper 2012-10-17 15:30:33 -06:00
Kristian Høgsberg
e20a0f14b5 wayland: Drop support for ill-defined, unused wl_egl_pixmap
It doesn't provide the cross-process buffer sharing that a window system
pixmap could otherwise support and we don't have anything left that uses
this type of surface.
2012-10-17 16:32:13 -04:00
Kristian Høgsberg
2b8e90a338 wayland: Remove 0.85 compatibility #ifdefs 2012-10-17 16:32:13 -04:00
Kristian Høgsberg
0229e3ae41 egl/wayland: Update to Wayland 0.99 API
The 0.99.0 Wayland release changes the event API to provide a thread-safe
mechanism for receiving events specific to a subsystem (such as EGL) and
we need to use it in the EGL platform.

The Wayland protocol now also requires a commit request to make changes
take effect, issue that from eglSwapBuffers.
2012-10-17 16:32:13 -04:00
Eric Anholt
be4c0a243e i965/fs: Statically allocate the reg_sets at context initialization.
Now that we've replaced all the variable settings other than reg_width, it's
easy to hang on to this (the expensive part of setting up the allocator).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 13:02:16 -07:00
Eric Anholt
8757fa65b8 i965/fs: Allocate registers in the unused parts of the gen7 MRF hack range.
This should also reduce register pressure on gen7+, like the previous commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 13:02:06 -07:00
Eric Anholt
a087e9f27f i965/fs: Reduce the interference between payload regs and virtual GRFs.
Improves performance of the Lightsmark penumbra shadows scene by 15.7% +/-
1.0% (n=15), by eliminating register spilling. (tested by smashing the list of
scenes to have all other scenes have 0 duration -- includes additional
rendering of scene description text that normally doesn't appear in that
scene)

v2: Allow allocation of all but g0/g1 of the payload.
v3: Pull count_to_loop_end() out to a helper function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2, recommended v3)
2012-10-17 13:01:57 -07:00
Eric Anholt
551e1cd44f i965/fs: Expose the payload registers to the register allocator.
For now, nothing else can get allocated over them, but that will change.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 13:01:54 -07:00
Eric Anholt
6c69df1e0f i965/fs: Remove extra allocation for classes[].
This was to slot in the magic aligned pairs class, but it got moved to a
descriptive name later.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 13:01:51 -07:00
Eric Anholt
5d90b98879 i965/fs: Make the register allocation class_sizes[] choice static.
Based on split_virtual_grfs(), we choose the same set every time, so set it in
stone.  This will help us avoid regenerating the somewhat expensive
class/register set setup every compile.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 13:01:37 -07:00
Eric Anholt
20ebebac51 i965/vs: Improve live interval calculation.
This is derived from the FS visitor code for the same, but tracks each channel
separately (otherwise, some typical fill-a-channel-at-a-time patterns would
produce excessive live intervals across loops and cause spilling).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48375
          (crash -> failure, can turn into pass by forcing unrolling still)
2012-10-17 12:24:01 -07:00
Eric Anholt
e1a518e2b1 i965/vs: Fix the mlen of scratch read/write messages.
These messages always have m0 = g0 and m1 = offset, and write has m2 = data.
Avoids regression in opt_compute_to_mrf() with a change to scratch writes to
set up the data as an MRF write in the IR.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 12:24:00 -07:00
Eric Anholt
c226b7a4d3 i965: Make the cfg reusable from the VS.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 12:24:00 -07:00
Eric Anholt
54679fcbca i965: Share the predicate field between FS and VS.
Note that BRW_PREDICATE_NONE is 0 and BRW_PREDICATE_NORMAL is 1, so that's a
lot like the true/false we had in the FS before.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 12:24:00 -07:00
Eric Anholt
7abfb67dc4 i965: Rename fs_cfg types to not mention fs.
fs_bblock_link -> bblock_link
fs_bblock -> bblock_t (to avoid conflicting with all the fs_bblock *bblock)
fs_cfg -> cfg_t (to avoid conflicting with all the fs_cfg *cfg)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 12:24:00 -07:00
Eric Anholt
5ed57d9543 i965: Move brw_fs_cfg.* to brw_cfg.*.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 12:24:00 -07:00
Eric Anholt
24aeeb2fdc i965: Make the FS and VS share a few visitor/instruction fields.
This will let us reuse brw_fs_cfg.cpp from brw_vec4_*.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 12:24:00 -07:00
Eric Anholt
338fd85e62 i965/vs: Trim the swizzle of the scratch write temporary.
This fixes confusion by the upcoming live variable analysis which saw e.g. use
of temp.w when only temp.xyz were initialized in the basic block, and
concluded that temp.w must have come from outside of the block (even though it
was never initialized anywhere).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 12:24:00 -07:00
Eric Anholt
af911b2819 i965/vs: Do the temporary allocation in emit_scratch_write().
Both callers were doing basically the same thing, just written differently.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 12:23:59 -07:00
Eric Anholt
9499f7984e i965/vs: Simplify emit_scratch_write() prototype.
Both callers used (effectively) inst->dst as the argument, so just reference
it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 12:23:59 -07:00
Eric Anholt
914d8f9f84 i965/vs: Add a little bit of IR-level debug ability.
This is super basic, but it let me visualize a problem I had with
opt_compute_to_mrf().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-17 12:23:59 -07:00
Adam Jackson
a30d14635d glx: Add GLXBadProfileARB to the error string list
Note: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2012-10-17 15:12:14 -04:00
Owen W. Taylor
1d0c621121 glx: Fix listing of INTEL_swap_event in glXQueryExtensionsString()
Due to a string mismatch, INTEL_swap_event wasn't listed among GLX
extensions for the connection, even when present on both client and
server. That is, glXQueryServerString and glXGetClientString reported the
extension, but glXQueryExtensionsString did not.

Note: This is a candidate for the stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56057
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-17 10:16:23 -07:00
José Fonseca
aa2067c757 gallivm: Hide AVX support when requested by LP_NATIVE_VECTOR_WIDTH or unsupported by LLVM.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-10-17 18:07:43 +01:00
Will Schmidt
54821c0e99 gallivm: Use mcjit for ppc_64 architecture
Per commentary and direction in the LLVM community, support for ppc64 is
going into MCJIT rather than the old JIT.  There is no existing support
in prior llvm versions, so no need to specify LLVM version numbers.

Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-10-17 18:07:43 +01:00
Brian Paul
32638737c5 st/mesa: silence MSVC signed/unsigned comparison warning
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-17 10:13:30 -06:00
Brian Paul
ead664e506 st/mesa: silence MSVC double/unsigned assignment warning
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-17 10:13:30 -06:00
Brian Paul
198d1bdb5f tgsi: silence MSVC signed/unsigned comparison warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-17 10:13:30 -06:00
Brian Paul
34a5fd2a39 util: fix MSVC signed/unsigned comparison warning in u_upload_mgr.c code
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-17 10:13:30 -06:00
Brian Paul
ba7bfdeff2 util: fix MSVC signed/unsigned comparison warning in u_vbuf.c code
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-17 10:13:30 -06:00
Brian Paul
a0785544e3 util: fix MSVC double/float conversion warning in u_format_r11g11b10f.h
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-17 10:13:30 -06:00
Brian Paul
f031910486 draw: silence MSVC signed/unsigned comparison warnings
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-17 10:13:30 -06:00
Brian Paul
a115a29153 util/blitter: silence assorted MSVC warnings
Fix signed/unsigned comparison warnings and float/int assignment warnings.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-17 10:13:30 -06:00
Brian Paul
7abd136e91 wmesa: remove old, unused span code 2012-10-17 10:13:30 -06:00
José Fonseca
879894552b scons: Fix graw-xlib lib order.
Avoids "undefined symbol: XShmCreateImage" error.
2012-10-17 15:28:26 +01:00
José Fonseca
ea2978b11c tgsi: Add support to parse IMM[x] too.
Thanks to Brian for pointing this out.
2012-10-17 15:27:26 +01:00
José Fonseca
2ab6e67d90 Revert "gallivm: Don't use llvm.x86.avx.max/min.ps.256 inadvertently."
This reverts commit bf2edc776b.
2012-10-17 15:04:20 +01:00
Vinson Lee
53e36d333c build: Build on Cygwin with gnu99 instead of c99.
The GCC c99 standard on Cygwin sets __STRICT_ANSI__ and symbols such as
strdup are not available.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-16 23:28:22 -07:00
Matt Turner
0199ff7fe3 es2api: Add GL ES 3 headers 2012-10-16 19:31:22 -07:00
Matt Turner
c9155c9317 glapi: Add es2="3.0" attributes to XML.
Note that we are missing the ARB_internalformat_query extension, which
provides the glGetInternalformativ function needed by GL ES 3.0.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-10-16 19:31:22 -07:00
Brian Paul
1284543a44 svga: whitespace fixes, remove useless comments 2012-10-16 18:11:58 -06:00
Brian Paul
0087f5ce51 svga: silence MSVC warning about negating an unsigned value 2012-10-16 17:55:39 -06:00
Brian Paul
ffbac58746 svga: silence MSVC double/float assignment warnings 2012-10-16 17:55:39 -06:00
Brian Paul
ce3faa993c svga: fix MSVC double/float parameter warning 2012-10-16 17:55:39 -06:00
Brian Paul
d21e6c87c0 svga: silence MSVC float/int assignment warnings 2012-10-16 17:55:39 -06:00
Brian Paul
200291e087 svga: silence MSVC double/float assignment warnings 2012-10-16 17:55:39 -06:00
Brian Paul
25cd2c2a8a svga: silence some MSVC signed/unsigned comparison warnings 2012-10-16 17:55:39 -06:00
Ian Romanick
4d0458dc6e mesa/tests: Add ES1.1 dispatch table sanity test
This test actually depends on FEATURE_ES1 because
_mesa_create_exec_table_es1 doesn't exist without it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-10-16 14:57:20 -07:00
Ian Romanick
95b76eab71 mesa/tests: Compile ES2 test regardless of FEATURE_ES2 setting
The relevant ES2 code is always in Mesa.  Always building the tests
ensures that things aren't accidentally broken when people don't build
with --enable-es2.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-10-16 14:57:20 -07:00
Brian Paul
c50d6a2abc mesa: remove FEATURE_ES1 tests in enable.c code
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 14:57:20 -07:00
Brian Paul
1633fa1627 mesa: remove FEATURE_ES test in _mesa_get_compressed_formats()
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 14:57:20 -07:00
Brian Paul
4936aadcd1 mesa: remove FEATURE_ES test in _mesa_is_compressed_format()
The code already has a runtime ES1 test.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 14:57:20 -07:00
Brian Paul
920f331cf1 mesa: remove FEATURE_GL test from updated_drawbuffers()
There's already a runtime test for full OpenGL.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 14:57:20 -07:00
Brian Paul
99940eef48 mesa: remove #if _HAVE_FULL_GL checks
This is basically more of the "remove FEATURE_x" clean-up.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 14:57:20 -07:00
Brian Paul
198fa6452b mesa: remove ASSERT_NO_FEATURE macro
Was only used in one place.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 14:57:20 -07:00
Eric Anholt
7139ab80ca i965: Fix rendering to small mipmaps of depth/stencil buffers using a temp mt.
Fixes 51 piglit tests (fbo-clear-formats, and most of the remaining failures
in depthstencil).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-16 13:13:44 -07:00
Eric Anholt
5c8dd6cf79 i965: Share the draw x/y offset masking code between main/blorp and all gens.
This code is twisty, and the comment before most of the blocks was actually
giving me the opposite impression from its intention: We want to apply as much
of our offset as possible through coarse tile-aligned adjustment, since we can
do so independently per buffer, and apply the minimum we can through
fine-grained drawing offset x/y, since it has to agree between all buffers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-16 13:13:44 -07:00
Eric Anholt
ddfa346e4a i965: Make a helper function for the renderbuffer temporary mt workaround.
We now have a case of wanting to do that on gen6+ as well, so make this logic
usable elsewhere.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-16 13:13:44 -07:00
Eric Anholt
4bec2e31bf i965: Warn on a couple of workarounds in blending.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-16 13:13:44 -07:00
Eric Anholt
1fe71848b6 intel: Add a macro for printing a debug warning once.
There are a number of places where some obscure piece of the code is not
currently worth fixing, and we have some workaround behavior available.  It's
nicer for users to do some lame workaround than to just assert, but without
asserts we never knew when the workaround was at fault.

This should give us a nice compromise: Execute the workaround, but mention
that the obscure workaround was hit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-16 13:13:44 -07:00
Andreas Boll
85067d4bab docs: add note about removal of GL_NV_fragment_program 2012-10-16 21:24:04 +02:00
Paul Berry
381186dbf8 glapi: Delete gles_api.py, since it is no longer used.
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 12:03:56 -07:00
Paul Berry
c8ad6ef1c6 mapi_abi: Use GLES information from XML rather than gles_api.py.
Note: mapi_abi can consume API information from either XML or a .csv
file.  A side effect of this change is that the ES1 and ES2 API
printers can only be used with XML input now.  That's ok, since the
.csv input format is only used for the OpenVG API.

Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 12:03:56 -07:00
Paul Berry
137f8ef225 mapi_abi: Override 'hidden' and 'handcode' attributes using polymorphism.
Previously, the ES1, ES2, and shared GLAPI printers passed a list of
function names to the base class constructor, which was used by the
_override_for_api() function to loop over all the API functions and
adjust their 'hidden' and 'handcode' attributes as appropriate for the
API flavour being code-generated.

This patch lifts the loop from _override_for_api() into its caller,
and makes it into a polymorphic function, so that the derived classes
can customize its behaviour directly.  In a future patch, this will
allow us to override the 'hidden' and 'handcode' attributes based on
information from the XML rather than a list of functions.

Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 12:03:56 -07:00
Paul Berry
4f6fc905c6 mapi_abi: Get rid of unnecessary copy.
Previously, _get_api_entries() would make a deep copy of each element
in the entries table before modifying the 'hidden' and 'handcode'
attributes.  This was unnecessary, since the entries aren't used again
after this function.  Removing the copy simplifies the code, because
it is no longer necessary to adjust the alias pointers to point to the
copied entries.

Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 12:03:56 -07:00
Paul Berry
77ed171f27 mapi_abi: Remove sanity check that all GLES functions are present.
Currently mapi_abi.py uses hardcoded lists of function names (in
gles_api.py) to determine which functions need to be included in the
GLES 1 or GLES 2 API.  This patch removes a sanity check which
verified that all GLES functions listed in the hardcoded lists were
actually present in the XML.

Later patches in this series will modify mapi_abi.py to determine
which functions need to be included in the GLES 1 or GLES 2 API based
directly on the XML.  Once that is done, the sanity check will be
redundant.  Removing the sanity check now will simplify the patches to
come.

Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 12:03:56 -07:00
Paul Berry
155eff56b1 mapi_abi: Collect all imports at top of file.
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 12:03:55 -07:00
Paul Berry
e378cd77bc glapi: Use GLES information from XML rather than gles_api.py.
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 12:03:55 -07:00
Paul Berry
cd4ce16c45 glapi: Read GLES information from XML.
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 12:03:55 -07:00
Paul Berry
81a7f50781 glapi: Add es1 and es2 attributes to XML.
Currently, the set of functions which exist in GLES1 or GLES2 is
determined by hardcoded lists of function names in gles_api.py.  This
patch encodes that information into the XML files using new
attributes, es1 and es2.

The es1 attribute denotes the first version of GLES 1 in which the
function exists (e.g. es1="1.1" means the function exists in GLES 1.1
but not GLES 1.0).  "none" (the default) means the function is not
available in any version of GLES 1.

The es2 attribute denotes the first version of GLES 2/3 in which the
function exists (e.g. es2="2.0" means the function exists in both GLES
2.0 and GLES 3.0).  "none" (the default) means the function is not
available in any version of GLES 2 or GLES 3.

Note that since GLES 3 is a strict superset of GLES 2, there is no
need for a separate attribute for it; instead, 'es2="3.0"' should be
used to denote functions that are present in GLES 3 but not GLES 2.

This patch only adds information about GLES versions 1.0, 1.1, and
2.0.

Later patches will modify the python code generation scripts to use
this information rather than the hardcoded lists in gles_api.py.

Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 12:03:55 -07:00
Paul Berry
7dc052b12b glapi: use new-style Python classes.
An unfortunate quirk of Python 2 is that there are two types of
classes: "classic" classes (which are backward compatible with some
unfortunate design decisions made early in Python's history), and
"new-style" classes.  Classic classes have a number of limitations
(for example they don't support super()) and are unavailable in Python
3.  There's really no reason to use classic classes, except in
unmaintained legacy code.  For more information see
http://www.python.org/download/releases/2.2.3/descrintro/.

This patch upgrades the Python code in src/mapi/glapi/gen to use
exclusively new-style classes.

Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 12:03:55 -07:00
Kenneth Graunke
41954107c0 i965/fs: Fix segfault when using INTEL_DEBUG=perf with non-GLSL.
Now that ARB programs and fixed function are routed through the new
backend, shader might be NULL.  Don't do INTEL_DEBUG=perf support in
that case, since it relies on shader->compiled_once.

Since INTEL_DEBUG=perf wasn't previously supported, this maintains the
status quo.  It might be nice to support it someday, however.

This could be moved to brw_shader_program instead of brw_shader, but
it appears even prog can be NULL in that case.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-16 12:02:01 -07:00
Kenneth Graunke
56705cd36b mesa: Don't flatten IF statements by default.
MaxIfDepth of 0 means "flatten all the time", not "never flatten".
This is only desirable on hardware that can't support control flow;
software rasterization and most hardware drivers want this.

This alters behavior for swrast as well as i915.  Tested on i915.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-16 12:01:39 -07:00
Kenneth Graunke
b2e0293213 mesa: Remove PROGRAM_WRITE_ONLY register type.
More dead code.  I'm not sure what it was for.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:35:25 -07:00
Kenneth Graunke
01d2bd34f4 mesa: Remove dead _mesa_num_parameters_of_type() function.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:35:24 -07:00
Kenneth Graunke
1366db2ef6 mesa: Remove dead _mesa_add_attribute() function.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:35:24 -07:00
Kenneth Graunke
d0021cb0fb mesa: Remove remnants of PROGRAM_VARYING.
The previous patch removed the producer of things in this file.
Since there aren't any, we can remove it.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:35:24 -07:00
Kenneth Graunke
eda4a4ae81 mesa: Remove dead _mesa_add_varying() function.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:35:24 -07:00
Kenneth Graunke
f7cfe3fc70 mesa: Remove dead program_parameter::Flags field.
All flags are now gone, so we can stop storing and passing this around.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:35:24 -07:00
Kenneth Graunke
5bb6f15f79 st/mesa: Remove the PROG_PARAM_BIT_CYL_WRAP flag. [v2]
Nobody ever set the flag, which makes this dead code.

v2: Leave the ureg_DECL_fs_input_cyl function in place, even though it's
    unused, since VMWare uses it for their internal projects.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:35:24 -07:00
Kenneth Graunke
4b13252bba mesa: Remove GLSL-related PROG_PARAM_BIT flags.
GLSL doesn't use the program code anymore.  Accordingly, there were no
consumers of these flags, so there's no need to define them.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:35:24 -07:00
Kenneth Graunke
8d418d1616 mesa: Remove support for named parameters.
These were only part of NV_fragment_program, so we can kill them.

The fact that PROGRAM_NAMED_PARAM appears in r200_vertprog.c is rather
comedic, but also demonstrates that people just spam the various types
of parameters everywhere because they're confusing.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:35:24 -07:00
Kenneth Graunke
d67e52b027 driconf: Remove force enable for NV_vertex_program.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:35:23 -07:00
Kenneth Graunke
58c466519d mesa: Remove yet more remnants of NV_fragment_program.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:35:23 -07:00
Kenneth Graunke
e5f03f23a0 mesa: Remove some miscellaneous NV program stuff from arbprogram.c.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:35:23 -07:00
Kenneth Graunke
d213d27f84 mesa: Simplify _mesa_BindProgram() by removing NV program remnants.
Without NV programs, there's no need for the compatible_program_targets
function.  A simple (non-)equality check will do.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:35:23 -07:00
Kenneth Graunke
2f350f360b mesa: Remove get and enable bits for NV_fragment_program.
Also remove a leftover remnant from NV_vertex_program.

v2: Update for Imre's get changes.

Reviewed-by: Brian Paul <brianp@vmware.com> [v1]
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
2012-10-16 11:35:23 -07:00
Kenneth Graunke
d711717b4a mesa: Remove prog_print support for NV programs.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:24:34 -07:00
Kenneth Graunke
2254569bda mesa: Remove support for parsing NV fragment programs.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:24:34 -07:00
Kenneth Graunke
9dc2c28983 mesa: Remove the gl_program::Resident flag.
It apparently was only used for NV programs.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:24:34 -07:00
Kenneth Graunke
7742952f7e mesa: Remove the EmitNVTempInitialization shader compiler option.
Nobody uses it anymore.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:24:34 -07:00
Kenneth Graunke
7487b16128 mesa: Remove the NV program API functions.
These are all unused now.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:24:34 -07:00
Kenneth Graunke
16d8161962 mesa: Switch to the other glGetVertexAttribPointervARB implementation.
Previously, Mesa used nvprogram.c's _mesa_GetVertexAttribPointervNV()
function to implement this GL call.  There was also a second
implementation in varray.c, _mesa_GetVertexAttribPointervARB(), which
was entirely unused.

The varray.c variant has an additional assertion and checks the index
against ctx->Const.VertexProgram.MaxAttribs rather than
MAX_VERTEX_GENERIC_ATTRIBS.  However, that variable is defined to the
same value, so it should be fine.

This will allow us to kill the duplicate function.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:24:34 -07:00
Kenneth Graunke
070ba30c36 mesa: Remove some shared NV_vp/fp functions from the dispatch table.
Also kill the resulting dead code for display list handling.

v2: Also kill dlist's OPCODE_REQUEST_RESIDENT_PROGRAMS_NV.

Reviewed-by: Brian Paul <brianp@vmware.com> [v1]
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:24:34 -07:00
Kenneth Graunke
ff1943dec9 mesa: Unhook NV_fragment_program API from the dispatch table.
The NamedParameter functions were introduced in NV_fragment_program, and
are not shared with any other extensions.

Although this patch appears to remove the LocalParameter functions, it
does not: the ARB_fragment_program section also set them up.  Now we
simply initialize them a single time.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:24:34 -07:00
Kenneth Graunke
492feddb03 swrast: Remove support for the NV_fragment_program extension.
No hardware drivers support this, it's obsolete, and unlikely to be
useful without NV_vertex_program, which is gone now.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 11:24:33 -07:00
Alex Deucher
ed8d87c6a6 radeonsi: add some new SI pci ids
Note: this is a candidate for the stable branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-10-16 13:11:38 -04:00
Tom Stellard
b1e7bd7690 r600g: Fix segfault in r600_compute_global_transfer_map()
This segfault was caused by commit
369e468889, however it is my fault for not
testing the patch while it was on the list.
2012-10-16 14:39:16 +00:00
Tom Stellard
a73c5d3f9d r600g: Fix build with --enable-opencl 2012-10-16 14:39:15 +00:00
Fredrik Höglund
762d9ace6b mesa/es: Enable GL_EXT_map_buffer_range
This extension is functionally the same as GL_ARB_map_buffer_range.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-16 13:21:41 +02:00
Kristian Høgsberg
017c6fb324 gbm: Reject buffers that are not wl_drm buffers in gbm_bo_import()
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2012-10-15 22:33:04 -04:00
Fredrik Höglund
0978707917 glx: Fix a regression in the new XCB code
dri2DrawableGetMSC(), dri2WaitForMSC() and dri2WaitForSBC() were
inadvertently changed to return 0 on success.  This resulted in the callers
returning an error to the client.

Restore the previous behavior and also check that the reply pointers are
valid before accessing them.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-16 02:24:42 +02:00
Brian Paul
df3721fd2e st/mesa: remove OPCODE_BRA switch case 2012-10-15 13:17:53 -06:00
Eric Anholt
59c4420fac docs: Add note about removal of GL_NV_vertex_program. 2012-10-15 11:53:24 -07:00
Eric Anholt
bc74c4bbaf mesa: Remove defines for NV_vertex_program limits.
Note that _mesa_GetVertexAttribPointervNV() is actually
glGetVertexAttribPointerv(), which operates on the generic attributes.  The
geometry shader initialization looks like arbitrary cruft to me.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:24 -07:00
Eric Anholt
09c006da9f mesa: Fix comments for NV_vp code that's now only used by other extensions.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:24 -07:00
Eric Anholt
37fc983d03 mesa: Add notes about remaining NV_vertex_program code.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:24 -07:00
Eric Anholt
8b2fe73897 mesa: Remove miscellaneous remains of NV_vertex_program.
v2: Rebase on top of get.c changes.

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2012-10-15 11:53:24 -07:00
Eric Anholt
cb9a1bf316 mesa: Remove API specific to GL_NV_vertex_program's aliased attribs.
v2: Rebase on top of get.c changes.

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2012-10-15 11:53:24 -07:00
Eric Anholt
8058a70763 mesa: Remove prog_instruction.h field for never-supported NV_vertex_program3.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:24 -07:00
Eric Anholt
cc763f0f3f mesa: Remove support for GL_VERTEX_STATE_PROGRAMs and their execution.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:24 -07:00
Eric Anholt
363643f540 mesa: Remove NV_vertex_program-specific parameters support.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:24 -07:00
Eric Anholt
c0120c2509 mesa: Remove support for NV_vertex_program's attribute evaluation.
Note that the MAP2 getters were missing from the implementation.  Neat.

v2: Rebase on top of get.c changes.

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2012-10-15 11:53:23 -07:00
Eric Anholt
4f9d351ef1 mesa: Remove support for NV_vertex_program's special attributes aliasing
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:23 -07:00
Eric Anholt
6a20f0e561 mesa: Fix NV_fragment_program's display list opcode for RequestResident.
While nuking NV_vertex_program, I noticed that one of my opcodes was used in a
strange place.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:23 -07:00
Eric Anholt
6ab9c04769 mesa: Remove support for NV_vertex_program's tracked matrices.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:23 -07:00
Eric Anholt
bcfd51f8c4 mesa: Remove Mesa IR opcodes that existed only for NV_vertex_program.
v2: Remove dead positive() function, caught by Matt.

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
2012-10-15 11:53:23 -07:00
Eric Anholt
422566e1c7 mesa: Remove support for parsing NV vertex programs.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:23 -07:00
Eric Anholt
cff1b1df4b swrast: Remove support for GL_NV_vertex_program.
It's not supported in any hardware drivers, and doesn't appear to be useful on
Linux.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:23 -07:00
Eric Anholt
a1998673ba gallium: Remove #if 0-ed enable of NV_vp. It's going away.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:22 -07:00
Eric Anholt
63c233cf08 r200: Remove support for software-only NV_vertex_program.
It wasn't supported in hardware, and the comments in the code indicated no
known uses (similar to my experience on Intel) and a possible intent to remove
it.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:22 -07:00
Eric Anholt
af90c8c511 intel: Remove NV_vertex_program support.
We were holding on to this code because we were aware that NWN 1 had some
support for vertex programs -- no other linux programs I've come across would
use it (since other software also has ARB_vp or GLSL support).  Only, it turns
out that NWN doesn't even give us any vertex programs.  Given that we have
known issues where the extension has never been fully supported, just give up
on it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46795
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:22 -07:00
Eric Anholt
1a8a0418f2 i965/vp: Remove more code for unused opcodes.
These don't appear in ARB_vp or NV_vp and I missed that fact on the first
pass of removing dead opcodes.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 11:53:22 -07:00
Andreas Boll
c5adfb21b3 r600g: drop useless switch statement
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-10-15 20:34:02 +02:00
Andreas Boll
0ce21660c2 gallium/docs: update some distro information
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-15 16:11:49 +02:00
Marek Olšák
023dae71ef r600g: emit the border color only when it's needed
That depends on the texture wrap modes and filtering.
2012-10-15 16:04:09 +02:00
Marek Olšák
33dda8f4fb r600g: cleanup create_sampler_state functions
- stopped using util_color
- reformatted to occupy less characters per line.
- used memcpy for the border color
- used pipe_color_union in the state structure
2012-10-15 16:04:09 +02:00
Marek Olšák
2bbd307fa6 st/mesa: fix integer texture border color for some formats (v2)
And the clear color too, though that may be an issue only with GL_RGB if it's
actually RGBA in the driver.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>

v2: The types of st_translate_color parameters were changed to gl_color_union
    and pipe_color_union as per Brian's comment.
2012-10-15 16:04:09 +02:00
Brian Paul
1ec12c53ba util: added debug_print_transfer_flags() function 2012-10-15 07:49:14 -06:00
Abdiel Janulgue
bcb10ca172 mesa: Fix a crash in update_texture_state() for external texture type
NOTE: This is a candidate for the stable branch.

Signed-off-by: Abdiel <abdiel.janulgue@intel.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-10-15 07:49:14 -06:00
Brian Paul
88ecd0ddb9 svga: remove needless debug-mode linked list code
LIST_DEL() always sets the prev/next pointers to NULL now.
2012-10-15 07:49:14 -06:00
Chris Fester
3fffe8f7b7 util: null-out the node's prev/next pointers in list_del()
Note: This is a candidate for the 9.0 branch.
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-10-15 07:49:14 -06:00
Daniel Stone
4004620d34 build: Don't fail if libX11 isn't installed
configure.ac would previously refuse to complete if libX11 wasn't
installed, even if we'd disabled GLX and weren't building an X11 EGL
platform.  Make the check simply set the no_x variable that's used (but
never set) immediately below for what looks like this very case.

Signed-off-by: Daniel Stone <daniel@fooishbar.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dan Nicholson <dbn.lists@gmail.com>
2012-10-14 20:41:35 -07:00
Christoph Bumiller
43e6c51aed nouveau: fix offset in nouveau_buffer_transfer_map
Before 369e468889, the transfer was
initialized before the call to map and had the correct value already.
2012-10-14 18:58:04 +02:00
Matt Turner
fb85b204d3 u_format_s3tc.c: Don't call getenv() twice
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-12 12:26:03 -07:00
Tapani Pälli
60565b564b android: generate matching remap_helper to dispatch table
commit a010215463 removed ES2 specific dispatch
table and remap_helper, since now we are using dispatch.h which is generated
from gl_and_es_API.xml we need to generate a matching remap_helper using the
same xml.

Note: This is a candidate for the 9.0 branch.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-12 11:42:09 -07:00
José Fonseca
bf2edc776b gallivm: Don't use llvm.x86.avx.max/min.ps.256 inadvertently.
Could happen when CPU supports AVX, but LLVM doesn't.
2012-10-12 18:52:28 +01:00
José Fonseca
9ccf91f9ef tgsi: Dump register number when dumping immediates.
For example:

VERT
DCL IN[0]
DCL OUT[0], POSITION
DCL OUT[1], GENERIC[12]
DCL CONST[0..4]
DCL TEMP[0], LOCAL
DCL TEMP[1], LOCAL
IMM[0] UINT32 {4294967295, 0, 0, 0}
IMM[1] FLT32 {    0.0000,     1.0000,     0.0000,     0.0000}
  0: SEQ TEMP[0].x, CONST[3].xxxx, IMM[0].xxxx
  1: F2I TEMP[0].x, -TEMP[0]
  2: SEQ TEMP[1].x, CONST[4].xxxx, IMM[0].xxxx
  3: F2I TEMP[1].x, -TEMP[1]
  4: AND TEMP[0].x, TEMP[0].xxxx, TEMP[1].xxxx
  5: IF TEMP[0].xxxx :0
  6:   MOV TEMP[0], IMM[1].xyxy
  7: ELSE :0
  8:   MOV TEMP[0], IMM[1].yxxy
  9: ENDIF
 10: MOV OUT[1], TEMP[0]
 11: MOV OUT[0], IN[0]
 12: END

instead of

VERT
DCL IN[0]
DCL OUT[0], POSITION
DCL OUT[1], GENERIC[12]
DCL CONST[0..4]
DCL TEMP[0], LOCAL
DCL TEMP[1], LOCAL
IMM UINT32 {4294967295, 0, 0, 0}
IMM FLT32 {    0.0000,     1.0000,     0.0000,     0.0000}
  0: SEQ TEMP[0].x, CONST[3].xxxx, IMM[0].xxxx
  1: F2I TEMP[0].x, -TEMP[0]
  2: SEQ TEMP[1].x, CONST[4].xxxx, IMM[0].xxxx
  3: F2I TEMP[1].x, -TEMP[1]
  4: AND TEMP[0].x, TEMP[0].xxxx, TEMP[1].xxxx
  5: IF TEMP[0].xxxx :0
  6:   MOV TEMP[0], IMM[1].xyxy
  7: ELSE :0
  8:   MOV TEMP[0], IMM[1].yxxy
  9: ENDIF
 10: MOV OUT[1], TEMP[0]
 11: MOV OUT[0], IN[0]
 12: END
2012-10-12 18:52:14 +01:00
Roland Scheidegger
d366520e85 gallivm: fix rsqrt failures
lp_build_rsqrt initially did not do any newton-raphson step. This meant that
precision was only ~11 bits, but this handled both input 0.0 and +infinity
correctly. It did not however handle input 1.0 accurately, and denormals
always generated infinity result.
Doing a newton-raphson step increased precision significantly (but notably
input 1.0 still doesn't give output 1.0), however this fails for inputs
0.0 and infinity (both result in NaNs).
Try to fix this up by using cmp/select but since this is all quite fishy
(and still doesn't handle denormals) disable for now. Note that even with
workarounds it should still have been faster since the fallback uses sqrt/div
(which both use the usually unpipelined and slow divider hw).
Also add some more test values to lp_test_arit and test lp_build_rcp() too while
there.

v2: based on José's feedback, avoid hacky infinity definition which doesn't
work with msvc (unfortunately using INFINITY won't cut it neither on non-c99
compilers) in lp_build_rsqrt, and while here fix up the input infinity case
too (it's disabled anyway). Only test infinity input case if we have c99,
and use float cast for calculating reference rsqrt value so we really get
what we expect.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-10-12 18:51:18 +01:00
José Fonseca
2a4105cbc0 galahad: galahad_context_blit
must unwrap.
2012-10-12 18:38:05 +01:00
Marek Olšák
555c8d500a r600g: move shader structures into r600_shader.h 2012-10-12 19:00:30 +02:00
José Fonseca
23c6b8f2ed mesa/st: Fix assertions.
Can't access ptDraw before it is written.
2012-10-12 17:04:34 +01:00
Andreas Boll
c3dd8c358c doxygen: add gbm to .gitignore 2012-10-12 17:45:49 +02:00
Marek Olšák
7997b3c97c r600g: implement MSAA resolving for 8-bit and 16-bit integer formats
by changing the format to NORM.
2012-10-12 15:23:27 +02:00
Oliver McFadden
1b921acd5f intel: print debug either to stdout or `logcat' depending on platform.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-12 11:14:54 +03:00
Brian Paul
743d859e62 util: fix broken pipe_get_tile_rgba() call
Fix breakage from commit 369e468.
2012-10-11 15:53:16 -06:00
Tom Stellard
4cc530f452 radeon/llvm: Fix build with LLVM 3.2 2012-10-11 21:33:00 +00:00
Tom Stellard
dc54c49df9 clover: Fix build with LLVM 3.2
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2012-10-11 21:32:54 +00:00
Tom Stellard
c6b0132d1e clover: Don't link against libclangRewrite
This library does not exist in LLVM 3.2 and libOpenCL.so links fine
without it on LLVM 3.1

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2012-10-11 21:32:36 +00:00
Marek Olšák
7b01bc1e4c radeonsi: handle unhandled CAPs 2012-10-11 21:36:26 +02:00
Marek Olšák
dd9274df4f radeonsi: fixup the return type of is_format_supported 2012-10-11 21:32:47 +02:00
Marek Olšák
8e3e4145ce radeonsi: remove unused local variables 2012-10-11 21:31:36 +02:00
Marek Olšák
47b7af6337 r600g: put user indices in the command stream for small index counts
This improves performance a little bit if there are lots of small indexed
draw commands.
2012-10-11 21:21:59 +02:00
Marek Olšák
0369fc9725 r600g: inline r600_translate_index_buffer 2012-10-11 21:21:34 +02:00
Marek Olšák
369e468889 gallium: unify transfer functions
"get_transfer + transfer_map" becomes "transfer_map".
"transfer_unmap + transfer_destroy" becomes "transfer_unmap".

transfer_map must create and return the transfer object and transfer_unmap
must destroy it.

transfer_map is successful if the returned buffer pointer is not NULL.
If transfer_map fails, the pointer to the transfer object remains unchanged
(i.e. doesn't have to be NULL).

Acked-by: Brian Paul <brianp@vmware.com>
2012-10-11 21:12:16 +02:00
Marek Olšák
ec4c74a9dc st/mesa: use the renderbuffer chosen by core Mesa in CopyTexSubImage
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-11 21:12:12 +02:00
Marek Olšák
9fe06f8815 softpipe: remove unused functions
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-11 21:12:10 +02:00
Marek Olšák
1c02075df0 st/mesa: use transfer_inline_write in st_texture_image_data
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-11 21:12:07 +02:00
Marek Olšák
ce7ebdd29a st/mesa: remove useless checking in reset_cache
It's always NULL here.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-11 21:12:03 +02:00
Andreas Boll
f04a6a65cc docs: start release notes file for 9.1 2012-10-11 19:26:10 +02:00
Brian Paul
60a9390978 svga: don't use uninitialized framebuffer state
Only the first 'nr_cbufs' color buffers in the pipe_framebuffer_state are
valid.  The rest of the color buffer pointers might be unitialized.
Fixes a regression in the piglit fbo-srgb-blit test since changes in the
gallium blitter code.

NOTE: This is a candidate for the 9.0 branch (just to be safe).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-10-11 09:13:59 -06:00
John Kåre Alsaker
6c53ec1ef2 svga: Remove wierd code which forces non-sRGB formats.
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-10-10 18:22:22 -06:00
John Kåre Alsaker
1a4aad11b0 svga: Add support for 16-bit per channel RGBA
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-10-10 18:19:44 -06:00
Eric Anholt
34c58acb59 i965/vs: Add support for splitting virtual GRFs.
This should improve our ability to register allocate without spilling.
Unfortuantely, due to the live variable analysis being ignorant of loops, we
still have register allocation failures on some programs.

v2: Add more context to the comment explaining the function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2012-10-10 13:22:56 -07:00
Eric Anholt
d4bcc65918 i965/vs: Try again when we've successfully spilled a reg.
Before, we'd spill one reg, then continue on without actually register
allocating, then assertion fail when we tried to use a vgrf number as a
register number.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-10 13:22:56 -07:00
Kenneth Graunke
9237f0ea8d i965/vs: Implement register spilling.
To validate this code, I ran piglit -t vs quick.tests with the "go spill
everything" debugging code enabled.  There was only one regression:
glsl-vs-unroll-explosion simply ran out of registers.  This should be
fine in the real world, since no one actually spills every single
register.

NOTE: This is a candidate for the 9.0 branch. Even if it proves to have
bugs, it's likely better than simply failing to compile.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-10 13:22:56 -07:00
Kenneth Graunke
46e529672b i965/vs: Fix unit mismatch in scratch base_offset parameter.
move_grf_array_access_to_scratch() calculates scratch buffer offsets in
bytes.  However, emit_scratch_read/write() expects the base_offset
parameter to be measured in OWords.

As a result, a shader using a scratch read/write offset greater than
zero (in practice, a shader containing more than one variable in
scratch) would use too large an offset, frequently exceeding the
available scratch space.

This patch corrects the mismatch by removing spurious conversion from
OWords to bytes in move_grf_array_access_to_scratch().

This is based on a patch by Paul Berry.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-10 13:22:55 -07:00
Matt Turner
587d5db11d egl: Return EGL_BAD_MATCH for invalid profile attributes
Version 12 of the EGL_KHR_create_context spec changed this behavior.

NOTE: This is a candidate for the 9.0 branch
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-10 13:15:06 -07:00
Vincent Lejeune
5090ce42e4 radeon/llvm: use ceil intrinsic instead of llvm.AMDIL.round.posinf
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-10 22:03:33 +02:00
Vincent Lejeune
9a6bb3f645 radeon/llvm: use floor intrinsic instead of llvm.AMDIL.floor
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-10 22:03:20 +02:00
Vincent Lejeune
bfdf26892c radeon/llvm: use llvm fabs intrinsic
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-10 22:03:03 +02:00
Vincent Lejeune
8db11bc4ed radeon/llvm: use llvm intrinsic for flog2
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-10 22:02:45 +02:00
Vincent Lejeune
23e11ac835 radeon/llvm: add support for cos/sin intrinsic
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-10 22:02:28 +02:00
Vincent Lejeune
876b42663c radeon/llvm: add a pattern for fsqrt
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-10 22:02:13 +02:00
Paul Berry
99802519b4 glapi: Reformat python code generation scripts to use 4-space indentation.
This brings us into accordance with the official Python style guide
(http://www.python.org/dev/peps/pep-0008/#indentation).

To preserve the indentation of the c code that is generated by these
scripts, I've avoided re-indenting triple-quoted strings (unless those
strings appear to be docstrings).

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-10 11:19:14 -07:00
José Fonseca
856464979b mesa: Avoid C99 indexed initializers.
Not supported by MSVC.

Reviewed-by: Imre Deak <imre.deak@intel.com>
2012-10-10 17:55:04 +01:00
José Fonseca
3f228ed090 mesa: Prevent CONST macro re-definition.
Should fix MSVC build, as windows.h also defines CONST.

CONST usage in get.c is not new, so probably this just appeared now due
to changes in the includes.
2012-10-10 11:40:34 +01:00
José Fonseca
a555888151 mesa: Silence 'assignment makes integer from pointer without a cast' warnings. 2012-10-10 11:35:34 +01:00
Imre Deak
9c1c23331a glget: fix make check for glGet GL_POLYGON_OFFSET_BIAS
This got broken by:
7182a1f glapi: rename/move GL_POLYGON_OFFSET_BIAS to its extension
section

Fix it by appending the _EXT suffix to the enum in the test too.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
2012-10-10 12:56:02 +03:00
Imre Deak
dd6479160c mesa: glGet: remove the unused TYPE_API_MASK flags
Since we generate the hash tables in build time, these flags aren't used
any more, remove them.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
2012-10-10 12:43:26 +03:00
Imre Deak
d220435416 mesa: glGet: use the build time generated hash tables
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
2012-10-10 12:43:23 +03:00
Imre Deak
98f880e0c4 mesa: glGet: add script to generate hash tables in build time
This will be needed by the next patch, which will switch to using
the parameter descriptor- and hash tables generated by the script.

The hash algorithm remains the same, the output parameter descriptor
table format changes slightly. There the TYPE_API_MASK entries are
removed and an invalid NULL entry is inserted at the beginning. This is
ok, as get.c:find_value() doesn't rely on TYPE_API_MASK any more to
detect an invalid enum.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
2012-10-10 12:43:19 +03:00
Imre Deak
6678125eae scons/android: add flag to check for enabled GL APIs
Needed by the next patch.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
2012-10-10 12:43:16 +03:00
Imre Deak
ea637c5b64 mesa: glGet: rename *{_EXT,_ARB} enums missing from the XML spec
The following enums used to be extensions but later became part of the
core specification. The _EXT/_ARB versions of these are not present in
in the current XML spec files, only defined in GL/glext.h

Later we'll need to look up these in a python script using the XML spec.
As a preparation for that remove the _EXT,_ARB suffix from these enums
and rename GL_DISTANCE_ATTENUATION_EXT to GL_POINT_DISTANCE_ATTENUATION.
Naturally, all enums keep their numerical values.

Note that similar renames shouldn't be necessary in the future: in case
of a new extension the XML spec is updated with the new _EXT/_ARB etc.
name and this name is added to the enum table in get.c.  Later the
extension may become part of the core spec, at which point the name w/o
the _EXT/_ARB suffix is added to the XML spec and the table in get.c
remains the same.

GL_BLEND_DST_ALPHA_EXT
GL_BLEND_DST_RGB_EXT
GL_BLEND_SRC_ALPHA_EXT
GL_BLEND_SRC_RGB_EXT
GL_COLOR_SUM_EXT
GL_COMPRESSED_TEXTURE_FORMATS_ARB
GL_CURRENT_FOG_COORDINATE_EXT
GL_CURRENT_SECONDARY_COLOR_EXT
GL_DISTANCE_ATTENUATION_EXT
GL_FOG_COORDINATE_ARRAY_EXT
GL_FOG_COORDINATE_ARRAY_STRIDE_EXT
GL_FOG_COORDINATE_ARRAY_TYPE_EXT
GL_FOG_COORDINATE_SOURCE_EXT
GL_FRAGMENT_SHADER_DERIVATIVE_HINT_ARB
GL_PACK_IMAGE_HEIGHT_EXT
GL_PACK_SKIP_IMAGES_EXT
GL_SECONDARY_COLOR_ARRAY_EXT
GL_SECONDARY_COLOR_ARRAY_SIZE_EXT
GL_SECONDARY_COLOR_ARRAY_STRIDE_EXT
GL_SECONDARY_COLOR_ARRAY_TYPE_EXT
GL_UNPACK_IMAGE_HEIGHT_EXT
GL_UNPACK_SKIP_IMAGES_EXT

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
2012-10-10 12:43:11 +03:00
Imre Deak
59d3bf6542 mesa: glGet: simplify the 'enum not found' condition
When traversing the hash table looking up an enum that is invalid we
eventually reach the first element in the descriptor array. By looking
at the type of that element, which is always TYPE_API_MASK, we know that
we can stop the search and return error. Since this element is always
the first it's enough to check for its index being 0 without looking at
its type.

Later in this patchset, when we generate the hash tables during build
time, this will allow us to remove the TYPE_API_MASK and related flags
completly.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
2012-10-10 12:43:08 +03:00
Imre Deak
2ad4a47547 mesa: glGet: fix parameter lookup for apps using multiple APIs
The glGet hash was initialized only once for a single GL API, even if
the application later created a context for a different API. This
resulted in glGet failing for otherwise valid parameters in a context
if that parameter was invalid in another context created earlier.

Fix this by using a separate hash table for each API.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
2012-10-10 12:43:05 +03:00
Imre Deak
7182a1fc5e glapi: rename/move GL_POLYGON_OFFSET_BIAS to its extension section
This should be named GL_POLYGON_OFFSET_BIAS_EXT and listed under the
EXT_polygon_offset section. (Solution by Ian Romanick)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
2012-10-10 12:42:42 +03:00
Marek Olšák
87a34131c4 r600g: move SQ_GPR_RESOURCE_MGMT_1 into new config_state
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:17:07 +02:00
Marek Olšák
c5584e93b1 r600g: move DB_SHADER_CONTROL into db_misc_state
Also update the register value in more appropriate places
than r600_update_derived_state.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:17:05 +02:00
Marek Olšák
ae25b93245 r600g: emit PS_PARTIAL_FLUSH at the beginning of CS
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:17:03 +02:00
Marek Olšák
ef723613e0 r600g: atomize depth-stencil-alpha state
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:17:01 +02:00
Marek Olšák
711f3bae9d r600g: atomize rasterizer state
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:16:58 +02:00
Marek Olšák
9a683d1bd8 r600g: sort variables in r600_context
Some variables have been removed from there too.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:16:55 +02:00
Marek Olšák
30bcc5538f r600g: initialize SQ_VTX_SEMANTIC_* in the start_cs command buffer
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:16:49 +02:00
Marek Olšák
18a189188a r600g: atomize scissor state
The workaround for R600 lacking VPORT_SCISSOR_ENABLE has also been simplified.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:16:45 +02:00
Marek Olšák
ab075de53b r600g: atomize polygon offset state
POLY_OFFSET_DB_FMT_CNTL is moved to the framebuffer state, because it only
depends on the zbuffer format.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:16:42 +02:00
Marek Olšák
a50edc8ed8 r600g: atomize fetch shader
The state object is actually a buffer, it's literally a buffer containing
the shader code.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:16:38 +02:00
Marek Olšák
8bf7044ec6 r600g: remove the dual_src_blend flag from the shader key
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:16:35 +02:00
Marek Olšák
faaba52aed r600g: atomize blend state
This is not so trivial, because we disable blending if the dual src
blending is turned on and the number of color outputs is less than 2.
I decided to create 2 command buffers in the blend state object and just
switch between them when needed, because there are other states unrelated
to blending (like the color mask) and those shouldn't be changed
(the old code had it wrong).

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:16:32 +02:00
Marek Olšák
eb65fefa4b r600g: inline r600_atom_dirty
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:16:28 +02:00
Marek Olšák
d8ea64697b r600g: remove the "atom" variable from r600_command_buffer
r600_command_buffer is not an atom.

The "atoms" have evolved into state slots (or groups of state slots) where
you can bind states. There is a fixed amount of atoms (state slots)
in the context.

The command buffers are nothing like that. They represent states, not state
slots.

We could probably give r600_atom a better name someday.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10 00:16:25 +02:00
Eric Anholt
1e7776ca2b egl: Remove bogus invalidate code.
The invalidate event support is a careful dance between driver and loader,
where both have to say they can handle it, and then the loader reports
invalidate events for the driver so the driver can do the optimization.

The EGL code doesn't report __DRIuseInvalidateExtension to the driver, so it
has no responsibility to call the driver's invalidate function, and the driver
is doing the glViewport hack because it assume.  This is not
the only time invalidate would need to be called (we need it *any* time an
invalidate event comes down the pipe, but we don't watch for them), so just
stop calling the driver's function.

Acked-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:04 -07:00
Eric Anholt
7e9bd2b2ed egl: Add support for driconf control of swapinterval.
This behavior mostly matches glx_dri2.  It's slightly complicated in
comparison because EGL exposes the implementation limits in the EGL config.

Note that platform_x11 was the only one setting swap_available, so the move of
the MaxSwapInterval into it is appropriate.

Acked-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:03 -07:00
Eric Anholt
8c472b8f6a glx: Replace DRI2SwapBuffers() custom protocol with XCB.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:03 -07:00
Eric Anholt
f02242a4fa glx: Fix some indentation.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:03 -07:00
Eric Anholt
811602885b glx: Replace DRI2SwapInterval custom protocol with XCB.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:03 -07:00
Eric Anholt
7acf8ae0e1 glx: Reuse setSwapInterval for setting initial swap interval. 2012-10-09 14:32:03 -07:00
Eric Anholt
d0937759db glx: Allow glXSwapInterval(0) when vblank_mode=0.
There's no reason to say no in this case.
2012-10-09 14:32:03 -07:00
Eric Anholt
ab8ae9301f glx: Replace DRI2GetMSC custom protocol with XCB.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:03 -07:00
Eric Anholt
8e61b9028a glx: Replace DRI2WaitForMSC custom protocol with XCB.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:03 -07:00
Eric Anholt
183ab9e14e glx: Replace DRI2WaitForSBC custom protocol with XCB.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:03 -07:00
Eric Anholt
1e74910bb7 glx/dri1: Remove uncompiled __DRI_SWAP_BUFFER_COUNTER code.
It's been in place but never enabled since 2010.  Note how one piece called a
DRI2 function, suggesting never being tested.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:02 -07:00
Eric Anholt
da3f7c127b egl: Quit checking for a bug in old xcb when we require new xcb.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:02 -07:00
Eric Anholt
b477384f40 egl: Drop xcb ifdefs by just requiring a version from this year.
glx and gallium's xcb_dri2 usage already require this version, so this is
nothing really new.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:02 -07:00
Eric Anholt
b49cd8495f egl: Unifdef dri_interface.h defines.
dri_interface.h comes from our tree, so why litter our tree with ifdefs for
older versions of it?

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:02 -07:00
Eric Anholt
c35a9388a3 glx: Unifdef some dri_interface.h defines.
dri_interface.h comes from our tree, so why litter our tree with ifdefs for
older versions of it?

I left in the DRI_TEX_BUFFER_VERSION ifdefs, which is broken and uncompiled
(the version wasn't bumped from 2 to 3 when the patch was landed), but I don't
know what should be done with it.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:02 -07:00
Eric Anholt
bb01f671bb glx: Require xcb_dri2 for building glxdri2.c.
I'm going to transition a bunch of the protocol to using XCB so we can stop
rolling it ourselves.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:02 -07:00
Eric Anholt
dc6fa41076 glx: Remove the last user of -DUSE_XCB.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:02 -07:00
Eric Anholt
3f0e3a7ad5 glx: Unifdef USE_XCB.
It's been required for building glx since
b518dfb513 in january.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:02 -07:00
Eric Anholt
31c7d4ec18 egl: Cleanly cast EGLNative* pointers to X11 types.
The EGLNative* types are all defined to be pointers across all our EGL
implementations, but in the X11 platform they're actually just XIDs (32-bit
integers).

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 14:32:01 -07:00
Vincent Lejeune
11e08f42e4 r600g: use a select to handle front/back color in llvm
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-09 23:19:09 +02:00
Vincent Lejeune
80663cb185 r600g: frontcolor tracks its associated backcolor
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-09 23:19:09 +02:00
Matt Turner
900cc7cf80 Remove VAAPI support.
Not working and unmaintained.

Reviewed-by: Christian König <christian.koenig@amd.com>
2012-10-09 14:00:05 -07:00
Marcin Slusarz
63a15117a5 nv50: fix build after "nv50: fix printf warning"
When compiled with C++ compiler, inttypes.h defines PRI* macros only when
__STDC_FORMAT_MACROS is defined.
2012-10-09 22:42:54 +02:00
Marcin Slusarz
93eba26935 nouveau: use pre-calculated stride for resource_get_handle
Fixes FDO#55294.

NOTE: This is a candidate for the 9.0 branch.
2012-10-09 22:23:09 +02:00
Tom Stellard
45288cd2b6 r600g: Fix build with --enable-opencl 2012-10-09 19:54:12 +00:00
Ian Romanick
b25fbceb86 mesa/tests: Remove driverCtx parameter from call to _mesa_initialize_context
Fixes 'make check' breakage since 733dba2.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-09 11:21:14 -07:00
Quentin Glidic
7cb8764ca3 intel: Add missing #include <time.h>
Commit 006c1a3c65 introduced a call to
clock_gettime, but failed to include <time.h>, breaking the build in
some cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-09 09:12:41 -07:00
Kenneth Graunke
b6346749a8 i965: Delete some dead code from brw_eu_emit.c.
Presumably some of this was used by the old fragment shader backend.
2012-10-09 09:11:26 -07:00
Andreas Boll
840d8484c0 docs: add missing release date 2012-10-09 17:50:29 +02:00
Andreas Boll
c833d98ff9 docs: update release notes for 9.0 2012-10-09 17:36:41 +02:00
Andreas Boll
3699150d3b docs: add news item for 9.0 release
Reviewed-by: Brian Paul <brianp@vmware.com>

ported manually from
8e73273cb9
2012-10-09 17:29:37 +02:00
Brian Paul
541158fbb9 mesa: remove unused _mesa_cpal_compressed_format_type() function
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-09 07:05:47 -06:00
Marek Olšák
30ebc8650c nv50: fix printf warning 2012-10-09 14:38:43 +02:00
Marek Olšák
51872e8bb3 nv30: fix type conversion warning 2012-10-09 14:34:27 +02:00
Marek Olšák
cf9081b37c i915g: fix unused variable and type conversion warnings 2012-10-09 14:33:16 +02:00
Daniel Stone
4f310984a9 teximage: Remove unnecessary compressed format check
Ever since df4a88ac, the check for compressed formats has been
unnecessary.  And ever since cb72ec5f, the build has been broken with
FEATURE_ES.  Remove it, as it does nothing.

Signed-off-by: Daniel Stone <daniel@fooishbar.org>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-10-09 14:32:03 +02:00
Andreas Boll
b534c39ece docs: update FAQ
Reported-by: Fabio Pedretti <fabio.ped@libero.it>

v2: (Chad Versace <chad.versace@linux.intel.com>)
  - Rewrite FAQ - proper place for installing mesa.

v3: fix some typos

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-09 09:00:18 +02:00
Ben Skeggs
63c3a799ae nv50: point vertex runout at a valid address
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2012-10-09 09:56:36 +10:00
Ben Skeggs
c47a01c29c nvc0: point vertex runout at a valid address
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2012-10-09 09:56:34 +10:00
Ben Skeggs
d53bbabe61 nvc0: fix missing permanent bo reference on poly cache
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2012-10-09 09:56:12 +10:00
Brian Paul
1aa8ad8b50 Revert "st/mesa: remove unused variables to fix compile warnings"
This reverts commit 810d2e167c.

The pscreen variable is used in an assertion.  Use "(void) pscreen;"
to silence the warning.
2012-10-08 17:32:54 -06:00
Brian Paul
bad1b271a0 mesa: minor whitespace fixes in teximage.c 2012-10-08 17:30:21 -06:00
Marek Olšák
810d2e167c st/mesa: remove unused variables to fix compile warnings 2012-10-09 01:14:55 +02:00
Marek Olšák
cb72ec5fc5 mesa: remove unused variables to fix compile warnings 2012-10-09 01:14:55 +02:00
Marek Olšák
fd3219962d softpipe: initialize quadColor2 to fix compile warnings 2012-10-09 01:14:24 +02:00
Marek Olšák
d0349c91c8 r600g: remove unused variables to fix compile warnings 2012-10-09 01:11:56 +02:00
Marek Olšák
d284613422 llvmpipe: remove unused variables to fix compile warnings 2012-10-09 01:10:58 +02:00
Stéphane Marchesin
437a2560b1 i915g: Don't clobber I915_NEW_FS on new framebuffer.
This snuck in with a previous commit.
2012-10-08 12:30:46 -07:00
Eric Anholt
6a514494fa i965/fs: Improve performance of copy/constant propagation.
Use a simple chaining hash table for the ACP.  This is not really very good,
because we still do a full walk of the tree per destination write, but it
still reduces fp-long-alu runtime from 5.3 to 3.9s.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-08 08:50:38 -07:00
Eric Anholt
fb5bf03a20 i965/fs: Move constant propagation to the same codebase as copy prop.
This means that we don't get constant prop across into the first block after a
BRW_OPCODE_IF or a BRW_OPCODE_DO, but we have hope for properly doing it
across control flow at some point.  More importantly, with the next commit it
will help avoid O(n^2) with instruction count runtime for shaders that have
many constant moves.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-08 08:50:38 -07:00
Eric Anholt
098acf6c84 i965: Remove the old ARB_fragment_program backend.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-08 08:50:38 -07:00
Eric Anholt
97615b2d8c i965: Replace brw_wm_* with dumping code into the fs_visitor.
This makes a giant pile of code newly dead.  It also fixes TXB on newer
chipsets, which has been totally broken (I now have a piglit test for that).
It passes the same set of Ian's ARB_fragment_program tests.  It also improves
high-settings ETQW performance by 3.2 +/- 1.9% (n=3), thanks to better
optimization and having 8-wide along with 16-wide shaders.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=24355
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-08 08:50:27 -07:00
Eric Anholt
014aaa97d3 i965: Reduce maximum GL_ARB_fragment_program instruction count to 1024.
I don't know of any programs that would need more than this.  The larger
programs I've seen have neared 100 instructions.  This prevent excessive
runtimes of automatic tests that attempt to test up to the exposed maximums
(like fp-long-alu).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-08 08:38:49 -07:00
Eric Anholt
9cfc00a84c i965/fs: Add a couple more algebraic cases that help some ARB_fp patterns.
ARB_fp doesn't go through the GLSL optimizer, and these were things you see
frequently thanks to conditionals being lowered to SLT/SGE and MUL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-08 08:38:49 -07:00
Eric Anholt
d81d7a4b65 i965/fs: Pull ir_binop_min/ir_binop_max handling to a separate function.
This will be reused from the ARB_fp compiler.  I touched up the pre-gen6 path
to not overwrite dst in the first instruction, which prevents the need for
aliasing checks (we'll need that in the ARB_fp compiler, but it actually
hasn't been needed in this codebase since the revert of the nasty old
MOV-avoidance code).  I also made the conditional_mod between gen6 and
pre-gen6 consistent, which shouldn't matter except for denorm/(+/-)0
comparisons where the choice between left and right hand side of the
comparison changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-08 08:38:49 -07:00
Eric Anholt
5c26874546 i965/fs: Refactor rectangle/GL_CLAMP texture coordinate adjustment.
We'll want to reuse this for ARB_fp handling.

v2: Fold the remaining bit of emit_texcoord back into visit(ir_texture).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-08 08:38:49 -07:00
Eric Anholt
e7149d390c i965/fs: Pass fragment depth to the fb write as a fs_reg, not an ir_variable.
This will be used for the ARB_fp change to use this backend.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-08 08:38:49 -07:00
Eric Anholt
6589c0bd56 mesa: Note that OPCODE_RFL is not part of ARB_fp (it's NV_fp only).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-08 08:38:49 -07:00
José Fonseca
88e417d761 st/wgl: Don't cache HDC anywhere.
Applications may destroy HDC at any time. So always get a HDC as needed.

Fixes lack of presents with Solidworks eDrawings when screen resolution is
changed.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-08 15:42:50 +01:00
Ian Romanick
86de501f14 meta: Make shader template literal strings be parameters to asprintf
This enables the C compiler to generate warnings if the formats and the
arguments don't match.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-10-07 20:35:50 -07:00
Ian Romanick
751737f497 meta: Always enable GL_EXT_texture_array in mipmap shader
'#extension foo: enable' is harmless.  The functionality is only
actually enabled if the extension is supported.  The shader won't use
the functionality if it's not supported, so we're fine.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-10-07 20:35:47 -07:00
Ian Romanick
0e973b7498 meta: Since mipmap output type is always vec4, don't sprintf it
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-10-07 20:35:45 -07:00
Ian Romanick
0242381f06 meta: Don't use GLSL 1.30 shader on OpenGL ES 2
Fixes GLES2 CoverageGL conformance test.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-10-07 20:35:42 -07:00
Ian Romanick
3308c079bd meta: Rearrange shader creation in setup_glsl_generate_mipmap
The diff looks weird, but this moves the code from the first 'if
(ctx->Const.GLSLVersion < 130)' block down into the second block.  It
also moves some variable decalarations closer to their use.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-10-07 20:35:39 -07:00
Ian Romanick
ab097dde0c meta: Remove unsafe global mem_ctx pointer
NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-10-07 20:34:17 -07:00
Stéphane Marchesin
6ef37f71b0 i915g: Call draw_set_mapped_vertex_buffer from draw_vbo
This regressed with the draw rework. Fixes glest and vdrift crash.
2012-10-06 13:15:04 -07:00
Marek Olšák
9dfca930d7 r600g: fix possible issue with stencil mipmap rendering
Somehow I only hit this issue with my latest libdrm changes.
This won't be needed with DB texturing.

NOTE: This is a candidate for the 9.0 branch.
2012-10-06 05:31:01 +02:00
Marek Olšák
6fa22b840e r600g: ensure PERFECT_ZPASS+NOOP_CULL_DISABLE are 0 for blits+decompression
When an occlusion query was active, the derived DB state wasn't changed
for u_blitter even though all the occlusion queries were suspended.

It's fixed by moving the state update into the emit functions, which are
called whenever queries are stopped or suspended.
2012-10-06 04:31:16 +02:00
Marek Olšák
6db53ca490 r600g: don't modify pipe_resource in resource_copy_region, fixing race condition
pipe_resource can be shared between contexts, we shouldn't modify its
description. Instead, let's use the resource "views" (sampler views and
surfaces), where we can freely change almost any property of a resource.
2012-10-06 04:31:16 +02:00
Marek Olšák
d063c7b142 r600g: fix streamout on RS780 and RS880
The latest kernel from git is required. Transform feedback (along with GL3.0)
is turned off on older kernels.
2012-10-06 03:49:29 +02:00
Marek Olšák
588263e7a7 gallium: allow debug helpers in the release build
No idea why this is #ifdef'd. Trace and Noop are definitely useful no matter
how Mesa is built.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-06 03:34:40 +02:00
Brian Paul
733dba2a08 mesa: remove the driverCtx parameter to _mesa_create/initialize_context()
No longer used.
2012-10-05 17:13:03 -06:00
Brian Paul
917d273928 mesa: remove unused gl_context::DriverCtx field 2012-10-05 17:13:03 -06:00
Brian Paul
4c9042d21d radeon/r200: remove use of gl_context::DriverCtx field 2012-10-05 17:13:03 -06:00
Brian Paul
5a63634a13 radeon/r200: make radeon_context subclass of gl_context
radeon_context now contains a gl_context, rather than a pointer to one.
This will allow some minor core Mesa clean-up.
2012-10-05 17:13:03 -06:00
Kenneth Graunke
7fa0f10cd8 mesa: Flag _NEW_VARYING_VP_INPUTS when TexEnv programs are active.
The idea here is to not flag _NEW_VARYING_VP_INPUTS when shaders (either
GLSL or ARB vp/fp) are in use.  If either TNL or TexEnv programs are
active, at least one stage is using fixed function.

On Pineview, fixes 20 Piglit, 60 oglconforms, and 7 ES 1.1 conformance
tests, as well as missing textures in Xonotic.  These were all
regressions since commit fb4a34e60e.

NOTE: This is a candidate for the 9.0 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49127
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54807
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-05 13:19:53 -07:00
Stéphane Marchesin
74b6ea49df i915g: Get rid of the fixup state functions.
Now that the saved_* state is gone, we don't need those any longer.
2012-10-05 12:45:02 -07:00
Stéphane Marchesin
dca9e3c477 i915g: Remove the i915_context->saved_* stuff.
When using u_blitter, the state was being saved from saved_*, but we
don't use that. So after u_blitter resumed we got some corrupted
state in.

So let's just remove the saved_* stuff. I thought it was weird but
harmless, it's actually broken.
2012-10-05 12:45:01 -07:00
Stéphane Marchesin
98600c5ff6 i915g: Don't update I915_HW_PROGRAM in update_framebuffer
It's already going to be updated in update_dst_buf_vars.
2012-10-05 12:45:00 -07:00
Stéphane Marchesin
762ac0a218 Revert "i915g: Don't bind 0-length programs"
This reverts commit 8c28a9bd73.
2012-10-05 12:44:58 -07:00
Vinson Lee
df0de93206 glapi: Do not use backtrace on Cygwin.
execinfo.h is not available on Cygwin.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-04 22:28:15 -07:00
Paul Berry
8f0b81bf7d mesa: don't enable glVertexPointer() when using API_OPENGLES2.
This function is only present in GLES1 and in the OpenGL compatibility
profile.

Fixes the following "make check" failure:

    [----------] 1 test from DispatchSanity_test
    [ RUN      ] DispatchSanity_test.GLES2
    Mesa warning: couldn't open libtxc_dxtn.so, software DXTn
    compression/decompression unavailable
    dispatch_sanity.cpp:122: Failure
    Value of: table[i]
       Actual: 0x4de54e
    Expected: (_glapi_proc) _mesa_generic_nop
    Which is: 0x41af72
    i = 321
    [  FAILED  ] DispatchSanity_test.GLES2 (4 ms)
    [----------] 1 test from DispatchSanity_test (4 ms total)

NOTE: This is a candidate for stable release branches.

Reviewed-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Tested-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-04 12:46:42 -07:00
Stéphane Marchesin
8c28a9bd73 i915g: Don't bind 0-length programs
Since we started doing fixups for different render target formats,
this has been an issue. Instead just don't do anything, when the
program gets emitted later it'll get the correct fixup.

Fixes a bunch of piglit tests.
2012-10-04 12:39:06 -07:00
Brian Paul
91d8409649 mesa: don't call TexImage driver hooks for zero-sized images
This simply avoids some failed assertions but there's no reason to
call the driver hooks for storing a tex image if its size is zero.

Note: This is a candidate for the stable branches.
2012-10-04 07:59:11 -06:00
Rob Bradford
185d6df3c1 intel: Fix intel_texsubimage_tiled_memcpy to skip GL_EXT_unpack_subimage case
413c49141 added an optimisation to improve the performance of teximage
under a limited set of circumstances. If GL_EXT_unpack_subimage has been
used then we we must also skip this optimisation since the optimised
codepath does not take the packing values into consideration.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-10-03 16:44:22 -07:00
Matt Turner
31ab61cac1 dri drivers: Link dricommon before dynamic libraries
I think libtool should be handling this for us, but the build fails for
Jordan because libdricommon (a static library, which uses expat) appears
before -lexpat on the linker command.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
2012-10-03 13:41:09 -07:00
Paul Berry
551c991606 register_allocate: don't consider trivially colorable registers for spilling.
Previously, we considered all registers as candidates for spilling.
This was counterproductive--for any registers that have already been
removed from the interference graph, there is no benefit to spilling
them, since they don't contribute to register pressure.

This patch ensures that we will only try to spill registers that are
still in the interference graph after register allocation has failed.

This is consistent with the recommendations of the paper "Retargetable
Graph-Coloring Register Allocation for Irregular Architectures", on
which our register allocator is based.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-10-03 12:54:42 -07:00
Marek Olšák
53d06ecdd0 glx/dri2: use uint64_t instead of double to represent time for FPS calculation
Wine or a windows app changes fpucw to 0x7f, causing doubles to be equivalent
to floats, which broke the calculation of FPS.
We should be very careful about using doubles in Mesa.

Henri Verbeet adds:
  For reference, this is done by for example d3d9 when a D3D device is
  created without D3DCREATE_FPU_PRESERVE set. In the general case
  applications can do all kinds of terrible things to the FPU control
  word of course.
2012-10-03 16:55:48 +02:00
Oliver McFadden
ff835724b5 mesa: tests: EnumStrings.LookUpByNumber
[ RUN      ] EnumStrings.LookUpByNumber
enum_strings.cpp:43: Failure
Value of: _mesa_lookup_enum_by_nr(everything[i].value)
  Actual: "GL_COMPRESSED_RGBA_S3TC_DXT3_ANGLE"
Expected: everything[i].name
Which is: "GL_COMPRESSED_RGBA_S3TC_DXT3_EXT"
enum_strings.cpp:43: Failure
Value of: _mesa_lookup_enum_by_nr(everything[i].value)
  Actual: "GL_COMPRESSED_RGBA_S3TC_DXT5_ANGLE"
Expected: everything[i].name
Which is: "GL_COMPRESSED_RGBA_S3TC_DXT5_EXT"
[  FAILED  ] EnumStrings.LookUpByNumber (2 ms)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55505
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
2012-10-03 14:11:58 +03:00
Andreas Boll
336cc6499b docs: add link to the GLSL compiler page
This reverts commit 9e0931e355

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-03 08:54:12 +02:00
Andreas Boll
d495669965 docs: update shading documentation
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-03 08:53:46 +02:00
Matt Turner
159ca32fec build: Remove autoconf check for signbit
rebase failure in 7da12426f7.
2012-10-02 22:50:02 -07:00
Stéphane Marchesin
fe3aeb7ea3 i915g: Implement srgb textures the easy way.
Since the hw can do it, let's use the hw. It's less accurate
but doesn't have the shader instruction count shortcomings.
2012-10-02 17:54:50 -07:00
Stéphane Marchesin
2acc719374 i915g: Use X tiling for textures
This is what the classic driver does, and it allows faster
texture uploads.
2012-10-02 17:54:48 -07:00
Robert Bragg
0a523a8820 SwapBuffersRegionNOK: invert rectangles on y axis
The EGL_NOK_swap_region2 spec states that the rectangles are specified
with a bottom-left origin within a surface coordinate space also with a
bottom left origin, so this patch ensures the rectangles are flipped
before passing them on to dri2_copy_region.

Fixes piglit's egl-nok-swap-region test.

Tested-by: Matt Turner <mattst88@gmail.com>
2012-10-02 14:49:00 -07:00
Brian Paul
df4a88ac43 mesa: remove bogus compressed texture size checks
A compressed texture image size doesn't have to be a multiple of the
compressed block size (only sub-images do).  Fixes issues when building
compressed mipmaps because we often wind up with non-block-size images
for the higher mipmap levels.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=55445

Note: This is a candidate for the stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Sven Arvidsson <sa@whiz.se>
2012-10-02 15:19:00 -06:00
Michel Dänzer
82e38ac91f radeonsi: Fix double compilation of shader variants.
Fixes crash in piglit glsl-max-varyings.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-10-02 17:53:47 +02:00
Michel Dänzer
c3db19efba radeonsi: Better indexing of parameters in the pixel shader.
We were previously using the TGSI input index, which can exceed the number of
parameters passed from the vertex shader via the parameter cache. Now we use
a separate index which only counts those parameters.

Prevents piglit regressions with the following fix.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-10-02 17:50:58 +02:00
Michel Dänzer
dbb4a7f950 radeon/llvm: Disable SI flow control again for now.
It makes piglit unreliable due to VM protection faults and GPU lockups.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-10-02 16:50:36 +02:00
Andreas Boll
48e4eb695a docs/helpwanted: cleanup todo list links
split into common and driver specific To-Do lists
add an explanation for each To-Do list

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-02 15:44:34 +02:00
Andreas Boll
1f38fb2697 docs: document how to apply a candidate to a stable branch
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-02 15:44:28 +02:00
Andreas Boll
f07784d9ba docs: document how to mark a candidate for a stable branch
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-02 15:44:00 +02:00
Negreanu Marius Adrian
e00abb00f0 android: glcpp: fix abuse of yylex
Port the 'glcpp: fix abuse of yylex' commit to Android.mk
Also, since the Android.*.mk are sourced in a global namespace,
the local-y-to-c-and-h is prefixed with the LOCAL_MODULE name,

The initial fix commit is 53d46bc787

There's also a bugzilla for this: 54947

Signed-off-by: Negreanu Marius Adrian <adrian.m.negreanu@intel.com>
Reviewed-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-10-02 08:14:34 +03:00
Matt Turner
523c015246 build: Don't build libdricore if not building classic drivers 2012-10-01 15:23:05 -07:00
Matt Turner
b6c0fa1280 libdricore: Remove dead C(XX)FLAGS_NOVISIBILITY 2012-10-01 15:23:05 -07:00
Matt Turner
24ded89876 build: Add visibility CFLAGS to OSMesa 2012-10-01 15:23:05 -07:00
Matt Turner
1762ec28db build: Link OSMesa with glapi, libdl, libstdc++
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=399813
          https://bugs.freedesktop.org/show_bug.cgi?id=53179
2012-10-01 15:23:05 -07:00
Matt Turner
4cfff7211c build: Set visibility CFLAGS in dri/swrast 2012-10-01 15:23:05 -07:00
Matt Turner
3628402707 build: Set visibility CFLAGS in dri/r200 2012-10-01 15:23:05 -07:00
Matt Turner
55d45efdd8 build: Set visibility CFLAGS in dri/radeon 2012-10-01 15:23:05 -07:00
Matt Turner
340637d54d build: Set visibility CFLAGS in dri/nouveau 2012-10-01 15:23:04 -07:00
Matt Turner
381d120b8a build: Set visibility CFLAGS in dri/i915 2012-10-01 15:23:04 -07:00
Matt Turner
d2872b5612 build: Set visibility CFLAGS in dri/common 2012-10-01 15:23:04 -07:00
Matt Turner
8746f641bb build: Build src/glsl with visibility CFLAGS 2012-10-01 15:23:04 -07:00
Matt Turner
710a90ccaf build: Turn on visibility CFLAGS for core mesa 2012-10-01 15:23:04 -07:00
Matt Turner
63c3a051cd build: Order src/Makefile correctly 2012-10-01 15:23:04 -07:00
Matt Turner
814345f54b build: Use AX_PTHREAD's HAVE_PTHREAD preprocessor definition 2012-10-01 15:23:04 -07:00
Matt Turner
b6651ae6ad build: Use PTHREAD_LIBS and PTHREAD_CFLAGS 2012-10-01 15:23:04 -07:00
Matt Turner
dd4fde8f67 build: Set PTHREAD_LIBS for pkgconfig files if empty 2012-10-01 15:20:50 -07:00
Tom Stellard
00d80b3a6f llvmpipe: Fix build with LLVM 2.8
Commit 8d9778589f added all-targets to the
LLVM_COMPONENTS list, but this component does not exist with LLVM 2.8.

Adding all-targets is not necessary for any drivers, and it seems to be
left over from earlier versions of the commit mentioned above.

Tested-by: Stéphane Marchesin <marcheu@chromium.org>
2012-10-01 17:42:56 -04:00
Tom Stellard
67fcb3c2b4 configure.ac: Use amdgpu component for LLVM 3.2
The amdgpu component actually does exist.  I must have been using an
older version of llvm-config by accident when I first made this change.
2012-10-01 21:14:10 +00:00
Tom Stellard
f2f17fc348 radeon/llvm: Only initialize the AMDGPU target 2012-10-01 21:14:10 +00:00
Tom Stellard
cbd09a9e5c radeon: Fix build with LLVM 3.1
The build was broken by commit 8d9778589f
2012-10-01 15:47:31 -04:00
Tom Stellard
8d9778589f radeon: Support LLVM 3.2
LLVM 3.2 and newer requires that the R600/SI backend be part of the
LLVM tree.
2012-10-01 15:37:17 +00:00
Tom Stellard
91ee735001 r600g: Re-enable growing of the compute memory pool 2012-10-01 15:37:16 +00:00
Tom Stellard
44b1050e6c r600g: Fix bug when adding new items to the compute memory pool
The items are ordered in the item list by their offsets, with the lowest
offset coming first in the list.  The old code was assuming that new
items being added to the list would always have a greater offset than
the first item in the list, however this is not always the case.
2012-10-01 15:37:16 +00:00
Tom Stellard
eacca90f43 r600g: Use a RAT buffer as the backing bo for the compute memory pool 2012-10-01 15:37:16 +00:00
Tom Stellard
5cd1c65dc1 r600g: Make sure to init the compute memory pool with enough memory 2012-10-01 15:37:16 +00:00
Tom Stellard
2508d43c36 r600g: Add evergreen_init_color_surface_rat() v2
This can be used to initialize the CB* registers for buffers without a
radeon_surface.

v2:
  - Get correct group_bytes value from r600_screen
  - Stop setting unnecessary fields

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-10-01 15:37:16 +00:00
Tom Stellard
d13c3b19f9 r600g: Add register field definitions for 028C70_RESOURCE_TYPE
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-10-01 15:37:16 +00:00
Oliver McFadden
9545d9611f intel: add support for ANGLE_texture_compression_dxt.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-01 17:21:51 +03:00
Alex Deucher
304beb81bb radeonsi: emit PA_SU_PRIM_FILTER_CNTL
has no default value.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <deathsimple@vodafone.de>
2012-10-01 10:29:51 +02:00
Alex Deucher
7d76767f21 radeonsi: remove some old r600g cruft
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <deathsimple@vodafone.de>
2012-10-01 10:29:50 +02:00
Alex Deucher
918e302a19 radeonsi: fix range checking for state regs
end value is exclusive, but in practice we shouldn't
hit this.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-10-01 10:29:50 +02:00
Alex Deucher
f1a3de5e9d radeonsi: drop some cayman remnants
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <deathsimple@vodafone.de>
2012-10-01 10:29:50 +02:00
Christian König
22ae062fa1 radeonsi: define SGPR register numbers
Instead of hardcoding them.

Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-10-01 10:29:50 +02:00
Christoph Bumiller
c321b1bef1 nvc0: make sure handles for unbound textures/samplers are uploaded on nve4 2012-09-30 23:09:37 +02:00
Christoph Bumiller
2149ce41ed nv50,nvc0: fix 3d engine blit for nvc0 2012-09-30 23:09:29 +02:00
Christoph Bumiller
36ea744f58 nv50,nvc0: implement blit 2012-09-30 21:31:45 +02:00
Marek Olšák
de80660c2b gallium: remove resource_resolve
The functionality is provided by the new blit function.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:57 +02:00
Marek Olšák
d37e6b15ad st/mesa: implement decompress_with_blit using gallium blit
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:57 +02:00
Marek Olšák
d1b929a137 st/mesa: implement BlitFramebuffer using gallium blit
This also fixes a lot tests, especially all the clip-and-scissor-blit MSAA
piglit tests.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:57 +02:00
Marek Olšák
ad3d5dbcc5 svga: implement blit
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:57 +02:00
Marek Olšák
3d9d4b1ce6 softpipe: implement blit
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:57 +02:00
Marek Olšák
5f3054dcc4 radeonsi: implement blit
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:57 +02:00
Marek Olšák
fc887d687b r600g: implement blit
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:57 +02:00
Marek Olšák
95b777e688 r300g: implement blit
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
ced065a079 nv30: implement blit
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
db85443922 nv30: use util_format_is_supported
Hardware drivers *must* use it.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
ff2d192ec5 llvmpipe: implement blit
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
6d2f59ce54 i915g: implement blit
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
2a309dc2b4 gallium: implement blit in driver wrappers
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
ab3070c5fa gallium: add helpers for dumping pipe_box and pipe_blit_info
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
cecfb452ab gallium/u_blitter: add helper for blitting via resource_copy_region
v2: fix off-by-one error in is_box_inside_resource, add comments

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
0b0697e80d gallium/u_blitter: add gallium blit implementation
The original blit function is extended and the otAher functions reuse it.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
84d2f2295e gallium/u_blitter: add ability to disable and restore the render condition
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
59dfe0af60 gallium/u_blitter: facilitate co-existence with the Draw module
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
9cc257ad40 gallium/u_blitter: check PIPE_CAP_TEXTURE_MULTISAMPLE
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
c4df2e3337 gallium: add blit into the interface
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
b9c9dd4783 gallium: add PIPE_CAP_TEXTURE_MULTISAMPLE
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Marek Olšák
c15dbd7ef2 softpipe: fix set_framebuffer_state with uninitialized surfaces past nr_cbufs-1
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30 18:57:56 +02:00
Vinson Lee
0615e8324c scons: Use full path of texture_builtins.py.
Fixes this build error on Cygwin.
Explicit dependency `src/glsl/builtins/tools/texture_builtins.py' not
found, needed by target
`build/cygwin-x86-debug/glsl/builtin_function.cpp'.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-29 14:38:37 -07:00
Brian Paul
46328296bd mesa: add fall-through comment, just to be clear 2012-09-29 08:53:59 -06:00
Brian Paul
bd81ebf085 mesa: remove useless GLenum casts 2012-09-29 08:53:59 -06:00
Brian Paul
e77fc1279a mesa: add const qualifier in check_for_ending() to silence warning 2012-09-29 08:24:44 -06:00
Kenneth Graunke
225276c696 i965: Complain about variable index lowering when INTEL_DEBUG=perf.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-29 00:36:56 -07:00
Kenneth Graunke
33dbac78a8 i965: Dump linked shaders on MESA_GLSL=dump.
Often, the original shader IR isn't terribly interesting because a lot
of crucial optimizations haven't been done (such as inlining built-ins).

ir_to_mesa used to print this out for us, but since we don't use it, we
have to do it ourselves.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-29 00:36:45 -07:00
Kenneth Graunke
5cadb3ef7e glsl: Rename variable_entry2 back to variable_entry in struct splitting.
The anonymous namespace should keep these private classes to file scope,
preventing clashes with other symbols of the same name elsewhere.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-29 00:36:01 -07:00
Anuj Phogat
ea0d088727 intel/i965: Disable SampleAlphaToOne if dual source blending enabled
From SandyBridge PRM, volume 2 Part 1, section 12.2.3, BLEND_STATE:
DWord 1, Bit 30 (AlphaToOne Enable):
"If Dual Source Blending is enabled, this bit must be disabled"

Note: This is a candidate for stable branches.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-29 00:10:09 -07:00
Vinson Lee
9549e55f11 scons: Disable build of assembly sources on Cygwin.
The assembly sources currently do not build on Cygwin.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-28 23:29:10 -07:00
Jordan Justen
00905dbf19 mesa: allow MESA_GL_VERSION_OVERRIDE to override the API type
Change the format to MAJOR.MINOR[FC]
For example: 2.1, 3.0FC, 3.1

The FC suffix indicates a forward compatible context, and
is only valid for versions >= 3.0.

Examples:
2.1:   GL Legacy/Compatibility context
3.0:   GL Legacy/Compatibility context
3.0FC: GL Core Profile context + Forward Compatible
3.1:   GL Core Profile context
3.1FC: GL Core Profile context + Forward Compatible

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 16:15:51 -07:00
Ian Romanick
e87c63f288 i965: brwInitVtbl needs to know the chipset generation
Fixes major regressions since de958de.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-09-28 15:39:17 -07:00
Ian Romanick
de958de71b i915: Don't free the intel_context structure when intelCreateContext fails.
intelDestroyContext will eventually be called, and it will clean things up.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53618
2012-09-28 15:05:24 -07:00
Ian Romanick
87f26214d6 i965: Don't free the intel_context structure when intelCreateContext fails.
intelDestroyContext will eventually be called, and it will clean things
up.  The call to brwInitVtbl is moved earlier so that
intelDestroyContext can call the device-specific destructor.  This also
makes the code look more like the i915 code.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54301
2012-09-28 15:05:24 -07:00
Ian Romanick
22897c7497 intel: Don't call intelDestroyContext if there is no context to destroy
Some error paths in the device-specific context creation functions can exit
before the deintel_context structure is allocated.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53618
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54301
2012-09-28 15:05:24 -07:00
Ian Romanick
f93cb0bebb dri_util: Use calloc to allocate __DRIcontext
The __DRIcontext contains some pointers, and some drivers check for them to be
NULL in some failure paths.  Instead of sprinkling NULL assignments across the
various drivers, just zero out the whole thing.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Lu Hua <huax.lu@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53618
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54301
2012-09-28 15:05:24 -07:00
Jordan Justen
4c704e5949 main/version: add "(Core Profile) to version string for core profiles
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-28 14:17:12 -07:00
Eric Anholt
7ae332dc6d glx: Fix compile warnings since 99fee476a1
_glapi_table is a struct full of named function pointers, while the generated
code just wants to treat it as an array of function pointers.  Cast to avoid
the compiler warning.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-09-28 14:00:24 -07:00
Ian Romanick
66159f94a5 mesa/tests: Sanity check the ES2 dispatch table
This test is only built when shared-glapi is used.  Because of changes
elsewhere in the tree that were necessary to make shared-glapi work
correct with GLX, it's not feasible to make the test function both ways.

The list of expected functions originally came from the functions set by
api_exec_es2.c.  This file no longer exists in Mesa (but api_exec_es1.c
is still generated).  It was the generated file that configured the
dispatch table for ES2 contexts.  This test verifies that all of the
functions set by the old api_exec_es2.c (with the recent addition of VAO
functions) are set in the dispatch table and everything else is a NOP.

When adding ES2 (or ES3) extensions that add new functions, this test
will need to be modified to expect dispatch functions for the new
extension functions.

v2: Expect VAO functions be non-NOP.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-09-28 08:19:54 -07:00
Ian Romanick
d0e1428349 mesa/main: Make no-op dispatch function public
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-09-28 08:19:54 -07:00
Ian Romanick
9c59d11cd2 mesa/tests: Move stub function to a separate file
When building with shared-glapi, we can just use Mesa's _mesa_warning without
problems.  stubs.cpp is only used when shared-glapi is not used.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-09-28 08:19:54 -07:00
Ian Romanick
6c01a0e770 mesa: Don't set uniform dispatch pointers for many things in ES2 or core
NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:54 -07:00
Ian Romanick
be66cf950e mesa: Don't set shaderapi dispatch pointers for many things in ES2 or core
v2: Allow GL_ARB_shader_objects functions in core profile because we
still expose the extension string there.  Don't allow
glBindFragDataLocation in GLES3 because it's not part of that API.
Based (mostly) on review comments from Eric Anholt.

NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:54 -07:00
Ian Romanick
aa0f588e2d mesa: Don't set vtxfmt dispatch pointers for many things in ES2 or core
NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:54 -07:00
Ian Romanick
a13c07f752 mesa: Don't set loopback dispatch pointers for most things in ES2 or core
NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:54 -07:00
Ian Romanick
3ef9e43865 mesa: Pass GL context to _mesa_create_save_table
This isn't used by this patch, but it will be necessary for several
follow-on patches.  Separating this out will make it easier to reorder
patches later.

NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:53 -07:00
Ian Romanick
ee77061277 mesa: Don't set dispatch pointer for glTexStorage in ES2
NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:53 -07:00
Ian Romanick
7f7268d385 mesa: Don't set dispatch pointer for glGetProgramivARB in ES2
This function is not the same as glGetProgramiv.

NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:53 -07:00
Ian Romanick
a83b01371e mesa: Don't set dispatch pointer for glResizeBuffersMESA in ES2
NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:53 -07:00
Ian Romanick
1c0a44aaf5 mesa: Don't set dispatch pointers for glPointParameter[if][v] in ES2
NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:53 -07:00
Ian Romanick
2a3a68e4c7 mesa: Don't set dispatch pointers for glClearDepth or glDepthRange in ES2
NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:53 -07:00
Ian Romanick
11927bfc4a mesa: Don't set dispatch pointer for glGetBufferSubData in ES2
NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:53 -07:00
Ian Romanick
850412b8ab mesa: Don't set dispatch pointer for glGetDoublev in ES2
NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:53 -07:00
Ian Romanick
aa129b0833 mesa: Don't set dispatch pointer for glPointSize in ES2
NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:53 -07:00
Ian Romanick
66b956618e mesa: Set dispatch pointer for glShaderBinary
NOTE: This is a candiate for stable branches

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-28 08:19:53 -07:00
Ian Romanick
23ff634c9c gles2: Alias glReadBufferNV with desktop glReadBuffer
NOTE: This is a candidate for the 9.0 branch

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: Kristian Høgsberg <krh@bitplanet.net>
2012-09-28 08:19:53 -07:00
Chad Versace
b589128620 intel: Fix yet-another-bug in intel_texsubimage_tiled_memcpy
The most recent commit that touched this function,

    commit b1d0fe022d
    Author: Chad Versace <chad.versace@linux.intel.com>
    Date:   Wed Sep 26 11:05:12 2012 -0700

        intel: Fix segfault in intel_texsubimage_tiled_memcpy

did fix the segfault, but introduced yet another bug. From Anholt: """You
need to still test format/type, because that's the incoming format (e.g.
GL_RGBA/GL_FLOAT) that you're trying to memcpy."""

This patch re-introduces the checks on the incoming format and type.

Note: This is a candidate for the 9.0 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-09-28 05:04:33 -07:00
Vinson Lee
d239cb1ccf mesa: Fix typo in error message.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2012-09-27 22:32:10 -07:00
Vincent Lejeune
92b3a99ce5 r600g: add some members to radeon_llvm_context
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-28 01:46:38 +02:00
Vincent Lejeune
a1a3792b18 r600g: tgsi-to-llvm path is taken after declarations have been parsed
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-28 01:46:23 +02:00
Kenneth Graunke
3767b25bd3 meta: Use float for temporary images, not (un)signed normalized.
In commit 091eb15b69, Jordan changed get_temp_image_type() to use
_mesa_get_format_datatype() instead of returning GL_FLOAT.  That has
several possible return values: GL_FLOAT, GL_INT, GL_UNSIGNED_INT,
GL_SIGNED_NORMALIZED, and GL_UNSIGNED_NORMALIZED.

We do want to use GL_INT/GL_UNSIGNED_INT for integer formats.  However,
we want to continue using GL_FLOAT for the normalized fixed-point types.
There isn't any code in pack.c to handle GL_(UN)SIGNED_NORMALIZED.

Fixes oglconform's fboarb advanced.blit.copypix, which was regressed by
commit 091eb15b69.

NOTE: This is a candidate for the 9.0 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53573
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-27 15:37:04 -07:00
Chad Versace
7dc0be8a8b intel: Don't advertise GLX_SWAP_COPY_OML
This patch removes all gl_config's with swapMethod=GLX_SWAP_COPY_OML. When
page flipping, we are unable to comply with swap-copy semantics.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-09-27 14:32:40 -07:00
Eric Anholt
e917ed6eee i965: Remove stale comment about rebuilding tnl_program.
It gets built in Mesa core before we're called these days.

Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-27 13:22:52 -07:00
Eric Anholt
7f9e1a7720 i965: Add a comment explaining one of the brw_draw_upload.c loops.
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-27 13:22:49 -07:00
Eric Anholt
0334e8dc25 i965: Remove broken non-interleaved-to-interleaved upload code.
This failed when all the uploads to occur were uniform-type vertex data (like
glColor4f being active across a DrawArrays), because it would upload 1 element
instead of 1 element per vertex.  There was no citation for how this code
helped any particular application, and it breaks ETQW, so just remove it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47170
NOTE: This is a candidate for the 9.0 and 8.0 branches.
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-27 13:22:43 -07:00
Eric Anholt
f3984fbe33 intel: Remove dead intel_format_to_rb_datatype.
This was for some of the old spans-related code that is now gone.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-27 12:52:53 -07:00
Eric Anholt
9ba6f4733c intel: Mark some file-local code as static.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-27 12:52:53 -07:00
Eric Anholt
e0cd633f17 i965: Mark brw_disasm.c tables as static const.
v2: Make the strings in the tables const, too.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-27 12:52:53 -07:00
Eric Anholt
837f06b42f i965: Use visibility cflags on the driver code.
The only symbols that need to be public (those in intel_screen.c that the
loader looks for) are already marked public.  Saves 100k of compiled driver
size.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-27 12:52:53 -07:00
Eric Anholt
0f331bd385 i965/vp: Remove support for non-ARB_vp, non-NV_vp opcodes.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-27 12:52:53 -07:00
Eric Anholt
57bd069849 i965/vp: Remove support for relative addressing of destination registers.
This was added for GLSL support back in the day.  It's prohibited by both
ARB_vp and NV_vp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-27 12:52:53 -07:00
Eric Anholt
410197974b i965/vp: Remove support for reading destination registers.
It's prohibited by ARB_vp and NV_vp, and not used by fixed function t&l.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-27 12:52:53 -07:00
Eric Anholt
7a7081c45a i965/vp: Remove support for GLSL flow control from the old VS backend.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-27 12:52:53 -07:00
Matt Turner
9ed00075d8 build: Link libglapi with pthreads
NOTE: This is a candidate for the 9.0 branch.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=839060
          https://bugs.gentoo.org/show_bug.cgi?id=435152
Reviewed-by: Adam Jackson <ajax@redhat.com>
2012-09-27 10:25:26 -07:00
Matt Turner
7da12426f7 build: Use AX_PTHREAD to detect pthreads
NOTE: This is a candidate for the 9.0 branch.

Reviewed-by: Adam Jackson <ajax@redhat.com>
2012-09-27 10:25:20 -07:00
Marek Olšák
96f50d0cf7 r600g: fix EXP on Cayman
NOTE: This is a candidate for the stable branches.
2012-09-27 19:14:44 +02:00
Marek Olšák
fd5c538464 r600g: fix RSQ of negative value on Cayman
NOTE: This is a candidate for the stable branches.
2012-09-27 19:14:44 +02:00
Marek Olšák
836325bf7e r600g: fix instance divisor on Cayman
Not sure if this is the best way to fix it.

NOTE: This is a candidate for the stable branches.
2012-09-27 19:14:44 +02:00
Marek Olšák
933faae2b8 r600g: flush FMASK and CMASK when changing colorbuffers on Evergreen
This fixes rare graphical corruption.

NOTE: This is a candidate for the stable branches.
2012-09-27 19:14:44 +02:00
Marek Olšák
9f5d6320f2 r600g: use invalid DB hardware formats to disable depth/stencil 2012-09-27 19:14:44 +02:00
Chad Versace
b1d0fe022d intel: Fix segfault in intel_texsubimage_tiled_memcpy
The function segfaulted when a game called glTexSubImage2D on a texture
with internalformat/format/type = GL_SLUMINANCE8/GL_BGRA/GL_UNSIGNED_BYTE.

The function only supports MESA_FORMAT_ARGB8888 and returns early if it
detects an unsupported format. Clearly, its detection condition was
insufficient. This patch fixes it to explicity check for
MESA_FORMAT_ARGB8888.

Note: This is a candidate for the 9.0 branch (fixes 413c491).
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-09-27 07:35:53 -07:00
Kenneth Graunke
6d6aef7974 i965: Do texture swizzling in hardware on Haswell.
Haswell supports EXT_texture_swizzle and legacy DEPTH_TEXTURE_MODE
swizzling by setting SURFACE_STATE entries.  This means we don't have to
bake the swizzle settings into the shader code by emitting MOV
instructions, and thus don't have to recompile shaders whenever the
swizzles change.

Unfortunately, we can't handle GL_ALPHA this way: unlike all the others,
which store the comparison result in the .r channel (and possibly others
as well), GL_ALPHA puts it in the .a channel.  The GLSL 1.30+ style
functions which return a float always simply return the .r channel,
which would be zero if we handled this as a surface override.  In this
case, fall back to doing it the old way.  DEPTH_TEXTURE_MODE = GL_ALPHA
isn't an interesting performance path anyway.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-26 22:58:30 -07:00
Kenneth Graunke
b5a042a657 i965: Refactor texture swizzle generation into a helper.
It's going to be reused in a second place soon.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-26 22:58:30 -07:00
Vincent Lejeune
ff947c6d65 radeon/llvm: improve select_cc lowering to generate CND* more often
v2: - Simplify isZero()
    - Remove a unused function prototype
    - Clean whitespace trails

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-27 01:43:35 +02:00
Chad Versace
bb7ecb29fb intel: Fix size of temporary etc1 buffer
Fixes valgrind errors in piglit test
oes_compressed_etc1_rgb8_texture-miptree: an invalid write in
_mesa_store_compressed_store_texsubimage() at line 4406 and invalid reads
in texcompress_etc_tmp.h:etc1_parse_block().

The calculation of the size of the temporary etc1 buffer allocated by
intel_miptree_map_etc1() was incorrect. Sometimes the allocated buffer was
too small, sometimes too large.  This patch corrects the size to that
expected by _mesa_store_compressed_store_texsubimage().

Note: This is candidate for the 9.0 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-09-26 09:47:46 -07:00
Alex Deucher
0aa47b2d8b radeonsi: fix truncated register define.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-26 10:07:46 -04:00
Brian Paul
3ba9dbbabf mesa: move _mesa_es_error_check_format_and_type() to glformats.c
Where the non-ES _mesa_error_check_format_and_type() function lives.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-26 07:43:49 -06:00
Brian Paul
8348076ae4 mesa: move GL_HALF_FLOAT_OES definition to glheader.h
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-26 07:43:49 -06:00
Brian Paul
b52e05cecb mesa: minor fix to glTexSubImage error message 2012-09-26 07:43:49 -06:00
Brian Paul
d3aa6a5c56 mesa: consolidate sub-texture error checking code
Do all error checking of glTexSubImage, glCopyTexSubImage and
glCompressedTexSubImage's xoffset, yoffset, zoffset, width, height, and
depth params in one place.
2012-09-26 07:43:49 -06:00
Brian Paul
7e1ad9cd37 mesa: consolidate glTexSubImage() error checking 2012-09-26 07:43:49 -06:00
Brian Paul
f830f10a37 mesa: consolidate glCompressedTexSubImage() error checking
Do all the checking in one function instead of two and fix up some of
the error checking.alignment check
2012-09-26 07:43:49 -06:00
Brian Paul
bd3caa50a5 mesa: consolidate subtexture xoffset/yoffset/width/height error checking code
This is the code that checks if a subtexture region is aligned to the
compressed format's block size.
2012-09-26 07:43:49 -06:00
Brian Paul
2558af7e93 mesa: consolidate glCopyTexSubImage error checking
Do all the checking in one function instead of two.
2012-09-26 07:43:49 -06:00
Brian Paul
1f586684d6 mesa: fix incorrect error for glCompressedSubTexImage
If a subtexture region isn't aligned to the compressed block size,
return GL_INVALID_OPERATION, not gl_INVALID_VALUE.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-26 07:43:49 -06:00
Christian Koenig
421eeff463 radeonsi: move draw cmds to si_commands.c
Signed-off-by: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-26 11:05:35 +02:00
Christian Koenig
7773c7109c radeonsi: start seperating commands into si_commands.c
Signed-off-by: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-26 11:05:31 +02:00
Christian Koenig
3c51c60ed0 radeonsi: get rid of evergreen_hw_context.c
Signed-off-by: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-26 11:05:27 +02:00
Christian Koenig
fcc9c125f4 radeonsi: remove unused code
Signed-off-by: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-26 11:05:23 +02:00
Christian König
04473db38a radeonsi: start reworking inferred state handling
Instead of tracking the inferred state changes separately
just check if queued and emitted states are the same.

This patch just reworks the update of the SPI map between
vs and ps, but there are probably more cases like this.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-26 11:04:36 +02:00
Paul Berry
112caa853d gles3: Prohibit set/get of GL_FRAMEBUFFER_SRGB.
GLES 3 supports sRGB functionality, but it does not expose the
GL_FRAMEBUFFER_SRGB enable/disable bit.  Instead the implementation
is expected to behave as though that bit is always enabled.

This patch ensures that ctx->Color.sRGBEnabled (the internal variable
tracking GL_FRAMEBUFFER_SRGB) is initially true in GLES 2/3 contexts,
and that it cannot be modified through the GLES 3 API.

This is safe for GLES 2, since ctx->Color.sRGBEnabled has no effect on
non-sRGB formats, and GLES 2 doesn't support any sRGB formats.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-09-25 15:02:43 -07:00
Paul Berry
414f69aaad meta: Properly save/restore GL_FRAMEBUFFER_SRGB in Meta.
Previously, meta logic was saving and restoring the value of
GL_FRAMEBUFFER_SRGB in an ad-hoc fashion.  As a result, it was not
properly disabled and/or restored for some meta operations.

This patch causes GL_FRAMEBUFFER_SRGB to be saved/restored in the
conventional way of meta-ops (using _mesa_meta_begin() and
_mesa_meta_end()).  It is now reliably saved/restored for
_mesa_meta_BlitFramebuffer, _mesa_meta_GenerateMipmap, and
decompress_texture_image, and preserved for all other meta ops.

Fixes piglit tests "ARB_framebuffer_sRGB/blit renderbuffer
{linear_to_srgb,srgb} scaled {disabled,enabled}".

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-09-25 15:01:13 -07:00
Paul Berry
8faa79764c enable: Create _mesa_set_framebuffer_srgb() function for use by meta ops.
GLES3 supports sRGB formats, but it does not support the
GL_FRAMEBUFFER_SRGB enable/disable flag (instead it behaves as if this
flag is always enabled).  Therefore, meta ops that need to disable
GL_FRAMEBUFFER_SRGB will need a backdoor mechanism to do so when the
API is GLES3.

We were already doing a similar thing for GL_MULTISAMPLE, which has
the same constraints.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-09-25 15:01:13 -07:00
Matt Turner
399a03fdd6 targets/xorg-i915: Rename driver to i915_drv.so.
modesetting_drv.so is undescriptive and collides with
xf86-video-modesetting.

Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
2012-09-25 12:04:10 -07:00
Chad Versace
413c491412 intel: Improve teximage perf for Google Chrome paint rects (v3)
This patch reduces the time spent in glTexImage and glTexSubImage by
over 5x on Sandybridge for the workload described below.

It adds a new fast path for glTexImage2D and glTexSubImage2D,
intel_texsubimage_tiled_memcpy, which is optimized for Google Chrome's
paint rectangles. The fast path is implemented only for 2D GL_BGRA
textures for chipsets with a LLC.

=== Performance Analysis ===

Workload description:

    Personalize your google.com page with a wallpaper.  Start chromium
with flags "--ignore-gpu-blacklist --enable-accelerated-painting
--force-compositing-mode".  Start recording with chrome://tracing. Visit
google.com and wait for page to finish rendering.  Measure the time spent
by process CrGpuMain in GLES2DecoderImpl::HandleTexImage2D and
HandleTexSubImage2D.

System config:

    cpu: Sandybridge Mobile GT2+ (0x0126)
    kernel 3.4.9 x86_64
    chromium 21.0.1180.89 (154005)

Statistics:

                  | N   Median  Avg   Stddev
    --------------|-------------------------
    before (msec) | 8   472.5  463.75 72.6
    after  (msec) | 8    78.0   79.6   5.7

    Arithmetic difference at 95.0% confidence:
       -384.1  +/- 55.2 msec
        -82.8% +/- 11.9%

    Ratio at 95.0% confidence:
          5.81 +/- 0.119

v2:
    - Replace check for `intel->gen >= 6` with `intel->has_llc`, per
      danvet.
    - Fix typo in comment, s/throuh/through/.
    - Swap 'before' and 'after' rows in stat table.

v3:
    - If the current batch references the bo, then flush batch before mapping
      the bo. Found by Chris.
    - Restrict supported texture images to level 0 of target
      GL_TEXTURE_2D. This avoids an arithmetic bug in calculating image
      offsets within the miptree, found by Paul. This restriction does not
      diminish this patch's benefit to Chrome OS performance.
    - Use less instructions for bit6 swizzling, suggested by Paul.
    - Remove erroneous comment about Y-tiling, for Paul.
    - Print perf_debug messages when flushing and stalling.
    - Update stats in commit message; run workload under a release build
      rather than a debug build.

Note: This is a candidate for the 9.0 branch.
Acked-by: Eric Anholt <eric@anholt.net>
CC: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-09-25 10:58:45 -07:00
Tom Stellard
581619f5a7 clover: Fix build with libclang v3.2
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2012-09-25 14:36:51 +00:00
Tom Stellard
71682cf65b clover: Query device for CL_DEVICE_MAX_MEM_ALLOC_SIZE v2
v2:
  - Use driver reported values and don't correct them to the OpenCL
    required minimum.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2012-09-25 14:36:50 +00:00
Tom Stellard
0e3c30cd6f gallium: Add PIPE_COMPUTE_CAP_MAX_MEM_ALLOC_SIZE v2
v2:
  - Add comment in screen.rst
  - Report OpenCL required minimum for r600g
2012-09-25 14:36:50 +00:00
Tom Stellard
b57eba3654 r600g: Handle multiple kernels in the same program v2
v2:
  - Use pc parameter of launch_grid
2012-09-25 14:36:46 +00:00
Blaž Tomažič
e59505e34b clover: Handle multiple kernels in the same program v2
v2: Tom Stellard
  - Use pc parameter of launch_grid()

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-25 14:27:47 +00:00
Brian Paul
68a4bb553b mesa: remove 'struct' from texenv_fragment_program
texenv_fragment_program is declared as a class.  Fixes warnings with MSVC.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-25 08:23:18 -06:00
Kenneth Graunke
097b4a3b28 i965: Allow fast depth clears if scissoring doesn't do anything.
A game we're working with leaves scissoring enabled, but frequently sets
the scissor rectangle to the size of the whole screen.  In that case,
scissoring has no effect, so it's safe to go ahead with a fast clear.

Chad believe this should help with Oliver McFadden's "Dante" as well.

v2/Chad: Use the drawbuffer dimensions rather than the miptree slice
dimensions.  The miptree slice may be slightly larger due to alignment
restrictions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-and-tested-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
2012-09-25 07:03:59 -07:00
Paul Berry
ab5ce2789f i965: Don't spill "smeared" registers.
Fixes an assertion failure when compiling certain shaders that need both
pull constants and register spilling:

brw_eu_emit.c:204: validate_reg: Assertion `execsize >= width' failed.

NOTE: This is a candidate for release branches.

Signed-off-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-25 07:02:14 -07:00
Jay Cornwall
374925bec9 nv50/ir/ra: Fix register interference tracking.
See fdo bug 55224.
2012-09-25 14:00:51 +02:00
Paul Berry
124b214f09 i965/blorp: Fix sRGB MSAA resolves.
Commit e2249e8c4d (i965/blorp: Add
support for blits between SRGB and linear formats) changed blorp to
always configure surface states for in linear format (even if the
underlying surface is sRGB).  This allowed sRGB-to-linear and
linear-to-sRGB blits to occur without causing the image to be
inappropriately brightened or darkened.

However, it broke sRGB MSAA resolves, since they rely on the
destination buffer format being sRGB in order to ensure that samples
are averaged together in sRGB-correct fashion.

This patch fixes the problem by instead configuring the source buffer
to use the *same* format as the destination buffer.  This ensures that
the image won't be brightened or darkened, but preserves proper sRGB
averaging.

Fixes piglit tests "EXT_framebuffer_multisample/accuracy srgb".

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55265

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-24 17:03:36 -07:00
Jonas Maebe
5fdf1f784b darwin: do not create double-buffered offscreen pixel formats
http://xquartz.macosforge.org/trac/ticket/536

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2012-09-24 16:06:07 -07:00
Tom Stellard
92b033a89e radeon/llvm: Fix instruction encoding for r600 family GPUs
Tested-by: Michel Dänzer <michel.daenzer@amd.com>

https://bugs.freedesktop.org/show_bug.cgi?id=55217
2012-09-24 17:01:31 -04:00
Brian Paul
24a8e0c3da build: remove signbit check in configure.ac
We now have a fallback macro in imports.h
This reverts part of 0f3ba405.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-24 14:48:23 -06:00
Brian Paul
14ca76646a mesa: add signbit() macro
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-24 14:48:06 -06:00
Tom Stellard
defe8f0da2 r600g: Set RADEON_FLUSH_KEEP_TILING_FLAGS when emitting compute cs 2012-09-24 18:35:50 +00:00
Robert Bragg
dda49c3cb7 build: substitute X11_INCLUDES variable
There are a few automake files that reference $(X11_INCLUDES) such as
src/glx/Makefile.am but configure.ac wasn't declaring the variable for
substitution. This would break builds of glx if libxcb, for example, was
installed in its own prefix since AM_CFLAGS wouldn't coincidentally
list the needed include path in that case.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-24 09:49:53 -07:00
Matt Turner
0f3ba405ea Use signbit() in IS_NEGATIVE and DIFFERENT_SIGNS
signbit() appears to be available everywhere (even MSVC according to
MSDN), so let's use it instead of open-coding some messy and confusing
bit twiddling macros.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54805
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
2012-09-24 09:45:17 -07:00
Francisco Jerez
959fe586fb clover: Silence narrowing conversion warnings in resource.cpp. 2012-09-24 18:36:34 +02:00
Tom Stellard
01877a6fd0 clover: Handle NULL value for clEnqueueNDRangeKernel local_work_size
[ Francisco Jerez: Slight simplification. ]
2012-09-24 18:35:43 +02:00
Paul Berry
a33ce665a5 i965/blorp: Increase Y alignment for multisampled stencil blits.
This patch is a band-aid fix for a bug in commit 5fd67fa (i965/blorp:
Reduce alignment restrictions for stencil blits), which causes
multisampled stencil blits to work incorrectly on Sandy Bridge.

When blitting to or from a normal stencil buffer, we have to use a
coordinate transformation that swizzles coordinates to account for the
fact that stencil buffers use W tiling, but the most similar tiling
format available for textures and render targets is Y tiling.  The
differences between W and Y tiling cause pixels to be scrambled within
a block of size 8x4 (width x height) as measured relative to a W tile,
or 16x2 as measured relative to a Y tile.  So in order to make sure
that pixels at the edges of the blit aren't lost, we need to align the
rendering rectangle (and the buffer sizes) to multiples of the 8x4
block size.  This alignment happens in the brw_blorp_blit_params
constructor, whereas the determination of how to swizzle the
coordinates happens during code generation, in the
brw_blorp_blit_program class.

When blitting to or from a multisampled stencil buffer, the coordinate
swizzling is more complex, because it has to account for the
interleaving pattern of samples, which uses 4x4 blocks for 4x MSAA and
8x4 blocks for 8x MSAA.  The end result is that if multisampling is in
use, the 16x2 block size (relative so a Y tile) needs to be expanded
to 16x4, and the corresponding size relative to a W tile expands to
8x8.

The problem doesn't affect Ivy Bridge severely enough to crop up in
Piglit tests because on Ivy Bridge we have to disable multisampling
when blitting *to* a multisampled stencil buffer (the blorp compiler
generates code to compensate for the fact that multisampling is
disabled).  However I suspect a bug is still present because we don't
disable multisampling when blitting *from* a multisampled stencil
buffer.

This patch fixes the problem by doubling the vertical alignment
requirement when blitting to or from a multisampled stencil buffer,
and multisampling has not been disabled.

In the long run I would like to rework the brw_blorp_blit_params
constructor--it's difficult to follow and has had several subtle bugs
like this one.  However this band-aid fix should be suitable for
cherry-picking to release branches.

Fixes Piglit tests "unaligned-blit {2,4} stencil {msaa,upsample}" on
Sandy Bridge.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-24 09:07:31 -07:00
Brian Paul
68060cfb2b upgrade glext.h to version 85
NOTE: This is a candidate for the stable branches.
2012-09-24 08:07:08 -06:00
Brian Paul
f1c448d2e5 st/mesa: check for zero-size image in st_TestProxyTexImage()
Fixes divide by zero issue in llvmpipe driver.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-09-24 07:58:45 -06:00
Kenneth Graunke
c432c86e6a mesa: Silence narrowing warnings in ff_fragment_shader's emit_texenv().
Recent version of GCC report a warning for the implicit conversion from
int to float:

  ff_fragment_shader.cpp:897:3: warning: narrowing conversion of '(1 << ((int)rgb_shift))' from 'int' to 'float' inside { } is ill-formed in C++11 [-Wnarrowing]

This is because floats cannot precisely represent all possible 32-bit
integer values.  However, texenv code is all expected to be floating
point, so this should not be a problem.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-23 22:38:58 -07:00
Marek Olšák
60e610e042 docs: fixup GL4.3 TODO list
From the OpenGL Registry:
  "2012/08/13: specs named GL_ARB_debug_group, GL_ARB_debug_label, and
   GL_ARB_debug_output2 were published in error during the initial OpenGL 4.3
   release. All functionality in these documents was combined into
   the extension GL_KHR_debug. They have been withdrawn from the registry,
   and a few other extensions were renumbered to avoid holes in the numbering
   scheme."
2012-09-23 17:19:52 +02:00
Vincent Lejeune
fb40f88338 radeon/llvm: support for interpolation intrinsics
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-22 18:12:11 +02:00
Marek Olšák
2988fa940e draw: fix non-indexed draw calls if there's an index buffer
pipe_draw_info::indexed determines if it should be indexed and not
the presence of an index buffer.

This fixes crashes in r300g.

NOTE: This is a candidate for the stable branches.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-22 14:20:37 +02:00
Tom Stellard
bbb2ebe2fc r600g: Fix build with LLVM compiler 2012-09-21 20:07:14 -04:00
Marek Olšák
bfe489c76b r600g: set QUANT_MODE on Cayman too
This fixes piglit/fbo-blit-stretched.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-22 00:31:59 +02:00
Marek Olšák
11e2a41b84 r600g: use CS helpers to emit streamout state
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-22 00:31:59 +02:00
Marek Olšák
669bfaaa1e r600g: remove initialization of unused loop register tables
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-22 00:31:59 +02:00
Marek Olšák
b71701d43e r600g: remove now-unused SURFACE_BASE_UPDATE logic
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-22 00:31:58 +02:00
Marek Olšák
e3ecfecada r600g: remove unused CB registers from register lists
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-22 00:31:58 +02:00
Marek Olšák
c8b06dccff r600g: atomize framebuffer state
Tested on RS880, Evergreen and Cayman.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-22 00:31:58 +02:00
Marek Olšák
b652180107 r600g: don't snoop context state while building shaders
Let's use the shader key describing the state.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-22 00:31:58 +02:00
Anuj Phogat
eb1d87fb94 meta: Add on demand compilation of per target shader programs
A call to glGenerateMipmap() follows the generation of a relevant
shader program in setup_glsl_generate_mipmap().

To support all texture targets and to avoid compiling shaders
everytime, per target shader programs are compiled on demand
and saved for the next call.

Fixes float-texture(mipmap.manual):
See Comment 6: https://bugs.freedesktop.org/show_bug.cgi?id=54296

NOTE: This is a candidate for stable branches.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-21 13:31:58 -07:00
Tom Stellard
8ed9aaea51 clover: Initialize height and depth to 1 for transfers
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2012-09-21 19:45:17 +00:00
Tom Stellard
024e1732cb pipe-loader: Remove a few debug_printfs
On debug builds these were always being printed.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2012-09-21 19:45:07 +00:00
Tom Stellard
438b1da7e5 radeon/llvm: Handle loads from the constants address space.
Reading from constant memory is not supported yet, so constant reads use
global memory.
2012-09-21 19:30:58 +00:00
Tom Stellard
3882d7b5e4 radeon/llvm: Add support for v4f32 stores on R600 2012-09-21 19:30:58 +00:00
Tom Stellard
e866dbd1b5 radeon/llvm: Add support for i8 reads on R600 2012-09-21 19:30:57 +00:00
Tom Stellard
b282c9611e radeon/llvm: Expand vector fadd and fmul on R600 2012-09-21 19:30:57 +00:00
Tom Stellard
aa8367dd13 radeon/llvm: Add optimization for FP_ROUND 2012-09-21 19:30:57 +00:00
Tom Stellard
87decd6e66 radeon/llvm: Replace AMDGPU pow intrinsic with the llvm version 2012-09-21 19:30:53 +00:00
Paul Berry
aa3c2e3186 i965/blorp: Fix narrowing warnings.
Blorp has to convert rectangle coordinates from integers to floats in
order to send them down the GPU pipeline.  Recent versions of GCC
issue a warning for this, since a float is not capable of precisely
representing all possible 32-bit integer values.  Suppress the warning
with an explicit type cast in the case of blorp, since rectangle
coordinates will never be large enough to cause a loss of precision.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-21 10:53:25 +02:00
Kenneth Graunke
cd49025aff i965: Remove brw_set_predicate_inverse(p, true) from scratch offset code
Given that it exists between a push/pop of instruction state, this call
can only affect the MOV or ADD instruction generated just below it.
Neither of those instructions are predicated, so it makes no sense to
ask for the inverse predicate.

This fixes grumblings from the simulator debugger, which was
complaining about an invalid predicate.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-21 01:29:40 -07:00
Kenneth Graunke
328961d955 mesa: Don't override S3TC internalFormat if data is pre-compressed.
Commit 42723d88d intended to override an S3TC internalFormat to a
generic compressed format when the application requested online
compression of uncompressed data.  Unfortunately, it also broke
pre-compressed textures when libtxc_dxtn isn't installed but the
extensions are forced on.

Both glCompressedTexImage2D() and glTexImage2D() call teximage(), which
calls _mesa_choose_texture_format(), hitting this override code.  If we
have actual S3TC source data, we can't treat it as any other format, and
need to avoid the override.

Since glCompressedTexImage2D() passes in a format of GL_NONE (which is
illegal for glTexImage), we can use that to detect the pre-compressed
case and avoid the overrides.

Fixes a regression since 42723d88d3.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-and-tested-by: Jordan Justen <jordan.l.justen@intel.com>
2012-09-20 14:49:19 -07:00
Kenneth Graunke
e2249e8c4d i965/blorp: Add support for blits between SRGB and linear formats.
Fixes colorspace issues in L4D2 when multisampling is enabled (the
scene was far too dark, but the flashlight area was way too bright).

The nVidia and AMD binary drivers both allow this kind of blit.

NOTE: This is a candidate for the 9.0 branch.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-20 14:48:02 -07:00
Kenneth Graunke
c96828ecb4 mesa: Ignore SRGB when determining compatible resolve formats.
MSAA resolves and other blit-like operations ignore SRGB state anyway,
so we should be able to safely allow resolves between compatible
SRGB/linear formats like SRGBA8 and RGBA8888.

This matches the behavior of the nVidia and AMD binary drivers.

Fixes completely black rendering when using multisampling in L4D2.

NOTE: This is a candidate for the 9.0 branch.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-20 14:47:23 -07:00
Andreas Boll
8504f18c3d docs: update some more FAQs
v2: remove mention of XFree86

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 20:00:37 +02:00
Andreas Boll
0188b9371f docs: remove utility.html
This page is very old and some of the links are dead.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 20:00:37 +02:00
Andreas Boll
19195781c8 docs: remove science.html
This page is very old and some of the links are dead.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 20:00:37 +02:00
Andreas Boll
19fe84d8df docs: remove modelers.html
This page is very old and some of the links are dead.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 20:00:36 +02:00
Andreas Boll
ca6ff299c5 docs: remove libraries.html
This page is very old and some of the links are dead.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 20:00:36 +02:00
Andreas Boll
b1c75e7257 docs: remove games.html
This page is very old and some of the links are dead.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 20:00:36 +02:00
Andreas Boll
9e2af606b6 docs/contents: add autoconf.html link
make it easier to find the docs/autoconf.html site

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 20:00:36 +02:00
Andreas Boll
7b314b3b14 docs: convert last traces of progs to mesa/demos repository
v2: fix typo

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 20:00:36 +02:00
Andreas Boll
356a73145e docs: add IRC info
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 20:00:36 +02:00
Andreas Boll
76d4f9e404 docs/egl: improve markup
replace unordered list <ul> with defined list <dl>

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 20:00:36 +02:00
Andreas Boll
63eade4b60 docs/autoconf: improve markup
replace unordered list <ul> with defined list <dl>

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 20:00:36 +02:00
Andreas Boll
ab06629d5d docs/autoconf: remove obsolete demo options
removed with commit 56c3cce2a1
two years ago

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 20:00:36 +02:00
Andreas Boll
d61707d0f8 docs: improve quality of gears.png
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 20:00:36 +02:00
Brian Paul
58f386b20b gallium: mention PIPE_TIMEOUT_INFINITE in the fence_finish() comment 2012-09-20 09:49:12 -06:00
Brian Paul
0bcad02955 llvmpipe: fix overflow bug in total texture size computation
v2: use uint64_t for the total_size variable, per Jose.

Also add two earlier checks for exceeding the max texture size.
For example a 1K^3 RGBA volume would overflow the lpr->image_stride
variable.

Use simple algebra to avoid overflow in intermediate values.
So instead of "x * y > z" use "x > z / y".

This should work if we happen to be on a platform that doesn't have
64-bit types.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-20 09:47:09 -06:00
Alex Deucher
7b4aefd3c9 r600g/llvm: rs780/rs880 are r600 asics
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-20 11:17:52 -04:00
Ian Romanick
ae3023e967 mesa: Allow glGetTexParameter of GL_TEXTURE_SRGB_DECODE_EXT
This was already (correctly) supported for glGetSamplerParameter paths.

NOTE: This is a candidate for stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 11:42:56 +02:00
Tom Stellard
bd8fb9e805 r300/compiler: Use precomputed q values in the register allocator 2012-09-19 19:25:53 -04:00
Tom Stellard
886a4d4a6a r300g: Init regalloc state during context creation
Initializing the regalloc state is expensive, and since it is always
the same for every compile we only need to initialize it once per
context.  This should help improve shader compile times for the driver.
2012-09-19 19:25:53 -04:00
Tom Stellard
9282adcae9 r300/compiler: Don't create register classes for inputs 2012-09-19 19:25:53 -04:00
Tom Stellard
e0f64a837f ra: Add q_values parameter to ra_set_finalize()
This allows the user to pass precomputed q values to the allocator.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-19 19:25:53 -04:00
Tom Stellard
cfeb99c7da ra: Clarify usage of ra_set_node_reg()
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-19 19:25:53 -04:00
Tom Stellard
69b387fbdc r600g: Invalidate texture cache when creating vertex buffers for compute v2
Compute shaders fetch data from vertex buffers via the texture cache, so
we need to make sure the texture cache is flushed.

v2:
  - Fix rebase mistake
  - Fix spelling in comment

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-09-19 14:58:53 -04:00
Tom Stellard
810345492e r600g: Use LOOP_START_DX10 for loops
LOOP_START_DX10 ignores the LOOP_CONFIG* registers, so it is not limited
to 4096 iterations like the other LOOP_* instructions.  Compute shaders
need to use this instruction, and since we aren't optimizing loops with
the LOOP_CONFIG* registers for pixel and vertex shaders, it seems like
we should just use it for everything.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-09-19 14:58:53 -04:00
Tom Stellard
3e3ca92718 r600g: Set the correct value of COLOR*_DIM for RATs
For buffers (which is what is being used for RATs), the
COLOR*_DIM.WIDTH_MASK field needs to be set to the low 16-bits of the
buffer size, and the COLOR*_DIM.HEIEGHT_MAX needs to be set to the
high bits.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-09-19 14:58:53 -04:00
Tom Stellard
9db64530bb r600g: Make sure to initialize DB_DEPTH_CONTROL register for compute
The kernel CS checker will fail if this register is not initialized.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-09-19 14:58:53 -04:00
Tom Stellard
69d814885b r600g: Add some comments and debug printfs to compute code
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-09-19 14:58:53 -04:00
Tom Stellard
6bd11bc9d5 r600g: Add missing break to case statement 2012-09-19 15:27:32 -04:00
Michal Sciubidlo
0e0c21e00e radeon/llvm: Emit ISA for ALU instructions in the R600 code emitter
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-19 13:17:41 -04:00
Tom Stellard
d525ed1a84 radeon/llvm: Only support 512 constant registers on R600
This is necessary upcoming encoding changes, since we will only be
using 9-bits for register encoding.
2012-09-19 13:11:36 -04:00
Andreas Boll
5abb1f8bde docs: update faq 2012-09-19 18:23:45 +02:00
Andreas Boll
0aad2e400c docs: update sourcetree
- add OpenCL state tracker Clover

- add XvMC state tracker

- remove progs
  directory got moved into its own repository mesa/demos

- remove vf
  directory removed with abda64efce
2012-09-19 18:23:45 +02:00
Andreas Boll
7a40dc1992 docs: remove obsolete r300c traces 2012-09-19 18:23:45 +02:00
Brian Paul
ead9cfdcc4 Revert "mesa: consolidate subtexture x/y/width/height error checking code"
This reverts commit 5b807400a8.

accidentally pushed.
2012-09-19 10:07:45 -06:00
Brian Paul
e1e302c7f6 Revert "more comment"
This reverts commit 5205db6a7c.

accidentally pushed
2012-09-19 10:07:34 -06:00
Brian Paul
f51d232e5f Revert "mesa: clean-up and fix glCompressedTexSubImage error checking"
This reverts commit 0c67fe5d2d.

accidentally pushed.
2012-09-19 10:07:22 -06:00
Brian Paul
7c8c90c4e4 docs: fix "Cppyright" typo 2012-09-19 10:01:04 -06:00
Brian Paul
0c67fe5d2d mesa: clean-up and fix glCompressedTexSubImage error checking 2012-09-19 09:21:03 -06:00
Brian Paul
5205db6a7c more comment 2012-09-19 09:21:03 -06:00
Brian Paul
5b807400a8 mesa: consolidate subtexture x/y/width/height error checking code
This is the code that checks if a subtexure region is aligned to the
compressed format's block size.
2012-09-19 09:21:03 -06:00
Andreas Boll
a73c59b7a6 docs: remove obsolete target attribute 2012-09-19 17:15:48 +02:00
Andreas Boll
7b09254883 docs: news.html is the new index.html 2012-09-19 17:15:47 +02:00
Andreas Boll
ac5cee934f docs: remove obsolete frame layout 2012-09-19 17:15:47 +02:00
Andreas Boll
b5da52ac58 docs: add new iframe layout 2012-09-19 17:15:47 +02:00
Andreas Boll
ad05f2e429 docs/news: linkify some active links 2012-09-19 17:15:45 +02:00
Andreas Boll
cc7eea955a docs/news: deactivate dead links
I have left the links as <code> elements for the purpose of
documentation.
2012-09-19 17:15:39 +02:00
Andreas Boll
6e0c2702e3 docs/news: drop redundant link 2012-09-19 17:15:34 +02:00
Andreas Boll
9ddf74d443 docs/news: update link 2012-09-19 17:15:31 +02:00
Andreas Boll
83937a2c0f docs/news: remove link to a non-existent page 2012-09-19 17:15:24 +02:00
Andreas Boll
6fb8aeb2c5 docs: fix some issues in relnotes
improve markup
fix link to relnotes-9.0
add missing relnotes links
2012-09-19 12:12:38 +02:00
Andreas Boll
abb1c847ac docs/devinfo: fix typo 2012-09-19 12:10:32 +02:00
Vadim Girlin
9aa8bac98b winsys/radeon: fix relocs caching
Don't cache pointers to elements of reallocatable array.
In some circumstances it caused false cache hits resulting in incorrect
command stream and gpu lockup.

Note: This is a candidate for the stable branches.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-09-19 04:48:16 +04:00
Vincent Lejeune
175fdd7b86 radeon/llvm: Add a fdiv pattern.
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-09-18 18:00:20 +02:00
Vincent Lejeune
12c4526157 radeon/llvm: reserve also corresponding 128bits reg
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-09-18 17:59:51 +02:00
Andreas Boll
88c3647e0b docs: drop obsolete sourceforge link
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-09-18 08:32:50 -06:00
Brian Paul
7d624799b9 softpipe: implement the new can_create_resource() function
And define a SP_MAX_TEXTURE_SIZE value as we do in llvmpipe.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:27 -06:00
Brian Paul
b9e88c5592 llvmpipe: implement the new can_create_resource() function
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:27 -06:00
Brian Paul
ead8847d44 st/mesa: implement new proxy texture code
If the gallium driver implements the can_create_resource() function, call
it to do proxy texture size checks.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:27 -06:00
Brian Paul
bd8b43a9f4 gallium: add new pipe_screen::can_create_resource() function
Used to implement proxy textures.  If a gallium driver doesn't implement
this function we'll just continue to use the core Mesa fallback code.

Without this hook we really have no good way to implement OpenGL proxy
textures with gallium drivers.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:27 -06:00
Brian Paul
a0fc7620f5 mesa: take cube faces into account in _mesa_test_proxy_teximage()
There will always be six cube faces so take that into consideration when
computing the texture size and comparing against the limit.
2012-09-17 19:49:27 -06:00
Brian Paul
90ca4c0c62 mesa: handle GL_PROXY_TEXTURE_CUBE_MAP in _mesa_num_tex_faces() 2012-09-17 19:49:27 -06:00
Brian Paul
df73be9105 llvmpipe: set max cube texture size to 4K x 4K
Before, the limit was 8K.  For 32-bit RGBA that would be require 1.5 GB
of memory (w/out mipmaps).  That's well beyond the LP_MAX_TEXTURE_SIZE
of 1GB.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:26 -06:00
Brian Paul
7dc76e9424 mesa: move/fix levels check for glTexStorage()
Fix copy&paste error and move min levels check closer to max levels check.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:26 -06:00
Brian Paul
ff24ed09fa mesa: rewrite glTexStorage() code
Simplify the code and make it more like the other glTexImage commands.
Call _mesa_legal_texture_dimensions() to validate width, height, depth.
Call ctx->Driver.TestProxyTexImage() to make sure texture is not too large.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:26 -06:00
Brian Paul
e6eaa85a43 mesa: rework texture size error checking
There are two aspects to texture image size checking:
1. Are the width, height, depth legal values (not negative, not larger
   than the max size for the mipmap level, etc)?
2. Is the texture just too large to handle?  For example, we might not be
   able to really allocate memory for a 3D texture of maxSize x maxSize x
   maxSize.

Previously, we did (1) via the ctx->Driver.TestProxyTextureImage() hook
but those tests are really device-independent.  Now we do (2) via that
hook since the max texture memory and texture shape are device-dependent.

Also, (1) is now done outside the general texture parameter error checking
functions because of the special interaction with proxy textures.  The
recently introduced PROXY_ERROR token is removed.

The teximage() and copyteximage() functions are bit simpler now (less
if-then nesting, etc.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:26 -06:00
Brian Paul
ce2ae3c3a2 mesa: refactor _mesa_test_proxy_teximage() code
Basically, move the body into a new _mesa_legal_texture_dimensions() function.
More refactoring to come.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:26 -06:00
Brian Paul
b1874ec931 mesa: move glTexImage 'level' error checking
Move level checking out of _mesa_test_proxy_teximage() and into
the other error-checking functions.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:26 -06:00
Brian Paul
35f16600b3 mesa: change create_version_string() return type to void
Fixes "warning: no return statement in function returning non-void"
2012-09-17 19:46:20 -06:00
Dave Airlie
1ce9f25fde glsl: make _mesa_builtin_uniform_desc static
I can't see any reason this is global (unless for debugging)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-09-18 07:57:42 +10:00
Tom Stellard
bfd55711c1 radeon/llvm: Inital flow control support for SI
This adds basic flow control support for If-Then-Else blocks using
predicates (stored in the EXEC register) and a predicate stack for
nested flow control.
2012-09-17 21:09:43 +00:00
Xinya Zhang
ef0d7e13d7 r600g: Close a memory leak of llvm byte streams
No regressions found in the tests of opencl-example/run_tests.sh.

Signed-off-by: Xinya Zhang <zxy_thf@hotmail.com>
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-17 21:09:43 +00:00
Tom Stellard
0b1a182905 radeon/llvm: Fix unused variable warning 2012-09-17 21:09:43 +00:00
Tom Stellard
059a56bddb radeon/llvm: Move kernel arg lowering into R600TargetLowering class 2012-09-17 21:09:43 +00:00
Jordan Justen
9fac1d1c3a main/version: consolodate version string creation for ES/Desktop GL
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-09-17 13:42:09 -07:00
Eric Anholt
81dff4f752 i965: Stop putting 8 NOPs after each prorgam.
As far as I can see, the intention of the requirement that we do so is to
prevent instruction prefetch from wandering out into either unmapped memory or
memory with a different caching type, and hanging the chip.  The kernel makes
sure that the page after your BO has a valid page of the same caching type,
which meets this requirement, so there's no need to waste space between our
programs (and in instruction cache) on this.

Saves another 9kb instructions in l4d2 shaders.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-17 12:32:52 -07:00
Eric Anholt
3e165ba62c i965: Test instruction compaction on gen7 2012-09-17 12:32:52 -07:00
Kenneth Graunke
bce72170ea i965: Add support for instruction compaction on Gen7.
Reduces l4d2 program size from 1195kb to 919kb.  Improves performance by 0.22%
+/- 0.11% (n=70).

v2: Rebase on compaction v2, fix up flag reg handling (by anholt).
v3: Fix uncompaction of the flag register number.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-17 12:32:52 -07:00
Eric Anholt
f25aefcebe i965: Support instruction compaction between control flow.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-09-17 12:32:52 -07:00
Eric Anholt
077d01b673 i965: Add support for instruction compaction.
This reduces program size by using some smaller encodings for common bit
patterns in the Gen ISA, with the hope of making programs fit in the
instruction cache better.

v2: Use larger bitshifts for the uncompressed field setups, in line with the
    way it's described in the spec.  Consistently name a brw_compile "p" like
    all other code.  Add a couple more tests.  Consistently call things
    "compacted" not "compressed" (which is a different feature).  Drop the
    explicit check for not compacting SENDs, which is unjustified and already
    implied by our lack of support for immediate values.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-09-17 12:32:52 -07:00
Eric Anholt
f5e2706395 i965: Prepare the break/cont uip/jip setting for compacted instructions.
The first cut at instruction compaction won't compact things that
would change control flow jump distances, but we do need to still be
able to walk the instruction stream, which involves jumping by 8 or 16
bytes between instructions.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-09-17 12:32:52 -07:00
Eric Anholt
f2bd3e70b5 i965: Move program dump to a helper function in brw_eu.c.
It's going to get more complicated when we do instruction compaction.  This
also introduces putting the program offset in the output.

v2: Use next_insn_offset in brw_get_program(), too.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-09-17 12:32:51 -07:00
Eric Anholt
826ecbbe6e i965: Make a linkable library for the contents of i965_dri.so.
To do unit testing of i965, we want to be able to link against the
driver's symbols and prod them.  If we don't have a separate lib from
our loadable module, libtool gets super whiny.

Acked-by: Paul Berry <stereotype441@gmail.com>
2012-09-17 12:32:51 -07:00
Eric Anholt
5dafee1853 dri: Reuse dri_test.c for stub glapi symbols for unit testing.
This file is used to provide stubs for the link test in gallium dri drivers.
But the same stubs without the main can be used for making unit tests for code
in a dri driver.

Acked-by: Paul Berry <stereotype441@gmail.com>
2012-09-17 12:32:51 -07:00
Eric Anholt
3f98ba9c43 i965: Clear brw_compile on setup.
I noticed in valgrind that p->single_program_flow was used while
uninitialized.  Everything else zeroed out brw_compile, but this is better
API.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-09-17 12:32:51 -07:00
Andreas Boll
99f14bc789 docs: remove obsolete mesa subset documentation
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-17 10:34:12 -06:00
Michel Dänzer
14c12ca331 radeon/llvm: Match integer add/sub for SI.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-17 18:05:49 +02:00
Michel Dänzer
8d7dd68d2a radeon/llvm: Complete integer comparison patterns for SI.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-17 18:03:41 +02:00
Michel Dänzer
97d3d25e1c radeon/llvm: Match AMDGPUfract on SI.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-17 18:02:01 +02:00
Michel Dänzer
39fb7faf95 radeon/llvm: Match int_AMDGPU_floor for SI.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-17 17:30:09 +02:00
Michel Dänzer
6d3a1a5361 radeon/llvm: Match vector logical operations on SI.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-17 17:12:08 +02:00
Brian Paul
7b6b447fa3 softpipe: update SP_MAX_TEXTURE_3D_LEVELS comment
9 levels = max size of 256 texels.
2012-09-16 19:00:20 -06:00
Tomeu Vizoso
68d1a3afd4 mesa/es: Define GL_MAX_TEXTURE_MAX_ANISOTROPY_EXT enum for all GLs
instead of just for GL and ES1.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-16 12:03:29 -07:00
Chris Forbes
d30a7d2eb4 mesa: fix dropped && in glGetStringi()
This fixes glGetStringi(GL_EXTENSIONS,.. for core contexts. Previously,
all extension names returned would be NULL.

NOTE: This is a candidate for release branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-16 01:47:00 -07:00
Kenneth Graunke
679c93ff89 meta: Don't _mesa_set_enable() invalid targets in ES 1.
GL_TEXTURE_1D, GL_TEXTURE_3D, GL_TEXTURE_RECTANGLE, and
GL_TEXTURE_GEN_S/T/R/Q don't exist in ES 1 contexts, so any meta ops
that used _mesa_meta_begin with MESA_META_TEXTURE would trigger GL
errors.  One such operation is _mesa_meta_Clear().

On ES 1, we want to disable GL_TEXTURE_GEN_STR_OES instead.

Fixes the ES1 conformance test miplin.c, which was regressed by commit
08be1d288f.

NOTE: This is a candidate for the 9.0 branch.

v2: Also blacklist GL_TEXTURE_3D, per Brian's comment.
v3: Disable GL_TEXTURE_GEN_STR_OES, per Ian's comment.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54297
Reviewed-by: Brian Paul <brianp@vmware.com> [v1]
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-15 20:13:09 -07:00
José Fonseca
b6c2234c22 Temporarily revert "mesa: remove remaining FEATURE_* defines where protected by API check."
This reverts commit 9f37b405a3.

Fixes windows builds.
2012-09-15 18:18:39 +01:00
Brian Paul
e78ebbc5f9 scons: add new -p (prefix) options for yacc
These were recently added to the Makefiles.
2012-09-15 09:01:15 -06:00
Brian Paul
2f5f7bd687 swrast: remove unused ati_fs_opcodes array 2012-09-15 08:29:47 -06:00
Brian Paul
e656c4a074 mesa: remove FEATURE_ES test in texcompress_cpal.c
Fixes a regression after removing the #if FEATURE_x tests.
2012-09-15 08:28:21 -06:00
Oliver McFadden
2bc8f03f49 mesa: remove never-defined FEATURE_histogram conditional.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:59 +03:00
Oliver McFadden
9f37b405a3 mesa: remove remaining FEATURE_* defines where protected by API check.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:57 +03:00
Oliver McFadden
ab1a9430c3 mesa: remove obsolete comments from mfeatures.h
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:55 +03:00
Oliver McFadden
961fcc45ad mesa: remove FEATURE_ATI_fragment_shader define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:52 +03:00
Oliver McFadden
dd44f80f81 mesa: remove FEATURE_APPLE_object_purgeable define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:51 +03:00
Oliver McFadden
dda982f1a7 mesa: remove FEATURE_EXT_transform_feedback define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:49 +03:00
Oliver McFadden
88233b0bc3 mesa: remove FEATURE_EXT_texture_sRGB define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:48 +03:00
Oliver McFadden
e9ccb5fe52 mesa: remove FEATURE_EXT_framebuffer_blit define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:46 +03:00
Oliver McFadden
d05d5d9a91 mesa: remove FEATURE_ARB_sync define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:45 +03:00
Oliver McFadden
02a19684f9 mesa: remove FEATURE_ARB_sampler_objects define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:43 +03:00
Oliver McFadden
c609bf9786 mesa: remove FEATURE_ARB_pixel_buffer_object define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:42 +03:00
Oliver McFadden
e8ba24cbfd mesa: remove FEATURE_ARB_map_buffer_range define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:41 +03:00
Oliver McFadden
32c3ba8753 mesa: remove FEATURE_ARB_framebuffer_object define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:39 +03:00
Oliver McFadden
e8a72d8282 mesa: remove FEATURE_ARB_(fragment|vertex)_program defines.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:38 +03:00
Oliver McFadden
b7d15977f6 mesa: remove FEATURE_NV_(fragment|vertex)_program defines.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:36 +03:00
Oliver McFadden
ae241747c8 mesa: remove unused FEATURE_NV_fence define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:26 +03:00
Oliver McFadden
b874db09cf mesa: remove unused FEATURE_OES_framebuffer_object define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:24 +03:00
Oliver McFadden
740cdfdea3 mesa: remove unused FEATURE_OES_mapbuffer define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:22 +03:00
Oliver McFadden
f88393afbe mesa: remove FEATURE_OES_EGL_image define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:20 +03:00
Oliver McFadden
cd28a19bd9 mesa: remove FEATURE_EXT_pixel_buffer_object define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:19 +03:00
Oliver McFadden
0c1ff721e1 mesa: remove FEATURE_EXT_framebuffer_object define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:18 +03:00
Oliver McFadden
528f48432e mesa: remove FEATURE_ARB_shader_objects and related defines.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:16 +03:00
Oliver McFadden
7ada8d371e mesa: remove FEATURE_ARB_fragment_shader define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:14 +03:00
Oliver McFadden
6c4cddadaa mesa: remove FEATURE_ARB_vertex_shader define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:13 +03:00
Oliver McFadden
5489fc7b9f mesa: remove FEATURE_OES_draw_texture define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:11 +03:00
Oliver McFadden
009250a096 mesa: remove FEATURE_es2_glsl and related defines.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:10 +03:00
Oliver McFadden
d09428c9cc mesa: remove FEATURE_point_size_array define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:07 +03:00
Oliver McFadden
fd232c6bd4 mesa: remove unused FEATURE_extra_context_init define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:06 +03:00
Oliver McFadden
ab8d76357f mesa: remove FEATURE_texture_s3tc define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:04 +03:00
Oliver McFadden
beb293e4cd mesa: remove FEATURE_texture_fxt1 define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:02 +03:00
Oliver McFadden
d4c2b1e8f8 mesa: remove FEATURE_rastpos define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:57:00 +03:00
Oliver McFadden
25ee9617ff mesa: remove FEATURE_queryobj define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:53 +03:00
Oliver McFadden
0ba82f9108 mesa: remove FEATURE_pixel_transfer define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:52 +03:00
Oliver McFadden
26a26e9992 mesa: remove FEATURE_feedback define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:50 +03:00
Oliver McFadden
fa9fc2332b mesa: remove FEATURE_evaluators define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:48 +03:00
Oliver McFadden
24c3d16f3b mesa: remove FEATURE_drawpix define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:46 +03:00
Oliver McFadden
53514b0326 mesa: remove FEATURE_draw_read_buffer define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:44 +03:00
Oliver McFadden
09df07373b mesa: remove FEATURE_dlist define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:42 +03:00
Oliver McFadden
dce8602251 mesa: remove FEATURE_convolve define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:40 +03:00
Oliver McFadden
97a8ca47ae mesa: remove FEATURE_colortable define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:37 +03:00
Oliver McFadden
004f032baf mesa: remove FEATURE_beginend define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:35 +03:00
Oliver McFadden
985b0cb22f mesa: remove FEATURE_attrib_stack define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:33 +03:00
Oliver McFadden
d6543599da mesa: remove FEATURE_arrayelt define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:31 +03:00
Oliver McFadden
016ba4cc2c mesa: remove FEATURE_accum define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:29 +03:00
Oliver McFadden
fc66313c96 mesa: remove FEATURE_userclip define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:27 +03:00
Oliver McFadden
eeed210c7d mesa: remove FEATURE_texgen define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:22 +03:00
Oliver McFadden
e5870d97eb mesa: remove FEATURE_dispatch define.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-15 12:56:09 +03:00
Dave Airlie
72f657c950 vbo: add a prefix to count_tessellated_primitives
Just to make it consistent with the rest of vbo, since it would
be an exported symbol anyways.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-09-15 18:04:09 +10:00
Dave Airlie
ee9f576637 mesa/fxt1: make fxt1_decode_1 static
No users outside this file.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-09-15 18:03:37 +10:00
Dave Airlie
da86e62d3c mesa/ati_fragshader: no need for opcodes to be global.
I can't see these in use anywhere outside this file.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-09-15 18:03:30 +10:00
Dave Airlie
14b4e727fb glsl: make tex_opcode_strs static
No reason for this to be global from what I can see

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-09-15 18:03:24 +10:00
Dave Airlie
7b10d81fc8 mesa/dxtn: make function pointers static
These aren't used outside thie file from what I can see.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-09-15 18:03:10 +10:00
Dave Airlie
36639ec6e9 meta: make mem_ctx non-global.
I can't see any external users, and this is a global symbol,

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-09-15 18:03:03 +10:00
Dave Airlie
7056193a43 glsl: make builtin_mem_ctx a static
This isn't used outside the generated file.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-09-15 18:02:46 +10:00
Dave Airlie
0b45bd146a ir_to_mesa: make some global variable static
nothing outside this file uses these.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-09-15 18:02:20 +10:00
Dave Airlie
6f3deeae96 mesa: make global perm variable static const
this array doesn't look like it needs to be global or unconst.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-09-15 18:01:02 +10:00
Mike Frysinger
8f9bae615d mklib: clean up abi flags for x86 targets
The current code is duplicated in two places and relies on `uname` to
detect the flags.  This is no good for cross-compiling, and the current
logic uses -m64 for the x32 ABI which breaks things.

Unify the code in one place, avoid `uname` completely, and add support
for the new x32 ABI.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2012-09-14 15:27:16 -07:00
Dave Airlie
88b0790b1a mesa/glsl: rename preprocess to glcpp_preprocess
This symbol with dricore escapes into the namespace, its too generic,
we should prefix it with something just to be nice.

Should be applied to stable + 9.0

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-09-15 08:22:55 +10:00
Dave Airlie
53d46bc787 glcpp: fix abuse of yylex
So glcpp tried to workaround yylex its own way, but failed,
do it properly.

This fixes another crash found after fixing the first crash.

this is a candidate for 9.0 and stable branches

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-09-15 08:20:56 +10:00
Dave Airlie
cc943c8470 mesa: use a prefix for the program lex
This avoids us making a global yylex symbol which will interfere will
all sorts of apps.

with libdricore which can't do symbol visibility currently we pollute
the namespace with this.

This is a candidate for 9.0 & stable branches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-09-15 08:20:56 +10:00
Paul Berry
a29a456635 meta: Refactor handling of GL_MULTISAMPLE.
In commit 055093e (meta: remove call to _meta_in_progress(), fix
multisample enable/disable), we created a meta_set_enable() function
that could be used by meta ops to enable and disable GL_MULTISAMPLE
even when the GLES API was in use (the GLES API doesn't support
GL_MULTISAMPLE; it behaves as if it is always enabled).  This created
some unfortunate code duplication between meta_set_enable() and the
existing _mesa_set_enable() function.

This patch eliminates the duplication by creating a
_mesa_set_multisample() function, which is used by both meta ops and
_mesa_set_enable() to enable/disable GL_MULTISAMPLE.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-14 14:50:41 -07:00
Anuj Phogat
15bf3103b4 _mesa_meta_GenerateMipmap: Generate separate shaders for glsl 120 / 130
glsl version of _mesa_meta_GenerateMipmap() would require separate
shaders for glsl 120 and 130.

V2: Removed the code for integer textures as ARB is planning to
    disallow automatic mipmap generation for integer textures.

NOTE: This is a candidate for stable branches.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-14 11:23:03 -07:00
Anuj Phogat
299acac849 _mesa_meta_GenerateMipmap: Support all texture targets by generating shaders at runtime
glsl path of _mesa_meta_GenerateMipmap() function would require different fragment
shaders depending on the texture target. This patch adds the code to generate
appropriate fragment shader programs at run time.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=54296

V2: Removed the code for integer textures as ARB is planning to
    disallow automatic mipmap generation for integer textures.
    Now using ralloc_asprintf in setup_glsl_generate_mipmap().

NOTE: This is a candidate for stable branches.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-14 11:23:03 -07:00
Christian König
fb541662eb radeon/llvm: Support frint on SI
Gets VDPAUs shaders working again.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-14 17:03:24 +02:00
Marek Olšák
fac7031a04 r600g: consolidate decompression code for the blitter 2012-09-14 05:55:00 +02:00
Marek Olšák
df5e2c058f r600g: do not require MSAA renderbuffer support if not asked for
to allow stencil-only sampler-only formats (like X24S8)

NOTE: This is a candidate for the stable branches.
2012-09-14 05:55:00 +02:00
Marek Olšák
61706915a3 gallium/u_blitter: fix stencil-only blits
NOTE: This is a candidate for the stable branches.
2012-09-14 05:55:00 +02:00
Marek Olšák
1e51d368eb r300g: fix colormask with non-BGRA formats
NOTE: This is a candidate for the stable branches.
2012-09-14 05:55:00 +02:00
Alex Deucher
b33d7eaa5e r600g: reduce quant mode on evergreen+
Seems to have an affect on the allowable range of
values.  Set evergreen+ to 1/256 to match 6xx/7xx.

fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=54877

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-13 17:33:35 -04:00
Marek Olšák
ee50d365ea radeonsi: don't use a staging resource for large transfers
It kills performance if the resource is linear.
2012-09-13 20:26:21 +02:00
Marek Olšák
e386972f5b r600g: don't use a staging resource for large transfers
It kills performance if the resource is linear.
2012-09-13 20:25:47 +02:00
Marek Olšák
1f5a7567e8 r600g: convert the remnants of VGT state into immediate register writes/atoms v4
v2: Group vgt register together to avoid lockup
v3: Split multi primitive register and index bias register
v4: Bump R600_NUM_ATOMS

Signed-off-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:45 +02:00
Marek Olšák
150decffb4 r600g: emit the primitive type and associated regs only if the type is changed
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:45 +02:00
Marek Olšák
c56dca909a r600g: add clip_misc_state for clip registers emitted in draw_vbo
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:45 +02:00
Marek Olšák
51d839edc8 r600g: fix computing how much space is needed for a draw command
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:45 +02:00
Marek Olšák
8faf3bcf07 r600g: fix the number of CS dwords of cb_misc_state
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:45 +02:00
Marek Olšák
2b8d39bbfc r600g: atomize clip state
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
de89fe1e5d r600g: atomize blend color
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
605fd0c14a r600g: atomize viewport state
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
63bf0f905a r600g: atomize stencil ref state
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
fd19aa4e12 r600g: remove unused state ID definitions
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
c383a3cfb2 r600g: initialize the first CS just like any other CS
by reusing the CS initialization in r600_context_flush.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
263045afbc r600g: add support for geometry shader samplers and constant buffers
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
f2eac1423a r600g: put sampler states and views into an array indexed by shader type
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
3fe78594b1 r600g: do fine-grained sampler state updates
Update only those sampler states which are changed in a shader stage,
instead of always updating all sampler states in the shader stage.
That requires keeping a bitmask of those states which are enabled, and those
states which are dirty at a given point (subset of enabled states).

This is similar to how sampler views, constant buffers, and vertex buffers
are handled.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
6c86124157 r600g: consolidate set_viewport_state functions
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
48de30e760 r600g: consolidate set_sampler_views functions
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
1bce17ee01 r600g: put constant buffer state into an array indexed by shader type
to easily and robustly handle multiple shader stages

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
3bffd8a5eb r600g: cleanup state function names
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
f96df32d62 r600g: consolidate initialization of common state functions
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Marek Olšák
fd2e34d557 r600g: simplify flushing
Based on the patch called "simplify and fix flushing and synchronization"
by Jerome Glisse.

Rebased, removed unneded code, simplified more and cleaned up.

Also, SH_ACTION_ENA is not set when changing shaders (hw doesn't seem
to need it). It's only used to flush constant buffers.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13 20:18:44 +02:00
Tom Stellard
6a5a4d59ce radeon/llvm: Fix lowering of vbuild
Some of the old AMDIL code was hard-coding subreg indices when creating
the VBUILD node, which was making it difficult to match the
vector_insert patterns.
2012-09-13 10:38:02 -04:00
Tom Stellard
70a50685a8 radeon/llvm: Support fmul on SI 2012-09-13 10:38:02 -04:00
Kenneth Graunke
28f4be9eb9 i965: Fix out-of-order sampler unit usage in ARB fragment programs.
ARB fragment programs use texture unit numbers directly, unlike GLSL
which has an extra indirection.  If a fragment program only uses one
texture assigned to GL_TEXTURE1, SamplersUsed will only contain a single
bit, which would make us only upload a single surface/sampler state
entry.  However, it needs to be the second entry.

Using _mesa_fls() instead of _mesa_bitcount() solves this.  For ARB
programs, this makes num_samplers the ID of the highest texture unit
used.  Since GLSL uses consecutive integers assigned by the linker,
_mesa_fls() should give the same result as _mesa_bitcount()..

Fixes a regression since 85e8e9e000,
which caused GPU hangs in ETQW (and probably others), as well as
breaking piglit test fp-fragment-position.

v2: Add a comment, as suggested by Matt.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54098
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54179
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: meng <mengmeng.meng@intel.com>
2012-09-12 22:13:05 -07:00
Kenneth Graunke
0fc163408e mesa: Add a _mesa_fls() function to find the last bit set in a word.
ffs() finds the least significant bit set; _mesa_fls() finds the /most/
significant bit.

v2: Make it an inline function in imports.h, per Brian's suggestion.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-12 22:13:05 -07:00
Paul Berry
1a5d4f7cb2 i965/blorp: Fix offsets and width/height for stencil blits.
Fixes piglit test "framebuffer-blit-levels draw stencil".

NOTE: This is a candidate for stable release branches.

Acked-by: Eric Anholt <eric@anholt.net>
2012-09-12 14:44:13 -07:00
Paul Berry
5fd67fac14 i965/blorp: Reduce alignment restrictions for stencil blits.
Previously, we aligned all stencil blit operations to multiples of the
size of a tile, since stencil buffers use W-tiling, and blorp has to
approximate this by configuring the 3D pipeline for Y-tiling and
swizzling coordinates.

However, this was unnecessarily conservative; it turns out that the
differences between W-tiling and Y-tiling are confined to 32-byte
sub-tiles within the 4k tiling pattern; the layout of these 32-byte
sub-tiles within the larger 4k tile is the same (8 sub-tiles across by
16 sub-tiles down, in column-major order).  Therefore we only need to
align stencil blit operations to multiples of the sub-tile size.

Note: although the performance improvement of this change is probably
quite small, the fact that W-tiling and Y-tiling formats only differ
within 32-byte sub-tiles will be essential in a future patch to ensure
that stencil blits work correctly between parts of the miptree other
than level/layer 0.  Making this change provides handy documentation
(and validation) of this fact.

NOTE: This is a candidate for stable release branches.

Acked-by: Eric Anholt <eric@anholt.net>
2012-09-12 14:44:13 -07:00
Paul Berry
1a75063d5f i965/blorp: don't reduce stencil alignment restrictions when multisampling.
When blitting to a stencil buffer, we need to align the rectangle we
send down the rendering pipeline, to account for the fact that the
stencil buffer uses a W-tiled layout, but we are configuring its
surface state as Y-tiled.

Previously, when the stencil buffer was multisampled, we assumed that
we could reduce the amount of alignment that was necessary, since each
pixel occupies a block of 2x2 or 4x2 samples in the stencil buffer.
That would have been correct if the coordinates we were adjusting were
measured in pixels.  However, the conversion from pixel coordinates to
coordinates within the interleaved buffer has already been done;
therefore the full alignment restriction applies.

Note: the reason this mistake wasn't previously uncovered by piglit
tests is because it is being masked by another mistake: the blorp
engine is using overly conservative alignment restrictions when doing
stencil blits.  The overly conservative alignment restrictions will be
removed in the patch that follows.  Doing this fix now will prevent
the subsequent patch from introducing regressions.

NOTE: This is a candidate for stable release branches.

Acked-by: Eric Anholt <eric@anholt.net>
2012-09-12 14:44:13 -07:00
Paul Berry
b760c9913d intel: Add map_stencil_as_y_tiled to intel_region_get_aligned_offset.
This patch modifies intel_region_get_aligned_offset() to make the
appropriate calculation when the blorp engine sets up a W-tiled
stencil buffer using a Y-tiled SURFACE_STATE.

NOTE: This is a candidate for stable release branches.

Acked-by: Eric Anholt <eric@anholt.net>
2012-09-12 14:44:13 -07:00
Paul Berry
50dec7fc2d intel: Add map_stencil_as_y_tiled to intel_region_get_tile_masks.
When the blorp engine is performing a blit from one stencil buffer to
another, it sets up the surface state for these buffers as Y-tiled, so
it needs to be able to force intel_region_get_tile_masks() to return
the appropriate masks for a Y-tiled region.

NOTE: This is a candidate for stable release branches.

Acked-by: Eric Anholt <eric@anholt.net>
2012-09-12 14:44:13 -07:00
Paul Berry
f04f219906 i965/blorp: Account for offsets when emitting SURFACE_STATE.
Fixes piglit tests "framebuffer-blit-levels {read,draw} depth".

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-12 14:44:12 -07:00
Paul Berry
3123f06215 i965/blorp: Thread level and layer through brw_blorp_blit_miptrees().
Previously, when performing a blit using the blorp engine, we failed
to account for the level and layer of the source and destination.  As
a result, all blits would occur between miplevel 0 and layer 0 of the
corresponding textures, regardless of which level/layer was bound to
the framebuffer.

This patch passes the correct level and layer through
brw_blorp_miptrees() into the brw_blorp_blit_params data structure.

Further patches in the series will adapt
gen{6,7}_blorp_emit_surface_state to make use of these parameters.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-12 14:44:12 -07:00
Paul Berry
bc6cafa045 i965/blorp: Don't create a dummy renderbuffer just to fetch image offsets.
This is unnecessary--the image offsets can be read directly out of the
miptree using intel_miptree_get_image_offset.
2012-09-12 14:44:12 -07:00
Paul Berry
c130ce7b2b i965/blorp: store x and y offsets in brw_blorp_mip_info.
Currently, gen{6,7}_blorp_emit_surface_state assumes that the src and
dst surfaces are mapped to miplevel 0 and layer 0 (thus no surface
offset is required).  This is a bug, since the user might try to blit
to and from levels/layers other than 0.

To fix this bug, it will not be sufficient to have
gen6_{6,7}_blorp_emit_surface_state look up the surface offset at the
time they set up the surface state, since these offsets will need to
be tweaked when blitting stencil buffers (due to the fact that stencil
buffer blits have to swizzle between W and Y tiling formats).

So, to pave the way for the bug fix, this patch causes the x and y
offsets to be computed during blit setup and stored in
brw_blorp_mip_info.

As a result of this change, brw_blorp_mip_info doesn't need to store
the level and layer anymore.

For consistency, this patch makes a similar change to the handling of
depth buffers when doing HiZ operations.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-12 14:44:12 -07:00
Paul Berry
09b0fa8499 i965/blorp: store surface width/height in brw_blorp_mip_info.
Previously, gen{6,7}_blorp_emit_surface_state would look up the width
and height of the surface at the time they set up the surface state,
and then tweak it if necessary (it's necessary when a W-tiled surface
is being mapped as Y-tiled).  With this patch, we look up the width
and height when setting up the blit, and store them in
brw_blorp_mip_info.  This allows us to do the necessary tweak in the
brw_blorp_blit_params constructor (where it makes more sense).  It
also reduces the need to keep track of level and layer in
brw_blorp_mip_info, so that a future patch can eliminate them
entirely.

For consistency, this patch makes a similar change to the handling of
depth buffers when doing HiZ operations.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-12 14:44:12 -07:00
Paul Berry
e14b1288ef i965/blorp: Change gl_renderbuffer* params to intel_renderbuffer*.
This makes it more convenient for blorp functions to get access to
Intel-specific data inside the renderbuffer objects.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-12 14:44:12 -07:00
Paul Berry
32c7b2769c i965/blorp: Clarify why width/height must be adjusted for Gen6 IMS surfaces.
Also add a clarifying comment for why the width/height doesn't need
adjustment for Gen7.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-12 14:44:12 -07:00
Paul Berry
bde833c9d0 i965/gen6+: Adjust stencil buffer size after computing miptree layout.
Since Gen6+ stencil buffers use W-tiling (a tiling arrangement which
drm and the kernel are not aware of) we need to round up the width and
height of a stencil buffer to multiples of the W-tile size (64x64)
before allocating a stencil buffer.  Previously, we rounded up the
size of the base miplevel, and then computed the miptree layout based
on the rounded up size.  This was incorrect, because it meant that the
total size of the miptree would not be properly W-tile aligned, and
therefore we would not always allocate enough pages.

(Note: even though the GL API doesn't allow creation of mipmapped
stencil textures, it does allow mipmapping of a combined depth/stencil
texture, and on Gen6+, a combined depth/stencil texture is internally
implemented as a pair of separate depth and stencil buffers.)

For example, on Sandy Bridge, when allocating a mipmapped stencil
texture of size 128x128, we would first round up to the nearest
multiple of 64x64 (causing no change to the size), and then compute
the miptree layout (whose size worked out to 128x196).  Then we would
request an allocation of 128*196 bytes (6.125 pages), causing 7 pages
to be allocated to the texture.  However, the texture needs 8 pages,
since each W-tile occupies a page, and it takes 2 W-tiles to cover a
width of 128 and 4 W-tiles to cover a height of 196.

This patch changes the order of operations so that the miptree layout
is computed first and then the total size of the miptree is rounded up
to be W-tile aligned.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-12 14:44:12 -07:00
Matt Turner
af6aeae4e1 build: Don't list glproto and dri2proto in pkg-config file
No files provided by glproto or dri2proto are needed for building
something with Mesa.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=342393
Reviewed-by: Dan Nicholson <dbn.lists@gmail.com>
2012-09-12 11:26:28 -07:00
Michel Dänzer
7443e4e697 radeonsi: Properly handle NULL sampler views.
Fixes piglit shaders/glsl-fs-uniform-sampler-array and many other similar
tests.

In fact, I just completed a piglit quick-driver.tests run without any GPU
lockups or even VM protection faults. Yay!

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-12 15:53:51 +02:00
Michel Dänzer
d67d8e2471 radeonsi: Fix calculation of number of records in buffer resource.
The value was too small by 1 in some cases (non-first of several vertex
elements interleaved in a single buffer).

Fixes intermittent incorrect geometry in many apps, e.g. piglit
spec/EXT_texture_snorm/fbo-generatemipmap-formats.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-09-12 13:23:09 +02:00
Imre Deak
9f30cbe9ee mesa: glGet: fix API check for EGL_image_external enums
These enums are valid only in ES1 and ES2. So far they were marked valid
incorrectly, depending on the previous API mask in the enum list.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-09-11 17:38:21 -06:00
Imre Deak
ae310e37fb mesa: glGet: fix indentation of print_table_stats
No functional change.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-09-11 17:38:21 -06:00
Imre Deak
97a693d1fa mesa: glGet: fix indentation of find_value
No functional change.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-09-11 17:38:21 -06:00
Imre Deak
746e82fff4 mesa: glGet: fix indentation of _mesa_init_get_hash
No functional change.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-09-11 17:38:21 -06:00
Brian Paul
35c75f6777 mesa: fix proxy texture error handling in glTexStorage()
This is basically a follow-on to 1f5b1f9846.
Basically, generate GL errors for ordinary invalid parameters for proxy
targets the same as for non-proxy targets.  Only texture size and OOM
errors should be handled specially for proxies.

Note: This is a candidate for the stable branches.
2012-09-11 17:38:21 -06:00
Brian Paul
d17440dcaa mesa: make _mesa_get_proxy_target() non-static
Needed for the next patch.

Note: This is a candidate for the stable branches.
2012-09-11 17:38:21 -06:00
Brian Paul
2e4fc54977 mesa: do internal format error checking for glTexStorage()
Turns out we weren't doing any format checking before.  Now check
the internal format and, in particular, make sure that unsized internal
formats aren't accepted.

Note: This is a candidate for the stable branches.
2012-09-11 17:38:21 -06:00
Paul Berry
5d5f0f3491 mesa/msaa: Allow X and Y flips in multisampled blits.
From the GL 4.3 spec, section 18.3.1 "Blitting Pixel Rectangles":

    If SAMPLE_BUFFERS for either the read framebuffer or draw
    framebuffer is greater than zero, no copy is performed and an
    INVALID_OPERATION error is generated if the dimensions of the
    source and destination rectangles provided to BlitFramebuffer are
    not identical, or if the formats of the read and draw framebuffers
    are not identical.

It is not clear from the spec whether "dimensions" should mean both
sign and magnitude, or just magnitude.

Previously, Mesa interpreted "dimensions" as meaning both sign and
magnitude, so any multisampled blit that attempted to flip the image
in the X and/or Y direction would fail.

However, Y flips are likely to be commonplace in OpenGL applications
that have been ported from DirectX applications, as a result of the
fact that DirectX and OpenGL differ in their orientation of the Y
axis.  Furthermore, at least one commercial driver (nVidia) permits Y
filps, and L4D2 relies on them being permitted.  So it seems prudent
for Mesa to permit them.

This patch changes Mesa to allow both X and Y flips, since there is no
language in the spec to indicate that X and Y flips should be treated
differently.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-09-11 15:50:55 -07:00
Tom Stellard
843ac06ad2 radeon/llvm: Fix operand order of V_CNDMASK in custom inserter
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-11 14:53:48 -04:00
Tom Stellard
d399ce7615 radeon/llvm: Assert if we try to encode an unknown register
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-11 14:53:48 -04:00
Tom Stellard
0df2753ad2 radeon/llvm: Add register encoding for VCC
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-11 14:53:47 -04:00
Tom Stellard
056d9c6ef1 radeon/llvm: Ignore special registers when calculating reg count
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-11 14:53:47 -04:00
Tom Stellard
0fb1e68a0b radeonsi: Handle position input parameter for pixel shaders v2
v2:
  - Don't increment ninterp or set any of the have_* flags for
    TGSI_SEMANTIC_POSITION

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-11 14:53:47 -04:00
Tom Stellard
0410e9e8c7 radeon/llvm: Coding style fixes
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-11 14:53:47 -04:00
Tom Stellard
d3e58f75d2 radeonsi: Move interpolation mode check into the compiler
The compiler needs to know which interpolation modes are enabled, so
it knows which values will be preloaded into the VGPRs.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-11 14:53:47 -04:00
Tom Stellard
5fff032dd5 radeonsi: Add missing interpolation mode to check for enabled modes
At least one interpolation mode must be enable, but the code that checks
this was not checking for perspective center.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-11 14:53:47 -04:00
Tom Stellard
cc571a367e radeonsi: Pass shader type to the compiler
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-11 14:53:47 -04:00
Tom Stellard
dfd3d61abf radeon/llvm: Add SHADER_TYPE instruction
This allows the program to specify the type of shader being compiled
(e.g. PXEL, VERTEX, etc.)

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-11 14:53:47 -04:00
Jerome Glisse
841c1b5f54 r600g: avoid GPU doing constant preload from random address
Previous command stream might have set any of the constant buffer
and the previous address might no longer be valid thus GPU might
preload constant from random invalid address and possibly triggering
lockup.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-09-11 12:57:54 -04:00
Michel Dänzer
9ccaa24f84 radeonsi: Texture border colour fixes.
* Handle arbitrary border colours.
* Use correct packing format for detecting special border colours.

Fixes piglit tex-border-1 and probably many other tests using border colours.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-11 11:06:56 +02:00
Michel Dänzer
03dfa30596 radeonsi: Handle NULL sampler states.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-11 11:06:16 +02:00
Kenneth Graunke
23cd6c43da i965: Remove incorrect comment above opt_algebraic.
The comment was cut-and-pasted from propagate_constants(), and had no
relation at all to opt_algebraic().
2012-09-10 22:58:25 -07:00
Kenneth Graunke
354f2cb5c7 glsl: Generate compile errors for explicit blend indices < 0 or > 1.
According to the GLSL 4.30 specification, this is a compile time error.
Earlier specifications don't specify a behavior, but since 0 and 1 are
the only valid indices for dual source blending, it makes sense to
generate the error.

Fixes (the fixed version of) piglit's layout-12.frag.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-09-10 20:56:11 -07:00
Marek Olšák
87389d4e5c r600g: remove unused function 2012-09-11 00:02:58 +02:00
Marek Olšák
830b6f3273 r600g: fix printf warning 2012-09-11 00:02:58 +02:00
Andreas Boll
e81ee67b51 mesa: bump version to 9.1 (devel)
Now that branch 9.0 is created, bump the minor version in
master.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-09 03:00:17 -07:00
Johannes Obermayr
10a96f4a4d Set OSMESA_VERSION=8.
VERSION_NUMBER is not required anymore. So it will be removed.

Reviewed-by: Adam Jackson <ajax@redhat.com>
2012-09-07 14:44:48 -04:00
Christoph Bumiller
3433471e8b nvc0/ir: add initial code to support GK110 ISA encoding 2012-09-07 19:03:40 +02:00
Michel Dänzer
8a497e5955 radeonsi: Float format fixups.
Fixes piglit spec/ARB_texture_float/fbo-generatemipmap-formats.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-07 18:23:08 +02:00
Michel Dänzer
15c009af28 radeonsi: Handle more SNORM formats.
Fixes piglit spec/EXT_texture_snorm/fbo-generatemipmap-formats (except for
what seems like a random fluke).

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-07 18:23:08 +02:00
Eric Anholt
39aca5076f i965: Fix virtual_grf_interferes() between calculate_live_intervals() and DCE.
This fixes the blue zombies bug in l4d2.

NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-07 08:29:49 -07:00
Eric Anholt
7b3fe776e2 i965: Make the param pointer arrays for the VS dynamically sized.
Saves 96MB of wasted memory in the l4d2 demo.

v2: Rebase on compare func change, change brace style.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-09-07 08:29:49 -07:00
Eric Anholt
f144b78dfb i965: Make the param pointer arrays for the WM dynamically sized.
Saves 26.5MB of wasted memory allocation in the l4d2 demo.

v2: Rebase on compare func change, fix comments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-07 08:29:48 -07:00
Eric Anholt
99596cba78 i965: Add functions for comparing two brw_wm/vs_prog_data structs.
Currently, this just avoids comparing all unused parts of param[] and
pull_param[], but it's a step toward getting rid of those giant statically
sized arrays.

v2: Actually use the new function instead of just looking at its
    address.  This required changing the args to const pointers.
    (review by Kenneth)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-07 08:29:48 -07:00
Eric Anholt
5bb94f2bc4 glsl: Count builtin uniforms against uniform component limits.
We don't fully process the builtin uniforms, but at least
num_uniform_components reflects reality now.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-09-07 08:29:48 -07:00
Michel Dänzer
30b303743d radeonsi: Handle TGSI_SEMANTIC_FOG.
Fixes exponential fog. The pixel shaders for linear fog seem to get
miscompiled still somehow.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-07 16:12:04 +02:00
Michel Dänzer
3144821ef6 radeon/llvm: Match fexp2 for SI.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-07 12:16:32 +02:00
Brian Paul
043f66204b glapi/glx: rename 'table' variable to 'disp_table'
This fixes an issue where the local 'table' variable was hiding the
function parameter name in glGetColorTable(..., void *table).

This should be OK as long as there's never a GL entrypoint that uses
'disp_table' as a parameter name.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-06 18:58:49 -06:00
Brian Paul
14f55869a4 glx: move 'prime' var into #ifdef'd code block
To silence unused var warning.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-06 18:58:49 -06:00
Kenneth Graunke
815d9d405c i965: Fix primitive restart on Haswell.
Haswell moved the "Cut Index Enable" bit from the INDEX_BUFFER packet to
a new 3DSTATE_VF packet, so we need to emit that.  Also, it requires us
to specify the cut index rather than assuming it's 0xffffffff.

This adds a new Haswell-specific tracked state atom to gen7_atoms.
Normally, we would create a new generation-specific atom list, but since
there's only one difference over Ivybridge so far, I chose to simply
make it return without doing any work on non-Haswell systems.

Fixes five piglit tests:
- general/primitive-restart-DISABLE_VBO
- general/primitive-restart-VBO_COMBINED_VERTEX_AND_INDEX
- general/primitive-restart-VBO_INDEX_ONLY
- general/primitive-restart-VBO_SEPARATE_VERTEX_AND_INDEX
- general/primitive-restart-VBO_VERTEX_ONLY

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-09-06 16:39:48 -07:00
Matt Turner
058fb00716 build: Disable building of d3d1x
It's broken and unmaintained, and I'm tired of seeing bug reports about
it.
2012-09-06 16:20:18 -07:00
Paul Berry
78a34d868d intel: avoid undefined variable warnings in intel_screen.c
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-06 14:49:25 -07:00
Jerome Glisse
5ceb87286f r600g: order atom emission v3
To avoid GPU lockup registers must be emited in a specific order
(no kidding ...). This patch rework atom emission so order in which
atom are emited in respect to each other is always the same. We
don't have any informations on what is the correct order so order
will need to be infered from fglrx command stream.

v2: add comment warning that atom order should not be taken lightly
v3: rebase on top of alphatest atom fix

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-09-06 15:09:17 -04:00
Jerome Glisse
935a729447 r600g: fix num of dwords needed for alphatest_state atom
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-09-06 15:09:14 -04:00
Chad Versace
f29a4b0157 mesa: Don't advertise GLES extensions in GL contexts
glGetStringi(GL_EXTENSIONS) failed to respect the context's API, and so
returned all internally enabled GLES extensions from a GL context.
Likewise, glGetIntegerv(GL_NUM_EXTENSIONS) also failed to repsect the
context's API.

Note: This is a candidate for the 8.0 and 9.0 branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-09-06 11:46:04 -07:00
José Fonseca
edc0a00377 llvmpipe: Make driver name more informative.
Such as

  "llvmpipe (LLVM 3.1, 128 bits)"

or

  "llvmpipe (LLVM 3.1, 256 bits)"

when leveraging AVX 8-wide registers.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-09-06 16:35:25 +01:00
Michel Dänzer
694617a5b4 radeonsi: Handle more L/I/A format cases.
Fixes piglit fbo-generatemipmap-formats.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-06 16:48:16 +02:00
Michel Dänzer
cfebaf9dbd radeonsi: Enable whole quad mode for pixel shaders.
Fixes wrong mipmap level being sampled at some triangle edges.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-06 16:46:55 +02:00
Michel Dänzer
5edb80cee0 radeon/llvm: Add intrinsic for enabling whole quad mode in SI pixel shaders.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-06 16:46:42 +02:00
Michel Dänzer
e7383b74ef radeon/llvm: SI shader vector instructions implicitly use the EXEC register.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-06 16:46:27 +02:00
Michel Dänzer
ab162f80c3 radeon/llvm: Extend SI EXEC register support.
Add 32 bit lo and hi variants, and binary encodings.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-06 16:15:44 +02:00
Tom Stellard
2baaa5c7eb radeon/llvm: Remove R600InstrInfo.td from TD_FILES
Fixes build bug introduced by
cebbdd4ac2
2012-09-06 14:16:59 +00:00
Michel Dänzer
d0f51fe567 radeonsi: Enable NPOT textures again.
Should be at least mostly working now (with the corresponding fixes in
libdrm_radeon).

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-09-06 15:39:20 +02:00
Michel Dänzer
cf697e875c radeonsi: Mipmaps require memory footprint to be padded to powers of two.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-09-06 15:39:13 +02:00
Michel Dänzer
b7d96ca35e radeonsi: Sampler view state simplification.
We can always use the offset and tiling mode from level 0 and restrict the
first and last mipmap level to be used in the sampler resource.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-09-06 15:39:01 +02:00
Michel Dänzer
396af00ffe radeonsi: Untiled textures are linear aligned, not linear general.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-09-06 15:38:45 +02:00
Tom Stellard
cebbdd4ac2 radeon/llvm: Cleanup makefile
Hopefully, this will fix all the parallel make problems people have
been having.
2012-09-06 13:30:42 +00:00
Matt Turner
b6109de34f Remove useless checks for NULL before freeing
Same as earlier commit, except for "FREE"

This patch has been generated by the following Coccinelle semantic
patch:

// Remove useless checks for NULL before freeing
//
// free (NULL) is a no-op, so there is no need to avoid it

@@
expression E;
@@
+ FREE (E);
+ E = NULL;
- if (unlikely (E != NULL)) {
-   FREE(E);
(
-   E = NULL;
|
-   E = 0;
)
   ...
- }

@@
expression E;
type T;
@@
+ FREE ((T) E);
+ E = NULL;
- if (unlikely (E != NULL)) {
-   FREE((T) E);
(
-   E = NULL;
|
-   E = 0;
)
   ...
- }

@@
expression E;
@@
+ FREE (E);
- if (unlikely (E != NULL)) {
-   FREE (E);
- }

@@
expression E;
type T;
@@
+ FREE ((T) E);
- if (unlikely (E != NULL)) {
-   FREE ((T) E);
- }

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-05 22:28:50 -07:00
Matt Turner
da3282b6e2 Replace another malloc/memset-0 combination with calloc
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-05 22:28:50 -07:00
Matt Turner
52789496a7 Remove useless memset after calloc
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-05 22:28:50 -07:00
Matt Turner
6bda027e01 Use calloc instead of malloc/memset-0
This patch has been generated by the following Coccinelle semantic
patch:

@@
expression E;
identifier I;
@@
- I = malloc(E);
+ I = calloc(1, E);
...
- memset(I, 0, sizeof *I);

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-05 22:28:50 -07:00
Matt Turner
5067506ea6 Remove useless checks for NULL before freeing
This patch has been generated by the following Coccinelle semantic
patch:

// Remove useless checks for NULL before freeing
//
// free (NULL) is a no-op, so there is no need to avoid it

@@
expression E;
@@
+ free (E);
+ E = NULL;
- if (unlikely (E != NULL)) {
-   free(E);
(
-   E = NULL;
|
-   E = 0;
)
   ...
- }

@@
expression E;
type T;
@@
+ free ((T) E);
+ E = NULL;
- if (unlikely (E != NULL)) {
-   free((T) E);
(
-   E = NULL;
|
-   E = 0;
)
   ...
- }

@@
expression E;
@@
+ free (E);
- if (unlikely (E != NULL)) {
-   free (E);
- }

@@
expression E;
type T;
@@
+ free ((T) E);
- if (unlikely (E != NULL)) {
-   free ((T) E);
- }

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-05 22:28:50 -07:00
Matt Turner
a9e8054fff glX_proto_send.py: Don't cast the return value of malloc 2012-09-05 22:28:50 -07:00
Matt Turner
2b7a972e3f Don't cast the return value of malloc/realloc
This patch has been generated by the following Coccinelle semantic
patch:

// Don't cast the return value of malloc/realloc.
//
// Casting the return value of malloc/realloc only stands to hide
// errors.

@@
type T;
expression E1, E2;
@@
- (T)
(
_mesa_align_calloc(E1, E2)
|
_mesa_align_malloc(E1, E2)
|
calloc(E1, E2)
|
malloc(E1)
|
realloc(E1, E2)
)
2012-09-05 22:28:50 -07:00
Matt Turner
812931f602 glX_proto_send.py: Remove deprecated Xmalloc/Xfree calls
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-05 22:28:49 -07:00
Matt Turner
7c7b7b068b Remove Xcalloc/Xmalloc/Xfree calls
These calls allowed Xlib to use a custom memory allocator, but Xlib has
used the standard C library functions since at least its initial import
into git in 2003. It seems unlikely that it will grow a custom memory
allocator. The functions now just add extra overhead. Replacing them
will make future Coccinelle patches simpler.

This patch has been generated by the following Coccinelle semantic
patch:

// Remove Xcalloc/Xmalloc/Xfree calls

@@ expression E1, E2; @@
- Xcalloc (E1, E2)
+ calloc (E1, E2)

@@ expression E; @@
- Xmalloc (E)
+ malloc (E)

@@ expression E; @@
- Xfree (E)
+ free (E)

@@ expression E; @@
- XFree (E)
+ free (E)

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-05 22:28:49 -07:00
Vinson Lee
17a574d7cd Use the correct macro _WIN32 for Windows.
The correct predefined macro for Windows is _WIN32, not WIN32 or
__WIN32__.  _WIN32 is defined for 32-bit and 64-bit version of Windows
by both MSVC and MinGW compilers.

http://sourceforge.net/p/predef/wiki/OperatingSystems
http://msdn.microsoft.com/en-us/library/b0084kay.aspx

This patch also fixes a MinGW automake build error.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-05 22:14:32 -07:00
Brian Paul
df5eb0c9bc mesa: remove #undef CONST in get.c
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-05 21:20:31 -06:00
Brian Paul
97992b05fb mesa: remove now unused CONST macro
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-05 21:20:27 -06:00
Brian Paul
2e23a76eb9 mesa: s/CONST/const/ in a comment
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-05 21:20:21 -06:00
Brian Paul
9f2a7a38e8 mesa: s/CONST/const/ in math/ files
The CONST macro hack will go away soon.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-05 21:20:03 -06:00
Tom Stellard
d220e2de7f radeon/llvm: Fix operand ordering for V_CNDMASK_B32
This fixes several hundred piglit tests.
2012-09-05 13:17:49 -04:00
Tom Stellard
12d3d6f6ab radeon/llvm: Use correct float->int conversion opcode on SI.
V_CVT_I32_F32 converts floats to signed integers, but we were using
V_CVT_F32_I32 which convertes signed integers to float.
2012-09-05 13:17:17 -04:00
Tom Stellard
d68e337c60 configure.ac: Don't link gallium drivers with libdricore
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-05 14:01:56 -04:00
Paul Berry
e42f16c192 i965/blorp: Fix incorrect indentation. 2012-09-05 10:42:06 -07:00
Paul Berry
772ea84b35 mapi: Add shared-glapi-test to .gitignore 2012-09-05 10:41:42 -07:00
Brian Paul
771e7b6d88 mesa: fix per-level max texture size error checking
This is a long-standing omission in Mesa's texture image size checking.
We need to take the mipmap level into consideration when checking if the
width, height and depth are too large.

Fixes the new piglit max-texture-size-level test.
Thanks to Stéphane Marchesin for finding this problem.

Note: This is a candidate for the stable branches.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-05 08:44:26 -06:00
Kenneth Graunke
456c7355e0 i965: Don't use brw->fragment_program in the old brw_wm_pass2.c.
According to Eric, this shouldn't matter since we don't do precompiles
using the old backend.  In other words, brw->fragment_program (the
currently active program) should equal c->fp (the program currently
being compiled).

However, it's just not a good idea to access brw->fragment_program
directly in compiler code.  It's totally illegal in the new backend, so
let's just not do it here either.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Paul Berry <stereotype441@gmail.com>
2012-09-05 06:50:10 -07:00
Tom Stellard
446d19c12a radeon/llvm: Fix lowering of SI_V_CNDLT
SREG_LIT_0 is a scalar register, so it can only be used in the
first argument of vector instructoins.
2012-09-04 14:21:10 -04:00
Tom Stellard
f9fede884b radeon/llvm: Fix encoding of V_CNDMASK_B32
The CodeEmitter was not setting the VGPR bit for src0, because the
instruction definition had the VCC register in the src0 slot, instead of
the actual src0 register.  This has been fixed by moving the VCC
register to the end of the operand list.
2012-09-04 14:21:10 -04:00
Brian Paul
f73ffacbf0 mesa: fix DIFFERENT_SIGNS() function
Looks like converting this to a macro, returning bool, caused us to
lose the high (31st) bit result.  Fixes piglit fbo-1d test.  Strange
that none of the other tests I ran caught this.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=54365

Tested-by: Vinson Lee <vlee@freedesktop.org>
2012-09-04 11:36:58 -06:00
Vincent Lejeune
8eaa36317a radeon/llvm: do not convert f32 operand of select_cc node
v2:-use camel coding style

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-04 17:54:37 +02:00
Vincent Lejeune
a4325b3229 radeon/llvm: custom lowering for FP_TO_UINT when dst is i1 (bool)
v2:-wrap line at 80 characters

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-04 17:54:01 +02:00
Vincent Lejeune
d9e135e18c radeon/llvm: support setcc on f32
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-04 17:52:53 +02:00
Vincent Lejeune
a383142436 radon/llvm: br_cc f32 now lowered without cast
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-04 17:50:44 +02:00
Vincent Lejeune
6a85725f13 radeon/llvm: swap wrong OPCODE_IS_*_ZERO_* opcode and use
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-04 17:44:48 +02:00
Christian König
73dd82061e winsys/radeon: create only one winsys for each fd
Fixing problems with GLAMOR.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-04 10:51:38 +02:00
Christian König
88a4fd8fe6 radeonsi: stop big offsets from hanging the GPU v2
v2: rebased of radeon/llvm fix.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-04 10:51:32 +02:00
Christian König
de7d3825a0 radeonsi: adjust PIPE_SHADER_CAP_MAX_CONSTS
So it matches what we really can do.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-04 10:51:26 +02:00
Christian König
8758183f0a radeon/llvm: fix SelectADDR8BitOffset
The offset is unsigned, not signed.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-09-04 10:51:11 +02:00
José Fonseca
7eb5040197 gallivm,llvmpipe: Use 4-wide vectors on AMD Bulldozer.
8-wide vectors is slower.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-09-04 08:49:00 +01:00
Brian Paul
9a31e090ef mesa: add missing return statements after recording errors
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-03 18:07:41 -06:00
Brian Paul
2ffc7fd2d2 mesa: remove more null pointer checks before free() calls
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-03 18:07:41 -06:00
Brian Paul
2276bb991a mesa: remove null pointer checks before free() calls
Since free(NULL) is fine.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-03 18:07:41 -06:00
Brian Paul
56ccdf7e30 mesa: remove SQRTF, use sqrtf. Convert INV_SQRT() to inline function.
We were already defining sqrtf where we don't have the C99 version.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-03 18:07:41 -06:00
Vadim Girlin
f44bda17f5 r600g: adjust QUANT_MODE for higher precision
Use 1/256 for R6xx/7xx, 1/4096 for evergreen, instead of default 1/16.

Helps to pass some piglit tests (fbo, multisample).

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-04 00:18:13 +04:00
Vinson Lee
19b3910bd5 util: Add cpuid for Solaris Studio.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-03 12:28:07 -07:00
Ian Romanick
51b069e7aa meta: Don't save and restore fog state when there is no fog state
I wonder if the better solution is to have _mesa_meta_GenerateMipmap not
use MESA_META_ALL for the GLSL path.  Even on compatibility profiles
there is no reason to save and restore fog on this path.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Lu Hua <huax.lu@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54295
2012-09-03 10:33:54 -07:00
Brian Paul
0b90da3252 mesa: remove accidentally committed __SUNPRO_C sqrtf() code 2012-09-03 08:03:07 -06:00
Christian König
e1673d2001 radeonsi: disable array-textures for now
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-09-03 11:23:25 +02:00
Christian König
aa5daa61a1 radeonsi: disable Z16 for now
It's causing crashes.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-03 11:23:20 +02:00
Christian König
74a55392b6 radeonsi: disable NPOT textures for now
Looks like we have an alignment issue with NPOT textures
and mipmaps. So disable NPOT textures until we figure out
what is going wrong here.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-03 11:23:14 +02:00
Christian König
e7723b5bdf radeonsi: handle indirect constants gracefully
It's not supported yet, so at least don't try to crash the box.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-03 11:23:08 +02:00
Brian Paul
a96119cc8c radeon: fix free/FREE mistake 2012-09-01 09:47:29 -06:00
Brian Paul
12bf268aab vega: include u_debug.h for assert() 2012-09-01 09:03:24 -06:00
Brian Paul
fe72a069d1 mesa: s/FREE/free/
v2: replace instances in dri/common/ dirs

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-01 07:47:24 -06:00
Brian Paul
4fdac659f8 mesa: s/CALLOC/calloc/
v2: replace instances in dri/common/ dirs

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-01 07:47:24 -06:00
Brian Paul
33bb8c051d mesa: s/MALLOC/malloc/
v2: replace instances in dri/common/ dirs

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-01 07:47:24 -06:00
Brian Paul
66d6ba2d83 util: remove u_debug.h from u_math.h
No debug code is used in u_math.h
2012-09-01 07:41:26 -06:00
Brian Paul
a7663729d2 util: include u_debug.h 2012-09-01 07:41:26 -06:00
Brian Paul
b114e37179 tgsi: include u_debug.h 2012-09-01 07:41:26 -06:00
Brian Paul
36f3f7ebfa mesa: clean-up LOG2() function 2012-09-01 07:41:26 -06:00
Brian Paul
c8a86f717f mesa: move IS_NEGATIVE() and DIFFERENT_SIGNS() to macros.h 2012-09-01 07:41:26 -06:00
Brian Paul
a2cf265c8d mesa: clean up F_TO_I, IFLOOR, ICEIL functions
Put all the #ifdef stuff inside the function bodies instead of outside.
2012-09-01 07:41:26 -06:00
Kenneth Graunke
4d9abd96cc i965/fs: Don't use brw->fragment_program in calculate_urb_setup().
Reading brw->fragment_program is nonsensical in compiler code: it
contains the currently active program (if any), not the one currently
being compiled.  Attempting to access it may either lead to crashes
(null pointer dereference if no program is active) or wrong results.

Fixes piglit regressions since 9ef710575b
on pre-Sandybridge hardware.  The actual bug was created in commit
7b1fbc6889.

NOTE: This is a candidate for the 9.0 and 8.0 branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54183
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-08-31 16:36:09 -07:00
Matt Turner
e0f510b1c9 build: Remove left over echo from GLU removal 2012-08-31 15:12:21 -07:00
Vadim Girlin
b05a1fc156 mesa: don't wait in _mesa_ClientWaitSync if timeout is 0
From ARB_sync spec:

    If the value of <timeout> is zero, then ClientWaitSync does not
    block, but simply tests the current state of <sync>. TIMEOUT_EXPIRED
    will be returned in this case if <sync> is not signaled, even though
    no actual wait was performed.

Fixes random fails of the arb_sync-timeout-zero piglit test on r600g.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-01 01:02:24 +04:00
Matt Turner
b95d598323 Remove libGLU
It's been moved to its own repository, found at
	http://cgit.freedesktop.org/mesa/glu/

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-31 10:58:15 -07:00
Jakob Bornecrantz
6a7dea93fa dri: Rework planar image interface
As discussed with Kristian on #wayland. Pushes the decision of components into
the dri driver giving it greater freedom to allow t to implement YUV samplers
in hardware, and which mode to use.

This interface will also allow drivers like SVGA to implement YUV surfaces
without the need to sub-allocate and instead send 3 seperate buffers for each
channel, currently not implemented.

I have tested these changes on Gallium Svga. Scott tested them on both intel
and Gallium Radeon. Kristan and Pekka tested them on intel.

v2: Fix typo in dri2_from_planar.
v3: Merge in intel changes.

Tested-by: Scott Moreau <oreaus@gmail.com>
Tested-by: Pekka Paalanen <ppaalanen@gmail.com>
Tested-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
2012-08-31 19:51:02 +02:00
Tom Stellard
022f6d8861 radeon/llvm: Rework how immediate operands are handled with SI
Immediate operands were previously handled in the CodeEmitter, but that
code was buggy and very confusing.  This commit adds a pass that simplifies
the handling of immediate operands by spliting the loading of the
immediate into a sperate insruction that is bundled with the original.
2012-08-31 12:54:58 -04:00
Tom Stellard
1cee70c5d8 radeon/llvm: Fix typo in assert 2012-08-31 12:54:58 -04:00
Tom Stellard
1247549734 radeon/llvm: Fix isEG tablegen predicate
This predicate incorrectly included SI GPUs, so some Evergreen
instructions were being emmitted on SI.
2012-08-31 12:54:58 -04:00
Tom Stellard
ee45dec7c4 radeon/llvm: Add support for RCP instruction on SI 2012-08-31 12:54:58 -04:00
Tom Stellard
fc8b4765d0 radeon/llvm: Support AMDGPUfmin DAG node on SI 2012-08-31 12:54:57 -04:00
Tom Stellard
c3c323a164 radeonsi: Handle TGSI_SEMANTIC_PSIZE
The relevant POINT_SIZE registers are being set using the
pipe_rasterizer_state, so we just need to tell the shader compiler which
export type to use.

This fixes several of the glean glsl tests.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-31 12:53:51 -04:00
Tapani Pälli
d58ca43b80 android: do not expose single buffered eglconfigs
On Android we want to add only double buffered configs for visuals.
Earlier implementation set the SurfaceType as 0 for single buffered
configs but driver still exposed these configs that were not compatible
with any egl surface type.  This caused Khronos conformance test runs to
fail on Android. This patch fixes the issue by skipping single buffered
configs earlier and not exposing them.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-31 09:47:02 -07:00
Tapani Pälli
29d394b9ba android: fix liblog API changes
android logging macros changed their name in JellyBean.

Signed-off-by: Bruce E. Robertson <bruce.e.robertson@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-31 09:45:56 -07:00
Tapani Pälli
4d02b018f4 xmlconfig: use __progname when building for Android
__progname symbol and strrchr are available with bionic.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-31 09:45:49 -07:00
Vinson Lee
f3bb6bd9b3 scons: Remove leftover print statement.
Remove print statement left over from commit
c57fb034b1.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2012-08-31 08:26:29 -07:00
Andreas Boll
0dcf555104 docs: update relnotes-9.0
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-08-31 09:22:18 -06:00
Andreas Boll
3678f8904c mesa: also bump version in Makefile.am and configure.ac to 9.0
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-08-31 09:22:16 -06:00
Vinson Lee
c57fb034b1 scons: Add default libraries to Solaris build.
Fixes SCons build on Solaris.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54293
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-08-31 08:24:38 -06:00
Brian Paul
43ed822a50 st/mesa: s/CALLOC/calloc/ to fix allocation bug
The CALLOC() macro only takes one argument so this was being treated
as a comma expression.  Simply use calloc() instead.

A follow-on patch will replace all CALLOC() calls with calloc().

NOTE: This is a candidate for the 8.0 and 9.0 branches.
2012-08-31 08:05:38 -06:00
Brian Paul
c5f9cf8232 util: add casts to silence signed/unsigned comparison warnings 2012-08-31 08:04:40 -06:00
Brian Paul
8472bb4508 mesa: fix-up and use _mesa_delete_renderbuffer()
_mesa_delete_renderbuffer() should free the mutex (though that may be a
no-op) and then free the renderbuffer object itself.  Subclasses of
gl_renderbuffer can use this function too.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-31 08:04:40 -06:00
Ian Romanick
2d2f1fd164 docs: Add some missing features to 9.0 release notes and GL3.txt
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-30 18:23:29 -07:00
Ian Romanick
0791484c42 mesa: Bump version to 9.0
Now that OpenGL 3.1 is supported by at least one driver, follow
tradition and bump the major version number.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-30 18:23:28 -07:00
Marek Olšák
0e470533ad r600g: enable transform feedback on Cayman
There doesn't seem to be anything wrong with it.
2012-08-31 01:19:03 +02:00
Marek Olšák
64db3cc6ad r600g: implement MSAA for Cayman
Everything works except for blitting MSAA colorbuffers, which isn't
so trivial on Cayman. It's a rarely-used feature anyway.
2012-08-31 01:19:03 +02:00
Anuj Phogat
f8a8f069ee i965/msaa: flag _NEW_MULTISAMPLE in the brw_tracked_state
This is required to get the program recompiled when SampleAlphaToCoverage
is enabled.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-08-30 11:10:50 -07:00
Marek Olšák
c2e9dd0276 r600g: enable MSAA on r6xx by default
DRM 2.22.0 is required though. Also require the new DRM for r700, as
there are some important fixes for that generation too.
2012-08-30 19:43:56 +02:00
Marek Olšák
2f6eb3afb7 r600g: disable MSAA depth decompression on r6xx 2012-08-30 19:43:56 +02:00
Marek Olšák
78354011f9 r600g: implement color resolve for r600
The blend state is different and the resolve single-sample buffer must have
FMASK and CMASK enabled. I decided to have one CMASK and one FMASK
per context instead of per resource.

There are new FMASK and CMASK allocation helpers and a new buffer_create
helper for that.
2012-08-30 19:43:56 +02:00
Marek Olšák
863e2c85b9 r600g: fix CB_SHADER_MASK and CB_TARGET_MASK for r6xx 2012-08-30 19:43:56 +02:00
Marek Olšák
187d7fb2fe r600g: implement draw_rectangle callback
The color resolve on r6xx needs PT_RECTLIST. Using conventional primitive
types (triangles and quads) produces an ugly line between two diagonally
opposite corners. I guess a rectangular point sprite would work too.
2012-08-30 19:43:55 +02:00
Marek Olšák
8698a3b85d r600g: implement MSAA for r700
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-08-30 19:43:55 +02:00
Marek Olšák
edf22a5c6d r600g: change programming of CB_SHADER_MASK on r600-r700
This one actually makes more sense and gives the expected value
for MSAA resolve.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-08-30 19:43:55 +02:00
Marek Olšák
1ff5f08823 configure.ac: require libdrm_radeon 2.6.39 for MSAA 2012-08-30 19:43:55 +02:00
Brian Paul
055093e33f meta: remove call to _meta_in_progress(), fix multisample enable/disable
This partially reverts d638da23d2.

With gallium the meta code is not always built so the call to
_meta_in_progress() was unresolved.  Simply special-case the
GL_MULTISAMPLE case in the meta code.  There might be other special
cases in the future given all the differences between legacy GL,
core GL, GLES, etc.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=54234
and https://bugs.freedesktop.org/show_bug.cgi?id=54239

v2 (Paul Berry <stereotype441@gmail.com>): keep _meta_in_progress
function, since it's needed by the i965 driver, but don't call it from
core mesa.

Signed-off-by: Brian Paul <brianp@vmware.com>
2012-08-30 08:28:19 -07:00
Brian Paul
aad7ccd261 meta: add parenthesis to silence compiler warnings
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-08-30 09:26:51 -06:00
Tapani Pälli
9121460f13 scons : add HAVE_DLOPEN to build environment
fixes dlopen issue caused by 57c57df7b4

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54140

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-08-30 12:02:03 +01:00
Christian König
f1fd94f355 radeonsi: fix stupid bug added in commit 07838603b9
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-30 10:23:32 +02:00
Eric Anholt
8393360659 i965/fs: Remove a dead member from live variables analysis.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-29 20:46:51 -07:00
Kenneth Graunke
6928bea7ca i965/fs: Initialize output_components[] by filling it with zeros.
Prior to commit 2f1869822, emit_fb_writes() looped from 0 to 3, writing
all four components of a vec4 color output.  However, that broke for
smaller output types (float, vec2, or vec3).  To fix that, I introduced
a new variable (output_components[]) containing the size of the output
type for each render target.

Unfortunately, I forgot to actually initialize it in the constructor,
which meant that unless a shader wrote to gl_FragColor, or the specific
output for each render target, output_components would contain a garbage
value, and we'd loop for a completely non-deterministic amount of time.

Not actually emitting any color writes seems like the right approach.
We may still need to emit a render target write (to terminate the
thread), but don't have to put in any sensible values (the shader didn't
write anything, after all).

Fixes a regression since 2f18698220.
NOTE: This is a candidate for stable release branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54193
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Ian Romanick <idr@freedesktop.org>
2012-08-29 15:10:57 -07:00
Ian Romanick
42723d88d3 mesa: Do something sensible when on-line compression is requested but not possible
It is possible to force S3TC extensions to be enabled.  This is
generally done to support applications that will only supply
pre-compressed textures.  This accounts for the vast majority of
applications.

However, there is still the possibility of an application asking for
on-line compression.  In that case, generate a warning and substitute a
generic compressed format.  The driver will either pick an uncompressed
format or a compressed format that Mesa can handle on-line (e.g., FXT1).

This should only cause problems for applications that request on-line
compression and read the compressed texture back.  This is likely an
infinitesimal subset of an already infinitesimal subset.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-29 15:09:38 -07:00
Ian Romanick
0e0d664461 i965: Allow creation of OpenGL 3.1 contexts
v2: Fix API_OPENGL_CORE handling when TEXTURE_FLOAT_ENABLED is not
defined.  Based on review feedback from Eric Anholt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-29 15:09:38 -07:00
Ian Romanick
2a33a99737 i965: Advertise GLSL 1.40 and TexBOs in core contexts
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:38 -07:00
Ian Romanick
91473485fc intel: Clean up bits of cruft in intelCreateContext
This and the previous three commits should probably be squashed together...

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:37 -07:00
Ian Romanick
bf8644e64d i965: Set context flags
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:37 -07:00
Ian Romanick
ca2b1fcb30 mesa/dri: Allow creation of forward-compatible contexts
This is done by changing the API to API_OPENGL_CORE.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:37 -07:00
Ian Romanick
36ceabfb74 mesa/es: Enable GL_OES_vertex_array_object
Functionally the same as GL_ARB_vertex_array_object.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-29 15:09:37 -07:00
Ian Romanick
35cf6aeb8c mesa: Enable GL_{ARB,APPLE}_vertex_array_object in all drivers
This is a purely software extension.  The drivers don't need to do any
work to support it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-29 15:09:37 -07:00
Ian Romanick
d1cf5c77b7 meta: Don't use deprecated keyword in 1.30 shader
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:37 -07:00
Ian Romanick
ae88281b7b mesa: Disallow alpha, luminance, and LA textures in core context
Also disallow the 1, 2, 3, and 4 formats.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:37 -07:00
Ian Romanick
04d6ffa06d mesa: Disallow more deprecated functions in core context
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:37 -07:00
Ian Romanick
91107b4ccf mesa: Require names from Gen in core context
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:37 -07:00
Ian Romanick
843b876ba3 mesa: Allow NULL vertex pointer without a VBO
There is text in the OpenGL 3.x specs to explicitly allow this case.
Weird.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:37 -07:00
Ian Romanick
792214e8d4 mesa: Disallow VertexAttribPointer without a VAO in a core context
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:37 -07:00
Ian Romanick
29512df635 mesa: Disallow wide lines in forward compatible context
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:37 -07:00
Ian Romanick
7e1cab09a1 mesa: Only FRONT_AND_BACK is allowed for PolygonMode in core context
Page 407 (page 423 of the PDF) of the OpenGL 3.0 spec says (in the list
of deprecated functionality):

    "Separate polygon draw mode - PolygonMode face values of FRONT and
    BACK; polygons are always drawn in the same mode, no matter which
    face is being rasterized."

Also modify meta to not use FRONT or BACK in a core context.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:36 -07:00
Paul Berry
d638da23d2 meta: Don't stray outside the confines of the API specified in the context
Signed-off-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:36 -07:00
Ian Romanick
8e7b6a69e9 mesa: Don't allow display lists or evaluators in core context
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:36 -07:00
Ian Romanick
2bcf555490 mesa: Don't allow GL_EXTENSIONS query in core context
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:36 -07:00
Ian Romanick
c85a9a9996 mesa: Non-sprite points are deprecated
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:36 -07:00
Eric Anholt
7d8d1c7819 mesa: Fix VAO deletion on GL 3.1 core.
We were calling through a dispatch table entry that was NULL, since the apple
variant is only on legacy desktop.  Just call the function we mean instead of
indirecting through the dispatch.
2012-08-29 15:09:36 -07:00
Eric Anholt
8a4d560796 mesa: Enable a bunch of missing getters on 3.1 core.
NOTE: maybe I enabled too many?
2012-08-29 15:09:36 -07:00
Eric Anholt
bb4a39ec95 mesa: Expose texture buffer objects when the context is GL 3.1 core.
v2: Use API_OPENGL_CORE.

v3: Only require desktop GL.  If a driver can't support TexBOs in a non-core
context, it should not enable them.

Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-29 15:09:36 -07:00
Ian Romanick
1b86a91c64 mesa: Allow PACK / UNPACK queries for ES2
These are part of the GL_EXT_unpack_subimage extension and ES 3.0.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:36 -07:00
Ian Romanick
a010215463 mesa: Kill ES2 wrapper functions
v2: Fix completely broken condition around ClearColorIiEXT and
ClearColorIuiEXT.

v3: Add special VertexAttrib handling for ES2.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:36 -07:00
Ian Romanick
fc2219e448 mesa: glGetVertexAttribPointerv is part of core profile and ES2
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:36 -07:00
Ian Romanick
917f68071b mesa/es: Validate glPointParameter pname in Mesa code rather than the ES wrapper
v2: Add proper core-profile filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-29 15:09:36 -07:00
Ian Romanick
f778174ea1 mesa: Require OpenGL 2.0 for GL_POINT_SPRITE_COORD_ORIGIN
The comment in the code even says this is the right thing to do.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-29 15:09:36 -07:00
Ian Romanick
25ffb86893 mesa: Require that drivers supporting point sprites support point parameters
All drivers in Mesa do.  This allows a lot of extension checking code to be
gutted from the function.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-29 15:09:35 -07:00
Ian Romanick
33e01d93ca mesa/es: Validate glGetTexEnv parameters in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:35 -07:00
Ian Romanick
8a263b6efd mesa/es: Validate glTexEnv parameters in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:35 -07:00
Ian Romanick
d2b03f6e99 mesa/es: Validate glGetTexGen parameters in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:35 -07:00
Ian Romanick
f329adfa49 mesa/es: Validate glTexGen parameters in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:35 -07:00
Ian Romanick
0fa4ed05cf mesa/es: Validate glLightModel pname in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:35 -07:00
Ian Romanick
fb4f2d3425 mesa/es: Validate glMaterial face and pname in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:35 -07:00
Ian Romanick
8df3f9bd5f mesa/es: Validate glGetMaterial pname in Mesa code rather than the ES wrapper
Fixes a bug that glGetMaterial[fx]v in ES1 contexts would (try to) allow
queries of GL_AMBIENT_AND_DIFFUSE.  This enum can only be used in glMaterial,
not in the get.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:35 -07:00
Ian Romanick
9555d7bdc1 mesa/es: Validate glGetPointerv pname in Mesa code rather than the ES wrapper
v2: Add proper core-profile, GLES1, and GLES3 filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:35 -07:00
Ian Romanick
d6c8913bc6 mesa/es: Validate glMatrixMode mode in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:35 -07:00
Ian Romanick
10e7db1ccf mesa/es: Validate glFog pname in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:35 -07:00
Ian Romanick
b7c7e5e45a mesa/es: Validate glReadPixels format and type in Mesa code rather than the ES wrapper
v2: Add proper GLES3 filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:35 -07:00
Ian Romanick
4114dee99e mesa/es: Validate glPixelStore pname in Mesa code rather than the ES wrapper
v2: Add proper core-profile and GLES3 filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:35 -07:00
Ian Romanick
08be1d288f mesa/es: Validate glEnable cap in Mesa code rather than the ES wrapper
Also handle glDisable, glIsEnabled, glEnableClientState, and
glDisableClientState.

v2: Add proper core-profile and GLES3 filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:34 -07:00
Ian Romanick
bca2cece02 mesa/es: Validate glHint target in Mesa code rather than the ES wrapper
v2: Add proper core-profile and GLES3 filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:34 -07:00
Ian Romanick
2c87030a00 mesa/es: Validate glGetVertexAttribf pname in Mesa code rather than the ES wrapper
v2: Add proper core-profile and GLES3 filtering.

v3: Allow glGetVertexAttribfv(0, GL_CURRENT_VERTEX_ATTRIB_ARB, param) in
OpenGL 3.1, just like OpenGL ES 2.0.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:34 -07:00
Ian Romanick
c13f36ce4e mesa/es: Validate glGetString pname in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:34 -07:00
Ian Romanick
6a9b8f897a mesa/es: Validate primitive modes in Mesa code rather than the ES wrapper
v2: Add proper core-profile filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:34 -07:00
Ian Romanick
72e076cb17 mesa: Refactor _mesa_valid_prim_mode to use a switch-statement
This makes the next change a bit easier.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:34 -07:00
Ian Romanick
01497a3560 mesa/es: Validate blend function enums in Mesa code rather than the ES wrapper
v2: Add proper core-profile filtering.

v3: Allow GL_SRC_ALPHA_SATURATE as a destination factor in GLES3.  Based
on review feedback from Eric Anholt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:34 -07:00
Ian Romanick
e58c19a204 mesa/es: Validate glClear mask in Mesa code rather than the ES wrapper 2012-08-29 15:09:34 -07:00
Ian Romanick
f0c99d0a6a mesa/es: Validate glRenderbufferStorage internalFormat in Mesa code rather than the ES wrapper
v2: Add proper core-profile and GLES3 filtering.

v3: Allow GL_RGB10_A2UI in GLES3 based on review feedback from Eric
Anholt.

v4: Arg.  Reject unsized RED and RG enums on GLES.  More feedback from
Eric.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:34 -07:00
Ian Romanick
ae86ebfcc9 mesa/es: Validate glGetRenderbufferParameter pname in Mesa code rather than the ES wrapper
v2: Add proper core-profile and GLES3 filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-29 15:09:34 -07:00
Ian Romanick
0cdaa471ec mesa/es: Validate glGetFramebufferAttachmentParameter pname in Mesa code rather than the ES wrapper
v2: Add proper core-profile, GLES1, and GLES3 filtering.

v3: Fix the GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME query when the
attachment type is GL_NONE on GLES3.  Other cleanups.  Based on review
feedback from Eric Anholt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:34 -07:00
Ian Romanick
5b44a77428 mesa/es: Validate glGenerateMipmap target in Mesa code rather than the ES wrapper
v2: Add proper core-profile and GLES3 filtering.

v3: Fix a typo in GL_TEXTURE_2D_ARRAY checking.

v4: Change !_mesa_is_desktop_gl tests to _mesa_is_gles test.  The test
around GL_TEXTURE_2D_ARRAY got some other changes because that enum is
also available with GLES3 (which uses API_OPENGLES2).  Based on review
feedback from Eric Anholt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:34 -07:00
Ian Romanick
7f991d26ad mesa/es: Validate glFramebufferTexture2D textarget in Mesa code rather than the ES wrapper
v2: Add proper core-profile and GLES3 filtering.

v3: Change !_mesa_is_desktop_gl tests to _mesa_is_gles test.  The test
around GL_TEXTURE_2D_ARRAY got some other changes because that enum is
also available with GLES3 (which uses API_OPENGLES2).  Based on review
feedback from Eric Anholt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-29 15:09:34 -07:00
Tom Stellard
2809ae3d44 radeon/llvm: Fix encoding of FP immediates on SI 2012-08-29 15:52:10 -04:00
Tom Stellard
05113fd266 radeon/llvm: Create a register class for the M0 register
The Common Subexpression Elimination pass will not operate on
instructions with physical register defs, so we end up with
several redundant copies to M0 when using interpolation.

Adding a register class that only contains the M0 register allows
use to use a virtual register to represent M0, and makes it possible
for the Common Subexpression Elimination pass to remove the extra
copies.
2012-08-29 15:52:10 -04:00
Tom Stellard
733c28a0d9 radeon/llvm: Set the neverHasSideEffects bit on more instructions
This flag makes these instructions candidates for the dead code
elimination and common subexpression elimination.
2012-08-29 15:52:10 -04:00
Tom Stellard
cf4ac69928 radeon/llvm: Declare the interpolation intrinsics as ReadOnly
This signals to the Dead Code Elimination pass that it is safe to
remove these instructions when they are dead.
2012-08-29 15:52:10 -04:00
Tom Stellard
73a2c4b9db radeon/llvm: Mark M0 as a def when lowering interpolation instructions 2012-08-29 15:52:10 -04:00
Anuj Phogat
0fc11a24c8 meta: Add GLSL variant of _mesa_meta_GenerateMipmap() function
This reduces the overhead of using the fixed function internally
in the driver.

V2: Use setup_glsl_generate_mipmap() and setup_ff_generate_mipmap()
    functions to avoid code duplication.
    Use glsl version when ARB_{vertex, fragmet}_shader are present.
    Remove redundant code.

V3: Remove redundant border related code leaving the assertion.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-29 11:43:52 -07:00
Brian Paul
c824804c6f glsl: s/class/struct/ for ast_type_qualifier
To silence an MSVC compiler warning about class vs. struct.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-29 12:08:46 -06:00
Brian Paul
ec6478fd32 mesa: convert a few more macros to inline functions 2012-08-29 08:20:58 -06:00
Brian Paul
cf41d7c63a mesa: remove COPY_4V_CAST() macro
Only used in one place, and not really needed.
2012-08-29 08:20:58 -06:00
Brian Paul
fd9afb87d8 mesa: convert a bunch of math macros to inline functions 2012-08-29 08:20:58 -06:00
Brian Paul
454e23776d tnl: use INTERP_4F() instead of four INTERP_F() calls 2012-08-29 08:20:58 -06:00
Brian Paul
ba6f47132d swrast: fix wrong assignments in _swrast_add_spec_terms_line() 2012-08-29 08:20:58 -06:00
Brian Paul
1aee8803f8 mesa: test for GL_EXT_framebuffer_sRGB in glPopAttrib()
To avoid spurious GL_INVALID_ENUM errors if the extension isn't supported.
2012-08-29 08:20:57 -06:00
Martin Pieuchot
c4c4d4ad1e mesa: Define CPU_TO_LE32 to work on OpenBSD
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-08-29 08:05:17 -06:00
Brian Paul
4aede0018a docs: remove mention of old driver maintenance
People who need old drivers can use older versions of Mesa.
2012-08-28 13:09:02 -06:00
Andreas Boll
6eaccbfeeb docs/utilities: add/update some useful utilities
the progs/util directory is now in mesa demos
replace glean with piglit
add ApiTrace

markup: replace the unordered list <ul> with a definition list <dl>

Signed-off-by: Brian Paul <brianp@vmware.com>
2012-08-28 13:08:56 -06:00
Eric Anholt
67e9ae8563 i965: Disable the swrast context setup on GL 3.1 core.
I've reviewed the code, and the swrast callsites remaining are all in
drawpixels/copypixels/bitmap/accum, or _swrast_BlitFramebuffer that shouldn't
be hit.  A piglit run with the context setup disabled on legacy GL and GLES2
showed regressions only in the copypixels and drawpixels tests.

If the context type is forced, this reduces the shader_runner maximum heap
size for glsl-algebraic-add-add-1.shader_test from 15,137,496b to 4,165,376b.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-28 11:43:04 -07:00
Eric Anholt
993c52d0be i965: Replace general sw fallback support with a manual check for rendermode.
There were no other cases that set it any more.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-28 11:43:04 -07:00
Eric Anholt
b0d23b66cf intel: Move RenderMode fallback func to i915 driver.
The Fallback field of the context struct doesn't work that way on i965, and
it's the only caller of FALLBACK() in the driver.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-28 11:43:04 -07:00
Eric Anholt
628dfe9511 i965: Drop the old sw fallback for position array being disabled.
This code has been in the driver since the first commit.  I think it was
trying to stop rendering from happening with a disabled position array.  Core
mesa has since had changes to deal with disabled position arrays correctly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-28 11:43:04 -07:00
Eric Anholt
5e3c093ff8 i965: Drop support for forcing drawing through sw fallbacks.
It turns out it hasn't worked since at least 8.0.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-28 11:43:04 -07:00
Eric Anholt
bfae8650ec i965: Move depth resolve for span fallbacks to a simpler place.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-28 11:43:04 -07:00
Eric Anholt
707f242c4b i965: Drop manual hiz resolves in span rendering.
swrast uses MapRenderbuffer, which leads to intel_miptree_map, which does the
depth resolve.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-28 11:43:04 -07:00
Michel Dänzer
70f9dbe298 radeon/llvm: Handle TGSI KIL opcode for SI.
Fixes piglit fp-kil and glBitmap() with radeonsi.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-28 20:27:23 +02:00
Michel Dänzer
16e42a5dd0 radeon/llvm: Basic support for SI EXEC register.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-28 20:26:50 +02:00
Michel Dänzer
6ca64393c9 radeonsi: Don't write to the PA_SC_RASTER_CONFIG register.
It should be initialized by the kernel as necessary.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-08-28 20:24:52 +02:00
Marek Olšák
999b7f6665 r600g: fix relative addressing on RS780 and RS880
They should be treated like RV670.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-28 18:27:03 +02:00
Andreas Boll
3e20605c16 docs/helpwanted: add radeonsi todo list
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-28 17:36:07 +02:00
Andreas Boll
17f09b664b configure.ac: add radeonsi to --with-gallium-drivers help string
the help string is used by ./configure --help

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-28 17:35:36 +02:00
José Fonseca
bc8509b43b llvmpipe: Bump the maximum texture size (in pixels).
But cap the size in bytes, to avoid depleting the whole system memory,
with humongus textures.

Tested with max-texture-size piglit test.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-28 15:18:43 +01:00
Vadim Girlin
6463eb013f u_vbuf: avoid unnecessary update of the vertex elements
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-08-28 18:01:13 +04:00
Matt Turner
971750e1cd egl: fix invalid flag detection for EGL_KHR_create_context
We want to check whether there are bits set outside of the valid flags.

Fixes piglit test egl-create-context-invalid-flag-gl

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-27 15:11:11 -07:00
Kenneth Graunke
77d675926a i965: Make VS programs obey the shader_precompile driconf option.
Now that it's on by default, we may as well make it obey the flag,
for consistency's sake if nothing else.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-27 14:23:40 -07:00
Kenneth Graunke
9ef710575b i965: Reenable the fragment shader precompile.
Precompiling the shader at link time often allows us to avoid compiling
it at the first use.  This moves the expensive compilation and
optimization process to game or level load time, rather than at draw
time, where we really can't avoid any cycles and don't want to risk
stalling the GPU.

The downside is that we have to guess the non-orthagonal state the
program will have set when it draws with the shader.  Previously, we
guessed wrong for nearly every shader, so it wasn't useful.  With the
recent SamplerUnits rework and this series, we've either eliminated
state or made smarter guesses, and usually get it right now.

In the L4D2 time demo, I now have 39 fragment shader recompiles and no
vertex shader recompiles.  Before this series and the SamplerUnits
rework, I had 206 fragment shader recompiles and 192 vertex shader
recompiles.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-27 14:23:40 -07:00
Kenneth Graunke
88b3850c27 i965: Set swizzle fields in the VS precompile program key.
This fixes a regression since 76d1301e8e:
I began setting SWIZZLE_XYZW for unused sampler units in the actual
program keys, since this matched the FS precompile behavior.  However,
the VS precompile was expecting zero, so that commit made essentially
every vertex shader (even those not using texturing) mismatch and need
to be recompiled.

Setting them in the VS precompile key solves the issue.  It also is an
improvement over our old behavior: previously we guessed that vertex
shaders didn't use any textures at all.  Now we actually look to see if
the VS had any sampler uniforms and guess based on that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-27 14:23:40 -07:00
Kenneth Graunke
c20cb8d1f6 i965/vs: Add VS program key dumping to INTEL_DEBUG=perf.
Eric added support for WM key debugging.  This adds it for the VS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-27 14:23:40 -07:00
Kenneth Graunke
85b24b0751 i965/fs: Assume shadow sampler swizzling is <X, X, X, 1>.
Our previous assumption, SWIZZLE_XYZW, was completely bogus for depth
textures.  There are no Y, Z, or W components.

DEPTH_TEXTURE_MODE has three options:
- GL_LUMINANCE: <X, X, X, 1>
- GL_INTENSITY: <X, X, X, X>
- GL_ALPHA:     <0, 0, 0, X>

The default value is GL_LUMINANCE, and most applications don't seem to
alter DEPTH_TEXTURE_MODE.  Make that our precompile guess.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-27 14:23:40 -07:00
Kenneth Graunke
f3d0daf7ea i965: Index sampler program key data by linker-assigned index.
Now that most things are based on the linker-assigned index, it makes
sense to convert the arrays in the VS/WM program key as well.  It seems
silly to leave them indexed by texture unit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-27 14:23:40 -07:00
Kenneth Graunke
ab17762c70 i965: Only set proj_attrib_mask for fixed function.
brw_wm_prog_key's proj_attrib_mask field is designed to enable an
optimization for fixed-function programs, letting us avoid projecting
attributes where the divisor is 1.0.

However, for shaders, this is not useful, and is pretty much impossible
to guess when building the FS precompile key.  Turning it off for
shaders should allow the precompile to work and not lose much.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Suggested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-27 14:23:40 -07:00
Kenneth Graunke
6cc14c2493 i965: Don't set stats_wm in the WM program key on Gen6+.
It's only needed for Gen4/5 IZ lookup workarounds.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-27 14:23:40 -07:00
Kenneth Graunke
b6b1fc1261 i965: Don't set vp_outputs_written in the WM program key on Gen6+.
It's only used by on pre-Sandybridge hardware.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-27 14:23:39 -07:00
Kenneth Graunke
87cdefed40 i965: Double the size of the state cache.
We probably want to do something more sophisticated here, but this at
least makes it through L4D2 without dumping the program cache.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-27 14:23:39 -07:00
Julien Cristau
ac889b2410 glapi/glx: call __glEmptyImage if USE_XCB, not memcpy directly
We were stomping on the caller's buffer by ignoring their alignment
requests and other pixel store modes.  This patch makes the USE_XCB path match
the older one more closely.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52059

Signed-off-by: Julien Cristau <julien.cristau@logilab.fr>
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-08-27 13:32:53 -06:00
Brian Paul
f308c80490 gallium/util: implement tile code for PIPE_FORMAT_Z32_FLOAT
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-27 13:32:53 -06:00
Brian Paul
a971476cc7 st/mesa: use fallback path for glCopyTexSubImage(GL_TEXTURE_1D_ARRAY)
Fixes many failing cases in piglit copyteximage test.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-27 13:32:53 -06:00
Chad Versace
88edbdf9f0 i965: Move hiz resolve to after renderbuffer resizing (v2)
Do all pre-draw hiz resolves *after* the renderbuffers are resized by
intel_prepare_render. Otherwise, we may resolve buffers that are
immediately discarded afterwards.

Fixes the assertion failure below when resizing windows in KDE and under
some unknown circumstance in Chrome OS:
    intel_resolve_map.c:46: intel_resolve_map_set: Assertion
    `(*tail)->need == need' failed.

Also, remove the comment that "resolves must occur [...] before setting up
any hardware state". That was true when resolves were implemented with
meta-ops, but no longer with blorp.

v2:
  - Keep brw_predraw_resolve_buffers in its current position, which is
    before any brw_context bits are modified. Instead, move the call to
    intel_prepare_render.

Note: This is a candiate for the 8.0 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52252
Reported-by: Lu Hua <huax.lu@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-27 07:48:28 -07:00
Chad Versace
a2a7e640a4 i965: Remove redundant null check
intel_renderbuffer_resolve_hiz checks if rb->mt is null, so there is no
need for the caller to do so.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-27 07:47:09 -07:00
Marek Olšák
7f0fcf17c3 r300g: implement TRUNC correctly
This fixes some integer division tests.
2012-08-27 14:35:18 +02:00
Michel Dänzer
f402acdbe2 radeonsi: Use FP16 shader export format when necessary / possible.
Fixes piglit fbo-blending-formats.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-27 11:51:56 +02:00
Michel Dänzer
26c7139d2c radeonsi: Refactor initialization of shader export intrinsic arguments.
In preparation for extending this code, which would make it rather unwieldy in
its current place.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-27 11:51:49 +02:00
Michel Dänzer
d1e40b3d40 radeonsi: Maintain cache of pixel shader variants according to contxt state.
Mostly inspired by r600g commit 4acf71f01e
('r600g: cache shader variants instead of rebuilding v3').

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-27 11:51:41 +02:00
Michel Dänzer
84fdda280f radeonsi: Drop extraneous semicolons from pm4 state macro definitions.
Could cause build failures if trying to use the macros in certain constructs.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-27 11:50:38 +02:00
Marek Olšák
a3d9d7ec79 r600g: implement compression for MSAA colorbuffers for evergreen
This adds the FMASK and CMASK buffers. They share the same resource
with color data.

COMPRESSION and FAST_CLEAR are always enabled if both FMASK and CMASK are
allocated. We initialize the CMASK to a "compressed" state (not "fast cleared"),
so that we can keep FAST_CLEAR enabled all the time.

Both FMASK and CMASK must be present at the moment. If either one is missing,
the other one is not used.

v2: add cayman regs in the list

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-08-27 04:31:00 +02:00
Marek Olšák
48edfe0505 r600g: cleanup names around depth decompression
for consistency with the upcoming color decompression naming

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-08-27 04:31:00 +02:00
Marek Olšák
3ac54ac2c8 r600g: fix evergreen 8x MSAA sample positions
The original samples positions took samples outside of the pixel boundary,
leading to dark pixels on the edge of the colorbuffer, among other things.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-08-27 04:31:00 +02:00
Marek Olšák
1cfec6e2c8 r600g: set CB_TARGET_MASK to 0xf and not 0xff for resolve on evergreen
independent_blend_enable must be true, so that the colormask isn't replicated
in all colorbuffers.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-08-27 04:30:59 +02:00
Marek Olšák
1516a4f353 gallium/u_blitter: initialize sample mask in resolve
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-08-27 04:30:59 +02:00
Tom Stellard
07c71d6ede r300/compiler: Use variable lists in the rename_regs pass 2012-08-26 20:39:49 -04:00
Eric Anholt
7540f25a34 i965: Rewrite the comment describing the query object support.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-26 10:40:33 -07:00
Eric Anholt
f0159018d7 i965/gen6+: Add support for GL_ARB_timer_query.
Needs updated libdrm.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-26 10:40:33 -07:00
Eric Anholt
9a2943ddf2 i965: Add support for GL_ARB_occlusion_query2.
This extension is just a bit of core code on top of the GL_ARB_occlusion_query
support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-26 10:40:33 -07:00
Eric Anholt
b765119c5d mesa: Add constants for the GL_QUERY_COUNTER_BITS per target.
Drivers need to be able to communicate their actual number of bits populated
in the field in order for applications to be able to properly handle rollover.

There's a small behavior change here: Instead of reporting the
GL_SAMPLES_PASSED bits for GL_ANY_SAMPLES_PASSED (which would also be valid),
just return 1, because more bits don't make any sense.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-26 10:40:28 -07:00
Eric Anholt
6754ec831e i965: Fix accumulator_contains() test to also reject swizzles of the dst.
When faced with this sequence:

	MOV	R1, c[1];
	MAD	R0, R2, R1.x, R1.y;

we were concluding that the MOV of R1 set up our accumulator and so we could
just use the previous result.  Only, it's got R1.xyzw in it instead of the
r1.y we're looking for.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46784
NOTE: This is a candidate for the 8.0 branch.
2012-08-26 09:58:40 -07:00
Jakob Bornecrantz
33ee019422 st/dri: Support width and height getters
Tested-by: Scott Moreau <oreaus@gmail.com>
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
2012-08-26 15:40:18 +02:00
Jakob Bornecrantz
15effe1fab st/dri: Claim to support validate_usage
Support version 3 as well as 2, since that is only the new format query,
which Jesse added support for to st/dri when he added it to dri_inteface.h.

Tested-by: Scott Moreau <oreaus@gmail.com>
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
2012-08-26 15:40:10 +02:00
Jakob Bornecrantz
93ebec87ed dri: Make query image WIDTH and HEIGHT be version 4
Tested-by: Scott Moreau <oreaus@gmail.com>
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
2012-08-26 15:39:50 +02:00
Jakob Bornecrantz
6bb71b8cbe dri: Remove image write function
Since its not used by anything anymore and no release has gone out
where it was being used.

Tested-by: Scott Moreau <oreaus@gmail.com>
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
2012-08-26 15:39:41 +02:00
Jakob Bornecrantz
a669a5055e gbm: Use libkms to replace DRI cursor images
Uses libkms instead of dri image cursor. Since this is the only user of the
DRI cursor and write interface we can remove cursor surfaces entirely from
the DRI interface and as a consequence also from the Gallium interface as
well. Tho to make everybody happy with this it would probably should add a
kms_bo_write function, but that is probably wise in anyways.

The only downside is that it adds a dependancy on libkms, this could how ever
be replaced with the dumb_bo drm ioctl interface.

Tested-by: Scott Moreau <oreaus@gmail.com>
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
2012-08-26 15:39:23 +02:00
Kenneth Graunke
a3685544e1 i965: Don't set iz_lookup the FS precompile's program key on Gen6+.
We already changed the actual program key builder to only set these bits
on gen < 6; this patch just brings the precompile state back in line so
it doesn't mismatch every time.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-25 23:05:35 -07:00
Kenneth Graunke
98211d5af7 i965/fs: Fix INTEL_DEBUG=perf program key printing.
When dumping differences in program keys, it printed messages of the
format:

   [Name of thing that changed]  [new]->[old]

This was terribly confusing: the right arrow implies "the value changed
from this to that", when in fact the message conveyed the opposite.

Except that some of the time, it didn't, since we accidentally swapped
the arguments to brw_debug_recompile_sampler_key.  With two swaps, it
would often come out in the expected format.

This patch fixes it to properly print:

   [Name of thing that changed]  [old]->[new]

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-25 23:01:50 -07:00
Kenneth Graunke
174d44a9c4 mesa: Use a new, more specific hook for shader uniform changes.
Gallium drivers and i965 don't require special notification when
sampler uniforms change.  They simply see the _NEW_TEXTURE and adjust
their indirection tables.  These drivers don't want ProgramStringNotify:
it simply causes pointless recompiles.

Unfortunately, i915 still requires shader recompiles and needs
ProgramStringNotify.  Rather than trying to fix that, simply change the
hook to a new, more specific one: ShaderUniformChange.  On i915, this
translates to ProgramStringNotify; others simply ignore it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-25 12:01:10 -07:00
Kenneth Graunke
85e8e9e000 i965: Use linker-assigned sampler IDs in instruction encoding.
When assigning uniform locations, the linker assigns each sampler
uniform a sequential numerical ID.  gl_shader_program::SamplerUnits maps
these sampler variable IDs to the actual texture units they reference
(specified via glUniform1i).

Previously, we encoded this mapping in the SEND instruction encoding:
the "sampler" was the texture unit number, and the binding table index
was SURF_INDEX_TEXTURE(the texture unit number).  This unfortunately
meant that whenever the application changed the value of a sampler
uniform, we had to recompile the shader to change the SEND instructions.

This was horrible for the game Cogs, which repeatedly switches between
using texture unit 0 and 1.  It also made fragment shader precompiles
useless: we'd do the precompile at glLinkShader() time, before the
application called glUniform1i to set the sampler values.  As soon as
it did that, we'd have to recompile, wasting time and space in the
program cache.

This patch encodes the SamplerUnits indirection in the binding table,
sampler state, and sampler default color tables.  Instead of baking the
texture unit number into the shader, we bake in the sampler variable ID
assigned by the linker.  Since those never change, we don't need to
recompile programs on uniform changes.

This does mean that the tables now depend on the linked shader program
being used for rendering, rather than simply representing all available
texture units.  This could cause an increase in state emission.

Another plus is that the sampler state and sampler default color tables
are now compact: we only emit as many entries as there are sampler
uniforms, with no holes in the table since the new sampler IDs are
sequential.  Previously we had to emit a full 16 entries every time,
since the tables tracked the state of all active texture units.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-25 12:01:10 -07:00
Kenneth Graunke
2faa592e7f i965: Add a "sampler state index" parameter to update_sampler_state().
This represents the index into the sampler state table or sampler
default color table (the two are identical).

Right now, this is still the texture unit, but that will change shortly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-25 12:01:10 -07:00
Kenneth Graunke
28fab4295e i965: Un-hardcode WM binding table from update_texture_surface.
Currently, we mirror the VS and WM binding tables' texture entries.
That may not continue to be true, so in preparation, pass in the binding
table and surface index as arguments.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-25 12:01:10 -07:00
Kenneth Graunke
96a22f3583 i965/vs: Rename "sampler" to "texunit" in texturing code.
The number we're passing around is actually the ID of the texture unit,
as opposed to the numerical value our of sampler uniforms.  Calling it
"texunit" clarifies this slightly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-25 12:01:09 -07:00
Kenneth Graunke
0ad2dce24a i965/fs: Rename "sampler" to "texunit" in texturing code.
The number we're passing around is actually the ID of the texture unit,
as opposed to the numerical value our of sampler uniforms.  Calling it
"texunit" clarifies this slightly.

Don't bother renaming fs_instruction::sampler.  Although it's currently
the texture unit, this series will change that.  No need for the churn.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-25 12:01:09 -07:00
Kenneth Graunke
bf0308d8d6 i965/fs: Remove unused 'sampler' parameter in emit_texture_genX().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-25 12:01:09 -07:00
Kenneth Graunke
76d1301e8e i965: Set SWIZZLE_NOOP for unused texture units in the program keys.
Previously, we left the swizzle key field as zero for unused texture
units.  The precompile sets all of them to SWIZZLE_NOOP, which meant
that we mismatched almost every time.

Since either works equally well, change it to SWIZZLE_NOOP to match
the precompiles.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-25 12:01:09 -07:00
Kenneth Graunke
f510dd5d60 i965: Remove four and a half year old TODO comments about samplers.
I can't actually understand what these mean, and they seem to
essentially say "we should simplify things", which is a nice goal but
not very specific.

Presumably things got cleaned up at some point.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-25 12:01:09 -07:00
Kenneth Graunke
d1447f5bc9 i965: Fix brw_link_shader to return false rather than NULL.
Fixes brw_shader.cpp:101:9: warning: converting to non-pointer type
'GLboolean {aka unsigned char}' from NULL [-Wconversion-null]

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-with-great-enthusiasm-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by Eric Anholt <eric@anholt.net>
2012-08-25 12:01:09 -07:00
Ian Romanick
f9767dac9a mesa/es: Validate glGetBufferParameteriv pname in Mesa code rather than the ES wrapper
v2: Add proper core-profile and GLES3 filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-24 19:15:20 -07:00
Ian Romanick
93d109645a mesa/es: Validate glMapBuffer access in Mesa code rather than the ES wrapper
v2: Add proper core-profile and GLES3 filtering.

v3: *Really* add proper core-profile and GLES3 filtering based on review
feedback from Eric Anholt.  It looks like previously there was some
rebase / merge fail.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-24 19:13:18 -07:00
Ian Romanick
bd4e5dd355 mesa/es: Validate glBufferData usage in Mesa code rather than the ES wrapper
v2: Add proper core-profile and GLES3 filtering based on review feedback
from Eric Anholt.  It looks like previously there was some rebase /
merge fail.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-24 19:13:18 -07:00
Ian Romanick
b0b6b76d52 mesa/es: Validate buffer object targets in Mesa code rather than the ES wrapper
v2: Add proper core-profile and GLES3 filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-24 19:13:18 -07:00
Ian Romanick
e2cf14d7b2 mesa/es: Validate VertexPointer types in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-24 19:13:18 -07:00
Ian Romanick
ef723ecce4 mesa/es: Remove redundant vertex pointer size validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-24 19:13:18 -07:00
Ian Romanick
a8f475d8f6 mesa/es: Validate TexCoordPointer size in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-24 19:13:18 -07:00
Ian Romanick
c3e9a207d0 mesa/es: Validate TexCoordPointer types in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-24 19:13:18 -07:00
Ian Romanick
e5ef0cbe0e mesa/es: Validate NormalPointer types in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-24 19:13:18 -07:00
Ian Romanick
fb8218508a mesa/es: Validate ColorPointer size in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-24 19:13:17 -07:00
Ian Romanick
07ccfef8d1 mesa/es: Validate ColorPointer types in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-24 19:13:17 -07:00
Ian Romanick
28ee443d7b mesa/es: Remove redundant vertex attrib pointer type validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-24 19:13:17 -07:00
Ian Romanick
ae633d0b2e mesa/es: Remove redundant vertex attrib pointer size validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-24 19:13:17 -07:00
Ian Romanick
946ddec163 mesa/es: Disallow BGRA vertex arrays in ES or ES2 contexts
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-24 19:13:17 -07:00
Ian Romanick
bbceed268e mesa: Rearrange array type checking, filter more types in ES
v2: Fix handling of GL_INT and GL_UNSIGNED_INT types pre-ES3.0, and fix
handling of GL_INT_2_10_10_10_REV and GL_UNSIGNED_INT_2_10_10_10_REV in
ES3.0.  Based on review comments by Ken Graunke.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-24 19:13:17 -07:00
Ian Romanick
a33f360e8f mesa: Refactor element type checking into its own function
This consolidates the tests and makes the emitted error message
consistent.

v2: Rename _mesa_valid_element_type to valid_elements_type.  Log the
enum string instead of the hex value in error messages.  Based on review
comments from Brian Paul and Ken Graunke.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-24 19:13:12 -07:00
Brian Paul
229868edf7 wgl: update some comments 2012-08-24 14:09:03 -06:00
Brian Paul
4b7c0938e4 st/mesa: don't do (generic) compression of 1D or 1D_ARRAY textures
As with the previous commit for core Mesa.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-08-24 14:09:03 -06:00
Brian Paul
a3af27e993 mesa: add generic compressed -> uncompressed format helper
_mesa_generic_compressed_format_to_uncompressed_format() probably wins the
prize for longest function name in Mesa.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-08-24 14:09:03 -06:00
Brian Paul
13d0bb21a9 mesa: don't try (generic) compression of 1D and 1D_ARRAY textures
See comments in the code for details.

Note: we only need to special-case the generic compressed formats since
specific texture formats are error-checked earlier to see if the compression
format is compatible with the texture type.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-08-24 14:09:03 -06:00
Brian Paul
d47a6ada9c mesa: add texture target field to ChooseTextureFormat() driver hook
This will let us choose the actual hardware format depending on the
type of texture.

v2: fixup radeon, nouveau, intel and swrast drivers too

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-24 14:08:57 -06:00
Brian Paul
ba7218061b xlib: remove texture compression hackery
I think this was left-over debug code from long ago.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-24 13:15:27 -06:00
Brian Paul
09fafd3b85 st/mesa: clean up use of 'target' variable in st_context_teximage()
'target' was used both as a parameter of type st_texture_type and then
re-used for GL_TEXTURE_x targets.  Rename the function parameter and
add a new local 'GLenum target'.

And remove an extraneous break statement.
2012-08-24 13:15:27 -06:00
Matt Turner
261719b21c automake: convert vgapi 2012-08-24 11:08:19 -07:00
Matt Turner
ba4a36d8cd build: Check for bison-generated file before bailing because of no bison
.y/.c was a typo.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-24 11:08:19 -07:00
Matt Turner
179d8aa331 Move _mesa_dl* functions into dlopen.h and inline them
No point in having an extra function call for inlinable functions.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2012-08-24 11:08:19 -07:00
Tapani Pälli
57c57df7b4 mesa/dlopen: use HAVE_DLOPEN instead of _GNU_SOURCE
Patches changes mesa to use 'HAVE_DLOPEN' defined by configure and Android.mk
instead of _GNU_SOURCE for detecting dlopen capability. This makes dlopen to
work also on Android where _GNU_SOURCE is not defined.

[mattst88] v2: HAVE_DLOPEN is sufficient for including dlfcn.h, remove
	       mingw/blrts checks around dlfcn.h inclusion.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
2012-08-24 11:08:19 -07:00
Matt Turner
df4dccc7a9 build: Only add links to .so files if we're building them
Xlib-GLX and OSMesa support static building.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=53962
2012-08-24 11:08:19 -07:00
Matt Turner
c56b57f4a1 build: Add libOSMesa.so.$(VERSION) link to libdir 2012-08-24 11:08:19 -07:00
Matt Turner
a8fd8cb9e7 build: Replace OSMESA_VERSION with generic VERSION_NUMBER
Can be used by other modules.
2012-08-24 11:08:19 -07:00
Matt Turner
383a70bf9a build: Order AC_CONFIG_FILES list
Makefiles before .pc files before directories. Alphabetize files of the
same type.
2012-08-24 11:08:19 -07:00
Matt Turner
8cdce6c136 build: Only build libmesa.la when needed
Namely, for Xlib-GLX, OSMesa, or test programs.
2012-08-24 11:08:19 -07:00
Matt Turner
00f3d9b11a build: Remove duplicate DRI automake conditionals 2012-08-24 11:08:19 -07:00
Matt Turner
d23b1b7977 build: Remove GLU_DIRS 2012-08-24 11:08:19 -07:00
Matt Turner
0abb26ebff build: Only generate dispatch assembly code that will be built 2012-08-24 11:08:19 -07:00
Paul Berry
5133bd6585 i965: don't clear resolve map when doing fast depth clears.
Previously, when performing a fast depth clear, we would also clear
the miptree's resolve map.  This destroyed important information,
since the resolve map contains information about needed resolves for
all levels and layers of the miptree, whereas a depth clear only
applies to a single level/layer combination at a time.  As a result,
resolves would sometimes fail to occur, leading to incorrect
rendering.

Fixes rendering artifacts with shadow maps in Unigine Heaven and
Unigine Sanctuary.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50270

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-24 09:59:27 -07:00
Paul Berry
4b8b6f385e i965/HiZ: remove assertion from intel_resolve_map_set().
There are three possible resolve map states for each (level, layer) of
a depth miptree: "needs HiZ resolve", "needs depth resolve", and
"needs neither".  When HiZ was first implemented on i965, any attempt
to directly transition between "needs HiZ resolve" and "needs depth
resolve" without passing through the "needs neither" state would have
been a bug indicating that a necessary resolve hadn't been performed.
Accordingly, intel_resolve_map_set() contained an assertion to verify
that no such direct transition happened.

However, now that we support fast depth clears, there is a valid
transition from the "needs HiZ resolve" to the "needs depth resolve"
state.  When doing a fast depth clear, the old state of the buffer is
irrelevant, since we are completely replacing it with the clear value,
so it is not necessary to do any resolves before clearing--we can
transition, if necessary, directly from the "needs HiZ resolve" state
to the "needs depth resolve" state.

To avoid spurious assertions in this valid case, this patch just
removes the assertion.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-24 09:59:27 -07:00
Christian König
9aacd5cc67 radeonsi: remove old tilling handling
Just use the functionality provided by the surface manager instead.

This fixes just another bunch of piglit tests.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-24 18:11:31 +02:00
Ian Romanick
86f29cf7d0 mesa/es: Validate glCreateShader targets in Mesa code rather than the ES wrapper
v2: Add proper core-profile filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-24 09:06:31 -07:00
Ian Romanick
b042f7a1ff mesa/es: Validate glGetProgramiv pnames in Mesa code rather than the ES wrapper
v2: Add proper core-profile and GLES3 filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-24 09:06:31 -07:00
Ian Romanick
1a200b68cd mesa: Filter glGetProgramiv pnames based on available extensions
Previously you could always glGetProgramiv one of the transform feedback
or geometry shader enums even if the extension wasn't supported.

In addtion, this reverts part of bda6ad27.  I think the hunks involving
GL_PROGRAM_BINARY_LENGTH_OES were spurious.  Mesa has no support for any
other part of GL_OES_get_program_binary.

v2: Remove redundant return in get_programiv based on review feedback
from Matt Turner.

v3: Correctly handle UBO related enums.

v4: Emit the bad enum in the _mesa_error call based on review feedback
from Brian Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-24 09:06:31 -07:00
Brian Paul
9282ebbaa5 swrast: implement cubical depth texture sampling
Fixes a few more failures in the piglit copyteximage test.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-24 09:38:44 -06:00
Blaž Tomažič
87280d56a3 clover: Accept CL_MEM_READ_WRITE flag
Fix API functions for memory objects to accept CL_MEM_READ_WRITE flag.

Signed-off-by: Blaž Tomažič <blaz.tomazic@gmail.com>
[ Francisco Jerez: Drop incorrect change in clCreateSubBuffer. ]
2012-08-24 17:10:14 +02:00
Tom Stellard
167ecf5ba3 radeon/llvm: Cleanup R600Instructions.td 2012-08-24 14:14:55 +00:00
Brian Paul
388af5b6f4 main: fix ES compile breakage 2012-08-24 06:40:06 -06:00
Brian Paul
4fec5e9154 mesa/swrast: fix GL_TEXTURE_2D_ARRAY texture fetches for dxt formats
As with the previous commit.

This fixes the last crash in the piglit copyteximage test but there's
still some failures.
2012-08-24 06:18:42 -06:00
Brian Paul
d78b44c265 mesa/swrast: fix GL_TEXTURE_2D_ARRAY texture fetches for latc/rgtc formats
Fix-up the texel fetch functions so that they handle 3D coords (as used for
array textures) and remove the "f_2d" part from their names.

Helps fix swrast crashes in piglit's copyteximage test.  More to come.
2012-08-24 06:18:41 -06:00
Brian Paul
fe2cc65fbb mesa: code movement in teximage.c
To get rid of a forward declaration.
2012-08-24 06:18:41 -06:00
Brian Paul
bdff1dfb39 mesa: consolidate glTexImage and glCompressedTexImage code
There was a lot of similar or duplicated code before.
To minimize this patch's size, use a forward declaration for
compressed_texture_error_check().  Move the function in the next patch.
2012-08-24 06:18:41 -06:00
Brian Paul
e93cb4b34f mesa: make glTexImage, glCompressedTexImage proxy code more alike
Next up, we can combine the teximage() and compressed_teximage() functions.
2012-08-24 06:18:41 -06:00
Brian Paul
c1a9e6010b mesa: rename texpal.[ch] to texcompress_cpal.[ch]
To be consistent with other files related to texture compression.
2012-08-24 06:18:41 -06:00
Brian Paul
aab06dc0f0 mesa: s/GLuint/gl_format/ in _mesa_compressed_format_to_glenum()
No real change here, just use the right type.
2012-08-24 06:18:41 -06:00
Brian Paul
46751edca9 mesa: new _mesa_num_tex_faces() helper
Not a real big help now, but will be useful for the
GL_ARB_texture_cube_map_array extension in the future.
2012-08-24 06:18:41 -06:00
Brian Paul
8a935d71ff mesa: make _mesa_get_proxy_tex_image() static
It's not used by any other file.
2012-08-24 06:18:41 -06:00
Brian Paul
637a79aa23 mesa: don't clear proxy image fields when regular GL error is generated
If a proxy texture call generates a regular GL error, we should not
clear the proxy image's width/height/depth/format fields.  Use a new
PROXY_ERROR token to distinguish proxy errors from regular GL errors.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-08-24 06:18:41 -06:00
Brian Paul
1f5b1f9846 mesa: fix glTexImage proxy texture error generation
When calling glTexImage() with a proxy target most error conditions should
generate a GL error.  We were erroneously doing the proxy-error behaviour
(where we zeroed-out the image's width/height/depth/format fields) in too
many places.

There's another issue with proxy textures, but that'll be fixed in the
next patch.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-08-24 06:18:41 -06:00
José Fonseca
3e3f99277d draw: Fix regression in draw_set_sampler(_views).
draw->samplers(_views) now has PIPE_SHADER_TYPES elements, instead of
PIPE_MAX_SAMPLERS as before.

Also, shader_stage must be less than PIPE_SHADER_TYPES to prevent buffer
overflow.

Trivial.
2012-08-24 11:28:00 +01:00
Vadim Girlin
e84d45fdb7 build: don't leave git_sha1.h.tmp after build/install
Fixes "`main/git_sha1.h.tmp': Permission denied" build error.
See https://bugs.freedesktop.org/show_bug.cgi?id=52064

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-24 11:16:14 +04:00
Tom Stellard
1434a86f50 radeon/llvm: Set End of Program bit on RAT instructions
This code was accidently dropped during the MCCodeEmitter conversion.
2012-08-23 21:54:32 +00:00
Tom Stellard
1bd7b29a66 radeon/llvm: Use correct instruction for moving immediates
This should fix an assertion failure that was happening in some compute
shaders.
2012-08-23 21:54:32 +00:00
Tom Stellard
2ad8608cb3 radeon/llvm: Fix some coding style issues 2012-08-23 21:54:32 +00:00
Tom Stellard
228a6641cc radeon/llvm: Pull changes from external version of the backend 2012-08-23 21:54:32 +00:00
Tom Stellard
5a1edb8655 radeon/llvm: Simplify the convert to ISA pass 2012-08-23 21:54:32 +00:00
Tom Stellard
cb5227b403 radeon/llvm: Make sure to use the Text section in the AsmPrinter 2012-08-23 21:54:31 +00:00
Matt Turner
68a2c510a6 build: Fix installation of GLES2 headers
Reported-by: U. Artie Eoff <ullysses.a.eoff@intel.com>
Tested-by: U. Artie Eoff <ullysses.a.eoff@intel.com>
2012-08-23 14:07:35 -07:00
Matt Turner
fc9ea7c74d build: Fix GLES linkage with libglapi
Reported-by: Ian Romanick <idr@freedesktop.org>
2012-08-23 14:07:35 -07:00
Anuj Phogat
e592f7df03 i965/msaa: Add sample-alpha-to-coverage support for multiple render targets
Render Target Write message should include source zero alpha value when
sample-alpha-to-coverage is enabled for an FBO with  multiple render targets.
Source zero alpha value is used as fragment coverage for all the render
targets.

This patch makes piglit tests draw-buffers-alpha-to-coverage and
alpha-to-coverage-no-draw-buffer-zero to pass on Sandybridge. No
regressions are observed with piglit all.tests.

V2: Revert all the changes made in emit_color_write() function to
include src0 alpha for targets > 0. Now handling this case in a if
block.

V3: Correctly calculate the instruction length for buffer zero.
Properly handle the case of dual_src_blend when alpha-to-coverage
is enabled.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-08-23 13:30:54 -07:00
Stéphane Marchesin
ff996cafce glsl/linker: Avoid buffer over-run in parcel_out_uniform_storage::visit_field
When too may uniforms are used, the error will be caught in
check_resources (src/glsl/linker.cpp).

NOTE: This is a candidate for the 8.0 branch.

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Benoit Jacob <bjacob@mozilla.com>
2012-08-23 11:42:19 -07:00
Ian Romanick
9b028faeaa mesa/es: Validate glCompressedTexSubImage internalFormat in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:31 -07:00
Ian Romanick
dd0eb00487 mesa/es: Validate glCompressedTexImage internalFormat in Mesa code rather than the ES wrapper
v2: Add proper core-profile filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:31 -07:00
Ian Romanick
c11096e94a mesa/es: Validate glCopyTexImage internalFormat in Mesa code rather than the ES wrapper
v2: Add GLES3 filtering.  I'm not 100% sure this is correct.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:31 -07:00
Ian Romanick
9848e86af0 mesa/es: Validate glTexSubImage format and type in Mesa code rather than the ES wrapper
v2: Add proper GLES3 filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:31 -07:00
Ian Romanick
409620e477 mesa/es: Validate glTexImage format, type, and internalFormat in Mesa code rather than the ES wrapper
v2: Add proper GLES3 filtering.

v3: Collapse ALPHA, LUMINANCE, and LUMINANCE_ALPHA cases per review
comment from Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:31 -07:00
Ian Romanick
0686ccac95 mesa/es: Validate glTexImage border in Mesa code rather than the ES wrapper
Also validate glCopyTexImage border.  This fixes a bug in the APIspec.
Previously glTexImage3DOES could be passed a non-zero border without error.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:31 -07:00
Ian Romanick
59d965333c mesa: Generate an error when glCopyTexImage border is invalid
NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:30 -07:00
Ian Romanick
2dcb40bb44 mesa/es: Add support for GL_APPLE_texture_max_level
This is desktop OpenGL functionality that has always existed.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:30 -07:00
Ian Romanick
c9689e3e55 mesa/es: Validate glGetTexParameter pnames in Mesa code rather than the ES wrapper
This also adds a missing extension (and API) check around
GL_TEXTURE_CROP_RECT_OES.

v2: Add proper core-profile and GLES3 filtering.  GL_TEXTURE_MAX_LEVEL
is (incorrectly) accepted in ES contexts.  A future patch will add
GL_APPLE_texture_max_level, and meta really needs this.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:30 -07:00
Ian Romanick
b3dd524a10 mesa/es: Validate glTexParameter pnames in Mesa code rather than the ES wrapper
This also adds a missing extension (and API) check around
GL_TEXTURE_CROP_RECT_OES.

v2: Add proper core-profile, GLES1, and GLES3 filtering.  GL_TEXTURE_MAX_LEVEL
is (incorrectly) accepted in ES contexts.  A future patch will add
GL_APPLE_texture_max_level, and meta really needs this.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:30 -07:00
Ian Romanick
4269cace79 mesa/es: Remove redundant glBindTexture target validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:30 -07:00
Ian Romanick
3f7c8364cf mesa: Filter glBindTexture targets based on supported features.
Fixed the piglit test arb_texture_buffer_object-negative-unsupported.

NOTE: This is a candidate for stable release branches.

v2: Add proper core-profile and GLES3 filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:30 -07:00
Ian Romanick
530c9d764b mesa/es: Validate tex image targets in Mesa code rather than the ES wrapper
This should take care of all the TexImage, TexSubImage, CopyTexImage,
CompressedTexImage3DOES, and CopyTexSubImage type paths.

v2: Add proper core-profile and GLES3 filtering.

v3: Squash the CompressedTexImage3DOES patch per review comment from
Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:30 -07:00
Ian Romanick
ea9b212fca mesa/es: Validate EGLImageTargetTexture2DOES target in Mesa code rather than the ES wrapper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:30 -07:00
Ian Romanick
a0595cb450 mesa/es: Validate glTexParameter targets in Mesa code rather than the ES wrapper
Ditto for glGetTexParameter targets.

v2: Add proper core-profile and GLES3 filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:30 -07:00
Ian Romanick
842efb9447 mesa/es: Validate GL_TEXTURE_WRAP param in Mesa code rather than the ES wrapper
v2: Add proper core-profile filtering.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:29 -07:00
Ian Romanick
d53101a9f3 mesa: Refactor validate_texture_wrap_mode to use a switch-statement
This makes the next couple changes a little easier.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-23 10:15:29 -07:00
Ian Romanick
2abf555496 meta: Don't modify GL_GENERATE_MIPMAP state when it doesn't exist
This is a bit of a hack.  _mesa_meta_GenerateMipmap shouldn't even be
used in contexts where GL_GENERATE_MIPMAP doesn't exist (i.e., core
profile and ES2) because it uses fixed-function, and fixed-function
doesn't exist there either!

A GLSL-based _mesa_meta_GenerateMipmap should be available soon.  When
that is available, this patch will be irrelevant and should be reverted.

v2: Change (ctx->API != API_OPENGLES2 && ctx->API != API_OPENGL_CORE) to
(ctx->API == API_OPENGL || ctx->API == API_OPENGLES) based on review
comment from Brian Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-23 10:15:29 -07:00
Tapani Pälli
2ddfca9837 build/glsl: fix android build v2
Commit 77a3efc6b9 broke android build that
sets its own value for GLSL_SRCDIR before including Makefile.sources.
Patch moves overriding the value after include, this works as GLSL_SRCDIR
variable gets expanded only later.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
2012-08-23 10:13:38 -07:00
Matt Turner
a6b8b709cd automake: convert es1api 2012-08-23 09:40:06 -07:00
Matt Turner
0f8110cb0c automake: convert es2api 2012-08-23 09:38:32 -07:00
Vadim Girlin
68d6441930 st/dri: pass config options to the state tracker
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2012-08-23 19:57:51 +04:00
Vadim Girlin
a6457c0692 st/mesa: accept and handle configuration options from st/dri
Currently there is a single option - force_glsl_extensions_warn.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2012-08-23 19:57:51 +04:00
Vadim Girlin
44f69fc825 st/dri: add force_glsl_extensions_warn option to dri options
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2012-08-23 19:57:51 +04:00
Vadim Girlin
e7c177ec9e st/dri: use driver name for driconf section lookup
The name is taken from the driver_descriptor, so it will be the same as
expected by driconf utility.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2012-08-23 19:57:51 +04:00
Vadim Girlin
6547733593 swrast: add DRM_DRIVER_DESCRIPTOR to store driver name 2012-08-23 19:57:50 +04:00
Paulo Alcantara
b41f36bde7 egl_dri2: Fix segmentation fault
The segmentation fault occurs when DRI2 is not loaded up and
dri2_setup_screen() function deferences dri2_dpy->dri2 (since it's NULL
at this point).

This patch fixes the segmentation fault by checking if dri2 pointer is
not NULL before deferencing it.

Signed-off-by: Paulo Alcantara <pcacjr@profusion.mobi>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-08-23 09:17:23 -06:00
Tom Stellard
90bd1d52bb radeon/llvm: Use the MCCodeEmitter for R600 2012-08-23 15:00:48 +00:00
Tom Stellard
235318a578 radeon/llvm: Use the MCCodeEmitter for SI 2012-08-23 15:00:48 +00:00
Tom Stellard
2de24024c1 radeon/llvm: Set 64BitPtr feature bit for SI 2012-08-23 15:00:48 +00:00
Tom Stellard
3f9b6aa0f4 radeon/llvm: Lower RETFLAG DAG Node to S_ENDPGM on SI 2012-08-23 15:00:48 +00:00
Tom Stellard
e30b4644b6 radeon/llvm: Add AsmPrinter 2012-08-23 15:00:48 +00:00
Tom Stellard
e61c54cb6b radeon/llvm: Mark JUMP as a pseudo instruction 2012-08-23 15:00:48 +00:00
Tom Stellard
ead72204f1 radeon/llvm: Remove the last uses of MachineOperand flags 2012-08-23 15:00:47 +00:00
Tom Stellard
67a47a445b radeon/llvm: Add flag operand to some instructions
This new operand replaces the MachineOperand flags in LLVM, which
will be deprecated soon.  Eventually all instructions should have a flag
operand, but for now this operand has only been added to instructions
that need it.
2012-08-23 15:00:47 +00:00
Tom Stellard
3a7a56e7aa radeon/llvm: Encapsulate setting of MachineOperand flags
MachineOperand flags will be removed soon, so it is convienent to
have only one function that modifies them.
2012-08-23 15:00:47 +00:00
Matt Turner
bee2edbf3d build: Link DRI drivers with dricore in case of no direct rendering
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-22 11:08:06 -07:00
Matt Turner
bfd7d6f58b build: Only build libmesagallium.la if building Gallium
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-22 11:08:06 -07:00
Matt Turner
f9786394e5 build: Clean glx Makefile.am
mapi/glapi is already built when make is run in src/glx.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-22 11:08:06 -07:00
Matt Turner
d9b109892d build: Put mapi/shared-glapi in CORE_DIRS
SRC_DIRS was overwritten (visible in the second hunk).

Also don't require mapi/shared-glapi to be built for GLES.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-22 11:08:06 -07:00
Matt Turner
8c9b78aad1 build: Only allow shared-glapi with DRI
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-22 11:08:06 -07:00
Matt Turner
32e8ce6d24 build: Set sensible DRI/X11/OSMesa defaults
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-22 11:08:06 -07:00
Matt Turner
53248e5f95 build: Print whether shared-glapi is enabled
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-22 11:08:06 -07:00
Matt Turner
625651cf81 build/x11: Force usage of C++ linker
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-22 11:08:06 -07:00
Matt Turner
9049b7f0fa build/x11: Don't link against shared-glapi
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-22 11:08:06 -07:00
Matt Turner
be5fe7b320 build: Remove deprecated --with-driver= flag
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-22 11:08:06 -07:00
Christian König
302c66ff81 radeonsi: rework vertex format handling
Preventing piglit's draw-vertices test from hanging the GPU.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-22 15:33:54 +02:00
Christian König
07838603b9 radeonsi: fix SPI_PS_INPUT_ENA handling
We need to enable at least one interpolation mode,
otherwise the GPU will hang.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-22 15:33:49 +02:00
Vadim Girlin
8d1a9a984f r600g: fix lockups with dual_src_blend v2
Disable blending when dual_src_blend is enabled and number of color exports
in the current fragment shader is less than 2.

Fixes lockups with ext_framebuffer_multisample-
alpha-to-coverage-dual-src-blend piglit test.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2012-08-22 12:12:22 +04:00
Jakob Bornecrantz
c4610e9f92 st/dri: Add shared usage on buffers created
Tested-by: Scott Moreau <oreaus@gmail.com>
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
2012-08-22 00:01:28 +02:00
Jakob Bornecrantz
61e95b8a5f gbm: Add shared usage on images created
Tested-by: Scott Moreau <oreaus@gmail.com>
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
2012-08-22 00:01:28 +02:00
Anuj Phogat
df2c4cbced mesa: Fix generic compressed texture formats' handling in glTexImage/glCopyTexImage
The generic texture formats should be accepted by the <internalformat>
parameter of TexImage1D, TexImage2D, TexImage3D, CopyTexImage1D, and
CopyTexImage2D functions. When the application specifies a generic
format, the driver is free to pick an uncompressed format.

This patch reverts the changes due to following commit:
commit a36581ccc0
mesa: do more teximage error checking for generic compressed formats

This patch fixes compressed texture format failures in intel oglconform
pxconv-gettex test case:
https://bugs.freedesktop.org/show_bug.cgi?id=47220

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-21 15:00:06 -07:00
Tom Stellard
1cb07bd3b8 radeon/llvm: ExpandSpecialInstrs - Add support for cube instructions 2012-08-21 15:42:44 +00:00
Tom Stellard
6c99f2101f radeon/llvm: ExpandSpecialInstrs - Add support for vector instructions 2012-08-21 15:42:44 +00:00
Tom Stellard
82a5d0c641 radeon/llvm: Add R600ExpandSpecialInstrs pass
This pass expends reduction instructions into a MachineInstrBundle that
contains 4 instruction, one for each instruction slot.
2012-08-21 15:42:44 +00:00
Tom Stellard
0588298575 radeon/llvm: Add helper function for getting sub reg indices 2012-08-21 15:42:44 +00:00
Michel Dänzer
1a25ebe3ce radeonsi: Handle NULL sampler views getting passed in by the state tracker.
Don't dereference NULL pointers, and if all views are NULL, don't generate an
invalid PM4 packet which locks up the GPU.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-08-21 15:42:25 +02:00
Ian Romanick
c1114c619a APIspec: Remove cruft about AMD_compressed_???_texture
Mesa doesn't support these extensions, and it seems unlikely that it
ever will

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:34 -07:00
Ian Romanick
4c32ee5bca mesa/es: Remove redundant glFramebufferTexture3D textarget validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:34 -07:00
Ian Romanick
7c9afe50fd mesa/es: Remove redundant glGetShaderiv pname validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:34 -07:00
Ian Romanick
aaef441638 mesa/es: Remove redundant glCompressedTexImage border validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:33 -07:00
Ian Romanick
d39cb8e9ef mesa/es: Remove redundant glPointSizePointer type validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:33 -07:00
Ian Romanick
d54004c352 mesa/es: Remove redundant glGetBufferPointer pname validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:33 -07:00
Ian Romanick
68d7ce3e9e mesa/es: Remove redundant glGetVertexAttribPointer pname validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:33 -07:00
Ian Romanick
5be5cf6934 mesa/es: Remove redundant element type validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:33 -07:00
Ian Romanick
b99a8caff1 mesa/es: Remove redundant glGetShaderPrecisionFormat shader type validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:33 -07:00
Ian Romanick
c914ac239e mesa/es: Remove redundant depth func validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:33 -07:00
Ian Romanick
af276d9d4b mesa/es: Remove redundant stencil op fail/zfail/zpass validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:33 -07:00
Ian Romanick
f3f993153c mesa/es: Remove redundant shade model mode validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:33 -07:00
Ian Romanick
5a193557d1 mesa/es: Remove redundant light pname and light validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:32 -07:00
Ian Romanick
0234410791 mesa/es: Remove redundant hint mode validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:32 -07:00
Ian Romanick
a4251da3b2 mesa/es: Remove redundant separate stencil face validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:32 -07:00
Ian Romanick
9113d0e686 mesa/es: Remove redundant stencil function validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:32 -07:00
Ian Romanick
1087745afe mesa/es: Remove redundant logic op operand validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:32 -07:00
Ian Romanick
bf03589882 mesa/es: Remove redundant alpha function validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:32 -07:00
Ian Romanick
8f55d83569 mesa/es: Remove redundant separate stencil mask face validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:32 -07:00
Ian Romanick
52d57985c6 mesa/es: Remove redundant front-face mode validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:32 -07:00
Ian Romanick
e1dbf56a10 mesa/es: Remove redundant face culling mode validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:32 -07:00
Ian Romanick
66404557db mesa/es: Remove redundant blend equation mode validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:06:31 -07:00
Ian Romanick
e39ea674d0 mesa/es: Remove redundant texture target validation
Mesa doesn't check the parameter passed to glMultiTexCoord*.  It does,
however, mask the texture value to prevent out-of-bounds writes.  This
patch will promote this non-conformant behavior to OpenGL ES 1.  I don't
think anyone will care, and the gets some silly code out of a hot path.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 16:05:11 -07:00
Ian Romanick
386e2f3289 mesa/es: Rearrange placement of GL_TEXTURE_MAX_ANISOTROPY_EXT in APIspec
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 09:52:45 -07:00
Ian Romanick
27e55805fb mesa/es: Remove redundant min/mag filter validation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-20 09:52:27 -07:00
Mathias Fröhlich
926a4a922f radeon-llvm: Start multithreaded before using llvm.
This is required to make some of llvm's api calls
thread save. In particular the PassRegistry, which is
implicitly accessed while compiling shader programs.
The PassRegistry uses a mutex that is only active if
the llvm_is_multithreaded() returns true.
Calling llvm_start_multithreading() makes this happen
and by calling this function we try to make sure that
we can savely compile shaders in paralell.
Since there is also a call llvm_stop_multithreading()
in the llvm api, we cannot guarantee that this does
not get switched off while we are relying on this being
set, but for the easier use cases this fixes a race with
the radeon llvm compiler we have as of today.

Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-20 16:27:23 +00:00
archibald
59361d76a5 r600g: Move common compute/3D register init to its own function
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-20 15:35:09 +00:00
Christoph Bumiller
c51f8e2790 nv50/ir/tgsi: handle DP2 in tgsi Instruction srcMask
Solved by Tiziano Bacocco on IRC.
2012-08-18 17:38:56 +02:00
Christoph Bumiller
f3a7be740d nv50/ir/emit: don't forget saturation bit on f32 add immediate
Solved by Maxim Levitsky on IRC.
2012-08-18 17:38:45 +02:00
Tilman Sauerbeck
d0ace4e949 mesa: use #if over #ifdef in the FEATURE_ES1 check to fix a build failure.
mfeatures.h will define FEATURE_ES1 to 0 if it's not defined yet.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53664

Signed-off-by: Brian Paul <brianp@vmware.com>
2012-08-18 07:53:54 -06:00
Brian Paul
5b542681dc st/mesa: fix sampler view counting
In the past, when we called pipe::set_sampler_views(n) the drivers set
samplers [n..MAX] to NULL.  We no longer do that.  The state tracker
code was already trying to set unused sampler views to NULL to cover
that case, but the logic was broken and unnoticed until now.  This patch
fixes it.

Strictly speaking, this patch shouldn't be necessary.  Drivers should simply
ignore unused samplers and sampler views.  But some drivers like llvmpipe (and
others?) count those things and they figure into state validation.  That could
be fixed in the future.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=53617

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-08-18 07:40:10 -06:00
Brian Paul
d65eb02537 util: update and fix u_upload_mgr.h comments 2012-08-18 07:39:52 -06:00
Brian Paul
84e5cb37d3 st/mesa: use Elements() instead of hard-coded number
And add a comment about the velems_util_draw[] array.
2012-08-18 07:39:52 -06:00
Brian Paul
1a9e4d5113 mesa: remove unused params, add const qualifiers
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-18 07:39:52 -06:00
Brian Paul
a6af24ee14 mesa: querying GL_TEXTURE_COMPRESSED_IMAGE_SIZE for a buffer obj is illegal
GL_INVALID_OPERATION is to be raised when querying a non-compressed
image/buffer.  Since a buffer object can't have a compressed format this
query always generates an error.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-18 07:39:51 -06:00
Ian Romanick
34472a0d87 mesa/es: Don't generate ES1 type conversion wrappers
These are gradually going to get whittled away and eventually folded into the
source files with the native type functions.

v2: Add (speculative) SConscript changes.  These may be broken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-17 18:12:20 -07:00
Eric Anholt
d707e337f5 i965: Fix bug in the old FS backend's projtex() calculation.
In the old backend, we looked at any FS attribute's proj_attrib_mask bits, not
just texcoords.  Now that we have _mesa_vert_result_to_frag_attrib(), we can
fill in the other FS inputs with correct proj_attrib_mask info.

NOTE: This is a candidate for stable branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46644
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-17 10:05:34 -07:00
Kenneth Graunke
3df13b32e5 mesa: Support GL_TEXTURE_BUFFER in GetTexLevelParameter[if]v in GL 3.1+.
The OpenGL 3.1 specification explicitly allows this.  Oddly, the
ARB_texture_buffer_object spec's issues section claims this isn't
allowed, but proceeds to explain that the extension simply doesn't edit
the underlying spec to allow it, and thus it didn't appear in the list
of legal texture targets.

Thus, this patch legalizes it only in 3.1+ contexts, but still returns
INVALID_ENUM in earlier contexts that expose ARB_texture_buffer_object.

Unfortunately, the behavior of the call is horrendously undefined.

Fixes oglconform's tbo/negative.textureParams test.

v2: Require desktop OpenGL.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-17 09:14:36 -07:00
Kenneth Graunke
8c37fc1e92 mesa: Split out part of glGetTexLevelParameter into a helper function.
Move the _mesa_GetTexLevelParameter[iv] functions below the helper
function so the prototype is available.

This will be useful in the next commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-17 09:14:36 -07:00
Kenneth Graunke
58d11524da mesa: Add GL_TEXTURE_CUBE_MAP to _mesa_max_texture_levels(). [v2]
For cube maps, _mesa_generate_mipmap() calls this with
GL_TEXTURE_CUBE_MAP (the gl_texture_object's Target) rather than one
of the faces.  This caused _mesa_max_texture_levels() to return 0, which
resulted in maxLevels == -1 and the next line's assertion to fail.

This function is called from seven places:
- fbobject.c: framebuffer_texture()
- mipmap.c: _mesa_generate_mipmap()
- texgetimage.c:
  - getteximage_error_check()
  - getcompressedteximage_error_check()
- texparam.c: _mesa_GetTexLevelParameteriv()
- texstorage.c: tex_storage_error_check()

All of these (or their callers) now explicitly check for invalid targets
already, so this shouldn't cause invalid targets to slip through.
(Technically _mesa_generate_mipmap() doesn't check for invalid targets,
but the API-facing _mesa_GenerateMipmapEXT() function does.)

+2 oglconforms (float-texture/mipmap.automatic and mipmap.manual)

In addition to fixing the mipmap bug, it should also cause glTexStorage
to accept GL_TEXTURE_CUBE_MAP, which is explicitly allowed by the spec.

v2: Drop alterations to callers; this is now in a patch series that adds
    explicit checking to API functions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-17 09:14:36 -07:00
Kenneth Graunke
9e4fde85e4 mesa: Add explicit target checking to GetTexLevelParameter[if]v().
Previously, it relied on _mesa_max_texture_levels() for texture target
error checking.  This was somewhat dodgy, as _mesa_max_texture_levels()
is called in seven diferent places, not all of which necessarily accept
the same list of targets.

I copied the list of legal targets from _mesa_max_texture_levels(), so
this patch should not introduce any change in behavior.  Future patches
will cause the two to diverge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-17 09:14:36 -07:00
Kenneth Graunke
63396ce4c0 mesa: Add explicit target checking to Get[Compressed]TexImage().
Previously, they relied on _mesa_max_texture_levels() for texture target
error checking.  This was somewhat dodgy, as _mesa_max_texture_levels()
is called in seven diferent places, not all of which necessarily accept
the same list of targets.

I copied the list of legal targets from _mesa_max_texture_levels() but
removed the proxy targets, as both functions explicitly rejected those
targets.  This changes the order in which we check errors, which could
change whether we return INVALID_VALUE or INVALID_ENUM.  However, it
shouldn't change the list of accepted targets.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-17 09:14:36 -07:00
Brian Paul
f69273f952 llvmpipe: remove polygon stipple assertion
It's possible for us to have an unused sampler bound when the fragment
shader itself doesn't use any samplers.  So the assertion isn't valid.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=53616
2012-08-17 09:07:49 -06:00
Brian Paul
553a08d314 svga: minor code reformatting
To be consistent with other functions.
2012-08-16 17:03:43 -06:00
Matt Turner
81ba2c53b6 build: Remove -shared from OSMesa's LDFLAGS
Would break the static build.
2012-08-16 15:04:54 -07:00
Matt Turner
d12b07eb1a build: Remove EXTRA_LIB_PATH
You can add extra library paths to LDFLAGS directly.
2012-08-16 15:04:54 -07:00
Matt Turner
e273ed37ea build: Require X11 pkg-config files 2012-08-16 15:04:53 -07:00
Marek Olšák
f36c404f90 r600g: disable tiling for 422 formats again 2012-08-16 20:44:54 +02:00
Marek Olšák
795834432b r600g: fix blits of subsampled formats 2012-08-16 20:44:54 +02:00
Marek Olšák
6fd9218bb4 r600g: fix copying between NPOT mipmapped compressed textures
We aligned the dimensions to the blocksize, then divided by it
(in r600_blit.c), then minified, which was wrong.

The minification must be done first, not last.
This fixes piglit/fbo-generatemipmap-formats with S3TC and maybe
a bunch of other tests too. Tested on RV730.
2012-08-16 20:44:54 +02:00
Marek Olšák
b8e9cf5d96 r600g: make F2U trans-only on r600-r700
This fixes a failing assertion in r600_asm.c.
2012-08-16 20:44:53 +02:00
Marek Olšák
0d7e002815 r600g: set CB_COLOR_INFO to INVALID for disabled colorbuffers on r600-r700
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-16 20:44:53 +02:00
Marek Olšák
951ac46a6a r600g: rename r600_resource_texture to r600_texture 2012-08-16 20:44:53 +02:00
Marek Olšák
952c905767 r600g: always put tiled textures in VRAM 2012-08-16 20:44:53 +02:00
Marek Olšák
773ff5705f r600g: cleanup r600_resource_texture in favor of radeon_surface 2012-08-16 20:44:53 +02:00
Marek Olšák
362a25aac5 r600g: remove unused parameter in r600_texture_create_object 2012-08-16 20:44:53 +02:00
Marek Olšák
c4993d15eb r600g: fixup the usage flag for the flushed depth texture 2012-08-16 20:44:53 +02:00
Philipp Brüschweiler
0efd564a09 wayland-drm: close fd after the display is uninitialized
This fixes a "kernel rejected pushbuf: Bad file descriptor" error on
wl_drm display destruction.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2012-08-16 13:17:06 -04:00
José Fonseca
50dec63790 scons: Fix MinGW cross compilation.
Compensate for the recent changes and assumptions added to
Makefiles.sources
2012-08-16 17:21:52 +01:00
Tom Stellard
5f82d19248 radeon/llvm: Lower implicit parameters before ISel 2012-08-16 16:04:51 +00:00
Brian Paul
0d308ef8fe gallium/draw: move misplaced brace 2012-08-16 09:16:42 -06:00
Brian Paul
f6b7157550 mesa: raise GL_INVALID_OPERATION in glGenerateMipmap for missing base image
This seems to be expected by the WebGL texture-mips test.  The error makes
sense, but I haven't found (yet) any OpenGL documentation specifying this
error condition.

See http://bugs.freedesktop.org/show_bug.cgi?id=44912

Note: This is a candidate for the 8.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-16 09:11:14 -06:00
Brian Paul
d663a557fd r600: update sampler, sampler_view code for the future
For when we have pipe->set_sampler_states(pipe, shader, start, num, samplers),
etc.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-08-16 09:01:31 -06:00
Brian Paul
10e552d056 rbug: update data structures, functions for future changes
To support geom/compute/etc shaders, samplers, sampler views, etc.
To support pipe->bind_sampler_states() w/ start_slot.
2012-08-16 09:01:31 -06:00
Brian Paul
109e87dc6a gallium/trace: add 'start' parameter to bind_sampler_states/views() 2012-08-16 09:01:31 -06:00
Brian Paul
d4ab8bd095 gallium/identity: add 'start' parameter to bind_sampler_states/views() 2012-08-16 09:01:31 -06:00
Brian Paul
f3cc4990a0 galahad: add 'start' parameter to bind_sampler_states/views() 2012-08-16 09:01:31 -06:00
Brian Paul
bd3733c0be svga: add 'start' parameter to bind_sampler_states/views() 2012-08-16 09:01:31 -06:00
Brian Paul
c969cb1447 llvmpipe: add 'start' parameter to bind_sampler_states/views() 2012-08-16 09:01:31 -06:00
Brian Paul
25a42f39e3 softpipe: add 'start' parameter to bind_sampler_states/views()
To support updating a sub-range of sampler states/views in the future.
Note that we always pass start=0 at this time.
2012-08-16 09:01:31 -06:00
Brian Paul
348ac08bfd gallium/trace: consolidate sampler, sampler_view code 2012-08-16 09:01:31 -06:00
Brian Paul
0ad95b923a gallium/identity: consolidate sampler, sampler_view code
This will simplify things when the pipe_context functions are consolidated.
2012-08-16 09:01:31 -06:00
Brian Paul
f3c3aff6ef st/mesa: add support for GS textures and samplers 2012-08-16 09:01:31 -06:00
Brian Paul
6c8a132158 st/mesa: combine vertex/fragment sampler state in arrays
As with other recent changes, put the vertex and fragment sampler state
into arrays indexed by the shader type.  This will let us easily add
support for other types of shaders in the future.
2012-08-16 09:01:31 -06:00
Brian Paul
cab2fed135 gallium: remove PIPE_MAX_VERTEX/GEOMETRY_SAMPLERS #define
PIPE_MAX_SAMPLERS, PIPE_MAX_VERTEX_SAMPLERS and PIPE_MAX_GEOMETRY_SAMPLERS
were all defined to the same value (16).

In various places we're creating arrays such as
sampler_views[PIPE_SHADER_TYPES][PIPE_MAX_SAMPLERS] so we were assuming
the same number of max samplers for all shader stages anyway.

Of course, drivers are still free to advertise different numbers of max
samplers for different shaders.
2012-08-16 09:01:31 -06:00
Brian Paul
a2c1df4c9a draw: index samplers and sampler_view state by shader type
So that we can handle GS state and other types of shaders in the future.
2012-08-16 09:01:31 -06:00
Brian Paul
bef196c792 draw: move tgsi-related state into a tgsi sub-struct
To better organize things a bit.
2012-08-16 09:01:31 -06:00
Brian Paul
df87fb5913 gallium: add a shader stage/type param to some draw functions
To prepare for geometry shader texture support in the draw module.
Note: we still only handle the vertex shader case.
2012-08-16 09:01:31 -06:00
Brian Paul
a8ed00d5f1 st/mesa: silence signed/unsigned comparison warning 2012-08-16 09:00:08 -06:00
Brian Paul
d733e5da9c svga: move result->key expression after result != NULL check 2012-08-16 08:58:55 -06:00
Brian Paul
50188adf7d svga: fix result==NULL logic in emit_fs_consts()
The previous test for result != NULL was kind of bogus since we dereferenced
the pointer earlier in the code.  Now, check for result != NULL first, then
get the result->key info.

Also, remove the useless "offset +=" code at the end.
2012-08-16 08:58:55 -06:00
Brian Paul
d55e0f1ba0 svga: update comment (s/SVGA_NEW_VS_RESULT/SVGA_NEW_VS_PRESCALE/) 2012-08-16 08:58:55 -06:00
Brian Paul
2a5eeeaebe svga: rename svga_hw_vs_parameters -> svga_hw_vs_constants
and similarly for svga_hw_fs_parameters
2012-08-16 08:58:55 -06:00
Niels Ole Salscheider
8cc1860d4a st/mesa: index can be negative in the PROGRAM_CONSTANT case
NOTE: This is a candidate for the 8.0 branch.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-16 08:56:09 -06:00
Brian Paul
fd41cbc557 mesa: add cast to silence warning in _mesa_pack_rgba_span_from_ints() 2012-08-16 08:55:48 -06:00
Brian Paul
658044cde1 meta: remove unused variable 2012-08-16 08:53:55 -06:00
Michel Dänzer
1b11395a36 radeonsi: Fix symbol conflicts with r600g.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50389

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-16 12:01:16 +02:00
Michel Dänzer
51d9f37a72 radeonsi: Fix memory leaks if returning early from some state functions.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-16 11:58:24 +02:00
Michel Dänzer
4b64fa2ff1 radeonsi: Fix LLVM context leak.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-16 11:58:24 +02:00
Michel Dänzer
18abc270c5 gallium/radeon: Don't assign virtual address space for BO that already has one.
We'd end up re-using the old one and throwing away the new one anyway, but only
after a roundtrip to the kernel.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-16 11:58:24 +02:00
Michel Dänzer
a60be05284 gallium/radeon: Create hole for waste when allocating from va_offset.
Otherwise, the wasted area could never be used for an allocation again.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-16 11:58:24 +02:00
Michel Dänzer
1f455ef5bc gallium/radeon: Fix potential address space loss in radeon_bomgr_force_va().
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-16 11:58:23 +02:00
Michel Dänzer
6d59b7f6dc gallium/radeon: Delete uppermost virtual address space hole if it's at the top.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-16 11:58:23 +02:00
Michel Dänzer
f5fe81daea gallium/radeon: Fix losing holes when allocating virtual address space.
If a hole exactly matches the allocated size plus alignment, we would fail to
preserve the alignment as a hole. This would result in never being able to use
the alignment area for an allocation again.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-16 11:58:23 +02:00
Michel Dänzer
206d07625c gallium/radeon: Merge holes when freeing virtual address space.
Otherwise we'll likely end up with an ever increasing amount of ever smaller
holes.

Requires keeping the list ordered wrt offsets.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-16 09:39:36 +02:00
Michel Dänzer
c25968f3e2 gallium/radeon: Make va_offset 64 bits wide.
Otherwise we'd wrap around after 32 bits. The kernel currently limits GPU
virtual address space to 4GB anyway, but that will probably change sooner or
later, and this would result in confusing error messages when running out of
virtual address space even now.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-16 09:37:33 +02:00
Vinson Lee
1597176f70 llvmpipe: Silence Coverity incorrect sizeof expression defect.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-15 22:15:49 -07:00
Vinson Lee
3d6892c479 scons: Add option to enable floating-point textures.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-15 22:04:24 -07:00
Dave Airlie
6a3ac03f2b glx/dri2: add dri2 prime support.
This adds support for having libGL pick a different driver for prime support.

DRI_PRIME env var is set to the value retrieved from the server randr
provider calls, by the calling process. (generally DRI_PRIME=1 will be
the right answer).

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-08-16 10:02:10 +10:00
Vincent Lejeune
565a4e2a86 radeon/llvm: Enable if-cvt
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-15 21:07:14 +00:00
Vincent Lejeune
a614979286 radeon/llvm: Add callbacks needed by if-cvt
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-15 21:07:14 +00:00
Vincent Lejeune
0eca5fd919 radeon/llvm: Lower branch/branch_cond into predicated jump
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-15 21:07:14 +00:00
Vincent Lejeune
6db2e9fdb0 radeon/llvm: Add a predicated JUMP instruction
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-15 21:07:13 +00:00
Vincent Lejeune
8263408a91 radeon/llvm: Support for predicate bit
Tom Stellard:
  - A few changes to predicate register defs

Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-15 21:07:13 +00:00
Vincent Lejeune
8f597d57e9 r600g: Glue to handle predicate aware output from llvm
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-15 21:07:13 +00:00
Vincent Lejeune
72f7632c6b r600g: Fix instruction group merge when there are predicated insts.
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-15 21:07:13 +00:00
Vincent Lejeune
56227f875b radeon/llvm: Do not use PV/PS if PRED_SEL does not match
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-15 21:07:13 +00:00
Vincent Lejeune
da676eab93 r600g: Add support for predicates
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-15 21:07:13 +00:00
Christian König
cf76edd300 radeonsi: move ps sampler state into PM4 stream
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-15 22:13:19 +02:00
Christian König
ec5b698525 radeonsi: move ps sampler views into PM4 stream
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-15 22:13:19 +02:00
Christian König
54de6f452c radeonsi: move vertex state descriptors into PM4 stream
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-15 22:13:19 +02:00
Christian König
f2c95d93db radeonsi: add shader data infrastructure
With this we can embed data for the shaders (like resource
descriptors) into the PM4 stream.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-15 22:13:19 +02:00
Christian König
4444b9d1ec radeon/llvm: add support to fetch temps as vectors
Necessary for texture fetches with temp regs as source on SI.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-15 22:13:19 +02:00
Tom Stellard
b6051bc785 radeon/llvm: Remove AMDGPUUtil.cpp 2012-08-15 18:35:26 +00:00
Apostolos Bartziokas
040c2e0456 radeon/llvm: Cleanup AMDGPUUtil.cpp 2012-08-15 18:35:25 +00:00
Tom Stellard
3aaa209293 radeon/llvm: Lower loads from USE_SGPR adddress space during DAG lowering 2012-08-15 18:35:25 +00:00
Tom Stellard
40c41fe890 radeon/llvm: Add live-in registers during DAG lowering
Psuedo instructions emulating live-in registers have been removed
and their corresponding intrinsics are now being lowered during DAG
lowering.
2012-08-15 18:35:25 +00:00
Tom Stellard
f3480f9234 radeon/llvm: Lower store_output intrinsic during DAG lowering 2012-08-15 18:35:25 +00:00
Tom Stellard
a76a0f7422 radeon/llvm: Force VTX_READ instructions to use same reg for src and dst
I was seeing some GPU hangs that seemed to be cause by ALU instructions
writing to the same register used as the source for VTX_READ.  Adding
this constraint to the VTX_READ instructions avoids this situation.
2012-08-15 18:35:25 +00:00
Marek Olšák
97b4b97b2f radeonsi: fix build breakage after u_blitter changes 2012-08-15 20:03:37 +02:00
Marek Olšák
e0cc61bd91 gallium/u_blitter: document custom meta helpers 2012-08-15 19:20:58 +02:00
Marek Olšák
b3b5bb9ddb r600g: disable handling of DISCARD_RANGE
https://bugs.freedesktop.org/show_bug.cgi?id=53130
2012-08-15 19:20:58 +02:00
Marek Olšák
44f14ebd7b r600g: implement timestamp query and get_timestamp hook
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-15 19:20:58 +02:00
Marek Olšák
1932bc8aae r600g: enable MSAA on evergreen by default
v2: add the DRM version check
2012-08-15 19:20:58 +02:00
Marek Olšák
870af19d70 r600g: implement copying between MSAA textures 2012-08-15 19:20:58 +02:00
Marek Olšák
0f86915c53 r600g: implement MSAA color resolve 2012-08-15 19:20:58 +02:00
Marek Olšák
94b634eca0 r600g: implement MSAA depth-stencil decompression and resolve
and integer textures, which are resolved the same as depth, I think.
2012-08-15 19:20:58 +02:00
Marek Olšák
6d3ad2dd2b r600g: implement TXQ_LZ opcode 2012-08-15 19:20:57 +02:00
Marek Olšák
4b78df9c81 r600g: implement MSAA rendering and texturing for evergreen and cayman 2012-08-15 19:20:57 +02:00
Marek Olšák
a01791add0 r600g: implement set_sample_mask 2012-08-15 19:20:57 +02:00
Marek Olšák
6517225078 r600g: implement alpha-to-coverage 2012-08-15 19:20:57 +02:00
Marek Olšák
26cb887ea2 r600g: implement alpha-to-one 2012-08-15 19:20:57 +02:00
Marek Olšák
4f21595276 r600g: remove support for 3-channel colorbuffers
We have no sampler support for them.
2012-08-15 19:20:57 +02:00
Marek Olšák
2f14202f52 configure.ac: bump libdrm_radeon requirement to 2.6.38 2012-08-15 19:20:57 +02:00
Marek Olšák
a7f4d3b740 winsys/radeon: print error if CS is overflowed
and don't submit the CS to the kernel.
2012-08-15 19:20:57 +02:00
Marek Olšák
dc5e61d884 gallium/u_blitter: implement X and Y texture flipping 2012-08-15 19:20:57 +02:00
Marek Olšák
825b45366d gallium/u_blitter: implement blitting multisample resources
It can blit only one sample at a time (it should be called in a loop).
2012-08-15 19:20:57 +02:00
Marek Olšák
dacf5dc9ac gallium: add TGSI support for multisample textures
The only allowed instructions are TXQ_LZ and TXF.

TXQ_LZ is like TXQ, but without the LOD parameter (which is always zero
with MSAA textures)

The 3rd or the 4th texcoord component in TXF should contain the sample index
for a 2D_MSAA or 2D_ARRAY_MSAA texture, respectively.
2012-08-15 19:20:57 +02:00
Marek Olšák
ba53573a8b gallium/tgsi: fix TGSI text parser
The problem was that the string matching succeeded e.g. for "2D" when there
was actually "2D_MSAA" and then failed parsing "_MSAA".

To prevent similar failures in the future, let's fix this kind of error
everywhere.
2012-08-15 19:20:57 +02:00
Marek Olšák
b7c4ee21c5 gallium/u_blit: set dst format from pipe_resource, not pipe_surface
We use it to decide whether we can use resource_copy_region.

NOTE: This is a candidate for the 8.0 branch.
2012-08-15 19:20:57 +02:00
Marek Olšák
1a17c42344 gallium: make pipe_box signed in order to represent flipped blits
This will be used by u_blitter.
2012-08-15 19:20:57 +02:00
Marek Olšák
03b78ceb50 st/mesa: don't clamp fragment color with integer colorbuffer 2012-08-15 19:20:57 +02:00
Marek Olšák
e06d6168cb mesa: flush vertices in test_framebuffer_completeness 2012-08-15 19:20:57 +02:00
Michel Dänzer
538085c5d4 st/egl: Fix up for ClientVersion -> ClientMajorVersion rename.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53513

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-15 10:49:39 +02:00
Jordan Justen
b3900ed5ad i965: add ARB_texture_rgb10_a2ui support
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-14 17:07:42 -07:00
Jordan Justen
091eb15b69 meta: allow CopyTexSubImage on integer formats
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-14 17:07:42 -07:00
Jordan Justen
6671d0dad3 mesa ReadPixels: handle signed/unsigned integer clamping
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-14 17:07:42 -07:00
Jordan Justen
f7333b6345 mesa pack: handle packed integer formats with clamping
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-14 17:07:42 -07:00
Jordan Justen
1a814217c3 mesa unpack: call _mesa_problem when unpack function is not available
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-14 17:07:42 -07:00
Jordan Justen
b3dd048cbb mesa texstore: handle signed/unsigned integer clamping
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-14 17:07:42 -07:00
Jordan Justen
7208505d30 mesa GetTexImage: handle signed/unsigned integer clamping
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-14 17:07:42 -07:00
Jordan Justen
7ef270867c mesa pack: handle uint and int clamping properly
Rename _mesa_pack_rgba_span_int to _mesa_pack_rgba_span_from_uints.
Add _mesa_pack_rgba_span_from_ints.

These separate routines allow the integer clamping to be handled
properly for signed versus unsigned integers.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-14 17:07:42 -07:00
Chad Versace
1938501fbf intel: Fix rendering to a multisample front buffer
We need to downsample before flushing BUFFER_FAKE_FRONT_LEFT to
BUFFER_FRONT_LEFT in intel_flush_front.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-14 16:19:25 -07:00
Chad Versace
a43599d1d1 intel: Clean up intel_flush_front
Stop repeating ourselves. Replace the 4 instances of
`driContext->driDrawablePriv` with `driDrawable`.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-14 16:19:25 -07:00
Chad Versace
38b748ce29 intel: Refactor intel_downsample_for_dri2_flush
Move it from intel_screen.c to intel_context.c. Redeclare as non-static.
A future commit will use it in multiple files.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-14 16:19:25 -07:00
Ian Romanick
cde2b7e55d docs: Add EGL extensions to release notes
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-14 15:45:17 -07:00
Ian Romanick
dbecb41300 egl: Allow OpenGL ES 3.0 as a version
In the DRI2 back-end this will get the same API as GLES 2.0.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 15:41:03 -07:00
Ian Romanick
a2ce2eba26 dri2: Note that __DRI_API_GLES2 is also used for OpenGL ES 3.0
Unlike 1.x to 2.0, OpenGL ES 3.0 is backwards compatible with 2.0.  Use the
same API flag for both.  Applications that specifically want 3.0 will specify
this using the major / minor version attributes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 15:41:03 -07:00
Ian Romanick
7b4b4f8e68 egl_dri2: Add support for EGL_KHR_create_context and EGL_EXT_create_context_robustness
Just like in GLX, EGL_KHR_create_context requires DRI2 version >= 3, and
EGL_EXT_create_context_robustness requires both DRI2 version >= 3 and the
__DRI2_ROBUSTNESS extension.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 15:41:03 -07:00
Ian Romanick
f171571bfc egl: Implement front-end support for EGL_EXT_create_context_robustness
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 15:41:03 -07:00
Ian Romanick
63beb3df98 egl: Implement front-end support for EGL_KHR_create_context
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 15:41:03 -07:00
Ian Romanick
9d76ad2fac egl_dri2: Silence warnings about missing initializers
egl_dri2.c: At top level:
egl_dri2.c:325:4: warning: missing initializer [-Wmissing-field-initializers]
egl_dri2.c:325:4: warning: (near initialization for 'swrast_driver_extensions[2].version') [-Wmissing-field-initializers]
egl_dri2.c:330:4: warning: missing initializer [-Wmissing-field-initializers]
egl_dri2.c:330:4: warning: (near initialization for 'swrast_core_extensions[1].version') [-Wmissing-field-initializers]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 15:41:03 -07:00
Ian Romanick
3fd79dd988 egl: Rename ClientVersion to ClientMajorVersion, add ClientMinorVersion
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 15:41:03 -07:00
Ian Romanick
ce55741cbc egl_dri2: Use createContextAttribs if DRI2 version >= 3
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 15:41:02 -07:00
Ian Romanick
38f91f2b08 egl_dri2: Require DRI2 version 2
The extra block in dri2_create_context is to prevent extra white space noise
in the next patch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 15:41:02 -07:00
Ian Romanick
0c445bb618 dri_util: Compare against the correct API enums
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 15:41:02 -07:00
Ian Romanick
258771882d mesa: Enable GL_ARB_invalidate_subdata
v2: Add GL_ARB_invalidate_subdata to release notes at Brian's
suggestion.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-14 14:39:33 -07:00
Ian Romanick
07e12c4917 mesa: Add skeleton implementations of glInvalidateTex{Sub,}Image
These are part of GL_ARB_invalidate_subdata (but not OpenGL ES 3.0).

v2: Add comment explaining why minimum dimensions are set to 1 for some
texture targets.  Add default case to switch statement to silence
compiler warnings and detect new texture targets.  Both changes
suggested by Brian.  Also use _mesa_is_desktop_gl as suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 14:39:33 -07:00
Ian Romanick
f241ffd48c mesa: Add skeleton implementations of glInvalidateBuffer{Sub,}Data
These are part of GL_ARB_invalidate_subdata (but not OpenGL ES 3.0).

v2: Use _mesa_bufferobj_mapped instead of testing
gl_buffer_object::Pointer as suggested by Brian.  Also use
_mesa_is_desktop_gl as suggested by Ken.

v3: Add a comment by the map subrange / discard range overlap test and
fix an off-by-one error noticed by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 14:39:33 -07:00
Ian Romanick
e2370bcc1d mesa/es: Pass context to _mesa_init_bufferobj_dispatch
With this change _mesa_init_bufferobj_dispatch won't set function
pointers that don't exist in OpenGL ES.

v2: Use _mesa_is_desktop_gl and _mesa_is_gles3 as suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 14:39:33 -07:00
Ian Romanick
342be8aa88 mesa: Add skeleton implementations of glInvalidate{Sub,}Framebuffer
These are part of GL_ARB_invalidate_subdata and OpenGL ES 3.0.

v2: Reject aux buffers in core context, and use _mesa_is_desktop_gl and
_mesa_is_gles3.  Both suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 14:39:33 -07:00
Ian Romanick
12249b9c96 glapi: Add GL_ARB_invalidate_subdata
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-14 14:39:33 -07:00
Ian Romanick
2a1ca4ff73 mesa/es3: Add _mesa_is_gles3 predicate
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 14:39:29 -07:00
Ian Romanick
9bcb9fad65 intel: Implement ARB_texture_storage
This is basically cut-and-paste from the swrast implementation, and it
could probably be (slightly) more optimal.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 14:39:19 -07:00
Ian Romanick
92b614172f mesa: update glext.h to version 83
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-14 12:19:24 -07:00
Matt Turner
79e9e1b32f build: Use MKDIR_P in src/mesa/Makefile.am
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 10:54:39 -07:00
Matt Turner
02f52e8df5 build: Use AM_V_GEN in src/mesa/Makefile.am
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 10:54:39 -07:00
Matt Turner
1b200d9001 build: Fix autogen.sh to allow out-of-tree builds
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 10:54:39 -07:00
Matt Turner
85d355f122 build: Fix out-of-tree generation of builtin_function.cpp
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 10:54:39 -07:00
Matt Turner
2191a79b4e build: Fix gtest out-of-tree build
Introduced by 3d000e7dd.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 10:54:39 -07:00
Matt Turner
e939250b63 build: Fix out-of-tree generation of api_exec_es{1,2}.c
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 10:54:39 -07:00
Matt Turner
5c2a6b74ed build/sources.mak: Add src/glsl/glcpp to INCLUDE_DIRS
Fixes problem where libdricore's of-out-tree build couldn't find
glcpp.h.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 10:54:38 -07:00
Matt Turner
fa74175210 build/sources.mak: Remove unused GLSL_LIBS
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-14 10:54:38 -07:00
Ian Romanick
707f067915 mesa: Kill GL_ARB_shadow_ambient with fire
No driver supports this extension, and it seems unlikely than any driver
ever will.  I think r300c may have supported it at one time, but that
driver has already been removed.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-08-14 10:40:04 -07:00
Tom Stellard
b49771970b radeon/llvm: Inline immediate offset when lowering implicit parameters 2012-08-14 14:06:20 +00:00
Tom Stellard
2fae8227ad radeon/llvm: Use correct opcocde for BREAK_LOGICALNZ_i32 2012-08-14 13:26:30 +00:00
José Fonseca
ea8dcfc90d scons: Populate top_srcdir and top_builddir variables when reading Makefiles.sources.
This is not entirely correct, as scons doesn't put binaries in a
"src" subdirectory, but doesn't seem to be a problem for now.
2012-08-14 12:19:56 +01:00
Kenneth Graunke
605f964d5c mesa: Use GLdouble for depthMax in final unpack conversions.
The final step of _mesa_unpack_depth_span is to take the temporary
GLfloat depth values and convert them to the desired format.  When
converting to GL_UNSIGNED_INTEGER with depthMax > 0xffffff, we use
double-precision math to avoid overflow and precision problems.

Or at least that's the idea.  Unfortunately

   GLdouble z = depthValues[i] * (GLfloat) depthMax;

actually causes single-precision multiplication, since both operands are
GLfloats.  Casting depthMax to GLdouble causes the scaling to be done
with double-precision math.

Fixes a regression in oglconform's depth-stencil basic.read.ds test
since c60ac7b179, where the expected and
actual values differed slightly.  For example, 0xcfa7a6 vs. 0xcfa7a4.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49772
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-13 19:16:38 -07:00
Eric Anholt
43e3a7533d i965: Fix the scaling of seconds to ms in perf debug.
*headdesk*
2012-08-13 17:50:25 -07:00
Ian Romanick
d606926013 i965: Validate API and version in brwCreateContext
v2: Use base-10 for versions like gl_context::Version.  Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-13 17:38:55 -07:00
Ian Romanick
db273724c9 i915: Validate API and version in i915CreateContext
v2: Use base-10 for versions like gl_context::Version.  Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-13 17:36:50 -07:00
Ian Romanick
a81e4b3e92 i830: Validate API and version before calling i830CreateContext
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-13 17:23:48 -07:00
Ian Romanick
2b63624326 intel: In the i915 driver, the chipset cannot be i965
In the i965 dirver, the chipset must be i965.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-13 17:23:24 -07:00
Ian Romanick
70f47505a2 dri: Pass API_OPENGL_CORE through to the drivers
This forces the drivers to do at least some validation of context API
and version before creating the context.  In r100 and r200 drivers, this
means that they don't do any post-hoc validation.

v2: Actually reject compatibility profile 3.2+ contexts.  Thanks Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-13 17:17:12 -07:00
Ian Romanick
7e81f553bc mesa: Filter a bunch more functions based on API
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-13 17:17:00 -07:00
Ian Romanick
0fef911ce4 mesa: Don't advertise extensions that are part of GL 1.5 in a core context
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-13 16:19:36 -07:00
Ian Romanick
aa0b1e902b mesa: Don't advertise extensions that are part of GL 1.4 in a core context
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-13 16:19:36 -07:00
Ian Romanick
213945385a mesa: Don't advertise extensions that are part of GL 1.3 in a core context
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-13 16:19:36 -07:00
Ian Romanick
7ef1869d69 mesa: Don't advertise extensions that are part of GL 1.2 in a core context
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-13 16:19:36 -07:00
Ian Romanick
4d39b86315 mesa: Don't advertise deprecated extensions in a core context
It may be possible to trim the list of extensions futher.  These are
just the obvious extensions that add functionality that the core context
explicitly forbids.  Apple's core-context extension list is *just* the
extensions on top of the core GL version.  I'm not sure we want to go
that far, but removing some things that have been in core since 2.1 may
be okay.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-13 16:19:36 -07:00
Christopher James Halse Rogers
cd4a61100d build: Fix libdricore out-of-tree builds (v2)
v2: Add both top_srcdir and top_builddir to mesa asm include dirs.
    These require both in-tree and build-time-generated files.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
2012-08-13 12:24:54 -07:00
Christopher James Halse Rogers
73fef0178a build/mapi: More killing of TOP in favour of top_srcdir
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
2012-08-13 12:24:47 -07:00
Christopher James Halse Rogers
77a3efc6b9 build/glsl: fix location of generated files.
Like in src/mesa, use GLSL_BUILDDIR/GLSL_SRCDIR to unambiguously
distinguish between in-tree and generated files.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
2012-08-13 12:24:39 -07:00
Christopher James Halse Rogers
37a1b8083e build/glapi: fix includes for generated files
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
2012-08-13 12:24:31 -07:00
Christopher James Halse Rogers
3fe69bac49 build: fix out of tree generation of glapi_mapi_tmp.h
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
2012-08-13 12:24:25 -07:00
Christopher James Halse Rogers
726f534bbb build/glx: fix include paths for out-of-tree builds
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
2012-08-13 12:24:17 -07:00
Christopher James Halse Rogers
b2ecaab7ad build: fix location of generated files in src/mesa (v4)
Also fix include paths for the generated headers.

v2: Switch to using self-explanatory BUILDDIR/SRCDIR defined from
    top_builddir/top_srcdir rather than the ambiguous TOP.
v3: Add both top_builddir and top_srcdir to include flags for mesa asm.
    These rely on both in-tree and build-time-generated includes.
v4: Rebased on top of 948c8f502a.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-08-13 12:24:04 -07:00
Kenneth Graunke
4e087de51a intel: Reserve enough space to finish occlusion queries on Gen6.
After realizing that brw_finish_batch emitted some final PIPE_CONTROLs
to record occlusion queries, Chris noted that we probably hadn't
reserved enough space to actually emit them.

Reserving a full 60 bytes seems a bit harsh, since we only need that
much if occlusion queries are actually active.  Plus, 28 bytes would be
sufficient for Gen7, and 24 for Gen4-5.

We could optimize this in the future, but it doesn't seem too critical.

NOTE: This is a candidate for stable release branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53311
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-12 20:12:28 -07:00
Kenneth Graunke
9da50667f4 intel: Move finish_batch() call before MI_BATCH_BUFFER_END and padding.
On Gen4+, brw_finish_batch() calls brw_emit_query_end(), which emits
some extra PIPE_CONTROLs to capture the current occlusion query data.
Unfortunately, it was being called *after* _intel_batchbuffer_flush
added the MI_BATCH_BUFFER_END, meaning those PIPE_CONTROLs didn't get
inside the batch.

Not only does this likely cause bogus occlusion query values, it can
also cause crashes: with the recent change to use 64-bit depth count
writes on Gen6+, we started emitting an odd-length PIPE_CONTROL, which
happened after the MI_NOOP padding.  This resulted in an odd-length
batch buffer, which resulted in execbuf2 returning -EINVAL and the
application dying with an intel_do_flush_locked failure.

On older generations, finish_batch() doesn't emit any state, so this
change shouldn't have any effect.

Huge thanks to Chris Wilson for helping me figure this out.

NOTE: This is a candidate for stable release branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53311
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-12 20:12:13 -07:00
Eric Anholt
006c1a3c65 i965: Add perf debug for stalls during shader compiles.
v2: fix bad comment from before I gave up and decided to just use doubles.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 19:08:25 -07:00
Eric Anholt
97a5f0ff2e i965: Add performance debug for when the state cache gets nuked.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 19:08:25 -07:00
Eric Anholt
fc3b7c9b56 i965: Add performance debug for shader recompiles.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 19:08:25 -07:00
Eric Anholt
b4da272a6e i965: Add performance debug for fast clear fallbacks.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 19:08:25 -07:00
Eric Anholt
0e723b135b intel: Add performance debug for some common GPU stalls.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 19:08:25 -07:00
Eric Anholt
4cfb9e3000 i965: Add performance debug for register spilling.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 19:08:25 -07:00
Eric Anholt
d72ff03e69 i965: Add INTEL_DEBUG=perf for failure to compile 16-wide shaders.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 19:08:25 -07:00
Eric Anholt
79198063b8 intel: Rename INTEL_DEBUG=fall to INTEL_DEBUG=perf.
I want to introduce some more debug output for performance surprises that
includes fallbacks, but aren't necessarily software rasterization.  Leave
INTEL_DEBUG=fall in place for those that have used that flag before.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 19:08:24 -07:00
Pauli Nieminen
bf6c1b7470 meta: texture rectangle textures may not have mipmaps
Avoid INVALID_OPERATION error if decompressing rectangle texture.
Setting mipmap level limits for those textures is error that must not be
hit by meta code to mislead user.

[v3/Kayden]: Resolve conflicts due to Eric picking a subset of Pauli's
original changes.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 16:18:46 -07:00
Pauli Nieminen
b9daa83463 meta: Use sampler object for mipmap generation
Sampler objects are perfect for meta operations.Sampler object
is separate state object that shadows the sampling state in texture
object. With sampler object mipmap can maintain same sampling state for
all subsequent generation requests.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 16:18:43 -07:00
Pauli Nieminen
ac4dc5e931 mesa/samplerobj: Avoid crash in sampler query if texture unit is disabled
Sampler queries are so far made only for enabled texture unit. But if
any code would query sampler before checking texture unit state that
would result to NULL deference.

Making the inline helper easier to use with NULL check makes a lot sense
because compiler is likely to combine the checks for the current texture.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 16:18:41 -07:00
Pauli Nieminen
5606bd574e mesa: Remove unnecessary parameters CompressedTexImage
In tune with previous patches. Again there is duplication of information
in function parameters that is good to remove.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 15:49:30 -07:00
Pauli Nieminen
c9a7dfcf92 mesa: Remove unnecessary parameters from AllocTextureImageBuffer
Size and format information is always stored in gl_texture_image
structure. That makes it preferable to remove duplicate information from
parameters to make interface easier to understand.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 15:49:28 -07:00
Pauli Nieminen
c5af889180 mesa: Remove unnecessary parameters from TexImage
gl_texture_image structure always holds size and internal format before
TexImage driver hook is called. Those passing same information in
function parameters only duplicates information making the interface
harder to understand.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 15:49:13 -07:00
Tom Stellard
e98ace934e configure: Check xcb version when X11 pkgconfig exists
Commit 6882381a2e added a dependency on a
newer version of xcb, but the version check wasn't added in all the
necessary places.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-12 15:42:43 -07:00
Chí-Thanh Christopher Nguyễn
4c73282d2b gbm: Fix build without gallium_drm_loader
pipe_loader_drm_probe_fd only exists if HAVE_PIPE_LOADER_DRM is defined.
Patch improved as suggested by Vadim A. Misbakh-Soloviov.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=52962
2012-08-12 14:38:32 -07:00
Christian König
9f5ff5981c radeonsi: move drawing into new state handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-08-11 09:58:26 +02:00
Christian König
583c212115 radeonsi: move sync handling into new state handler
So we can remove all the old atom handling.

Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-08-11 09:58:26 +02:00
Christian König
303f4b7dcd radeonsi: separate and disable streamout for now
I have my doubts that this code still works on SI.

Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-08-11 09:58:26 +02:00
Christian König
696b6cf466 radeonsi: remove ps_partial_flush
Not needed any more.

Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-08-11 09:58:26 +02:00
Christian König
7acb194a7b radeonsi: remove r6xx_flush_and_inv atom
It is not used any more.

Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-08-11 09:58:25 +02:00
Christian König
708337e62e radeonsi: move init state to new state handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-08-11 09:58:25 +02:00
Christian König
862df0885a radeonsi: add support for PKT3 cmds to new state handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-08-11 09:58:25 +02:00
Christian König
ce40e4726c radeonsi: cleanup shader headers
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-08-11 09:58:25 +02:00
Chad Versace
996ff1c9bf Revert "mesa: Remove C++11 narrowing warnings"
This reverts commit 9f5a5d541d.

Fixes the following build error on GCC 4.2.3:
  cc1plus: error: unrecognized command line option "-Wno-narrowing"
The GCC Manual incorrectly stated that commit 9f5a5d54 woulde be safe for
old versions of GCC.

Reported-by: Andy Furniss <andyqos@ukfsn.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-10 14:05:14 -07:00
Brian Paul
16c702ef3b softpipe: fix softpipe_delete_fs_state() failed assertion
The var!=softpipe->fs_variant assertion was failing because we weren't
nulling the softpipe->fs_variant pointer when binding a new shader.
Since softpipe->fs_variant depends on the current fs, it's of no use
when a new FS is bound.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=53318

Note: This is a candidate for the 8.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-10 13:27:04 -06:00
Brian Paul
3487b93cc4 cso: rearrange some structure fields for consistency 2012-08-10 12:14:17 -06:00
Brian Paul
cf77c29e60 st/mesa: fix renderbuffer validation bug
After we attach a new renderbuffer in this function we need to make
sure Mesa's update_framebuffer() gets called.

Fixes crash in WebGL conformance/textures/texture-attachment-formats.html,
but the test still fails for other reasons.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=53316

Note: This is a candidate for the 8.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-10 11:49:36 -06:00
Chad Versace
9f5a5d541d mesa: Remove C++11 narrowing warnings
Add -Wno-narrowing to CXXFLAGS for gcc.

It is safe to add this flag even for versions of gcc that don't recognize
it.  From the GCC Manual [1]: "[GCC] allows the use of new -Wno- options
with old compilers".

This removes warnings of the form
    warning: narrowing conversion of X from 'int' to 'float' inside { } is
    ill-formed in C++11 [-Wnarrowing]
in ff_fragment_shader.cpp and gen6_blorp.cpp of the form.  When building
i965, I observed no other difference in the build output.

[1] http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-10 09:59:41 -07:00
Brian Paul
f7af4beae5 gallivm: fix crash in lp_sampler_static_state()
Fixes WebGL conformance/uniforms/uniform-default-values.html crash.

We need to check for the null view pointer before accessing view->texture.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=53317

Note: This is a candidate for the 8.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-08-10 09:45:25 -06:00
Brian Paul
9b04abe368 st/mesa: fix glCopyTexSubImage crash
Fixes a WebGL crash.  The dest texture image is at level 2 and is of
size 1x1 texel.  The st texture image is a stand-alone resource, not
a pointer into a complete mipmap.  So the resource has one level and
trying to write to level 2 blows up.

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=53314
and http://bugs.freedesktop.org/show_bug.cgi?id=53319

Note: This is a candidate for the 8.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-08-10 09:45:17 -06:00
Chad Versace
6cb9e99a75 intel: Always downsample in intel_miptree_map_multisample
Always downsample before mapping, even if the map mode contains
GL_MAP_INVALIDATE_RANGE_BIT. If we neglect to downsample when only
a subrect is mapped then the upsample in intel_miptree_unmap_multisample
may write garbage to the region outside the subrect.

(Eric gave my patch e88cfbb a conditional reviewed-by with the condition
that it always downsample before mapping. I forgot to make that change
before pushing the patch.)

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-09 15:21:02 -07:00
Eric Anholt
04a11b5f5e i965/gen6+: Add support for edge flags.
Fixes the 3 new piglit edgeflag tests.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=40707
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-09 09:07:50 -07:00
Eric Anholt
b3367f56d8 i965/vs: Convert EdgeFlagPointer values appropriately for the VS on gen4.
Fixes piglit gl-2.0/edgeflag.

NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-09 09:07:49 -07:00
Eric Anholt
3eb8d71225 i965/vs: Add comment noting copy_edgeflag state dependency.
It's already in the state struct.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-09 09:07:49 -07:00
Eric Anholt
e119f98472 i965/vs: Add support for copying user edge flags.
Fixes the glsl skinning demo regression since changing to the new GLSL
compiler, and is part of fixing piglit gl-2.0-edgeflag.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50079
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-09 09:07:49 -07:00
Olivier Galibert
7426d9d769 i965/fs: Fix the FS inputs setup when some SF outputs aren't used in the FS.
If there was an edge flag or a two-side-color pair present, we'd end up
mismatched and read values from earlier in the VUE for later FS inputs.

v2: Fix regression in gles2conform shaders generating point size. (change by
    anholt)

Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 8.0 branch.
2012-08-09 09:07:49 -07:00
Vinson Lee
3466538171 st/mesa: Initialize tgsi_texture_offset Padding field.
Fixes uninitialized scalar variable defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-08 22:36:27 -07:00
Kenneth Graunke
68bccc40f5 glx/dri: Initialize reset to __DRI_CTX_RESET_NO_NOTIFICATION.
If the application has requested reset notification, then
dri2_convert_glx_attribs will initialize this to the correct value.

Otherwise, it's supposed to initialize this to NO_NOTIFICATION, but
doesn't when num_attribs == 0.  (The consensus seems to be that we
should make it do so, but that's more invasive, so I'm pushing this for
now.)

Fixes a regression since a8724d85f8
where trying to run OilRush_x86 or apitrace heaven_x64 would result in:

dri_util.c:221: dri2CreateContextAttribs: Assertion `!"Should not get
here."' failed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53076
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2012-08-08 17:15:21 -07:00
Tapani Pälli
94f22fbe78 intel: use _mesa_meta_Clear with OpenGL ES 1.1 v2
Patch changes i915 and i965 drivers to use fixed function version of
meta clear when running on ES 1.1. This fixes rendering errors seen with
Google Maps, Angry Birds and Gallery3D on Android platform.

Change 88128516d4 exposes all extensions
internally to be available independent of GL flavour, therefore check
against ARB_fragment_shader does not work.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50333
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-08 17:15:21 -07:00
Kenneth Graunke
5deb1d1a1f i965: Rework the extra flushes surrounding occlusion queries.
This removes the CS stall on Ivybridge.

On Sandybridge, the depth stall needs to be preceded by a non-zero
post-sync op, which requires a CS stall, which needs a stall at
scoreboard.  Emit the full workaround.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-08 17:15:21 -07:00
Eric Anholt
b0adbda75a i965/vs: Protect pow(x,y) MOV of y on gen4 from other instruction flags.
I don't know if it was possible to trigger this bug -- we don't merge
saturates into the math instruction because we're bad at coalescing currently,
and there's nothing generating these with predicates.  Still, let's avoid
future bugs when we do smarter codegen.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-08 16:21:31 -07:00
Eric Anholt
9b4053cabd i965: Drop the confusing saturate argument to math instruction setup.
This was ridiculous.  We were ignoring the inst->header.saturate flag in the
case of math and only math.  On gen4, we would leave inst->header.saturate in
place if it happened to be set, which would end up being applied to the
implicit mov and thus trash the first argument.  On gen6, we would overwrite
inst->header.saturate with the saturate flag from the argument, which was not
set appropriately in brw_vec4_emit.cpp, and was only not a bug due to our
incompetence at coalescing saturate moves.

By ripping the argument out and making saturate work just like all the other
brw_eu_emit.c code generation, we can avoid both these classes of bugs.

Fixes piglit fog-modes, and the new specific fs-saturate-exp2 case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48628
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-08 16:21:30 -07:00
Eric Anholt
33dfdc735e i965: Make brw_set_saturate() use stdbool.
There was a chance for brw_wm_emit.c to screw up and pass (1 << 4) instead of
1, which would get converted to 0 when stored.  Instead, use stdbool which
converts nonzero to true/1 like we want.
2012-08-08 16:21:30 -07:00
Eric Anholt
1b148e660e mesa: In conditional rendering fallback, check the query status.
Otherwise, conditional rendering always takes the fallthrough "render it
anyway" case unless the application had itself done a check or wait on the
query.

Fixes intel oglconform's conditional_render advanced.nofbo.readpixels.

Reviewed-by: Brian Paul <brianp@vmware.com>
NOTE: This is a candidate for the 8.0 branch.
2012-08-08 16:21:30 -07:00
Eric Anholt
4bbd120368 mesa: Fix glPopAttrib() behavior on GL_FRAMEBUFFER_SRGB.
I happened to notice this while looking at a blit pass in l4d2, which had an
optional push/pop around framebuffer srgb setting.  It didn't matter in the
end, but the fix is sitting in my tree now.

Reviewed-by: Brian Paul <brianp@vmware.com>
NOTE: This is a candidate for the 8.0 branch.
2012-08-08 16:21:30 -07:00
Ian Romanick
9f7b3d1713 Make shared-glapi the default
You can't practically have desktop OpenGL and OpenGL ES on the same system
without this.  The benefits of not having it (e.g., a more compact dispatch
table) are irrelevant.

v2: Don't mark shared-glapi as experimental.  Review suggestion by Chad.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-08 10:06:26 -07:00
Ian Romanick
5602f0f955 mesa/tests: Fix trivial typos in src/mapi/glapi tests
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-08 10:06:26 -07:00
Ian Romanick
45d3d0ad21 mesa/tests: Add tests for the generated shared-glapi dispatch table
These are largely based on the src/mapi/glapi/tests.  However,
shared-glapi provides less external visibility into the dispatch table,
so there is less to test.  Also, shared-glapi does not implement
_glapi_get_proc_name, so that test was removed.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-08 10:06:26 -07:00
Ian Romanick
d9f899bb93 glapi: Prevent accidental use of lies w/shared-glapi
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-08 10:06:26 -07:00
Ian Romanick
99fee476a1 glx: Don't use glapitable.h at all
When --enable-shared-glapi is used, all non-ABI entries in the table are
lies.  Avoiding the use of glapitable.h avoids the lies.  The only
entries used in this code are entries that are ABI.  For these, the ABI
offset can be used directly.

Since this code is in src/glx, it can't use src/mesa/main/dispatch.h to
get the pretty names for these offsets.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-08 10:06:26 -07:00
Ian Romanick
f5dffb7e36 glx: Don't rely on struct _glapi_table
When --enable-shared-glapi is used, all non-ABI entries in the table are
lies.  There are two completely separate code generation paths used to
assign dispatch offset.  Neither has any clue about the other.
Unsurprisingly, the can't agree on what offsets to assign.

This adds a bunch of overhead to __glXNewIndirectAPI, but this function
is called at most once.

The test ExtensionNopDispatch was removed.  There was just no way to
make this test work with the information provided in shared-glapi.
Since indirect_glx.c uses _glapi_get_proc_offset now, it was also
impossible to make the tests work without shared-glapi.  So much pain.

This fixes indirect rendering with shared-glapi.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-08 10:06:26 -07:00
Ian Romanick
52d6df8aa7 mesa/tests: Don't build glapi tests with shared-glapi
This fixes 'make check' on with --enable-shared-glapi.  This test cannot work
in that environment.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-08 10:06:25 -07:00
Kenneth Graunke
e45a9ce474 i965: Use 64-bit writes for occlusion queries.
The hardware seems to use the length of the PIPE_CONTROL command to
indicate whether the write is 64-bits or 32-bits.  Which makes sense
for immediate writes.

Daniel discovered this by writing a pattern into the query object bo
and noticing that the high 32-bits were left intact, even on those
pipe control writes that seemingly worked.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-08 09:24:23 -07:00
Kenneth Graunke
20c09b82d0 i965: Refactor depth count write PIPE_CONTROLs into a helper function.
This consolidates the complexity in one place, which is important
because it's about to get even more complicated.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-08 09:24:21 -07:00
Kenneth Graunke
a2cdd5ada8 i965: Emit a CS stall before timestamp writes.
This implements one of the Sandybridge PIPE_CONTROL workarounds.  It
doesn't appear to be required for Ivybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-08 09:24:19 -07:00
Kenneth Graunke
c4c78c275a i965: Use 64-bit writes for timestamp queries.
The hardware seems to use the length of the PIPE_CONTROL command to
indicate whether the write is 64-bits or 32-bits.  Which makes sense
for immediate writes.

Daniel discovered this by writing a pattern into the query object bo
and noticing that the high 32-bits were left intact, even on those
pipe control writes that seemingly worked.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-08 09:24:16 -07:00
Kenneth Graunke
03f14664b6 i965: Refactor timestamp write PIPE_CONTROLs into a helper function.
This consolidates the complexity in one place, which is important
because it's about to get even more complicated.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-08 09:24:14 -07:00
Kenneth Graunke
61d0b9f52c intel: Make the length for PIPE_CONTROL explicit.
PIPE_CONTROL has variable length, depending upon generation and whether
we want to do 32-bit or 64-bit data writes.  Make it explicit, rather
than hiding a length of 4 in the #define for _3DSTATE_PIPE_CONTROL.

Generated by s/3DSTATE_PIPE_CONTROL/3DSTATE_PIPE_CONTROL | (4 - 2)/g.
This is equivalent since the #define used to have | 2 in it.  A grep
through the sources shows that all instances have been converted, so
it's safe to remove the | 2 from the #define.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-08 09:23:57 -07:00
Brian Paul
ecac178aa2 swrast: add missing switch case for API_OPENGL_CORE
To silence compiler warning.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-08 09:39:36 -06:00
Brian Paul
b4d6502fcd gallivm: remove unused src_elem_type variable
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-08 09:39:36 -06:00
Brian Paul
f21669e9a2 svga: remove unused svga_shader::use_sm30 field, add comments
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-08 09:39:36 -06:00
Brian Paul
16a289195e svga: remove unused svga_winsys_handle type
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-08 09:39:36 -06:00
Michel Dänzer
82cd9c0fc2 radeonsi: If pixel shader compilation fails, use a dummy shader.
Otherwise we're likely to hang the GPU.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-08 15:33:38 +02:00
Christian König
be42a45e02 radeonsi: fix memory leak and/or segfaults
Fix a stupid typo that could lead to memory
leaks and/or segfaults.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-08 12:36:49 +02:00
Christian König
8c44e5a144 radeon/winsys: fix winsys VM handling
Move releasing the VM area after closing the bo handle.

This partially fixes: https://bugs.freedesktop.org/show_bug.cgi?id=45018

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-08 12:35:10 +02:00
Vinson Lee
7528e2104f translate: Fix typo in is_legal_int_format_combo.
Fixes same on both sides defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-07 22:34:28 -07:00
Marek Olšák
1ea263fccb r600g: remove unused parameters in texture functions 2012-08-07 23:39:52 +02:00
Eric Anholt
4a078516b6 i965: Enable uniform buffer objects on gen6+.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:52 -07:00
Eric Anholt
04871058eb i965/vs: Add support for loading uniform buffer variables as pull constants.
Unlike the FS side in the previous commit, this does variable indexing just
fine, using the same code as we used for other variable-indexed pull
constants.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:52 -07:00
Eric Anholt
90de96ff0d i965/fs: Add support for loading uniform buffer variables as pull constants.
Variable array indexing isn't finished, because the lowering pass
turns it all into conditional moves of constant index accesses so I
can't test it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:51 -07:00
Eric Anholt
bb020d09c3 i965/vs: Add a surface index to VS_OPCODE_PULL_CONSTANT instructions.
Similar to the previous commit for the fragment shader, now we have a buffer
index and an offset.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:51 -07:00
Eric Anholt
454dc83f66 i965/fs: Communicate the pull constant block read parameters through fs_regs.
I wanted to add the surface index as a variable value for UBO support,
and a reg seemed like the obvious way to go.  This exposes more of the
information to CSE, which we'll probably want to apply to pull
constant loads for UBOs eventually (you might access 4 floats in a
row, each of which would produce an oword block read of the same
block).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:51 -07:00
Eric Anholt
25d2bf3845 i965: Bind UBOs as surfaces like we do for pull constants.
v2: Comment fix, drop extraneous parens (review by Kenneth)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:51 -07:00
Eric Anholt
5bffbd7ba2 i965: Add an offset argument to constant buffer setup.
We'll use this for UBO surfaces.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:51 -07:00
Eric Anholt
5fc5b29a54 mesa: Add support for glUniformBlockBinding() in display lists.
Fixes piglit GL_ARB_uniform_buffer_object/dlist.

v2: Use the .ui fields instead of .i for type consistency (review by Brian
    Paul)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:51 -07:00
Eric Anholt
bfa046b5f2 mesa: Unbind uniform buffer bindings on glDeleteBuffers().
Fixes piglit GL_ARB_uniform_buffer_object/deletebuffers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:51 -07:00
Eric Anholt
1eb3c06ae8 mesa: Default to GL 3.1's limits on uniform blocks.
The ARB spec lets you get away with the default block counting against the
blocks for combined size limits.  The core spec says you need to be able to
support the maximum size of default block *and* the maximum size of each
uniform block.  I see no reason that any driver would have a problem with
that.

Fixes gl 3.1/minmax (with an associated fix to the test)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:51 -07:00
Eric Anholt
803262a5f5 glsl: Refuse to parse uniform block declarations when UBOs aren't available.
Fixes piglit
GL_ARB_uniform_buffer_object/compiler/extension-disabled-block.frag

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:51 -07:00
Eric Anholt
e45f1b11c0 glsl: Align GL_UNIFORM_BLOCK_DATA_SIZE according to std140 rules.
Fixes piglit GL_ARB_uniform_buffer_object/data-size test.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:51 -07:00
Eric Anholt
86e0045578 glsl: Only flag RowMajor on matrix-type variables.
We were only propagating it to the API when the variable was a matrix type,
but we were still tripping over it in lower_ubo_reference when it was set on a
vector.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:50 -07:00
Eric Anholt
ffb2d43059 glsl: Fix calculation of std140 offset alignment for mat2s.
We were getting the base offset of a vec2, not of a vec2[2] like the quoted
spec text says we should.

v2: Fix swapped then/else cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:50 -07:00
Eric Anholt
300315fe69 glsl: Fix glGetActiveUniformsiv(GL_UNIFORM_BLOCK_INDEX).
Previously, we were returning the index into the UniformBlocks of one of the
linked shaders, when it's supposed to be the program global index.

Fixes piglit getactiveuniformsiv-uniform_block_index.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:50 -07:00
Eric Anholt
af3fc6bb28 ir_to_mesa: Don't whack the ->location field of uniform block variables.
Fixes some failures in GL_ARB_uniform_buffer_object/maxblocks.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:50 -07:00
Eric Anholt
56e82e30cb mesa: Make glBindBufferBase/glBindBufferRange() work on just-genned names.
In between glGenBuffers() and glBindBuffer(), the buffer object points to this
dummy buffer with a name of 0, and a glBindBufferBase() would point to that.
It seems pretty clear, given that glBindBufferBase() only cares about the
current size of the buffer at render time, that it should bind up the buffer
that you passed in instead of pointing it at this useless dummy buffer.

However, what should glBindBufferRange() do?  As of this patch, it will
promote the genned buffer to a proper buffer like it had been
glBindBuffer()ed, and then detect that the size is greater than the buffer's
current size of 0 and throw INVALID_VALUE.  It seems like the most reasonable
answer here.

Note that this also changes the behavior of these two on non-glGenBuffers() bo
names.  We haven't yet set up the error throwing for glBindBuffers() on gl
3.1+, and my assumption is that these two functions should inherit their
behavior on un-genned names from glBindBuffers().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 13:54:50 -07:00
Eric Anholt
a75f2681d2 glsl: Add a lowering pass to turn complicated UBO references to vector loads.
v2: Reduce the impenetrable code in emit_ubo_loads() by 23 lines by keeping
    the ir_variable as the variable part of the offset from handle_rvalue(),
    and track the constant offsets from that with a plain old integer value,
    avoiding a bunch of temporary variables in the array and struct handling.
    Also, fix file description doxygen.
v3: Fix a row vs col typo, and fix spelling in a comment.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-07 13:54:47 -07:00
Eric Anholt
8c2a983835 glsl: Add a variant of the rvalue visitor for handle_rvalue() on the way down.
For the UBO lowering pass, I want to see the whole dereference chain for
replacing, not the innermost ir_dereference_variable.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 11:47:49 -07:00
Eric Anholt
2ea3ab14f2 glsl: Add a "ubo_load" expression type for fetches from UBOs.
Drivers will probably want to be able to take UBO references in a
shader like:

        uniform ubo1 {
                float a;
                float b;
                float c;
                float d;
        }

        void main() {
             gl_FragColor = vec4(a, b, c, d);
        }

and generate a single aligned vec4 load out of the UBO.  For intel,
this involves recognizing the shared offset of the aligned loads and
CSEing them out.  Obviously that involves breaking things down to
loads from an offset from a particular UBO first.  Thus, the driver
doesn't want to see

	variable_ref(ir_variable("a")),

and even more so does it not want to see

	array_ref(record_ref(variable_ref(ir_variable("a")),
          "field1"), variable_ref(ir_variable("i"))).

where a.field1[i] is a row_major matrix.

Instead, we're going to make a lowering pass to break UBO references
down to expressions that are obvious to codegen, and amenable to
merging through CSE.

v2: Fix some partial thoughts in the ir_binop comment (review by Kenneth)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 11:47:49 -07:00
Eric Anholt
71ba6de342 glsl: Fix a reference to UniformBlocks during uniform linking.
When converting var->location from pointing at the program's UniformBlocks to
pointing at the linked shader's UniformBlocks, I missed this change.  It
usually worked out in the end because the two lists happen to be the same in
many testcases.

Fixes a valgrind complaint on
oglconform ubo-compile.cpp advanced.std140.2stage

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 11:47:49 -07:00
Eric Anholt
7e42302e71 glsl: Update the notes on adding a new expression type.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 11:47:49 -07:00
Eric Anholt
9c1b41879a mesa: Replace VersionMajor/VersionMinor with a Version field.
As we get into supporting GL 3.x core, we come across more and more features
of the API that depend on the version number as opposed to just the extension
list.  This will let us more sanely do version checks than "(VersionMajor == 3
&& VersionMinor >= 2) || VersionMajor >= 4".

v2: Fix a bad <= 30 check.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 11:47:19 -07:00
Eric Anholt
3aaeb3e5e7 intel: Fix compiler warnings from winsys msaa. 2012-08-07 11:47:11 -07:00
Chad Versace
e943e5c291 intel: Advertise multisample DRI2 configs on gen >= 6
This turns on window system MSAA.

This patch changes the id of many GLX visuals and configs, but that
couldn't be prevented. I attempted to preserve the id's of extant configs
by appending the multisample configs to the end of the extant ones. But
somewhere, perhaps in the X server, the configs are reordered with
multisample configs interspersed among the singlesample ones.

Test results:
  Tested with xonotic and `glxgears -samples 1` on Ivybridge.

  No piglit regressions on Ivybridge.

  On Sandybridge, passes 68/70 of oglconform's
  winsys multisample tests.  The two failing tests are:
      multisample(advanced.pixelmap.depth)
      multisample(advanced.pixelmap.depthCopyPixels)
  These tests hang the gpu (on kernel 3.4.6) due to
  a glDrawPixels/glReadPixels pair on an MSAA depth buffer.  I don't expect
  realworld apps to do that, so I'm not too concerned about the hang.

  On Ivybridge, passes 69/70. The failing case is
  multisample(advanced.line.changeWidth).

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:34 -07:00
Chad Versace
8b5d68dd28 intel: Clarify intel_screen_make_configs
This function felt sloppy, so this patch cleans it up a little bit.

- Rename `color` to `i`. It is not a color value, only an iterator int.
- Move `depth_bits[0] = 0` into the non-accum loop because that is where
  it used. The accum loop later overwrites depth_bits[0].
- Rename `depth_factor` to `num_depth_stencil_bits`.
- Redefine `msaa_samples_array` as static const because it is never
  modified. Rename to `singlesample_samples`.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
a4bf68ca50 dri: Simplify use of driConcatConfigs
If either argument to driConcatConfigs(a, b) is null or the empty list,
then simply return the other argument as the resultant list.

All callers were accomplishing that same behavior anyway. And each caller
accopmplished it with the same pattern. So this patch moves that external
pattern into the function.

Reviewed-by: <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
b2d428cb8d intel: Refactor creation of DRI2 configs
DRI2 configs were constructed in intelInitScreen2. That function already
does too much, so move verbatim the code for creating configs to a new
function, intel_screen_make_configs.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
61fd684782 intel: Downsample on DRI2 flush
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
e88cfbb95f intel: Support mapping multisample miptrees
Add two new functions: intel_miptree_{map,unmap}_multisample, to which
intel_miptree_{map,unmap} dispatch. Only mapping flat, renderbuffer-like
miptrees are supported.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
4c0ccc13bd intel: Refactor use of intel_miptree_map
Move the opencoded construction and destruction of intel_miptree_map into
new functions, intel_miptree_attach_map and intel_miptree_release_map.
This patch prevents code duplication in a future commit that adds support
for mapping multisample miptrees.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
81980958d0 intel: Refactor intel_miptree_map/unmap
Move the body of intel_miptree_map into a new function,
intel_miptree_map_singlesample. Now intel_miptree_map dispatches to the
new function. A future commit adds a multisample variant.

Ditto for intel_miptree_unmap.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
6b56140b4b i965: Mark needed downsamples for msaa winsys buffers
Add function intel_renderbuffer_set_needs_downsample. It is a no-op
except on multisample winsys buffers shared with DRI2.

Mark the needed downsamples with the new function at two locations:
    - Immediately after drawing is complete.
    - After blitting.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
d3746354fb intel: Define functions for up/downsampling on miptrees
Flesh out the stub functions intel_miptree_{up,down}sample.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
6cc9df331b i965: Add function brw_blorp_blit_miptrees
Define a function, brw_blorp_blit_miptrees, that simply wraps
brw_blorp_blit_params + brw_blorp_exec with C calling conventions. This
enables intel_miptree.c, in a following commit, to perform blits with
blorp for the purpose of downsampling multisample miptrees.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
f4873babdc intel: Allocate miptree for multisample DRI2 buffers
Immediately after obtaining, with DRI2GetBuffersWithFormat, the DRM buffer
handle for a DRI2 buffer, we wrap that DRM buffer handle with a region and
a miptree. This patch additionally allocates an accompanying multisample
miptree if the DRI2 buffer is multisampled.

Since we do not yet advertise multisample GL configs, the code for
allocating the multisample miptree is currently inactive.

This patch adds the following fields to intel_mipmap_tree:
    singlesample_mt
    needs_downsample
and the following function stubs:
    intel_miptree_downsample
    intel_miptree_upsample

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
4eba67285f intel: Refactor creation of hiz and mcs miptrees
Move the logic for creating the ancillary hiz and mcs miptress for winsys
and non-texture renderbuffers from intel_alloc_renderbuffer_storage to
intel_miptree_create_for_renderbuffer. Let's try to isolate complex
miptree logic to intel_mipmap_tree.c.

Without this refactor, code duplication would be required along the
intel_process_dri2_buffer codepath in order to create the mcs miptree.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
e2f2376e88 intel: Set num samples for winsys renderbuffers
Add a new param, num_samples, to intel_create_renderbuffer and
intel_create_private_renderbuffer.

No multisample GL config is yet advertised, so the value of num_samples is
currently 0.  For server-owned winsys buffers, gl_renderbuffer::NumSamples
is not yet used.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
53fa28f7b1 intel: Refactor quantize_num_samples
Rename quantize_num_samples to intel_quantize_num_samples and change the
first param from struct intel_context* to struct intel_screen*. The
function will later be used by intelCreateBuffer, which is not bound to
any context but is bound to a screen.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Chad Versace
7a2e40ed28 intel: Update stale comment for intel_miptree_slice::map
The comment referred to intel_tex_image_map/unmap, but should more
accurately refer to intel_miptree_map/unmap.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-07 09:30:33 -07:00
Paulo Zanoni
4b40375c43 i965: add more Haswell PCI IDs
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-07 11:13:47 -03:00
Brian Paul
8433f80add egl: remove redundant PFNEGLQUERYSTREAMTIMEKHRPROC typedef
This typedef is present earlier in the header and isn't part of the
EGL_KHR_stream_cross_process_fd extension.  Looks like a Khronos glitch.
2012-08-07 07:31:05 -06:00
Brian Paul
99695f58fd softpipe: fix loop limit for tex_cache[] array
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=53199
2012-08-07 08:00:46 -06:00
Vinson Lee
7d65356d8a st/mesa: Fix a potential memory leak in get_mesa_program.
Fixes resource leak defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-06 22:08:56 -07:00
Vinson Lee
c3894bc2d5 gallivm: Add constructor for raw_debug_ostream.
Fixes uninitialized scalar field defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-06 22:07:31 -07:00
Brian Paul
e622723918 docs: update ARB_debug_output status to DONE 2012-08-06 16:48:00 -06:00
Jason Wood
56c1f55c51 docs: Add OpenGL 4.3 requirements
v2: Note that GLSL 4.3 has not been started, and that
ARB_compute_shader has been started in Gallium drivers.

Signed-off-by: Jason Wood <sandain@hotmail.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-08-06 16:41:24 -06:00
Ian Romanick
45e592c3dd egl: Import eglext.h version 14
This is necessary for EGL_KHR_create_context work (including writing
piglit tests).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-06 15:37:04 -07:00
Ian Romanick
b50703aea5 egl: Replace KHR_surfaceless_* extensions with KHR_surfaceless_context
KHR extension name is reserved for Khronos ratified extensions, and there is
no such thing as EGL_KHR_surfaceless_{gles1,gles2,opengl}.  Replace these
three extensions with EGL_KHR_surfaceless_context since that extension
actually exists.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-06 15:37:04 -07:00
Ian Romanick
cb77f5dd1f egl_dri2: Refactor dereference of dri2_ctx_shared
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-06 15:37:04 -07:00
Ian Romanick
05413ddb1d egl_dri2: Remove swrast version >= 2 checks
Since support for swrast version 2 was added (f55d027a), it has also been
required.  In swrast_driver_extensions, version 2 is set for __DRI_SWRAST
extension.  Remove the spurious version checks sprinked through the code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-06 15:37:04 -07:00
Ian Romanick
63adb6b9ea dri2: Fix bug in attribute handling for non-desktop OpenGL contexts
Previously an error would be generated if any attributes were specified when
creating a non-desktop OpenGL context.  This was a mistake, and it will
prevent old drivers from working with new EGL libraries that add support for
the createContextAttribs interface.  Instead, match the behavior of
EGL_KHR_create_context: allow versions that make sense, reject non-zero flags.

NOTE: This is a candidate for the 8.0 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-06 15:37:04 -07:00
Andreas Boll
102617bc52 docs: update piglit url
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-08-06 16:23:43 -06:00
Andreas Boll
933e13e2af docs/helpwanted: add r600g and i915g todo lists
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-08-06 16:23:43 -06:00
Kenneth Graunke
caa4ae5d7d i965: Allocate dummy slots for point sprites before computing VUE map.
Commit f0cecd43d6 moved the VUE map computation to be only once, at
VS compile time.  However, it did so in slightly the wrong place: it
made the one call to brw_vue_compute_map happen right before the
allocation of dummy slots for replaced point sprite coordinates, causing
a different VUE map to be generated (at least on Ironlake).

Fixes a regression in Piglit's point-sprite test on Ironlake.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46489
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-06 11:16:40 -07:00
Kenneth Graunke
54c045b93c i965/vs: Don't clobber sampler message MRFs with subexpressions.
See the preceding commit for a description of the problem.

NOTE: This is a candidate for stable release branches.

v2: Use a separate dPdx variable rather than reusing the lod src_reg.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-06 11:16:15 -07:00
Kenneth Graunke
c0f60106df i965/fs: Don't clobber sampler message MRFs with subexpressions.
Consider a texture call such as:

   textureLod(s, coordinate, log2(...))

First, we begin setting up the sampler message by loading the texture
coordinates into MRFs, starting with m2.  Then, we realize we need the
LOD, and go to compute it with:

   ir->lod_info.lod->accept(this);

On Gen4-5, this will generate a SEND instruction to compute log2(),
loading the operand into m2, and clobbering our texcoord.

Similar issues exist on Gen6+.  For example, nested texture calls:

  textureLod(s1, c1, texture(s2, c2).x)

Any texturing call where evaluating the subexpression trees for LOD or
shadow comparitor would generate SEND instructions could potentially
break.  In some cases (like register spilling), we get lucky and avoid
the issue by using non-overlapping MRF regions.  But we shouldn't count
on that.

Fixes four Piglit test regressions on Gen4-5:
- glsl-fs-shadow2DGradARB-{01,04,07,cumulative}

NOTE: This is a candidate for stable release branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52129
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-06 11:16:11 -07:00
Kenneth Graunke
27bf9c1997 i965/fs: Factor out texcoord setup into a helper function.
With the textureRect support and GL_CLAMP workarounds, it's grown
sufficiently that it deserves its own function.  Separating it out
makes the original function much more readable.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-06 11:16:09 -07:00
Kenneth Graunke
82bfb4b41a i965/fs: Move message header and texture offset setup to generate_tex().
Setting the texture offset bits in the message header involves very
specific hardware register descriptions.  As such, I feel it's better
suited for the lower level "generate" layer that has direct access to
the weird register layouts, rather than at the fs_inst abstraction layer.

This also parallels the approach I took in the VS backend.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-06 11:16:00 -07:00
Jerome Glisse
2df399c34b r600g: atomize sampler state v2
Use atom for sampler state. Does not provide new functionality
or fix any bug. Just a step toward full atom base r600g.

v2: Split seamless on r6xx/r7xx into it's own atom. Make sure it's
    emited after sampler and with a pipeline flush before otherwise
    it does not take effect.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-08-06 12:04:55 -04:00
Alex Deucher
d3f8000bfc radeonsi: add some new pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-06 10:55:41 -04:00
Alex Deucher
a6146d2566 r600g: add additional evergreen pci ids
Note: this is a candidate for the stable branches.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-06 10:55:41 -04:00
Brian Paul
8eeeef3705 st/mesa: merge fragment/vertex sampler update code
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-06 08:50:20 -06:00
Brian Paul
819e786339 st/mesa: massage update_vertex_samplers() code
...to look like update_fragment_samplers() code, as with the previous
commit.  The next step would be to merge the two functions.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-06 08:50:19 -06:00
Brian Paul
2aac0d145a st/mesa: merge fragment/vertex texture update code
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-06 08:50:11 -06:00
Brian Paul
dd6aafcf72 st/mesa: massage the update_vertex_textures() code
...to look like update_fragment_textures() code.  The next step would
be to merge the two functions.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-06 08:41:07 -06:00
Brian Paul
5749ae919e st/mesa: rename some vertex/fragment state fields for better consistency
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-06 08:41:07 -06:00
Brian Paul
29604441de llvmpipe: consolidate the sampler and sampler view setting code
Less code.  And as with softpipe, if/when we consolidate the pipe_context
functions for binding sampler state, this will make the llvmpipe changes
trivial.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-06 08:33:17 -06:00
Brian Paul
b3538d3563 llvmpipe: combine vertex/fragment sampler state into an array
This will allow code consolidation in the next patch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-06 08:33:17 -06:00
Brian Paul
1f34e1a6cb softpipe: consolidate vert/frag/geom sampler setting functions
The functions for setting samplers and sampler views for vertex,
fragment and geometry shaders were nearly identical.  Now they
use shared code.

In the future, if the pipe_context functions for setting samplers
and sampler views for vert/frag/geom/compute are combined, this
will make updating the softpipe driver a snap.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-06 08:33:17 -06:00
Brian Paul
d6c3e6d8f3 softpipe: consolidate sampler-related arrays
Combine separate arrays for vertex/fragment/geometry samplers, etc into
one array indexed by PIPE_SHADER_x.

This allows us to collapse separate code for vertex/fragment/geometry
state into loops over the shader stage.  More to come.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-06 08:33:17 -06:00
Brian Paul
0a14e9f09f softpipe: combine vert/frag/geom texture caches in an array
This lets us consolidate some code now, and more in subsequent patches.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-08-06 08:33:17 -06:00
Vinson Lee
61b62c007a mesa: Fix off-by-one error in Parse_TextureImageId.
Fixes out-of-bounds write defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-04 21:42:23 -07:00
Vinson Lee
3e7b3a04bf util: Move dereference after null check in util_resource_copy_region.
Fixes dereference before null check defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-04 21:41:27 -07:00
Brian Paul
a5ca29100b i915g: silence a const pointer warning 2012-08-04 08:38:11 -06:00
Marek Olšák
f9a498d1bc radeonsi: fix build failure after blitter changes 2012-08-04 16:34:24 +02:00
Marek Olšák
cb922b63eb r600g: precompute color buffer state in pipe_surface and reuse it 2012-08-04 14:05:52 +02:00
Marek Olšák
cdc681c3ad r600g: precompute depth buffer state in pipe_surface and reuse it
This is done on-demand, because we don't know in advance if a zbuffer
will be bound as depth or color.
2012-08-04 14:05:51 +02:00
Marek Olšák
e6dfc8c77b r600g: simplify create_surface 2012-08-04 14:05:51 +02:00
Marek Olšák
581f7e3101 r600g: drop the old texture allocation code
Made obsolete by the libdrm surface allocator.
2012-08-04 14:05:51 +02:00
Marek Olšák
7c371f4695 r600g: make sure copying of all texture formats is accelerated 2012-08-04 14:05:51 +02:00
Marek Olšák
84645fa613 gallium/u_blitter: add a query for checking whether copying is supported
v2: add comments
2012-08-04 14:05:37 +02:00
Marek Olšák
e2f623f1d6 r600g: don't decompress depth or stencil if there isn't any 2012-08-04 13:53:07 +02:00
Marek Olšák
ea72351a91 r600g: correct texture memory size for Z32F_S8X24 on evergreen 2012-08-04 13:53:07 +02:00
Marek Olšák
c8ff737a18 gallium/u_blitter: remove fallback for stencil copy that all drivers skipped
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-04 13:53:07 +02:00
Marek Olšák
ef1bf6d69e gallium/u_blitter: add ability to blit only depth or only stencil
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-04 13:53:07 +02:00
Marek Olšák
8842678047 gallium: define PIPE_MASK_RGBAZS
I need this and it seems like it could be useful.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-04 13:53:07 +02:00
Marek Olšák
8aaf6972d1 gallium/u_blitter: minor cleanup
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-04 13:53:07 +02:00
Marek Olšák
67a3e5bc32 gallium/tgsi: fixup texture name strings
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-04 13:53:07 +02:00
Marek Olšák
6c420b1668 gallium/u_blitter: set sample mask to ~0
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-04 13:53:07 +02:00
Marek Olšák
9d1ef354f9 gallium/u_blit: bail out if src is a multisample texture
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-04 13:53:07 +02:00
Marek Olšák
6b3f1ae12b gallium/u_blit: check nr_samples before using resource_copy_region
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-04 13:53:07 +02:00
Marek Olšák
e7689303a8 gallium: set sample mask to ~0 for clear, blit and gen_mipmap
The sample mask affects single-sampled rendering too (it's orthogonal
to the color mask).

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-04 13:53:06 +02:00
Dave Airlie
cd97a5f660 r600g: fix F2U opcode translation
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-08-04 13:45:27 +02:00
Vinson Lee
5bce0b5175 draw: Ensure channel in convert_to_soa is initialized.
Fixes uninitialized pointer read defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-03 22:28:31 -07:00
Vinson Lee
9d36b3abfd u_blitter: Move a pointer dereference after null check.
Fixes dereference before null check defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-08-03 22:27:13 -07:00
Matt Turner
fb85558ab1 Use C99 NAN and INFINITY macros 2012-08-03 15:02:09 -07:00
Brian Paul
65da837fcf gallium/tests/trivial: updates for CSO interface changes 2012-08-03 11:58:43 -06:00
Brian Paul
c61d3fe8bd st/xorg: updates for CSO interface changes 2012-08-03 11:56:36 -06:00
Brian Paul
459dd56897 st/xa: updates for CSO interface changes 2012-08-03 11:56:28 -06:00
Brian Paul
3d1bec5d9a vega: fix build breakage from cso sampler/view changes 2012-08-03 08:33:23 -06:00
Brian Paul
832706a80b cso: remove unreachable break statements 2012-08-03 07:16:35 -06:00
Brian Paul
076e5eacf1 cso: 80-column wrapping, remove trailing whitespace, etc 2012-08-03 07:16:35 -06:00
Brian Paul
ea6f035ae9 gallium: consolidate CSO sampler and sampler_view functions
Merge the vertex/fragment versions of the cso_set/save/restore_samplers()
functions.  Now we pass the shader stage (PIPE_SHADER_x) to the function
to indicate vertex/fragment/geometry samplers.  For example:

cso_single_sampler(cso, PIPE_SHADER_FRAGMENT, unit, sampler);

This results in quite a bit of code reduction, fewer CSO functions and
support for geometry shaders.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-08-03 07:16:35 -06:00
Vinson Lee
350f12fb65 st/mesa: Ensure dst in compile_instruction is initialized.
Fixes uninitialized scalar variable defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-02 21:10:49 -07:00
Tom Stellard
f6ad8b45c2 radeon/llvm: Add $(LLVM_LDFLAGS) to the loader linker flags 2012-08-02 20:12:11 +00:00
Tom Stellard
4a89a20717 radeon/llvm: Add support for more f32 CMP instructions on SI 2012-08-02 20:12:11 +00:00
Tom Stellard
a35eea7868 radeon/llvm: Add support for fneg on SI 2012-08-02 20:12:10 +00:00
Tom Stellard
4104bae063 radeon/llvm: Add support for fp_to_sint on SI 2012-08-02 20:12:10 +00:00
Tom Stellard
f7fcaa07df radeon/llvm: Remove CMOVLOG DAG node 2012-08-02 20:12:06 +00:00
Tom Stellard
a5ac8ee2c5 radeonsi: Properly initialize si_shader_ctx.radeon_bld 2012-08-02 13:21:30 -04:00
Michel Dänzer
c2bae6b91d radeonsi: Handle TGSI TXP opcode.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-02 18:38:47 +02:00
Michel Dänzer
93b4f1f97e radeonsi: Handle TGSI DIV opcode.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-02 18:38:16 +02:00
Brian Paul
daf4254d07 svga: remove questionable INLINE qualifiers 2012-08-02 09:40:41 -06:00
Brian Paul
421f134028 svga: sort #includes 2012-08-02 09:40:40 -06:00
Brian Paul
81f2f3f65c svga: add some comments in svga_screen_cache.c 2012-08-02 09:40:40 -06:00
Brian Paul
4b5a5898b1 svga: whitespace, formatting fixes 2012-08-02 09:40:40 -06:00
Brian Paul
bcd8d9713d svga: remove unneeded 'struct svga_screen' declarations 2012-08-02 09:40:40 -06:00
Brian Paul
8551635242 mesa: fix default_access_mode() result for ES2
The GL_OES_mapbuffer extension is supported by OpenGL ES 1 and ES 2 so return
GL_MAP_WRITE_BIT for both ES versions, not just ES 1.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-02 09:40:40 -06:00
Brian Paul
3eb2b5c5e4 mesa: default_access_mode() returns a GLbitfield, not GLenum 2012-08-02 09:40:40 -06:00
José Fonseca
4bd36956f8 scons: set YACCHXXFILESUFFIX to stop needless rebuilding of the parser
Before, the GLSL parser was getting rebuilt every time that scons was
run.  The problem was scons was expecting a glsl_parser.hpp file but
we were generating a glsl_parser.h file.

Signed-off-by: Brian Paul <brianp@vmware.com>
2012-08-02 09:40:40 -06:00
Christian König
41625afa2f radeonsi: initial VDPAU target
Windowed speed is of course way to slow, but fullscreen
works like a charm now.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-02 15:15:23 +02:00
Christian König
a3c6607be1 radeon/llvm: fix fp immediates on SI
I don't know if this is a good idea, but it
fixes the problem at hand.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-02 15:15:00 +02:00
Christian König
250b7fdd26 radeonsi: fix TEX writemask
Using the writemask in the sampler results in packet
VGPRS. For now just sample all components and let
llvm chose the right one.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-02 12:05:33 +02:00
Christian König
3508815d17 radeonsi: fix shader param and color count
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-02 11:22:57 +02:00
Christian König
92b96a883f radeonsi: fix texture loads from sampler > 0
The backend is multiplying the offset by the numbers of
elements anyway, so doing it twice just makes everything
crash.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-02 11:22:52 +02:00
Christian König
9b7dc5e81c radeonsi: disable tiling until we fixed all bugs
Currently there are more important things to worry about.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-02 11:22:40 +02:00
Vinson Lee
8734584952 scons: Add support for Intel Compiler.
The patch makes the SCons build with Intel Compiler successful.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-01 21:28:47 -07:00
Pauli Nieminen
204bfb904b meta: Use sampler object in framebuffer blit
Framebuffer blit needs to setup texture sampling with no reference to the
user's texturing state, and a sampler object lets us avoid a bunch of changes
to the user's state setup.

We don't bother caching the sampler object since we're changing parameters in
it based on the filtering option to glBlitFramebuffer().

Fixes piglit GL_ARB_sampler_objects/framebufferblit and rendering in l4d2 (our
setting of srgb decode wasn't being respected due to the user's sampler object
being active).

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-01 15:57:12 -07:00
Pauli Nieminen
676a563d5b meta: Add sampler object to texture decompression
Sampler objects can be used to shadow texture object state without
modifying original application state. Decompression path feels a bit
like path where caching shouldn't happen. But as everything else is
cached already I decided to cache sampler state too.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-01 15:57:12 -07:00
Pauli Nieminen
5a320d5bcf mesa: Allow meta module to call sampler functions
To allow meta module to use sample objects mesa GL functions need to be
visible and linkable for meta module.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-01 15:57:12 -07:00
Pauli Nieminen
cbdc1d5354 swrast: Support sampler object for texture fetching state
swrast needs to pass sampler object into all texture fetching functions
to use correct sampling state when sampler object is bound to the unit.
The changes were made using half manual regular expression replace.

v2: Fix NULL deref in _swrast_choose_triangle(), because the _Current
    values aren't set yet, so we need to look at our texObj2D. (anholt)

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-01 15:55:51 -07:00
Pauli Nieminen
8129dabb5f mesa: Make ARB_sampler_objects mandatory
To allow meta acceleration operations to use sampler objects the
ARB_sampler_objects extension needs to be mandatory for all drivers.
Because the extension doesn't have any hardware dependencies it is
trivial to implement.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-01 15:31:17 -07:00
Pauli Nieminen
ae58f9696c mesa/program: Use sampler object state if present
CompareFailValue is part of Sampler state that needs to be read from
bound sampler object if present.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-01 15:31:17 -07:00
Pauli Nieminen
cae7636852 mesa/ff_shader: Fix sampler state reading
Fixed function fragment shader generator was incorrectly read texture
sampling state directly from texture object. To make sure that
ARB_sampler_object works correctly shader generator has to use the
bound sampler if one exist.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-01 15:31:17 -07:00
Pauli Nieminen
6f6bd8aedc radeon&r200: Add support for ARB_sampler_objects
Preparation for the mandatory support of ARB_sampler_objects. I have tested
this patch with rv280 only.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-01 15:31:16 -07:00
Pauli Nieminen
10169e7adc radeon: Fix printf format not to warn in 64bit
When I build tested radeon changes I noticed two warnings about format
size missmatch in 64bit. I decided to clean them to make relevant
compiler warnings easier to spot.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-08-01 15:31:16 -07:00
Pauli Nieminen
54808e560f nouveau: Add support for ARB_sampler_objects
ARB_sampler_objects is very simple software only extension to support.  I want
to make it a mandatory extension for Mesa drivers to allow the meta module to
use it.

This patch add support for the extension to nouveau. It is completely untested
search and replace patch, except for flagging the texture state as needing to
be recomputed when a sampler object is present.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
2012-08-01 15:31:16 -07:00
Pauli Nieminen
765509903b mesa/samplerobj: Support EXT_texture_sRGB_decode
sRGBDecode state is part of sampler object state but mesa was missing
handlers to access the state. This patch adds the support for required
state changes and queries.

GL_EXT_texture_sRGB_decode issue 4:
"4) Should we add forward-looking support for ARB_sampler_objects?

        RESOLVED: YES

        If ARB_sampler_objects exists in the implementation, the sampler
        objects should also include this parameter per sampler."

Fixes piglit GL_ARB_sampler_objects/GL_EXT_texture_sRGB_decode.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-01 15:31:16 -07:00
Pauli Nieminen
c37efbfe4c mesa: Move DepthMode to texture object
GL_DEPTH_TEXTURE_MODE isn't meant to be part of sampler state based on
compatibility profile specifications.

OpenGL specification 4.1 compatibility 20100725 3.9.2:
"... The values accepted in the pname parameter
are TEXTURE_WRAP_S, TEXTURE_WRAP_T, TEXTURE_WRAP_R, TEXTURE_MIN_-
FILTER, TEXTURE_MAG_FILTER, TEXTURE_BORDER_COLOR, TEXTURE_MIN_-
LOD, TEXTURE_MAX_LOD, TEXTURE_LOD_BIAS, TEXTURE_COMPARE_MODE, and
TEXTURE_COMPARE_FUNC. Texture state listed in table 6.25 but not listed here and
in the sampler state in table 6.26 is not part of the sampler state, and remains in the
texture object."

The list of states is in Table 6.24 "Textures (state per texture
object)" instead of 6.25 mentioned in the specification text.

Same can be found from 3.3 compatibility specification.

Signed-off-by: Pauli Nieminen <pauli.nieminen@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-08-01 15:30:13 -07:00
Paul Berry
c18806cebf i965/msaa: Allow GL_SAMPLES to be set to 1 prior to Gen6.
This patch allows GL_SAMPLES to be set to either 0 or 1 on i965
platforms that don't support MSAA (those prior to Gen6).  Setting
GL_SAMPLES=1 has the same effect as setting it to 0 on these platforms
(because MSAA is unsupported), but is distinguishable via the GL API.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50165

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-01 12:45:20 -07:00
Paul Berry
97fc89c6cb i965/msaa: Treat GL_SAMPLES=1 as equivalent to GL_SAMPLES=0.
EXT_framebuffer_multisample is a required subpart of
ARB_framebuffer_object, which means that we must support it even on
platforms that don't support MSAA.  Fortunately
EXT_framebuffer_multisample allows for this by allowing GL_MAX_SAMPLES
to be set to 1.

This leads to a tricky quirk in the GL spec: since
GlRenderbufferStorageMultisamples() accepts any value for its
"samples" parameter up to and including GL_MAX_SAMPLES, that means
that on platforms that don't support MSAA, GL_SAMPLES is allowed to be
set to either 0 or 1.  On platforms that do support MSAA, GL_SAMPLES=1
is not used; 0 means no MSAA, and 2 or higher means MSAA.

In other words, GL_SAMPLES needs to be interpreted as follows:
  =0  no MSAA (possible on all platforms)
  =1  no MSAA (only possible on platforms where MSAA unsupported)
  >1  MSAA (only possible on platforms where MSAA supported)

This patch modifies all MSAA-related code to choose between
multisampling and single-sampling based on the condition (GL_SAMPLES >
1) instead of (GL_SAMPLES > 0) so that GL_SAMPLES=1 will be treated as
"no MSAA".

Note that since GL_SAMPLES=1 implies GL_SAMPLE_BUFFERS=1, we can no
longer use GL_SAMPLE_BUFFERS to distinguish between MSAA and non-MSAA
rendering.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-08-01 12:45:15 -07:00
Tomeu Vizoso
d5c918f6ad glsl: Add support for OES_standard_derivatives in GLSL ES.
Previously, we advertised the extension but the builtin functions
were enabled only for GLSL and not for ES.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52003

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-08-01 10:44:44 -07:00
Chad Versace
8c94f6bbd8 intel: Use consistent pattern in intelCreateBuffer
The 16-bit depth case did not follow the function's prevalent pattern.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-01 10:33:40 -07:00
Chad Versace
2b4fbc4d7d intel: Decrease nesting level in intelCreateBuffer
Nearly the whole function body was contained in the 'else' branch. The
'if' branch did one thing: return early with an error. Clean things up by
moving all the code out of the 'else' branch. Decreases max nesting level
from 4 to 3.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-01 10:33:38 -07:00
Chad Versace
83fa0842ca intel: Remove dead code in intelAllocateBuffer
After commit "intel: Convert to using private depth/stencil buffers", we
request from DRI2GetBuffersWithFormat only the front left and back left
buffers. We no longer request depth and stencil buffers.

Assert that in intelAllocateBuffer and remove the related dead code.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-08-01 10:33:36 -07:00
Matt Turner
84ead7b4e8 configure.ac: Remove extra ;;
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=53053
2012-08-01 10:12:50 -07:00
Matt Turner
33ae29c93b configure.ac: Don't duplicate CFLAGS
These assignments caused CFLAGS specified on the configure line to
appear twice in the final CFLAGS. Removing them makes the behavior
reasonable -- USER_CFLAGS are appended at the end of CFLAGS, allowing
the builder to override flags added by configure.ac like
-fno-strict-aliasing.

Reviewed-by: Adam Jackson <ajax@redhat.com>
2012-08-01 10:12:50 -07:00
Matt Turner
14819eb588 configure.ac: Remove contractions to stop breaking syntax highlighting
Reviewed-by: Adam Jackson <ajax@redhat.com>
2012-08-01 10:12:50 -07:00
Matt Turner
0e38a3ca52 configure.ac: remove remnants of ppc asm support
Missed by d387899388.

Reviewed-by: Adam Jackson <ajax@redhat.com>
2012-08-01 10:12:22 -07:00
Adam Jackson
33ef67ab20 linux: Default to dri not xlib on all arches
Even on s390{,x} where there's no video card, you still want this so GLX
protocol works.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2012-08-01 12:37:25 -04:00
Christoph Bumiller
8592933de8 nv50,nvc0: make resolve sampler objects allow sRGB conversion
Just figured out what that bit does.

Note: It's converted back to sRGB on write, so no effective
conversion occurs.
2012-08-01 15:39:46 +02:00
Christoph Bumiller
6286d9810b Revert "gallium: specify resource_resolve destination via a pipe_surface"
This reverts commit 5d5af7d359.

It turns out the issue this was supposed to fix merely counter-acted
a bug in the hardware driver that I wasn't aware of.

The resource_resolve is not supposed to do sRGB conversion, period.
(This would violate the requirement that source and destination must
be of the same format).
2012-08-01 15:39:46 +02:00
Roland Scheidegger
be2dcc5e9f r200: get rid of dubious aux scissor bits
no point in emitting aux scissor values if we
a) never enable them
b) never set the actual values

plus it is enough to have that aux scissor enable reg (which we never set to
enable) in one place not two.
2012-08-01 14:58:47 +02:00
Roland Scheidegger
c0c216c469 radeon/r200: get rid of some unneeded cliprect/scissor code
Noone was interested in the number of cliprects, and noone cared
about the intersect result neither. So just nuke this.
2012-08-01 14:58:38 +02:00
Roland Scheidegger
549470aa1a r200: get rid of old gart memory functions from old dri1
Those functions are SO dead.
2012-08-01 14:58:29 +02:00
Roland Scheidegger
de694b6b10 radeon/r200: fix bogus clears
There were several problems with these functions (which are a remnant
of dri1 hyperz mostly - should bring it back somehow someday).
First, it would always do a swrast clear if the buffer to clear was a fbo.
Second, for buffers we wouldn't handle the clear (I guess aux/accum?) we
would actually still have tried to clear that later even when we already
cleared it with swrast.
2012-08-01 14:58:23 +02:00
Roland Scheidegger
5b88a2a22d radeon/r200: fix bogus assert/scissor wrt width/height 2048
This addresses one issue raised in bug #51658 discovered by Eugene St Leger.
The assert is bogus since there's no problem with texture width/height being
2048 (the width/height programmed is width/height minus one).
OTOH though the programmed size for scissor rect should be width/height
minus one too otherwise bad things may happen (as it is inclusive, and there's
not enough bits for more than a value of 2047).
2012-08-01 14:58:15 +02:00
Christian König
6574fe3c4a radeon/llvm: fix calculation of max register number
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-08-01 11:15:06 +02:00
Tom Stellard
a488fdd3d9 radeon/llvm: Add pseudo-support for 64-bit immediate types on SI
SI does not support 64-bit immediates natively, but llvm will generate
i64 immediates when indexing loads and stores (since SI has 64-bit
pointers).  The i64 indices will always be small enough to fit into
32-bits (i.e. the high 32 bits will always be all zeros), so we can
treat these index values as 32-bits.
2012-07-31 20:19:21 +00:00
Tom Stellard
be46874281 radeon/llvm: Fix incorrect return value in SelectADDRReg()
We need to return true when we match the pattern.
2012-07-31 20:19:20 +00:00
Tom Stellard
056b77ca22 radeon/llvm: Move SMRD IMM pattern before SMRD SGPR pattern
In tablegen, if two patterns match, the one that comes first in the file
is given preference.  We want the SMRD IMM pattern to be given
preference, because it encodes the pointer offset in its immediate
field, which saves us an add instruction.
2012-07-31 20:19:20 +00:00
Eric Anholt
877a897adc glsl: Reject linking shaders with too many uniform blocks.
Part of fixing piglit maxblocks.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-31 12:06:20 -07:00
Eric Anholt
fa08b8ad54 mesa: Return -1 for glGetUniformLocation on UBOs.
Fixes piglit ARB_uniform_buffer_object/getuniformlocation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-31 12:06:20 -07:00
Eric Anholt
bbd1d6124d glsl: Assign array and matrix stride values according to std140 layout.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-31 12:06:20 -07:00
Eric Anholt
551bdf25bc glsl: Add support for default layout qualifiers for uniforms.
I ended up having to add rallocing of the ast_type_qualifier in order
to avoid pulling in ast.h for glsl_parser_extras.h, because I wanted
to track an ast_type_qualifier in the state.

Fixes piglit ARB_uniform_buffer_object/row-major.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-31 12:06:20 -07:00
Eric Anholt
7b77c64254 glsl: Merge UBO layout qualifiers in a qualifier list.
Yes, you get to say things like "layout(row_major, column_major)" and
get column major.

Part of fixing piglit ARB_uniform_buffer_object/row_major.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-31 12:06:20 -07:00
Eric Anholt
eed967bc9c mesa: Add support for GL_ARB_ubo's glGetActiveUniformName().
This is like a stripped-down version of glGetActiveUniform that just
returns the name, since the other return values (type and size) of
that function are now meant to be handled with
glGetActiveUniformsiv().

Fixes piglit ARB_uniform_buffer_object/getactiveuniformname

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-31 12:06:19 -07:00
Eric Anholt
dc654370c3 mesa: Add support for most of the other pnames of glGetActiveUniformBlockiv().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-31 12:06:19 -07:00
Eric Anholt
5a165d1f3a mesa: Add support for getting active uniform block names.
Fixes piglit ARB_uniform_buffer_object/getactiveuniformblockname.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-31 12:06:19 -07:00
Eric Anholt
467304dfe5 mesa: Add support for glUniformBlockBinding() and the API to get it back.
Fixes piglit ARB_uniform_buffer_object/uniformbufferbinding.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-31 12:06:19 -07:00
Eric Anholt
fafa394c15 glsl: Incorporate all UBO language changes into GLSL 1.40.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-31 12:06:19 -07:00
Eric Anholt
4070036259 mesa: Add support for glGetProgramiv pnames for UBOs.
Fixes piglit ARB_uniform_buffer_object/getprogramiv.

v2: Add extension checks.
v3: Appease MSVC.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-31 12:06:19 -07:00
Kenneth Graunke
3a90dc22d1 glsl: Refactor #version validation to be more future-proof.
The previous implementation required a flag in _mesa_glsl_parse_state
and line of code to initialize it for every version of the shading
language we intend to support.  As we look to add 150, 330, 400, 410,
420, and beyond, this gets rather unwieldy.

This patch retains the switch statement (to reject, say, #version 111),
but removes all the bits.  Code to check for ctx->API == API_OPENGL_CORE
could easily be added to the 110 and 120 cases to reject those.

v2: Use _mesa_is_desktop_gl to preserve the existing behavior in the
    presence of the new API_OPENGL_CORE enumeration.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
2012-07-31 11:20:49 -07:00
Eric Anholt
19bd5936af i965: Add support for GL_SKIP_DECODE_EXT on other SRGB formats.
Fixes some failures in getteximage-formats.

v2: Remove stray include, and drop extra test for encoding == GL_SRGB --
    _mesa_get_srgb_format_linear() returns the same format if it wasn't SRGB.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48120
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
NOTE: This is a candidate for the 8.0 branch.
2012-07-31 11:14:23 -07:00
Kenneth Graunke
03ac5c54b5 glsl: Fix #pragma invariant(all) language version check.
It was using state->Const.GLSL_100ES, which is set if the driver
supports ARB_ES2_compatibility or we're in ES2 mode.  Instead, it should
use state->language_version, as that represents the actual GLSL version
of the shader being compiled.

Since the correct logic is < 120 && !100, just make it == 110.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-31 10:52:54 -07:00
Kenneth Graunke
d84b3a5a3c mesa: Support glGetString(GL_SHADING_LANGUAGE_VERSION) for >= 1.40.
This will need to get refactored when we add support for core profiles
or forward-compatible contexts, but we may as well have it in the
meantime.  This allows us to override the GLSL version and experiment.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-31 10:52:54 -07:00
Brian Paul
591594ea1e ir_to_mesa: make size_swizzles[] array static const 2012-07-31 09:00:41 -06:00
Jon TURNEY
27013e5164 Move installing osmesa.pc to drivers/osmesa
Move installing osmesa.pc to drivers/osmesa, where it belongs better

This also restores the installation of gl.pc if we are building osmesa at the
same time as libGL, which was broken in commit 39785488 when the .pc
installation was converted to automake

v2:
Remove HAVE_OSMESA_DRIVER automake conditional, it's now pointless as we
will only be building in the drivers/osmesa directory if the condition it
checked was true.

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-31 12:48:33 +01:00
Vinson Lee
2faa2b4f7e gallium/util: Use GCC built-in functions for NaN and infinity.
This patch fixes this build failure with Intel Compiler.

src/gallium/auxiliary/util/u_format_tests.c(903): error: floating-point operation result is out of range
     {PIPE_FORMAT_R16_FLOAT, PACKED_1x16(0xffff), PACKED_1x16(0x7c01), UNPACKED_1x1(        NAN, 0.0, 0.0, 1.0)},

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-07-30 23:27:19 -07:00
Jordan Justen
3d0b54c7c6 mesa: don't enable legacy GL functions when using API_OPENGL_CORE
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-30 16:25:56 -07:00
Jordan Justen
1fea3df6f4 intel: add support for using API_OPENGL_CORE
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-30 16:25:56 -07:00
Jordan Justen
0f099df567 meta: add support for using API_OPENGL_CORE
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-30 16:25:56 -07:00
Jordan Justen
4aecd8f031 glsl: add support for using API_OPENGL_CORE
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-30 16:25:56 -07:00
Jordan Justen
09714c09a4 mesa: add support for using API_OPENGL_CORE
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-30 16:18:57 -07:00
Jordan Justen
3d284dcba6 mesa: add api check functions
These functions make it easier to check for multiple API types.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-30 16:18:57 -07:00
Jordan Justen
1c29b73f4d mesa: add API_OPENGL_CORE api
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-30 16:18:57 -07:00
Ian Romanick
d3de40742f glsl: Fix ir_last_opcode value.
Now that ir_quadop_vector exists, ir_last_binop and ir_last_opcode are
no longer the same.  Only one place currently uses this enumeration, and
already handles ir_quadop_vector correctly.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Olivier Galibert <galibert@pobox.com>
2012-07-30 15:15:48 -07:00
Ian Romanick
9d998a2a59 glsl: Request an Nx1 type instance in ir_quadop_vector lowering pass.
No types have 0 columns.  The glsl_type::get_instance method contains

   if ((rows < 1) || (rows > 4) || (columns < 1) || (columns > 4))
      return error_type;

To get a vector, use columns = 1.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Olivier Galibert <galibert@pobox.com>
2012-07-30 15:14:34 -07:00
Kenneth Graunke
13cb99dc73 glsl: Make bvec and ivec types accessible without using get_instance.
It's more convenient to use shortcuts like glsl_type::bvec2_type than
the longwinded glsl_type::get_instance(GLSL_TYPE_BOOL, 2, 1).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Olivier Galibert <galibert@pobox.com>
2012-07-30 15:14:09 -07:00
Tom Stellard
cd0949eb28 radeon/llvm: Cleanup AMDIL.h 2012-07-30 21:10:14 +00:00
Tom Stellard
2f921101c0 radeon/llvm: Rename all AMDIL* classes to AMDGPU* 2012-07-30 21:10:14 +00:00
Tom Stellard
b72ab79d73 radeon/llvm: Merge AMDILSubtarget into AMDGPUSubtarget 2012-07-30 21:10:13 +00:00
Tom Stellard
27ae41c83d radeon/llvm: Merge AMDILTargetLowering class into AMDGPUTargetLowering 2012-07-30 21:10:13 +00:00
Tom Stellard
c96490e3b5 radeon/llvm: Remove IL_cmp DAG node 2012-07-30 21:10:13 +00:00
Tom Stellard
aece7970eb radeon/llvm: Cleanup and reorganize AMDIL .td files 2012-07-30 21:10:13 +00:00
Tom Stellard
0ce6e50601 radeon/llvm: Remove lowering code for unsupported features
e.g. function calls, load/store from stack
2012-07-30 21:10:08 +00:00
Tom Stellard
caeaf43dad radeon/llvm: Remove AMDILVersion.td 2012-07-30 20:31:57 +00:00
Tom Stellard
c3111eb639 radeon/llvm: Remove AMDILAlgorithms.tpp 2012-07-30 20:31:57 +00:00
Tom Stellard
ac669c32c6 radeon/llvm: Merge AMDILInstrInfo.cpp into AMDGPUInstrInfo.cpp 2012-07-30 20:31:57 +00:00
Tom Stellard
3a0187b1b5 radeon/llvm: Merge AMDILRegisterInfo into AMDGPURegisterInfo 2012-07-30 20:31:57 +00:00
Tom Stellard
9c42fb6f26 radeon/llvm: Change the tablegen target from AMDIL to AMDGPU 2012-07-30 20:31:56 +00:00
Kenneth Graunke
f56dfc3213 i965: Support MESA_FORMAT_SIGNED_RGBA_16.
The hardware supports this format with no known quirks, so we may as
well enable it.

Alpha blending is not supported until Sandybridge, but as far as I can
tell, OpenGL doesn't require alpha blending on SNORM formats.  Plus, we
already expose R8G8B8A8_SNORM which has a similar restriction.

Fixes 6 piglit texwrap-2D-*SNORM* cases,
gl-3.1/required-sized-texture-formats, and 10 oglconform snorm-textures
subcases

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-30 09:35:58 -07:00
Elvis Lee
e7a4a2b18b gbm: Fix build for wayland include
backends/gbm_dri.c fails to find wayland-server.h.

Signed-off-by: Elvis Lee <kwangwoong.lee@lge.com>
2012-07-30 11:58:02 -04:00
Brian Paul
b51be8786f mesa: fix _math_matrix_copy(), again
The matrix is 16 GLfloats in size.  Since from->inv is just a pointer (not
an array), sizeof(*from->inv) wasn't right.
2012-07-30 08:30:15 -06:00
Vinson Lee
502c10839e mesa: Fix wrong sizeof argument in _math_matrix_copy.
Fixes Coverity wrong sizeof argument defect.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-07-30 08:13:55 -06:00
Christian König
86490bc150 radeonsi: fix db and stencil setup v2
v2: fix tiling for small pitches, that finally makes
    glxgears and readPixSanity work

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-07-30 15:02:04 +02:00
Christian König
7dace3a3cf radeonsi: fix stencil op mapping
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-07-30 15:02:00 +02:00
Christian König
ad15c8c0f1 radeonsi: fix assertion in si_bind_vs_sampler
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-07-30 15:01:55 +02:00
Christian König
1fb8ee62fa radeonsi: fix shader binding
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-07-30 15:01:51 +02:00
Christian König
f18fd255cf radeonsi: fix dummy export in shaders v2
v2: add assertion for vertex shader

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-07-30 15:01:34 +02:00
Christian König
b15e3ae5b4 radeonsi: fix vertex buffer and elements
Let's just use the T# descriptors until we get a fetch shader.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-07-30 14:45:32 +02:00
Christian König
d51b9b70d5 radeonsi: fix shader size and handling
We should always upload the shader here.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-07-30 14:45:08 +02:00
Christian König
fe41287ffa radeonsi: rename r600_resource to si_resource
Also split it into seperate header and add
some helper functions.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-07-30 14:44:38 +02:00
Kenneth Graunke
dcf8754cce glcpp: Add a newline to expanded #line directives.
Otherwise, the preprocessor happily outputs

    #line 2 4 <your next line of code>

and the main compiler gets horribly confused and fails to compile.

This is not the right solution (line numbers in error messages will
likely be off-by-one in certain circumstances), but until Carl comes
up with a proper fix, this gets programs running again.

Fixes regressions in Regnum Online, Overgrowth, Piglit, and others since
commit aac78ce823.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51802
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51506
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41152
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-28 13:33:50 -07:00
Christoph Bumiller
5d5af7d359 gallium: specify resource_resolve destination via a pipe_surface
The format member of pipe_surface may differ from that of the
pipe_resource, which is used to communicate, for instance, whether
sRGB encode should be enabled in the resolve operation or not.

Fixes resolve to sRGB surfaces in mesa/st when GL_FRAMEBUFFER_SRGB
is disabled.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-28 14:58:18 +02:00
Christoph Bumiller
51e41a0d89 st/mesa: call update_renderbuffer_surface for sRGB renderbuffers, too
sRGBEnabled should affect both textures and renderbuffers, so we need
to check/update the pipe_surface format for both.

Fixes, for instance, rendering appearing too bright in wine applications
using sRGB multisample renderbuffers.

NOTE: This is a candidate for the 8.0 branch.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-28 13:14:30 +02:00
Christoph Bumiller
acd66ec033 nv50: fix depth/stencil multisample memory storage types
Leftover from libdrm_nouveau v2 interface change.
2012-07-28 13:14:03 +02:00
Christoph Bumiller
cd3d85b63d nv50: fix resource_resolve shader start offsets 2012-07-28 13:11:56 +02:00
Brian Paul
f612e55e45 st/mesa: undo a couple static asserts
Hmm, gcc didn't catch these mistakes, but MSVC did.
2012-07-27 16:10:58 -06:00
Brian Paul
322a2938f3 st/mesa: use STATIC_ASSERT in a few places 2012-07-27 15:47:38 -06:00
Brian Paul
59c67f8116 mesa: whitespace, etc. fixes in program.h 2012-07-27 15:43:53 -06:00
Brian Paul
906febaf8b meta: fix glDrawPixels fallback test, stencil drawing
Remove the check for pixel transfer ops.  If any RGB/depth scale/bias
is in effect, it'll be applied in the glTexImage step.

If drawing stencil pixels we need to disable pixel transfer so that
alpha scale/bias are not applied to the stencil data.

These issues were spotted by Roland.

Fixes Blender performance issues reported in
http://bugs.freedesktop.org/show_bug.cgi?id=47375

NOTE: This is a candidate for the 8.0 branch.

Tested-by: Barto <mister.freeman@laposte.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-07-27 14:53:16 -06:00
Brian Paul
a80b7407f3 radeon: fix 'sowftware' typo 2012-07-27 14:53:16 -06:00
Eric Anholt
fbf86c7f0f i965/gen7: Reduce GT1 WM thread count according to updated BSpec.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>

https://bugs.freedesktop.org/show_bug.cgi?id=52382
2012-07-27 11:42:19 -07:00
Kenneth Graunke
cbcf750d5f i965: Fix typo in shader channel select field name.
"chanel" isn't very searchable.  I can type, honest!

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-27 11:31:07 -07:00
Paul Berry
ee9f6a34cc i965/msaa: Use MESA_FORMAT_R8 for MCS buffer.
No functional change.  This patch modifies intel_miptree_alloc_mcs to
allocate the 4x MCS buffer using MESA_FORMAT_R8 instead of
MESA_FORMAT_A8.  In principle it doesn't matter, since we only access
the buffer using MCS-specific hardware mechanisms, so all that's
important is to use a format with the correct size.  However,
MESA_FORMAT_A8 has enough unusual behaviours that it seems prudent to
avoid it.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-27 10:42:19 -07:00
Zou Nan hai
588881430a intel: increase wm thread number to 80 on gen6 GT2
It seems reset is not required for setting the max_wm_threads to 80
on gen6 GT2.

Increases performance in the Counter-Strike: Source video stress test
by 7.18% (n=5).

Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Matt Turner <mattst88@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
2012-07-27 10:32:17 -07:00
Tom Stellard
fdd8df20e4 r600g: Emit dispatch state for compute directly to the cs
We no longer rely on an evergreen_compute_resource for emitting dispatch
state.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-07-27 17:08:09 +00:00
Tom Stellard
dc0b8a4628 r600g: Initialize VGT_PRIMITIVE_TYPE in the start_cs_cmd atom
The value of this register will always be DI_PT_POINTLIST for compute
shaders.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-07-27 17:08:09 +00:00
Tom Stellard
d3b0130491 r600g: Atomize compute shader state
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-07-27 17:08:09 +00:00
Tom Stellard
5497391067 r600g: Add helper functions for emitting compute SET_CONTEXT packets
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-07-27 17:08:09 +00:00
Tom Stellard
c9ef27276f radeon/llvm: Add instruction defs for branches on SI 2012-07-27 17:08:09 +00:00
Tom Stellard
ee0f0f03c6 radeon/llvm: Fix VOPC and V_CNDMASK encoding 2012-07-27 17:08:09 +00:00
Tom Stellard
d4bdd09d47 radeon/llvm: Assert if we try to copy SCC reg 2012-07-27 17:08:09 +00:00
Tom Stellard
fd1f19a191 radeon/llvm: Add SI DAG optimizations for setcc, select_cc
These are needed for correctly lowering branch instructions in some
cases.
2012-07-27 17:08:08 +00:00
Tom Stellard
cd5d4c5073 radeon/llvm: Add support for encoding SI branch instructions 2012-07-27 17:08:08 +00:00
Tom Stellard
50ff2dc0a4 radeon/llvm: Add special nodes for SALU operations on VCC
The VCC register is tricky because the SALU views it as 64-bit, but the
VALU views it as 1-bit.  In order to deal with this we've added some
special bitcast and binary operations to help convert from the 64-bit
SALU view to the 1-bit VALU view and vice versa.
2012-07-27 17:08:08 +00:00
Tom Stellard
c424975572 radeon/llvm: Add i1 registers for SI. 2012-07-27 17:08:08 +00:00
Tom Stellard
bdda1cb914 radeon/llvm: Fix CCReg definitions on SI 2012-07-27 17:08:08 +00:00
Tom Stellard
ae9be358f2 radeonsi: Enable PIPE_SHADER_CAP_INTEGERS 2012-07-27 17:08:08 +00:00
Tom Stellard
022b54359a radeonsi: Add support for loading integers from constant memory 2012-07-27 17:08:07 +00:00
Tom Stellard
ad95bcb31f radeon/llvm: Add bitconvert patterns for SI 2012-07-27 17:08:07 +00:00
Tom Stellard
4cab682184 radeon/llvm: Add custom lowering for SELECT_CC nodes on SI 2012-07-27 17:08:07 +00:00
Tom Stellard
ba76684292 radeon/llvm: Move conditional pattern leafs to common tablegen file 2012-07-27 17:08:07 +00:00
Tom Stellard
d36455ba2c radeon/llvm: Implement getSetCCResultType for SI 2012-07-27 17:08:07 +00:00
Tom Stellard
e8825ce6e1 radeon/llvm: Custom lower BR_CC for SI 2012-07-27 17:08:07 +00:00
Tom Stellard
87272e9e25 radeon/llvm: Move lowering of BR_CC node to R600ISelLowering
SI will handle BR_CC different from R600, so we need to move it
out of the shared instruction selector.
2012-07-27 17:08:07 +00:00
Tom Stellard
92823fb72a radeon/llvm: Move lowering of SETCC node to R600ISelLowering
SI will handle SETCC different from R600, so we need to move it
out of the shared instruction selector.
2012-07-27 17:08:06 +00:00
Tom Stellard
46d12c99a2 radeon/llvm: Use correct node type when lowering SETCC 2012-07-27 17:08:06 +00:00
Tom Stellard
47d1b0a809 radeon/llvm: Move LowerSELECT_CC into R600ISelLowering
SI will handle SELECT_CC different from R600, so we need to move it out
of the shared instruction selector.
2012-07-27 17:08:06 +00:00
Eric Anholt
11ff18fcf5 automake: Remove OPT_FLAGS.
If you want to change your compiler arguments, just set CFLAGS/CXXFLAGS.
Having Mesa have this separate variable is a great way to have your arguments
not thoroughly propagated to all compiler invocations.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-26 17:30:06 -07:00
Eric Anholt
87a1c4f233 automake: Remove ARCH_FLAGS.
In all current uses, it was appended to CFLAGS, which already had -m32.  If
you want to do some other flag supplied to compiler invocations, there's
CFLAGS/CXXFLAGS.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-26 17:30:06 -07:00
Paul Berry
4df2848786 i965/msaa: use ROUND_DOWN_TO macro.
No functional change.  This patch modifies brw_blorp_blit.cpp to use
the ROUND_DOWN_TO macro instead of open-coded bit manipulations, for
clarity.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-26 15:02:10 -07:00
Brian Paul
f37f1a7209 svga: initialize svga_compile_key to zeros to be safe 2012-07-26 16:00:31 -06:00
Brian Paul
dafa77201f svga: fix invalid memory reference in needs_to_create_zero()
The emit->key.fkey info is only valid if we're generating a fragment shader.
We should not look at it if we're generating a vertex shader.

When generating a vertex shader, the value of emit->key.fkey.num_textures was
garbage and the loop over num_textures would read invalid data.  At best
this would cause us to emit an unused constant.  At worse, we could segfault.
Just by dumb luck, fkey.num_textures was usually a smallish integer.

NOTE: This is a candidate for the 8.0 branch.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-07-26 16:00:31 -06:00
Brian Paul
38184dcd54 radeon: fix Base/base typo
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=52563
2012-07-26 15:57:20 -06:00
Daniel Charles
948c8f502a android-build: fix dricore build for autogenerated files (v3)
Recently more files were removed from control to be auto-generated
in the dricore library. Android build was not able to locate the
new files if they were not created beforehand.

LOCAL_SRC_FILES includes some of those files and Android.gen.mk
re-defines this variable by filtering out the auto-generated files.
Unfortunately for this variable it is not the same to have the SRCDIR
variable defined as the current directory.

By re-defining SRCDIR for the autotools build the Android build system
is happy again and the new files were actually removed from the sources
to use the auto generated versions.

Also patch d5c1801a01 was partially reverted as the files
can not be compiled to the LOCAL_PATH, instead they should live on the
intermediates folder so that a clean can wipe them out.

v3: [chad] Fix the definition of SRCDIR in libdricore/Makefile.am.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Daniel Charles <daniel.charles@intel.com>
2012-07-26 14:51:20 -07:00
Brian Paul
0e893b4261 radeon: set swrast_renderbuffer::ColorType field when mapping renderbuffers
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=47375

NOTE: This is a candidate for the 8.0 branch.

Tested-by: Barto <mister.freeman@laposte.net>
2012-07-26 13:59:44 -06:00
Brian Paul
a73e9207da xlib: add X error handler around XGetImage() call
XGetImage() will generate a BadMatch error if the source window isn't
visible.  When that happens, create a new XImage.  Fixes piglit 'select'
test failures with swrast/xlib driver.

NOTE: This is a candidate for the 8.0 branch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-26 13:59:44 -06:00
Brian Paul
66adc807c4 mesa: remove obsolete matrix comment 2012-07-26 13:59:44 -06:00
Brian Paul
1e37d54d9d mesa: fix comment typo: s/pointer/point/ 2012-07-26 13:59:44 -06:00
Brian Paul
66d9ac5ac7 mesa: remove _math_matrix_alloc_inv()
Always allocate space for the inverse matrix in _math_matrix_ctr()
since we were always calling _math_matrix_alloc_inv() anyway.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-26 13:59:44 -06:00
Brian Paul
50db812915 mesa: loosen small matrix determinant check
When computing a matrix inverse, if the determinant is too small we could hit
a divide by zero.  There's a check to prevent this (we basically give up on
computing the inverse and return the identity matrix.)  This patch loosens
this test to fix a lighting bug reported by Lars Henning Wendt.

v2: use abs(det) to handle negative values

NOTE: This is a candidate for the 8.0 branch.

Tested-by: Lars Henning Wendt <lars.henning.wendt@gris.tu-darmstadt.de>
2012-07-26 13:59:43 -06:00
Paul Berry
148c8e639d i965: Use sendc for all render target writes on Gen6+.
The sendc instruction causes the fragment shader thread to wait for
any dependent threads (i.e. threads rendering to overlapping pixels)
to complete before sending the message.  We need to use sendc on the
first render target write in order to guarantee that fragment shader
outputs are written to the render target in the correct order.

Previously, we only used the "sendc" instruction when writing to
binding table index 0.  This did the right thing for fragment shaders,
because our fragment shader back-ends always issue their first render
target write to binding table index 0.  However, it did the wrong
thing for blorp, which performs its render target writes to binding
table index 1.

A more robust solution is to use sendc for all render target writes.
This should not produce any performance penalty, since after the first
sendc, all of the dependent threads will have completed.

For more information about sendc, see the Ivy Bridge PRM, Vol4 Part3
p218 (sendc - Conditional Send Message), and p54 (TDR Registers).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-26 10:49:38 -07:00
Paul Berry
8f37ea414f i965/msaa: Remove TODO comments that are no longer relevant.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-26 10:49:38 -07:00
Paul Berry
c738ea1191 intel: Make more consistent use of _mesa_is_{user,winsys}_fbo()
A lot of code was still differentiating between between winsys and
user fbos by testing the fbo's name against zero.  This converts
everything in the i915 and 965 drivers over to use _mesa_is_user_fbo()
and _mesa_is_winsys_fbo().

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-26 10:48:36 -07:00
Paul Berry
284ad9c3b2 mesa: Make more consistent use of _mesa_is_{user,winsys}_fbo()
A lot of code was still differentiating between between winsys and
user fbos by testing the fbo's name against zero.  This converts
everything in core mesa, the state tracker, and src/mesa/program over
to use _mesa_is_user_fbo() and _mesa_is_winsys_fbo().

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-26 10:38:05 -07:00
Oliver McFadden
e72f20641a glsl: warning: pragma `invariant(all)' not supported in GLSL ES 1.00
The OpenGL(R) ES Shading Language
Version 1.00 Revision 17 (12 May, 2009)

> 4.6.1 The Invariant Qualifier
> ... To force all output variables to be invariant, use the pragma
> #pragma STDGL invariant(all)

Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-26 13:09:15 +03:00
Kenneth Graunke
16cba717c2 shared-glapi: Install libglapi.so.0.0.0 and .0 links in lib/.
We already provided these files on 'make install', but only created a
'libglapi.so' in the top-level lib/ convenience folder.  We used to
create all three, but at some point in the build system churn, it broke.

Various applications (like the ES2 conformance suite) seem to link
against libglapi.so.0, so without these links, setting LD_LIBRARY_PATH
and LIBGL_DRIVERS_PATH can lead to using /usr/lib/libglapi.so.0 with
/home/whatever/libGL.so, which leads to API calls getting routed
incorrectly (i.e. glCompileShader -> _mesa_LinkProgramARB), which leads
to rage problems.

Preserve developer sanity...install links.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-25 22:37:24 -07:00
Vinson Lee
4f109ca4e8 scons: Fix build with clang.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-25 17:04:30 -07:00
Eric Anholt
cc44aa7749 i965: Remove unused param conversion code.
Ever since ctx->NativeIntegers was set, the conversion flag has been
PARAM_NO_CONVERT.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-25 10:29:56 -07:00
Olivier Galibert
fa76d04aea softpipe: fix copy/paste error in tex sample code
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=52369

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-25 07:47:19 -06:00
Jon TURNEY
f9089f4022 Remove redundant osmesa shared library install from Makefile.old
Since osmesa now has been converted to Makefile.am, an appropriate install: rule
is generated to install the shared libary, so we no longer need to do that in
src/mesa/Makefile.old

This leaves nothing in src/mesa/Makefile.old but the tags: rule, so move that to
Makefile.am and remove Makefile.old

Also, nothing now uses OSMESA_LIB_GLOB anymore, so remove it

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-25 12:41:07 +01:00
Jon TURNEY
bd4a3cce96 Update mesa/drivers/x11/Makefile.am for xm_image.h removal
Commit 6c6803f28d removed xm_image.[ch], and removed
xm_image.c, but not xm_image.h from the Makefile, this was subsequently carried over
into Makefile.am

Remove xm_image.h from Makfile.am.  This allows 'make dist' to succeed, even if it
doesn't do anything useful

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-25 12:40:10 +01:00
Jon TURNEY
9f84d645a4 drivers/osmesa: Link OSMesa using -no-undefined libtool flag
"Use -no-undefined to assure libtool that the library has no
unresolved symbols at link time, so that libtool will build a shared
library on platforms require that all symbols are resolved when the
library is linked."

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-25 12:39:42 +01:00
Jon TURNEY
50b13217ba drivers/X11: Link X11 libGL with -no-undefined libtool flag
"Use -no-undefined to assure libtool that the library has no
unresolved symbols at link time, so that libtool will build a shared
library on platforms require that all symbols are resolved when the
library is linked."

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-25 12:38:38 +01:00
Vinson Lee
491d82e9df Revert "scons: Add instrumentation component libraries to linking on llvm-3.2."
This reverts commit e2e7b467d8.

No longer needed after llvm-3.2svn r160611.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2012-07-24 22:49:49 -07:00
Paul Berry
497bf5dd2b i965/msaa: Switch on 8x MSAA for Gen7.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-24 14:52:59 -07:00
Paul Berry
7285612713 i965/msaa: Adjust MCS buffer allocation for 8x MSAA.
MCS buffers use 32 bits per pixel in 8x MSAA, and 8 bits per pixel in
4x MSAA.  This patch adjusts the format we use to allocate the buffer
so that enough memory is set aside for 8x MSAA.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-24 14:52:58 -07:00
Paul Berry
304be9db14 i965/msaa: Remove assertion in 3DSTATE_SAMPLE_MASK to allow 8x MSAA.
The code to emit 3DSTATE_SAMPLE_MASK was already correct for 8x
MSAA--this patch just removes an assertion that would have prevented
it from being used for 8x MSAA.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-24 14:52:58 -07:00
Paul Berry
2a9ab29ed9 i965/msaa: Adjust 3DSTATE_MULTISAMPLE packet for 8x MSAA.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-24 14:52:58 -07:00
Paul Berry
7fae97c98b i965/blorp: Encode and decode IMS format for 8x MSAA correctly.
This patch updates the blorp functions encode_msaa() and decode_msaa()
to properly handle the encoding of IMS MSAA buffers when
num_samples=8.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-24 14:52:58 -07:00
Paul Berry
619471dc32 i965/blorp: Compute sample number correctly for 8x MSAA.
When operating in persample dispatch mode, the blorp engine would
previously assume that subspan N always represented sample N (this is
correct assuming 4x MSAA and a 16-wide dispatch).  In order to support
8x MSAA, we must compute which sample is associated with each subspan,
using the "Starting Sample Pair Index" field in the thread payload.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-24 14:52:58 -07:00
Paul Berry
082874e389 i965/blorp: Properly adjust primitive size for 8x MSAA.
When rendering to an IMS MSAA surface on Gen7, blorp sets up the
rendering pipeline as though it were rendering to a single-sampled
surface; accordingly it must adjust the size of the primitive it sends
down the pipeline to account for the interleaving of samples in an IMS
surface.

This patch modifies the size adjustment code to properly handle 8x
MSAA, which makes room for the extra samples by using an interleaving
pattern that is twice as wide as 4x MSAA.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-24 14:52:58 -07:00
Paul Berry
17eae9762c i965/blorp: Parameterize manual_blend() by num_samples.
This patch adds a num_samples argument to the blorp function
manual_blend(), allowing it to be told how many samples need to be
blended together.  Previously it assumed 4x MSAA, since that was all
we supported.

We also bump up LOG2_MAX_BLEND_SAMPLES from 2 to 3, so that
manual_blend() will be able to handle 8x MSAA.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-24 14:52:58 -07:00
Paul Berry
4afee38a2f i965/msaa: Remove comment about falsely claiming to support MSAA.
Gen6+ hardware now supports MSAA properly.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-24 14:52:58 -07:00
Paul Berry
ff9313fac7 i965/blorp: Handle DrawBuffers properly.
When the client program uses glDrawBuffer() or glDrawBuffers() to
select more than one color buffer for drawing into, and then performs
a blit, we need to blit into every single enabled draw buffer.

+2 oglconforms.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50407

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-24 14:52:57 -07:00
Paul Berry
fa1d267beb i965/blorp: Rearrange order of blit validation and preparation steps.
This patch rearranges the order of steps performed by a blorp blit
from this:

- Sync up state of window system buffers.
- Find buffers.
- Find miptrees.
- Make sure buffer formats match.
- Handle mirroring.
- Make sure width and height match.
- Handle clipping/scissoring.
- Account for window system origin conventions.
- Do depth resolves, if applicable.
- Do the blit.
- Record the need for a future HiZ resolve, if applicable.

To this:

- Sync up state of window system buffers.
- Handle mirroring.
- Make sure width and height match.
- Handle clipping/scissoring.
- Account for window system origin conventions.
- Find buffers.
- Make sure buffer formats match.
- Find miptrees.
- Do depth resolves, if applicable.
- Do the blit.
- Record the need for a future HiZ resolve, if applicable.

The steps are the same, but they are now performed in an order that
will make it possible to implement correct DrawBuffers support.  Note
that the last four steps are now in a separate function
(do_blorp_blit), since they will need to be executed repeatedly when
DrawBuffers support is added.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-24 14:52:57 -07:00
Paul Berry
eac4f1a707 i965/blorp: Don't fall back to swrast when miptrees absent.
Previously, the blorp engine would fall back to swrast if the source
or destination of a blit had no associated miptree.  This was
unnecessary, since _mesa_BlitFramebufferEXT() already takes care of
making the blit silently succeed if there are no buffers bound, so the
fallback paths could never actually happen in practice.

Removing these fallback paths will simplify the implementation of
correct DrawBuffers support in blorp.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-24 14:52:57 -07:00
Paul Berry
0dbec6ae07 i965/blorp: Fixup scissoring of blits to window system buffers.
This patch modifies the order of operations in the blorp engine so
that clipping and scissoring are performed before adjusting the
coordinates to account for the difference in origin convention between
window system buffers and framebuffer objects.  Previously, we would
do clipping and scissoring after adjusting for origin conventions, so
we would get scissoring wrong in window system buffers.

Fixes Piglit test "fbo-scissor-blit window".

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-24 14:52:57 -07:00
Paul Berry
da54d2e576 i965/blorp: Simplify check that src/dst width/height match.
When checking that the source and destination dimensions match, we
don't need to store the width and height in variables; doing so just
risks confusion since right after the check, we do clipping and
scissoring, which may alter the width and height.

No functional change.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-24 14:52:57 -07:00
Paul Berry
bac43b8bb7 i965/msaa: Work around problems with null render targets on Gen6.
On Gen6, multisampled null render targets don't seem to work
properly--they cause the GPU to hang.  So, as a workaround, we render
into a dummy color buffer.

Fortunately this situation (multisampled rendering without a color
buffer) is rare, and we don't have to waste too much memory, because
we can give the workaround buffer a very small pitch.

Fixes piglit test "EXT_framebuffer_multisample/no-color {2,4}
depth-computed *" on Gen6.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-24 14:52:57 -07:00
Paul Berry
0aeb87023e i965: Set width, height, and tiling properly for null render targets.
The HW docs say that the width and height of null render targets need
to match the width and height of the corresponding depth and/or
stencil buffers, and that they need to be marked as Y-tiled.  Although
leaving these values at 0 doesn't seem to cause any ill effects, it
seems wise to follow the documented requirements.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-24 14:52:57 -07:00
Paul Berry
691c55f356 i965/msaa: Control multisampling behaviour via the visual.
Previously, we used the number of samples in draw buffer 0 to
determine whether to set up the 3D pipeline for multisampling.  Using
the visual is cleaner, and has the benefit of working properly when
there is no color buffer.

Fixes all piglit tests "EXT_framebuffer_multisample/no-color" on Gen7.
On Gen6, the "depth-computed" variants of these tests still fail; this
will be addresed in a later patch.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-24 14:52:57 -07:00
Paul Berry
48fdfbcb58 msaa: Compute visual samples/sampleBuffers from all buffers.
This patch ensures that Visual.samples and Visual.sampleBuffers are
set correctly even in the case where there is no color buffer.
Previously, these values would retain their default value of 0 in this
circumstance, even if the depth or stencil buffer was multisampled.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-24 14:52:56 -07:00
Anthony G. Basile
f35e380dd2 Fix compile time errors when building against uclibc
Mesa misses a few checks when compiling on a uclibc system
which cause it to fall back on glibc-ism.  This patch
addresses those issues.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Anthony G. Basile <blueness@gentoo.org>
2012-07-24 13:00:47 -07:00
Jerome Glisse
1ffac44e83 r600g: enable streamout only on 2.14 or latter kernel
The kernel streamout support was supposed to get into 3.3 along
the tiling change and thus use the same kernel version bump of
2.13 to report userspace that streamout register were supported.

This is not what happen. So as streamout kernel support did not
bump the kernel driver version, rely on kernel 2.14 version bump
to know if streamout is enabled or not. Which means you need at
least 3.4 kernel.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-07-24 15:08:31 -04:00
Jordan Justen
881bb4ac72 intel: move error on create context to proper path
The error was being set on the non-error path, rather
than the error path.

NOTE: This is a candidate for the 8.0 branch.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-24 11:59:19 -07:00
Jordan Justen
01168df4d9 mesa context: generate an error for uninstalled context functions
For 'non-legacy' contexts we will want to generate an error
if an uninstalled function is called.

The effect of this change will be that we can avoid installing
legacy functions, and they will then generate an error as
needed for deprecated functions in GL >= 3.1.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-24 11:50:35 -07:00
Brian Paul
1f9239ec8d nouveau: include glformats.h to get missing prototype
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=52449
2012-07-24 10:33:20 -06:00
Brian Paul
a271a0c9f6 mesa: improve comment in build_tnl_program() 2012-07-24 09:54:50 -06:00
Brian Paul
8f2a13c5e3 docs: the legacy makefile system is removed in Mesa 8.1 2012-07-24 08:49:02 -06:00
Brian Paul
7e18a039ee mesa: move _mesa_error_check_format_and_type() to glformats.c
Now all the format/type-related helper functions are in glformats.c
and image.c is just image-related functions.
2012-07-24 08:37:29 -06:00
Brian Paul
a1287f549a mesa: move more format helper functions to glformats.c 2012-07-24 08:37:29 -06:00
Brian Paul
8b762ebd72 mesa: move some format helper functions to glformats.c 2012-07-24 08:37:29 -06:00
Christian König
de3335dba8 radeonsi: remove old state handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:30 +02:00
Christian König
9b213c871a radeonsi: move everything else into the new handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:30 +02:00
Christian König
53d47889e6 radeonsi: move format handling into si_state.c
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:30 +02:00
Christian König
73dd906ba0 radeonsi: move remaining sampler state into si_state.c
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:30 +02:00
Christian König
ca9cf611b6 radeonsi: move draw state into new handling
Split it out into si_state_draw.c

Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:30 +02:00
Christian König
0d6b0b512a radeonsi: move constants to new state handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:30 +02:00
Christian König
baf2039756 radeonsi: move sampler states into new handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:30 +02:00
Christian König
3c09f11e5c radeonsi: move shaders to new handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:30 +02:00
Christian König
bd2a5cf328 radeonsi: move spi into new handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:30 +02:00
Christian König
840f05da6b radeonsi: move init state to new handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:30 +02:00
Christian König
e4e6f954ae radeonsi: move draw_info to new state handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:30 +02:00
Christian König
76660dfcce radeonsi: move CB_TARGET_MASK into fb/blend state
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:30 +02:00
Christian König
e6937211da radeonsi: move stencil_ref to new handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:30 +02:00
Christian König
b41b3eb989 radeonsi: move dsa state to new handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:29 +02:00
Christian König
bd18a316e1 radeonsi: move infeered fb/rs state to new handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:29 +02:00
Christian König
f67fae0e43 radeonsi: move rasterizer state into new handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:29 +02:00
Christian König
835098a529 radeonsi: move framebuffer to new handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:29 +02:00
Christian König
7e011d92c9 radeonsi: move viewport to new handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:29 +02:00
Christian König
43f414f7b7 radeonsi: move scissor state to new state handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:29 +02:00
Christian König
9cbbe0d4e6 radeonsi: move clip state to new handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:29 +02:00
Christian König
0a091a4824 radeonsi: move blend color to new state handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:29 +02:00
Christian König
63636ae52a radeonsi: move blender to new state handling
Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:29 +02:00
Christian König
bf7302a6e1 radeonsi: rework state handling v2
Add a complete new state handling for SI.

v2: fix spelling error

Signed-off-by: Christian König <deathsimple@vodafone.de>
2012-07-24 12:29:29 +02:00
Brad King
27382c0f7b automake: Honor GL_LIB for mangled/custom lib names
Commit 2d4b77c7 (automake: Convert src/mesa/drivers/x11/Makefile to
automake, 2012-06-12) dropped the old Makefile, which used GL_LIB, and
replaced it with a Makefile.am hard-coding the name "GL".  This broke
handling of --enable-mangling and --with-gl-lib-name options which
depend on GL_LIB to specify the GL library name.

Use "@GL_LIB@" in src/mesa/drivers/x11/Makefile.am to configure the
library name.  Also use this approach to simplify src/glx/Makefile.am
and drop the HAVE_MANGLED_GL conditional.  While at it, fix the
compatibility link we create in "lib" for the software-only driver to
use version GL_MAJOR instead of hard-coding "1".

Reviewed-by: Dan Nicholson <dbn.lists@gmail.com>
2012-07-23 22:34:13 -07:00
Marek Olšák
82fc813ca8 st/mesa: fix DDY opcode for FBOs
This fixes piglit/fbo-deriv.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-23 19:23:53 +02:00
Marek Olšák
f40b5723f0 st/mesa: set the centroid qualifier in fragment shader inputs
This fixes some centroid tests in the EXT_framebuffer_multisample piglit group.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-23 19:23:53 +02:00
Marek Olšák
162b3ad94d st/mesa: flush the glBitmap cache before changing framebuffer state
This fixes the piglit EXT_framebuffer_multisample/bitmap tests.

Note that we must not rely on ctx->DrawBuffer when flushing the cache, because
that's already updated with a new framebuffer. We want to draw into the old
framebuffer where glBitmap was called.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-23 19:23:53 +02:00
Marek Olšák
07b9b3c37b st/mesa: set the correct window renderbuffer internal format
The multisample-resolve blit relies on this being correct.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-23 19:23:52 +02:00
Marek Olšák
5927227576 mesa: fix format checking when doing a multisample resolve
v2: make it more bullet-proof

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-23 19:23:52 +02:00
José Fonseca
c30bf68946 gallivm: Prefer the standard JIT engine whenever possible.
Testing shows that the standard JIT engine retrofited with AVX support is quite
stable and as capable to handle AVX instructions as MC-JIT is.

And the old JIT is much more memory efficient, as we don't need to
allocate one engine instance per shader, as we do for MC-JIT due to its
incompleteness.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-07-23 17:46:38 +01:00
Jerome Glisse
cb149bf9e1 r600g: don't emit forbidden reg with old kernel on evergreen
Fix https://bugs.freedesktop.org/show_bug.cgi?id=52313

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-07-23 11:42:36 -04:00
Jerome Glisse
b7b5a77ec0 r600g: don't emit forbidden register on old kernel
Fix https://bugs.freedesktop.org/show_bug.cgi?id=52313

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-07-23 11:28:25 -04:00
Vincent Lejeune
bc4b4c605c radeon/llvm: Fix a bug with IF LOGICALNZ with int operand
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-07-23 15:04:36 +00:00
Tom Stellard
044de40cb0 pipe_loader: Try to connect with the X server before probing pciids v2
When X is running it is neccesary for pipe_loader to authenticate with
DRM, in order to be able to use the device.

This makes it possible to run OpenCL programs while X is running.

v2:
  - Fix C++ style comments
  - Drop Xlib-xcb dependency
  - Close the X connection when done
  - Split auth code into separate function

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2012-07-23 13:25:36 +00:00
Tom Stellard
17f6c9195f configure.ac: Add --with-llvm-prefix option
This option allows you to specify the llvm install prefix.  It is
useful for switching between different versions of LLVM.
2012-07-23 13:25:36 +00:00
Kenneth Graunke
c3bc41011f mesa: Prevent repeated glDeleteShader() from blowing away our refcounts.
Calling glDeleteShader() should mark shaders as pending for deletion,
but shouldn't decrement the refcount every time.  Otherwise, repeated
glDeleteShader() is not safe.

This is particularly bad since glDeleteProgram() frees shaders: if you
first call glDeleteShader() on the shaders attached to the program (thus
decrementing the refcount), then called glDeleteProgram(), it would try
to free them again (decrementing the refcount another time), causing
a refcount > 0 assertion to fail.

Similar to commit d950a778.

NOTE: This is a candidate for the 8.0 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-22 14:34:44 -07:00
Matt Turner
cfdf60f236 imports.h: Correct ceilf typo.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-22 14:06:08 -07:00
Marek Olšák
f96405f254 st/mesa: remove st_flush_bitmap wrapper
just a cleanup
2012-07-22 03:32:55 +02:00
Jordan Justen
749c9060ac mesa formats: add MESA_FORMAT_ABGR2101010_UINT
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-21 16:49:42 -07:00
Jordan Justen
1c8812c244 mesa formats: unpack ARGB8888/XRGB8888
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-21 16:49:42 -07:00
Jordan Justen
8c265cf5ef mesa pack: use _mesa_problem instead of assert
If the pack type is not supported, use _mesa_problem
rather than asserting.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-21 16:49:42 -07:00
Jordan Justen
9ad8f431b2 mesa: add glformats integer type/format detection routines
_mesa_is_integer_format is moved to formats.c and renamed
as _mesa_is_enum_format_integer.

_mesa_is_format_unsigned, _mesa_is_type_integer,
_mesa_is_type_unsigned, and _mesa_is_enum_format_or_type_integer
are added.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-21 16:49:42 -07:00
Vinson Lee
e2e7b467d8 scons: Add instrumentation component libraries to linking on llvm-3.2.
llvm-3.2svn r160587 moved createBoundsCheckingPass from
lib/Transforms/Scalar to lib/Transforms/Instrumentation.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-07-21 10:38:25 -07:00
Matt Turner
d24cf88a1a Remove unused _mesa_memset16
Unused since commit fd104a845.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-21 08:23:38 -07:00
Matt Turner
f58ba6ca91 Remove _mesa_inv_sqrtf in favor of 1/SQRTF
Except for a couple of explicit uses, _mesa_inv_sqrtf was disabled since
its addition in 2003 (see f9b1e524).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-21 08:23:38 -07:00
Matt Turner
948b1c541f Remove _mesa_sqrt* in favor of plain sqrt
Temporarily disabled since 2003 (see 386578c5b).

This saves us from calling sqrt() 128 times to generate the sqrttab in
one_time_init().

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-21 08:23:38 -07:00
Matt Turner
ec79138138 Use INV_SQRT instead of 1/SQRTF
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-21 08:23:38 -07:00
José Fonseca
bd9bf7a424 autoconf: Only kink mcjit component when available.
Should fix build failures with older LLVM version, but only tested on
LLVM 3.1.
2012-07-21 11:43:35 +01:00
Chad Versace
735070c45b i830: Fix stack corruption
Found by compiler warning:
    i830_texstate.c:131:28: warning: argument to 'sizeof' in 'memset' call
          is the same expression as the destination; did you mean to
          dereference it?  [-Wsizeof-pointer-memaccess]
       memset(state, 0, sizeof(state));
              ~~~~~            ^~~~~

On 64-bit systems, memset here would write an extra 4 bytes.

Note: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-20 16:01:57 -07:00
José Fonseca
1a8f6ac5a4 mesa: disable MSVC global optimization in pack.c
To reduce excessive compilation time in release mode.

NOTE: This is a candidate for the 8.0 branch.

Tested-by: Brian Paul <brianp@vmware.com>
2012-07-20 16:23:22 -06:00
Brian Paul
9fd4e9e9e6 mesa: whitespace fixes in pbo.c 2012-07-20 16:22:59 -06:00
Brian Paul
ac14f569fe mesa: update texstore.c comment 2012-07-20 15:13:19 -06:00
Roland Scheidegger
70a969f123 llvmpipe: use runtime loop instead of static loop for looping over quads
This can potentially cut shader program size by a factor of 4 for 4-wide
execution respectively 2 for 8-wide execution and while this ratios aren't
quite reached for more complex shaders it can be close.
Could not really measure a performance difference so far except for trivial
shaders (glxgears).
There seems to be a fair amount of unnecessary move's generated especially
at the beginning it might be possible to optimize those away somehow.
Things aren't quite as clean, some additional stuff needs to be done for
keeping both paths working (though llvm might be able to optimize this away).
glxgears seems to lose about 5-10% of performance, looking at the generated
shaders this is actually less than I'd think it would be - both 4 and 8-wide
shaders, despite containing a loop actually have about 10% more instructions
in total, and will have roughly 50% more executed instructions (though mostly
cheap ones). Need to figure out how to reduce overhead...

v2: keep complex interpolation for 4-wide mode, adapt to interface changes.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-07-20 20:17:15 +01:00
Roy Spliet
542bd6941f nv30: Support negative offsets in indirect constant access.
Fixes piglit vp-address-01 amongst several others.

Signed-off-by: Roy Spliet <r.spliet@student.tudelft.nl>
Reviewed-by: Lucas Stach <dev@lynxeye.de>
Tested-by: Lucas Stach <dev@lynxeye.de>
2012-07-20 20:31:40 +02:00
Bryan Cain
248e6f0331 nv50/ir: set position before i instead of i->next in NV50LoweringPreSSA::visit
Fixes rendering glitches in Psychonauts such as Raz's eyes flickering white.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=51962.
2012-07-20 20:30:07 +02:00
Eric Anholt
b2a44cde64 i965/gen7: Increase the WM threads to hardware limits.
This thread count is only supposed to be enabled when "WIZ Hashing Disable in
GT_MODE register enabled."  I've always been confused whether that means the
bit in the register should be 1 or 0.  For my IVB GT2's register 0x7008 value
of 0x0, this appears to work fine.

Improves l4d2 performance at 640x480 by 0.88 +/- 0.11% (n=88).  Improves
performance with rasterization at 1280x1024 by 1.45% +/- 0.36% (n=6).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-20 11:05:39 -07:00
Eric Anholt
8ab5842a6d glsl: Assign locations for uniforms in UBOs using the std140 rules.
Fixes piglit layout-std140.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-20 10:44:04 -07:00
Eric Anholt
9feb403b0e glsl: Don't resize arrays in uniform blocks.
This is a requirement for std140 uniform blocks, and optional for
packed/shared blocks.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-20 10:43:59 -07:00
Eric Anholt
0cea8a56b6 glsl: Don't dead-code eliminiate uniforms declared in uniform blocks.
This is a requirement for std140 uniform blocks, and optional for
packed/shared blocks.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-20 10:43:52 -07:00
Eric Anholt
548bce4733 mesa: Implement the UBO-specific pnames of glGetActiveUniformsiv.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-20 10:43:50 -07:00
Eric Anholt
a74507dc94 glsl: Propagate uniform block information into gl_uniform_storage.
Now we can actually return information on uniforms in uniform blocks
in the new queries.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-20 10:43:47 -07:00
Eric Anholt
ddc88fbf51 mesa: Add implementation of glGetUniformBlockIndex().
Now that we finally have a list of uniform blocks in the linked shader
program, we can tell what their indices are.

Fixes piglit GL_ARB_uniform_buffer_object/getuniformblockindex.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-20 10:43:44 -07:00
Eric Anholt
093b20666d glsl: Set the uniform_block index for the linked shader variables.
At this point in the linking, we've totally lost track of the struct
gl_uniform_buffer that this pointed to in the original unlinked
shader, so we do a nasty n^2 walk to find it the new one based on the
variable name.

Note that these point into the shader's list of gl_uniform_buffers,
not the linked program's.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-20 10:43:42 -07:00
Eric Anholt
9f1a4a6340 mesa: Add support for glGetActiveUniformsiv on non-UBO pnames.
We'll need to propagate the UBO fields to the uniform storage records
before we can handle the other pnames.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-20 10:43:40 -07:00
Eric Anholt
acfbdfcbc8 mesa: Add support for glGetUniformIndices().
This is a single entrypoint that maps from a series of names to the
indices of those names within the active uniforms list.  Each index is
like glGetUniformLocation()'s return value, except that it doesn't
encode an array offset.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-20 10:43:35 -07:00
Eric Anholt
abcdbdf9cc mesa: Move the _mesa_uniform_merge_location_offset to glGetUniformLocation().
With the upcoming GL_ARB_uniform_buffer_object changes, the only
other caller that will want the cooked value is state_tracker.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-20 10:43:33 -07:00
Eric Anholt
f609cf782a glsl: Merge the lists of uniform blocks into the linked shader program.
This attempts error-checking, but the layout isn't done yet.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-20 10:43:28 -07:00
Eric Anholt
b3c093c79c glsl: Translate the AST for uniform blocks into some IR structures.
We're going to need this structure to cross-validate the uniform
blocks between shader stages, since unused ir_variables might get
dropped.  It's also the place we store the RowMajor qualifier, which
is not part of the GLSL type (since that would cause a bunch of type
equality checks to fail).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-20 10:43:19 -07:00
Eric Anholt
f7561e8ecd glsl: Turn UBO variable declarations into ir_variables and check qualifiers.
Fixes piglit layout-*-non-uniform and layout-*-within-block.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-20 10:43:12 -07:00
Lucas Stach
cdad337fec st/xorg: fix masked transformations
Someone tried to be clever and "optimized" add_vertex_data2() to just use
two points for the texture coordinates and then reuse individual
components. Sadly this is not how matrix multiplication works.

Fixes rendercheck -t tmcoords

Signed-off-by: Lucas Stach <dev@lynxeye.de>
2012-07-20 18:47:54 +02:00
Paul Berry
60c3e69dbf i965/blorp: Use IMS layout when texturing from depth/stencil surfaces.
Previously, on Gen7, when texturing from a depth or stencil surface,
the blorp engine would configure the 3D pipeline as though the input
surface was non-multisampled, and perform the necessary coordinate
transformations in the fragment shader to account for the IMS layout.
This meant outputting a lot of extra fragment shader code, and it
raised some uncertainty about how to deal with very large surfaces.

This patch modifies blorp to configure the 3D pipeline properly for
IMS layout when reading from depth and stencil surfaces.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-07-20 09:35:38 -07:00
Paul Berry
0dd5e98aa5 i965/blorp: Loosen assertions in compute_msaa_layout_for_pipeline.
Previously, on Gen7, compute_msaa_layout_for_pipeline() would verify
that IMS layout is not used.  However, now that we configure
SURFACE_STATE correctly for IMS surfaces, IMS layout is available.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-07-20 09:35:38 -07:00
Paul Berry
989218b980 i965/blorp: Configure SURFACE_STATE correctly for IMS surfaces.
This patch modifies gen7_set_surface_num_multisamples() to set up the
SURFACE_STATE appropriately for texturing from IMS format MSAA
surfaces (which are only used on Gen7 for depth and stencil buffers).
Since the function now sets more than just the number of multisamples,
it's been renamed to gen7_set_surface_msaa().

This will make it possible to remove some kludginess from the blorp
engine.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-07-20 09:35:38 -07:00
Paul Berry
f91b4d92b9 i965/blorp: Optimize manual_blend() for compressed multisampled surfaces.
When downsampling a compressed multisampled surface, we can take a
shortcut to downsample any pixels that were completely covered by a
single primitive.  In this case, the first color value we fetch is the
correct final color for the downsampled pixel, so we can skip the rest
of the blending operation.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-07-20 09:35:37 -07:00
Paul Berry
e5d983267a i965/blorp: Fix integer downsampling on Gen7.
When downsampling an integer-format buffer on Gen7, we need to use the
"avg" instruction rather than the "add" instruction, to ensure that we
don't overflow the range of 32-bit integers.  Also, we need to use the
proper register type (BRW_REGISTER_TYPE_D or BRW_REGISTER_TYPE_UD) for
intermediate color data and for writing to the render target.

Note: this patch causes blorp to use the proper register type for all
operations (downsampling, upsampling, and ordinary blits).  Strictly
speaking, this is only necessary for downsampling, because the other
operations exclusively use MOV instructions on the color data.  But
it's simpler to use the proper register type in all cases.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-07-20 09:35:37 -07:00
Paul Berry
b961d37e61 i965/blorp: Modify manual_blend() to avoid unnecessary loss of precision.
When downsampling from an MSAA image to a single-sampled image, it is
inevitable that some loss of numerical precision will occur, since we
have to use 32-bit floating point registers to hold the intermediate
results while blending.  However, it seems reasonable to expect that
when all samples corresponding to a given pixel have the exact same
color value, there will be no loss of precision.

Previously, we averaged samples as follows:

    blend = (((sample[0] + sample[1]) + sample[2]) + sample[3]) / 4

This had the potential to lose numerical precision when all samples
have the same color value, since ((sample[0] + sample[1]) + sample[2])
may not be precisely representable as a 32-bit float, even if the
individual samples are.

This patch changes the formula to:

    blend = ((sample[0] + sample[1]) + (sample[2] + sample[3])) / 4

This avoids any loss of precision in the event that all samples are
the same, by ensuring that each addition operation adds two equal
values.

As a side benefit, this puts the formula in the form we will need in
order to implement correct blending of integer formats.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-07-20 09:35:37 -07:00
Paul Berry
6a27506181 i965: Add support for AVG instruction.
From the Ivy Bridge PRM, Vol4 Part3 p152:

    "The avg instruction performs component-wise integer average of
    src0 and src1 and stores the results in dst. An integer average
    uses integer upward rounding. It is equivalent to increment one to
    the addition of src0 and src1 and then apply an arithmetic right
    shift to this intermediate value."

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-07-20 09:35:37 -07:00
Paul Berry
9544e44262 i965: Replace fs_visitor::kill_emitted with gl_fragment_program::UsesKill.
The kill_emitted variable was duplicating the functionality of
gl_fragment_program::UsesKill.  There's no need for both.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-20 09:33:07 -07:00
Paul Berry
0f1f2ff8db mesa: Set gl_fragment_program::UsesKill in do_set_program_inouts.
Previously, the code for setting this flag for GLSL programs was
duplicated in three places: brw_link_shader(), glsl_to_tgsi_visitor,
and ir_to_mesa_visitor.  In addition to the unnecessary duplication,
there was a performance problem on i965: brw_link_shader() set the
flag before doing its final round of optimizations, which meant that
if the optimizations managed to eliminate all the discard operations,
the flag would still be set, resulting (at least in theory) in slower
performance.

This patch consolidates all of the code that sets UsesKill for GLSL
programs into do_set_program_inouts(), which already is doing a
similar job for UsesDFdy, and which occurs after i965's final round of
optimizations.

Non-GLSL programs (ARB programs and the state tracker's glBitmap
program) are unaffected.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-20 09:33:07 -07:00
Kristian Høgsberg
a8c092266e gallium-egl: Move wayland query_buffer implementation
Move it to native_wayland_drm_bufmgr_helper.c which only gets compiled when
wayland is enabled and which already includes the right headers.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2012-07-19 16:11:06 -04:00
Olivier Galibert
fbe3fa74e5 softpipe: Fix segfault with fbo-cubemap.
The cube sampler generates two-dimensional texture coordinates and
hence passes NULL for the array for the third one.  The actual 2D
sampler, lower in the pipe, knew not to used that array since it
didn't need it.  But the samplers have become single-texel and the
coordinate array dereference has been moved up one step, to a level
where the code does not know only two coordinates are used.  Hence the
segfault.

The simplest fix by far is to add a third dummy coordinate array in
the call to the next pipe step, which will be dereferenced to an
harmless 0 which then will be happily ignored by the sampler.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=52250

Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-19 13:19:14 -06:00
Kristian Høgsberg
d7522ed130 wayland: Support EGL_WIDTH and EGL_HEIGHT queries for wl_buffer
We're going to make the public wl_buffer struct as small as possible.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2012-07-19 14:03:17 -04:00
Kristian Høgsberg
e23bfdb329 wayland: Use existing EGL_TEXTURE_FORMAT for querying wl_buffer texture format
We also reuse EGL_TEXTURE_RGBA and EGL_TEXTURE_RGB, adding only the new
planar YUV texture formats: EGL_TEXTURE_Y_U_V_WL, EGL_TEXTURE_Y_UV_WL and
EGL_TEXTURE_Y_XUXV_WL.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2012-07-19 14:03:17 -04:00
Kristian Høgsberg
e1b45a3c06 gallium-egl: Implement eglQueryWaylandBufferWL
Support this query for gallium EGL too.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2012-07-19 14:03:17 -04:00
Kenneth Graunke
d43f4181e1 glsl: Remove open coded version of ir_variable::interpolation_string().
Presumably the function didn't exist when we wrote this code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-19 11:00:00 -07:00
Paul Berry
d08fdacd58 i965: Avoid unnecessary recompiles for shaders that don't use dFdy().
The i965 back-end needs to compile dFdy() differently for FBOs and
window system framebuffers, because Y coordinates are flipped between
the two (see commit 82d2596: i965: Compute dFdy() correctly for FBOs).
This patch avoids unnecessarily recompiling shaders that don't use
dFdy(), by only setting render_to_fbo in the wm program key if the
shader actually uses dFdy().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-19 10:02:25 -07:00
Paul Berry
ce1d2f08f9 glsl: Set UsesDFdy appropriately for GLSL shaders.
This patch updates the ir_set_program_inouts_visitor so that it also
sets gl_fragment_program::UsesDFdy.

This is a bit of a hack (since dFdy() isn't an input or an output),
but there's no other obvious visitor to squeeze this functionality
into, and it would be silly to create a brand new visitor just for
this purpose.

v2: use local 'fprog' var to avoid repeated casting.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-19 10:02:21 -07:00
Paul Berry
a0f7b86959 mesa: Set UsesDFdy appropriately for assembly programs.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-19 10:02:19 -07:00
Paul Berry
5e310e9f83 mesa: Add UsesDFdy to struct gl_fragment_program.
The i965 back-end needs to compile dFdy() differently for FBOs and
window system framebuffers, because Y coordinates are flipped between
the two (see commit 82d2596: i965: Compute dFdy() correctly for FBOs).
This boolean will allow it to avoid unnecessarily recompiling shaders
that don't use dFdy().

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-19 10:02:01 -07:00
Kenneth Graunke
658a63e5d9 drirc: Add disable_blend_func_extended workaround for Unigine OilRush.
The previous commit implemented the workaround, cited a bug report
about OilRush, but actually only enabled the workaround for the demos.

Turn it on for OilRush too.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50291
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-19 01:40:24 -07:00
Kenneth Graunke
040894391a i965: Add a driconf option to disable GL_ARB_blend_func_extended.
Unigine Heaven (at least) has a bug where it incorrectly uses the
GL_ARB_blend_func_extended extension.

Dual source blending allows two color outputs per render target;
individual shader outputs can be assigned to be either the first or
second blending input by setting the 'index' via one of two methods:

- An API call: glBindFragDataLocationIndexed()
- The GLSL 'layout' qualifier provided by GL_ARB_explicit_attrib_location

Both of these only work on user defined fragment shader outputs; it's an
error to use either on built-in outputs like gl_FragData.

Unigine uses gl_FragData and gl_FragColor exclusively, and doesn't even
attempt to use either method to set index == 1.  However, it does set
the blending function to SRC1 enums, which requires a fragment shader
output with index == 1 or else rendering is undefined.

In other words, enabling ARB_blend_func_extended causes Unigine to
render incorrectly, resulting in an apparent regression, even though our
driver code (as far as I can tell) is perfectly fine.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50291
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-19 01:22:34 -07:00
Brian Paul
768be75c44 mesa: remove stale comment 2012-07-18 16:51:47 -06:00
Brian Paul
e4f8d33aea mesa: use gl_program cast wrappers
In a few cases, remove unneeded casts.
And fix a few other const-correctness issues.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-18 16:51:47 -06:00
Brian Paul
1170b5aa9f mesa: add some gl_program cast wrappers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-18 16:51:47 -06:00
Marek Olšák
c3c83af380 r600g: setup streamout before calling last r600_need_cs_space before drawing
This fixes CS checker errors due to registers not being initialized, because
the flush occured after dirty state was emitted but before drawing.
2012-07-18 22:42:58 +02:00
Eric Anholt
a40c1f9522 i965/fs: Make register spill/unspill only do the regs for that instruction.
Previously, if we were spilling the result of a texture call, we would store
all 4 regs, then for each use of one of those regs as the source of an
instruction, we would unspill all 4 regs even though only one was needed.

In both lightsmark and l4d2 with my current graphics config, the shaders that
produce spilling do so on split GRFs, so this doesn't help them out.  However,
in a capture of the l4d2 shaders with a different snapshot and playing the
game instead of using a demo, it reduced one shader from 2817 instructions to
2179, due to choosing a now-cheaper texture result to spill instead of piles
of texcoords.

v2: Fix comment noted by Ken, and fix the if condition associated with it for
    the current state of what constitutes a partial write of the destination.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2012-07-18 12:30:06 -07:00
Eric Anholt
a454f8ec6d i965/fs.h: Refactor tests for instructions modifying a register.
There's one instance of a potential behavior change: propagate_constants may
now propagate into a part of a vgrf after a different part of it was
overwritten by a send that returns multiple registers.  I don't think we ever
generate IR that meets that condition, but it's something to note if we bisect
behavior change to this.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-18 12:30:06 -07:00
Eric Anholt
fc01376c50 i965/fs: Replace usage is_tex() with regs_written() checks.
In these places, we care about any sort of send that hits more than one reg,
not just textures.  We don't yet have anything else returning more than one
reg, so there's no change.

v2: Use mlen instead of is_tex() for the is-it-a-send check.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-18 12:30:06 -07:00
Eric Anholt
a6411520b4 i965/fs: Rename virtual_grf_next to virtual_grf_count.
"count" is a more useful name, since most of the time we're using it for
looping over the variables.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-18 12:30:06 -07:00
Eric Anholt
40cd60a315 i965/fs: Move a block out of a loop in live variables setup.
This was accidentally copy-and-pasted inside.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-18 12:30:06 -07:00
Anuj Phogat
cd5cd85a43 i965/msaa: Disable alpha-to-{coverage, one} when drawbuffer zero is in integer format
OpenGL specification 3.3 (page 196), section 4.1.3 says:
If drawbuffer zero is not NONE and the buffer it references has an
integer format, the SAMPLE_ALPHA_TO_COVERAGE and SAMPLE_ALPHA_TO_ONE
operations are skipped."
This should work properly even if there are other draw buffers that
are not in integer format.

This patch makes following piglit tests pass on mesa:
int-draw-buffers-alpha-to-coverage
int-draw-buffers-alpha-to-one

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-07-18 11:54:12 -07:00
Lucas Stach
fb18ec4f27 st/xorg: attach EDID to outputs
Allows tools like GNOME's monitor configuration to show meaningful names.

v2: fix resource leak

Signed-off-by: Lucas Stach <dev@lynxeye.de>
2012-07-18 17:19:16 +02:00
Lucas Stach
9de16ac0a8 st/xorg: remove superfluous memset
exaDriverAlloc() uses calloc, which already initialises pExa to zero.

Signed-off-by: Lucas Stach <dev@lynxeye.de>
2012-07-18 17:19:07 +02:00
Lucas Stach
70f0eda127 st/xorg: reorder exa context creation and use screen param queries
Gives the x-server a more accurate description of the exa hardware
capabilities.

v2: drop NPOT check

Signed-off-by: Lucas Stach <dev@lynxeye.de>
2012-07-18 17:18:55 +02:00
Olivier Galibert
229a1a7e4d softpipe: Take all lods into account when texture sampling.
This patch churns a lot because it needs to change 4-wide filters into
single pixel filters, since each fragment may use a different filter.

The only case not entirely supported is the anisotropic filtering.
Not sure what we want to do there, since a full quad is required by
that filter.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-18 08:02:39 -06:00
Marek Olšák
99c65bac34 r600g: implement wait-free buffer transfer for DISCARD_RANGE
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-07-18 07:16:30 +02:00
Marek Olšák
8ac9801669 r600g: accelerate buffer copying
This will be useful for efficient handling of the DISCARD transfer flags.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-07-18 06:32:57 +02:00
Marek Olšák
f237fd431b r600g: update R600_MAX_DRAW_CS_DWORDS to take draw-opaque into account 2012-07-18 06:25:37 +02:00
Marek Olšák
30257c3291 r600g: move VGT_STRMOUT_DRAW_OPAQUE_OFFSET initialization into invariant state 2012-07-18 06:25:37 +02:00
Marek Olšák
d9ba1b0beb r600g: only set the index type if drawing is indexed 2012-07-18 06:25:37 +02:00
Marek Olšák
1cfb55c509 r600g: remove debug code for streamout 2012-07-18 06:25:37 +02:00
Marek Olšák
ff9a49328e r600g: inline r600_context_draw_opaque_count 2012-07-18 06:25:37 +02:00
Marek Olšák
1b699a4832 r600g: fix alphatest without a colorbuffer on evergreen 2012-07-18 06:25:36 +02:00
Marek Olšák
82a1d24175 r600g: fix alphatest without a colorbuffer on r6xx-r7xx 2012-07-18 04:35:38 +02:00
Marek Olšák
de4fd087cb r600g: always derive alphatest state from the first colorbuffer 2012-07-18 04:17:11 +02:00
Marek Olšák
bc2f5fc01e r600g: atomize alphatest state 2012-07-18 03:45:25 +02:00
Marek Olšák
5130196c0b r600g: try to fix line stippling with lineloops
The piglit test is failing, but visually it looks almost correct.
2012-07-18 02:17:10 +02:00
Marek Olšák
43e226b6ef r600g: optimize uploading depth textures
Make it only copy the portion of a depth texture being uploaded and
not the whole 2D layer.

There is also a little code cleanup.
2012-07-18 00:32:50 +02:00
Marek Olšák
b242adbe5c r600g: remove needless wrapper r600_texture_depth_flush 2012-07-18 00:21:53 +02:00
Marek Olšák
611dd52942 r600g: init_flushed_depth_texture should be able to report errors 2012-07-18 00:21:53 +02:00
Paul Berry
e9b908b014 msaa: Generate proper error for operations prohibited on MSAA buffers.
From the GL 3.0 spec, section 4.3.3, in the documentation for
CopyPixels():

    "An INVALID_OPERATION error will be generated if the object bound
    to READ_FRAMEBUFFER_BINDING is framebuffer complete and the value
    of SAMPLE_BUFFERS is greater than zero."

The same applies to CopyTexImage...() and CopyTexSubImage...()
functions, since they are defined in terms of CopyPixels().

Previously we were generating an INVALID_FRAMEBUFFER_OPERATION error
in these cases.

Fixes piglit tests
"EXT_framebuffer_multisample/negative-{copypixels,copyteximage}".

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-17 14:40:01 -07:00
Brian Paul
c4d2a14d6e gallivm: silence uninitialized variable warnings 2012-07-17 14:41:29 -06:00
Marek Olšák
9d699cd845 r600g: fix lockups with and enable dual source blending on evergreen
GL_ARB_blend_func_extended is now enabled on all chipsets.
2012-07-17 21:22:15 +02:00
Marek Olšák
c26fadf195 r600g: remove unused code after conversion of sampler views 2012-07-17 21:22:15 +02:00
Marek Olšák
5d8d4252f2 r600g: convert sampler view emission into atoms
Vertex and constant buffers are emitted in the same way.
This is mainly a simplification of the code. The cleanup is in another patch.
2012-07-17 21:22:15 +02:00
Marek Olšák
7022f49b52 r600g: only make constant buffers dirty if there's something to update 2012-07-17 21:22:15 +02:00
Marek Olšák
80755ff563 r600g: properly track which textures are depth
This fixes the issue with have_depth_texture never being set to false.
2012-07-17 21:22:15 +02:00
Marek Olšák
e5de73cafd r600g: consolidate and optimize sampler states changes for evergreen
Only set sampler states which changed.
2012-07-17 21:22:14 +02:00
Marek Olšák
883c43cdd4 r600g: don't invalidate texture caches when setting sampler states
Changing sampler states doesn't change resource bindings.
2012-07-17 21:22:14 +02:00
Marek Olšák
ba48f47ebf r600g: consolidate code for setting sampler views and fix bugs in the process
Issues fixed:

- set_vs_sampler_views for evergreen is now properly implemented.

- Added the missing inval_texture_cache call for evergreen.

- have_depth_texture was sometimes incorrectly set to false on evergreen even
  if there were depth textures in other shader stages. To fix this, set it
  to true once and never set it to false again. It's stupid, but it matches
  the r600 code. The proper fix is left to another patch.

- Optimizaton: The sampler views which aren't changed aren't updated.
2012-07-17 21:22:14 +02:00
Marek Olšák
d1ca16b273 r600g: remove unused flag have_depth_fb
This is a leftover from:

commit fe1fd67556
Author: Marek Olšák <maraeo@gmail.com>
Date:   Sun Jul 8 03:10:37 2012 +0200

    r600g: don't flush depth textures set as colorbuffers
2012-07-17 21:22:14 +02:00
Marek Olšák
585baac652 r600g: do fine-grained vertex buffer updates
If only some buffers are changed, the other ones don't have to re-emitted.
This uses bitmasks of enabled and dirty buffers just like
emit_constant_buffers does.
2012-07-17 21:22:14 +02:00
Marek Olšák
f4f2e8ebe1 r600g: don't call inval_shader_cache in r600_context_flush twice
It's already called in r600_constant_buffers_dirty.
2012-07-17 21:22:14 +02:00
Marek Olšák
6694a68d89 gallium/util: add util_bit_last - finds the last bit set in a word 2012-07-17 21:22:14 +02:00
Marek Olšák
018e3f75d6 r600g: fix all failing depth-stencil tests for evergreen 2012-07-17 21:22:14 +02:00
Michel Dänzer
761131ce45 configure.ac: Further LLVM fixups.
* Also add mcjit in the non-OpenCL case.
* Replace hardcoded llvm-config with $LLVM_CONFIG everywhere.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellad <thomas.stellard@amd.com>
2012-07-17 19:12:01 +02:00
Michel Dänzer
39c4bc7fdf glsl: Drop obsolete .gitignore entries.
Helps spotting and removing the obsolete generated files, which otherwise break
the build.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-07-17 18:30:32 +02:00
Tom Stellard
ed41a559dc configure.ac: Add libLLVMMCJIT to the LLVM_LDFLAGS
This is neccessary for linking the llvmpipe tests.  It appears this
dependency was introduced by the "wider native register" changes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-07-17 12:08:24 -04:00
Eric Anholt
fadc9eaf97 intel: Add a comment explaining why we early return on matching BO names.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-17 08:18:08 -07:00
Eric Anholt
2b311fd802 intel: Drop other checks for old loader version.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-17 08:18:06 -07:00
Eric Anholt
1b4374d364 intel: Replace the non-getBuffersWithFormat compat path with an error message.
It's been broken (using NULL getBuffersWithFormat() instead of
getBuffers()) due to a copy and paste error for a year now.
GetBuffersWithFormat has been around since 2009, so I don't feel any
guilt in not supporting it.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-17 08:18:04 -07:00
Eric Anholt
9bbf7c139b intel: Remove dead intel_framebuffer_has_hiz().
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-17 08:18:02 -07:00
Eric Anholt
bce58e155d intel: Convert to using private depth/stencil buffers (v2)
This means that GLX buffer sharing of these no longer works.  On the
other hand, just *look* at this code reduction.

v2:
  - [chad] Fix intelCreateBuffer for gen < 6. When the branch for
    !screen->hw_has_separate_stencil was taken,
    intel_create_private_renderbuffer was incorrectly not used.

  - [chad] Remove all code in intel_process_dri2_buffer for processing
    depth, stencil, and hiz buffers. That code is now dead.

CC: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-17 08:17:56 -07:00
Eric Anholt
433ff3e16e intel: Add a function for creating a private window system buffer.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-17 08:17:38 -07:00
Roland Scheidegger
bf484024b9 gallivm: (trivial) remove unnecessary bogus include 2012-07-17 17:11:18 +02:00
Kristian Høgsberg
2023bf996e gbm: Add gbm_bo_import for gallium gbm backend 2012-07-17 10:54:00 -04:00
Elvis Lee
1f2c87cc8f st/egl: Fix build for wayland includes
common/native_wayland_drm_bufmgr_helper.c fails to find wayland-server.h

Signed-off-by: Elvis Lee <kwangwoong.lee@lge.com>
2012-07-17 10:54:00 -04:00
Elvis Lee
23f1e551cc st/gbm: renaming pitch to stride on gallium
commit '7250cd506baa0bd4649b30d87509cdd0cbc06a57'
changes struct gbm_bo, renaming it's 'pitch' to 'stride'.
This applies to Gallium.

Signed-off-by: Elvis Lee <kwangwoong.lee@lge.com>
2012-07-17 10:54:00 -04:00
Matt Turner
f42e601ce0 glx: build tests after libglx.la
Previously, if you ran make followed by make check it would work, but
if you just ran make check the test program would fail to compile.

Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>
2012-07-17 06:59:00 -07:00
José Fonseca
3469715a8a gallivm,draw,llvmpipe: Support wider native registers.
Squashed commit of the following:

commit 7acb7b4f60dc505af3dd00dcff744f80315d5b0e
Author: José Fonseca <jfonseca@vmware.com>
Date:   Mon Jul 9 17:46:31 2012 +0100

    draw: Don't use dynamically sized arrays.

    Not supported by MSVC.

commit 5810c28c83647612cb372d1e763fd9d7780df3cb
Author: José Fonseca <jfonseca@vmware.com>
Date:   Mon Jul 9 17:44:16 2012 +0100

    gallivm,llvmpipe: Don't use expressions with PIPE_ALIGN_VAR().

    MSVC doesn't accept exceptions in _declspec(align(...)). Use a
    define instead.

commit 8aafd1457ba572a02b289b3f3411e99a3c056072
Author: José Fonseca <jfonseca@vmware.com>
Date:   Mon Jul 9 17:41:56 2012 +0100

    gallium/util: Make u_cpu_detect.h header C++ safe.

commit 5795248350771f899cfbfc1a3a58f1835eb2671d
Author: José Fonseca <jfonseca@vmware.com>
Date:   Mon Jul 2 12:08:01 2012 +0100

    gallium/util: Add ULL suffix to large constants.

    As suggested by Andy Furniss: it looks like some old gcc versions
    require it.

commit 4c66c22727eff92226544c7d43c4eb94de359e10
Author: José Fonseca <jfonseca@vmware.com>
Date:   Fri Jun 29 13:39:07 2012 +0100

    gallium/util: Truly disable INF/NAN tests on MSVC.

    Thanks to Brian for spotting this.

commit 8bce274c7fad578d7eb656d9a1413f5c0844c94e
Author: José Fonseca <jfonseca@vmware.com>
Date:   Fri Jun 29 13:39:07 2012 +0100

    gallium/util: Disable INF/NAN tests on MSVC.

    Somehow they are not recognized as constants.

commit 6868649cff8d7fd2e2579c28d0b74ef6dd4f9716
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Jul 5 15:05:24 2012 +0200

    gallivm: Cleanup the 2 x 8 float -> 16 ub special path in lp_build_conv.

    No behaviour change intended, like 7b98455fb40c2df84cfd3cdb1eb7650f67c8a751.

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 5147a0949c4407e8bce9e41d9859314b4a9ccf77
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Jul 5 14:28:19 2012 +0200

    gallivm: (trivial) fix issues with multiple-of-4 texture fetch

    Some formats can't handle non-multiple of 4 fetches I believe, but
    everything must support length 1 and multiples of 4.
    So avoid going to scalar fetch (which is very costly) just because length
    isn't 4.
    Also extend the hack to not use shift with variable count for yuv formats to
    arbitrary length (larger than 1) - doesn't matter how many elements we
    have we always want to avoid it unless we have variable shift count
    instruction (which we should get with avx2).

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 87ebcb1bd71fa4c739451ec8ca89a7f29b168c08
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Jul 4 02:09:55 2012 +0200

    gallivm: (trivial) fix typo for wrap repeat mode in linear filtering aos code

    This would lead to bogus coordinates at the edges.
    (undetected by piglit because this path is only taken for block-based
    formats).

    Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit 3a42717101b1619874c8932a580c0b9e6896b557
Author: José Fonseca <jfonseca@vmware.com>
Date:   Tue Jul 3 19:42:49 2012 +0100

    gallivm: Fix TGSI integer translation with AVX.

commit d71ff104085c196b16426081098fb0bde128ce4f
Author: José Fonseca <jfonseca@vmware.com>
Date:   Fri Jun 29 15:17:41 2012 +0100

    llvmpipe: Fix LLVM JIT linear path.

    It was not working properly because it was looking at the JIT function
    before it was actually compiled.

    Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit a94df0386213e1f5f9a6ed470c535f9688ec0a1b
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Jun 28 18:07:10 2012 +0100

    gallivm: Refactor lp_build_broadcast(_scalar) to share code.

    Doesn't really change the generated assembly, but produces more compact IR,
    and of course, makes code more consistent.

    Reviewed-by: Brian Paul <brianp@vmware.com>

commit 66712ba2731fc029fa246d4fc477d61ab785edb5
Author: José Fonseca <jfonseca@vmware.com>
Date:   Wed Jun 27 17:30:13 2012 +0100

    gallivm: Make LLVMContextRef a singleton.

    There are any places inside LLVM that depend on it.  Too many to attempt
    to fix.

    Reviewed-by: Brian Paul <brianp@vmware.com>

commit ff5fb7897495ac263f0b069370fab701b70dccef
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Jun 28 18:15:27 2012 +0200

    gallivm: don't use 8-wide texture fetch in aos path

    This appears to be a slight loss usually.
    There are probably several reasons for that:
    - fetching itself is scalar
    - filtering is pure int code hence needs splitting anyway, same
      for the final texel offset calculations
    - texture wrap related code, which can be done 8-wide, is slightly more
      complex with floats (with clamp_to_edge) and float operations generally
      more costly hence probably not much faster overall
    - the code needed to split when encountering different mip levels for the
      quads, adding complexity
    So, just split always for aos path (but leave it 8-wide for soa, since we
    do 8-wide filtering there when possible).
    This should certainly be revisited if we'd have avx2 support.

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit ce8032b43dcd8e8d816cbab6428f54b0798f945d
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Jun 27 18:41:19 2012 +0200

    gallivm: (trivial) don't extract fparts variable if not needed

    Did not have any consequences but unnecessary.

commit aaa9aaed8f80dc282492f62aa583a7ee23a4c6d5
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Jun 27 18:09:06 2012 +0200

    gallivm: fix precision issue in aos linear int wrap code

    now not just passes at a quick glance but also with piglit...
    If we do the wrapping with floats, we also need to set the
    weights accordingly. We can potentially end up with different
    (integer) coordinates than what the integer calculations would
    have chosen, which means the integer weights calculated previously
    in this case are completely wrong. Well at least that's what I think
    happens, at least recalculating the weights helps.
    (Some day really should refactor all the wrapping, so we do whatever is
    fastest independent of 16bit int aos or 32bit float soa filtering.)

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit fd6f18588ced7ac8e081892f3bab2916623ad7a2
Author: José Fonseca <jfonseca@vmware.com>
Date:   Wed Jun 27 11:15:53 2012 +0100

    gallium/util: Fix parsing of options with underscore.

    For example

      GALLIVM_DEBUG=no_brilinear

    which was being parsed as two options, "no" and "brilinear".

commit 09a8f809088178a03e49e409fa18f1ac89561837
Author: James Benton <jbenton@vmware.com>
Date:   Tue Jun 26 15:00:14 2012 +0100

    gallivm: Added a generic lp_build_print_value which prints a LLVMValueRef.

    Updated lp_build_printf to share common code.
    Removed specific lp_build_print_vecX.

    Reviewed-by: José Fonseca <jfonseca@vmware.com>
    Reviewed-by: Brian Paul <brianp@vmware.com>

commit e59bdcc2c075931bfba2a84967a5ecd1dedd6eb0
Author: José Fonseca <jfonseca@vmware.com>
Date:   Wed May 16 15:00:23 2012 +0100

    draw,llvmpipe: Avoid named struct types on LLVM 3.0 and later.

    Starting with LLVM 3.0, named structures are meant not for debugging, but
    for recursive data types, previously also known as opaque types.

    The recursive nature of these types leads to several memory management
    difficulties.  Given that we don't actually need recursive types, avoid
    them altogether.

    This is an attempt to address fdo bugs 41791 and 44466. The issue is
    somewhat random so there's no easy way to check how effective this is.

    Cherry-picked from 9af1ba565d

commit df6070f618a203c7a876d984c847cde4cbc26bdb
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Jun 27 14:42:53 2012 +0200

    gallivm: (trivial) fix typo in faster aos linear int wrap code

    no longer crashes, now REALLY tested.

commit d8f98dce452c867214e6782e86dc08562643c862
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Jun 26 18:20:58 2012 +0200

    llvmpipe: (trivial) remove bogus optimization for float aos repeat wrap

    This optimization for nearest filtering on the linear path generated
    likely bogus results, and the int path didn't have any optimizations
    there since the only shader using force_nearest apparently uses
    clamp_to_edge not repeat wrap anyway.

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit c4e271a0631087c795e756a5bb6b046043b5099d
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Jun 26 23:01:52 2012 +0200

    gallivm: faster repeat wrap for linear aos path too

    Even if we already have scaled integer coords, it's way faster to use
    the original float coord (plus some conversions) rather than use URem.
    The choice of what to do for texture wrapping is not really tied to int
    aos or float soa filtering though for some modes there can be some gains
    (because of easier weight calculations).

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 1174a75b1806e92aee4264ffe0ffe7e70abbbfa3
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Jun 26 14:39:22 2012 +0200

    gallivm: improve npot tex wrap repeat in linear soa path

    URem gets translated into series of scalar divisions so
    just about anything else is faster.

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit f849ffaa499ed96fa0efd3594fce255c7f22891b
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Jun 26 00:40:35 2012 +0100

    gallivm: (trivial) fix near-invisible shift-space typo

    I blame the keyboard.

commit 5298a0b19fe672aebeb70964c0797d5921b51cf0
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Jun 25 16:24:28 2012 +0200

    gallivm: add new intrinsic helper to deal with arbitrary vector length

    This helper will split vectors which are too large for the hw, or expand
    them if they are too small, so a caller of a function using intrinsics which
    uses such sizes need not split (or expand) the vectors manually and the
    function will still use the intrinsic instead of dropping back to generic
    llvm code. It can also accept scalars for use with pseudo-vector intrinsics
    (only useful for float arguments, all x86 scalar simd float intrinsics use
    4vf32).
    Only used for lp_build_min/max() for now (also added the scalar float case
    for these while there). (Other basic binary functions could use it easily,
    whereas functions with a different interface would need different helpers.)
    Expanding vectors isn't widely used, because we always try to use
    build contexts with native hw vector sizes. But it might (or not) be nicer
    if this wouldn't need to be done, the generated code should in theory stay
    the same (it does get hit by lp_build_rho though already since we
    didn't have a intrinsic for the scalar lp_build_max case before).

    v2: incorporated Brian's feedback, and also made the scalar min/max case work
        instead of crash (all scalar simd float intrinsics take 4vf32 as argument,
        probably the reason why it wasn't used before).
        Moved to lp_bld_intr based on José's request, and passing intrinsic size
        instead of length.
        Ideally we'd derive the source type info from the passed in llvm value refs
        and process some llvmtype return type so we could handle intrinsics where
        the source and destination type isn't the same (like float/int conversions,
        packing instructions) but that's a bit too complicated for now.

    Reviewed-by: Brian Paul <brianp@vmware.com>
    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 01aa760b99ec0b2dc8ce57a43650e83f8c1becdf
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Jun 25 16:19:18 2012 +0200

    gallivm: (trivial) increase max code size for shader disassembly

    64kB was just short of what I needed (which caused a crash) hence
    increase to 96kB (should probably be smarter about that).

commit 74aa739138d981311ce13076388382b5e89c6562
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Jun 25 11:53:29 2012 +0100

    gallivm: simplify aos float tex wrap repeat nearest

    just handle pot and npot the same. The previous pot handling
    ended up with exactly the same instructions plus 2 more (leave it
    in the soa path though since it is probably still cheaper there).
    While here also fix a issue which would cause a crash after an assert.

commit 0e1e755645e9e49cfaa2025191e3245ccd723564
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Jun 25 11:29:24 2012 +0100

    gallivm: (trivial) skip floor rounding in ifloor when not signed

    This was only done for the non-sse41 case before, but even with
    sse41 this is obviously unnecessary (some callers already call
    itrunc in this case anyway but some might not).

commit 7f01a62f27dcb1d52597b24825931e88bae76f33
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Jun 25 11:23:12 2012 +0100

    gallivm: (trivial) fix bogus comments

commit 5c85be25fd82e28490274c468ce7f3e6e8c1d416
Author: José Fonseca <jfonseca@vmware.com>
Date:   Wed Jun 20 11:51:57 2012 +0100

    translate: Free elt8_func/elt16_func too.

    These were leaking.

    Reviewed-by: Brian Paul <brianp@vmware.com>
    Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit 0ad498f36fb6f7458c7cffa73b6598adceee0a6c
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Jun 19 15:55:34 2012 +0200

    gallivm: fix bug for tex wrap repeat with linear sampling in aos float path

    The comparison needs to be against length not length_minus_one, otherwise
    the max texel is never chosen (for the second coordinate).

    Fixes piglit texwrap-1D-npot-proj (and 2D/3D versions).

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit d1ad65937c5b76407dc2499b7b774ab59341209e
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Jun 19 16:13:43 2012 +0200

    gallivm: simplify soa tex wrap repeat with npot textures and no mip filtering

    Similar to what is already done in aos sampling for the float path (but not
    the int path since we don't get normalized float coordinates there).
    URem is expensive and the calculation is done trivially with
    normalized floats instead (at least with sse41-capable cpus).
    (Some day should probably do the same for the mip filter path but it's much
    more complicated there hence the gain is smaller.)

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit e1e23f57ba9b910295c306d148f15643acc3fc83
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Jun 18 20:38:56 2012 +0200

    llvmpipe: (trivial) remove duplicated function declaration

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 07ca57eb09e04c48a157733255427ef5de620861
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Jun 18 20:37:34 2012 +0200

    llvmpipe: destroy setup variants on context destruction

    lp_delete_setup_variants() used to be called in garbage collection,
    but this no longer exists hence the setup shaders never got freed.

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit ed0003c633859a45f9963a479f4c15ae0ef1dca3
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Jun 18 16:25:29 2012 +0100

    gallivm: handle different ilod parts for multiple quad sampling

    This fixes filtering when the integer part of the lod is not the same
    for all quads. I'm not fully convinced of that solution yet as it just
    splits the vector if the levels to be sampled from are different.
    But otherwise we'd need to do things like some minify steps, and getting
    mip level base address separately anyway hence it wouldn't really look
    like much of a win (and making the code even more complex).
    This should now give identical results to single quad sampling.

commit 8580ac4cfc43a64df55e84ac71ce1a774d33c0d2
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Jun 14 18:14:47 2012 +0200

    gallivm: de-duplicate sample code common to soa and aos sampling

    There doesn't seem to be any reason why this code dealing with cube face
    selection, lod and mip level calculation is separate in aos and
    soa sampling, and I am sick of having it to change in both places.

commit fb541e5f957408ce305b272100196f1e12e5b1e8
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Jun 14 18:15:41 2012 +0200

    gallivm: do mip filtering with per quad lod_fpart

    This gives better results for mip filtering, though the generated code might
    not be optimal. For now it also creates some artifacts if the lod_ipart isn't
    the same for all quads, since instead of using the same mip weight for all
    quads as previously (which just caused non-smooth gradients) this now will
    use the right weights but with the wrong mip level in this case (can easily
    be seen with things like texfilt, mipmap_tunnel).
    v2: use logic helper suggested by José, and fix issue with negative lod_fpart
        values

commit f1cc84eef7d826a20fab6cd8ccef9a275ff78967
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Jun 13 18:35:25 2012 +0200

    gallivm: (trivial) fix bogus assert in lp_build_unpack_broadcast_aos_scalars

commit 7c17dbae8ae290df9ce0f50781a09e8ed640c044
Author: James Benton <jbenton@vmware.com>
Date:   Tue Jun 12 12:11:14 2012 +0100

    util: Reimplement half <-> float conversions.

    Removed u_half.py used to generate the table for previous method.

    Previous implementation of float to half conversion was faulty for
    denormalised and NaNs and would require extra logic to fix,
    thus making the speedup of using tables irrelevant.

commit 7762f59274070e1dd4b546f5cb431c2eb71ae5c3
Author: James Benton <jbenton@vmware.com>
Date:   Tue Jun 12 12:12:16 2012 +0100

    tests: Updated tests to properly handle NaN for half floats.

commit fa94c135aea5911fd93d5dfb6e6f157fb40dce5e
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Jun 11 18:33:10 2012 +0200

    gallivm: do mip level calculations per quad

    This is the final piece which shouldn't change the rendering output yet.

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 23cbeaddfe03c09ca18c45d28955515317ffcf4c
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Sat Jun 9 00:54:21 2012 +0200

    gallivm: do per-quad cube face selection

    Doesn't quite fix the piglit cubemap test (not sure why actually)
    but doing per-quad face selection is doing the right thing and
    definitely an improvement.

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit abfb372b3702ac97ac8b5aa80ad1b94a2cc39d33
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Jun 11 18:22:59 2012 +0200

    gallivm: do all lod calculations per quad

    Still no functional change but lod is now converted to scalar after
    lod calculations.

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 519368632747ae03feb5bca9c655eccbc5b751b4
Author: James Benton <jbenton@vmware.com>
Date:   Tue May 22 16:46:10 2012 +0100

    gallivm: Added support for half-float to float conversion in lp_build_conv.

    Updated various utility functions to support this change.

commit 135b4d683a4c95f7577ba27b9bffa4a6fbd2c2e7
Author: James Benton <jbenton@vmware.com>
Date:   Tue May 22 16:02:46 2012 +0100

    gallivm: Added function for half-float to float conversion.

    Updated lp_build_format_aos_array to support half-float source.

commit 37d648827406a20c5007abeb177698723ed86673
Author: James Benton <jbenton@vmware.com>
Date:   Tue May 22 14:55:18 2012 +0100

    util: Updated u_format_tests to rigidly test half-float boundary values.

commit 2ad18165d96e578aa9046df7c93cb1c3284d8c6b
Author: James Benton <jbenton@vmware.com>
Date:   Tue May 22 14:54:16 2012 +0100

    llvmpipe: Updated lp_test_format to properly handle Inf/NaN results.

commit 78740acf25aeba8a7d146493dd5c966e22c27b73
Author: James Benton <jbenton@vmware.com>
Date:   Tue May 22 14:53:30 2012 +0100

    util: Added functions for checking NaN / Inf for double and half-floats.

commit 35e9f640ae01241f9e0d67fe893bbbf564c05809
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu May 24 21:05:13 2012 +0200

    gallivm: Fix calculating rho for 3d textures for the single-quad case

    Discovered by accident, this looks like a very old typo bug.

commit fc1220c636326536fd0541913154e62afa7cd1d8
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu May 24 21:04:59 2012 +0200

    gallivm: do calcs per-quad in lp_build_rho

    Still convert to scalar at the end of the function.

commit 50a887ffc550bf310a6988fa2cea5c24d38c1a41
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon May 21 23:21:50 2012 +0200

    gallivm: (trivial) return scalar in lp_build_extract_range for length 1 vectors

    Our type system on top of llvm's one doesn't generally support vectors of
    length 1, instead using scalars. So we should return a scalar from this
    function instead of having to bitcast the vector with length 1 later elsewhere.

commit 80c71c621f9391f0f9230460198d861643324876
Author: James Benton <jbenton@vmware.com>
Date:   Tue May 22 17:49:15 2012 +0100

    draw: Fixed bad merge error

commit c47401cfad0c9167de20ff560654f533579f452c
Author: James Benton <jbenton@vmware.com>
Date:   Tue May 22 15:29:30 2012 +0100

    draw: Updated store_clip to store whole vectors instead of individual elements.

commit 2d9c1ad74b0b0b41861fffcecde39f09cc27f1cf
Author: James Benton <jbenton@vmware.com>
Date:   Tue May 22 15:28:32 2012 +0100

    gallivm: Added lp_build_fetch_rgba_aos_array.

    A version of lp_build_fetch_rgba_aos which is targeted at simple array formats.

    Reads the whole vector from memory in one, instead of reading each element
    individually.

    Tested with mesa tests and demos.

commit ff7805dc2b6ef6d8b11ec4e54aab1633aef29ac8
Author: James Benton <jbenton@vmware.com>
Date:   Tue May 22 15:27:40 2012 +0100

    gallivm: Added lp_build_pad_vector.

    This function pads a vector with undef to a desired length.

commit 701f50acef24a2791dabf4730e5b5687d6eb875d
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 18 17:27:19 2012 +0100

    util: Added util_format_is_array.

    This function checks whether a format description is in a simple array format.

commit 5e0a7fa543dcd009de26f34a7926674190fa6246
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 18 19:13:47 2012 +0100

    draw: Removed draw_llvm_translate_from and draw/draw_llvm_translate.c.

    This is "replaced" by adding an optimised path in lp_build_fetch_rgba_aos
    in an upcoming patch.

commit 8c886d6a7dd3fb464ecf031de6f747cb33e5361d
Author: James Benton <jbenton@vmware.com>
Date:   Wed May 16 15:02:31 2012 +0100

    draw: Modified store_aos to write the vector as one, not individual elements.

commit 37337f3d657e21dfd662c7b26d61cb0f8cfa6f17
Author: James Benton <jbenton@vmware.com>
Date:   Wed May 16 14:16:23 2012 +0100

    draw: Changed aos_to_soa to use lp_build_transpose_aos.

commit bd2b69ce5d5c94b067944d1dcd5df9f8e84548f1
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 18 19:14:27 2012 +0100

    draw: Changed soa_to_aos to use lp_build_transpose_aos.

commit 0b98a950d29a116e82ce31dfe7b82cdadb632f2b
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 18 18:57:45 2012 +0100

    gallivm: Added lp_build_transpose_aos which converts between aos and soa.

commit 69ea84531ad46fd145eb619ed1cedbe97dde7cb5
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 18 18:57:01 2012 +0100

    gallivm: Added lp_build_interleave2_half aimed at AVX unpack instructions.

commit 7a4cb1349dd35c18144ad5934525cfb9436792f9
Author: José Fonseca <jfonseca@vmware.com>
Date:   Tue May 22 11:54:14 2012 +0100

    gallivm: Fix build on Windows.

    MC-JIT not yet supported there.

    Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit afd105fc16bb75d874e418046b80d9cc578818a1
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 18 16:17:26 2012 +0100

    llvmpipe: Added a error counter to lp_test_conv.

    Useful for keeping track of progress when fixing errors!

    Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit b644907d08c10a805657841330fc23db3963d59c
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 18 16:16:46 2012 +0100

    llvmpipe: Changed known failures in lp_test_conv.

    To comply with the recent fixes to lp_bld_conv.

    Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit d7061507bd94f6468581e218e61261b79c760d4f
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 18 16:14:38 2012 +0100

    llvmpipe: Added fixed point types tests to lp_test_conv.

    Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit 146b3ea39b4726dbe125ac666bd8902ea3d6ca8c
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 18 16:26:35 2012 +0100

    llvmpipe: Changed lp_test_conv src/dst alignment to be correct.

    Now based on the define rather than a fixed number.

    Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit f3b57441f834833a4b142a951eb98df0aa874536
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 18 16:06:44 2012 +0100

    gallivm: Fixed erroneous optimisation in lp_build_min/max.

    Previously assumed normalised was 0 to 1, but it can be -1 to 1
    if type is signed.
    Tested with lp_test_conv and lp_test_format, reduced errors.

    Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit a0613382e5a215cd146bb277646a6b394d376ae4
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 18 16:04:49 2012 +0100

    gallivm: Compensate for lp_const_offset in lp_build_conv.

    Fixing a /*FIXME*/ to remove errors in integer conversion in lp_build_conv.
    Tested using lp_test_conv and lp_test_format, reduced errors.

    Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit a3d2bf15ea345bc8a0664f8f441276fd566566f3
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 18 16:01:25 2012 +0100

    gallivm: Fixed overflow in lp_build_clamped_float_to_unsigned_norm.

    Tested with lp_test_conv and lp_test_format, reduced errors.

    Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit e7b1e76fe237613731fa6003b5e1601a2e506207
Author: José Fonseca <jfonseca@vmware.com>
Date:   Mon May 21 20:07:51 2012 +0100

    gallivm: Fix build with LLVM 2.6

    Trivial, and useful.

commit d3c6bbe5c7f5ba1976710831281ab1b6a631082d
Author: José Fonseca <jfonseca@vmware.com>
Date:   Tue May 15 17:15:59 2012 +0100

    gallivm: Enable MCJIT/AVX with vanilla LLVM 3.1.

    Add the necessary C++ glue, so that we don't need any modifications
    to the soon to be released LLVM 3.1.

    Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit 724a019a14d40fdbed21759a204a2bec8a315636
Author: José Fonseca <jfonseca@vmware.com>
Date:   Mon May 14 22:04:06 2012 +0100

    gallivm: Use HAVE_LLVM 0x0301 consistently.

commit af6991e2a3868e40ad599b46278551b794839748
Author: José Fonseca <jfonseca@vmware.com>
Date:   Mon May 14 21:49:06 2012 +0100

    gallivm: Add MCRegisterInfo.h to silence benign warnings about missing implementation.

    Trivial.

commit 6f8a1d75458daae2503a86c6b030ecc4bb494e23
Author: Vinson Lee <vlee@freedesktop.org>
Date:   Mon Apr 2 22:14:15 2012 -0700

    gallivm: Pass in a MCInstrInfo to createMCInstPrinter on llvm-3.1.

    llvm-3.1svn r153860 makes MCInstrInfo available to the MCInstPrinter.

    Signed-off-by: Vinson Lee <vlee@freedesktop.org>
    Reviewed-by: Brian Paul <brianp@vmware.com>

commit 62555b6ed8760545794f83064e27cddcb3ce5284
Author: Vinson Lee <vlee@freedesktop.org>
Date:   Tue Mar 27 21:51:17 2012 -0700

    gallivm: Fix method overriding in raw_debug_ostream.

    Use matching type qualifers to avoid method hiding.

    Signed-off-by: Vinson Lee <vlee@freedesktop.org>
    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit 6a9bd784f4ac68ad0a731dcd39e5a3c39989f2be
Author: Vinson Lee <vlee@freedesktop.org>
Date:   Tue Mar 13 22:40:52 2012 -0700

    gallivm: Fix createOProfileJITEventListener namespace with llvm-3.1.

    llvm-3.1svn r152620 refactored the OProfile profiling code.
    createOProfileJITEventListener was moved from the llvm namespace to the
    llvm::JITEventListener namespace.

    Signed-off-by: Vinson Lee <vlee@freedesktop.org>
    Reviewed-by: Brian Paul <brianp@vmware.com>

commit b674955d39adae272a779be85aa1bd665de24e3e
Author: Vinson Lee <vlee@freedesktop.org>
Date:   Mon Mar 5 22:00:40 2012 -0800

    gallivm: Pass in a MCRegisterInfo to MCInstPrinter on llvm-3.1.

    llvm-3.1svn r152043 changes createMCInstPrinter to take an additional
    MCRegisterInfo argument.

    Signed-off-by: Vinson Lee <vlee@freedesktop.org>
    Reviewed-by: Brian Paul <brianp@vmware.com>

commit 11ab69971a8a31c62f6de74905dbf8c02884599f
Author: Vinson Lee <vlee@freedesktop.org>
Date:   Wed Feb 29 21:20:53 2012 -0800

    Revert "gallivm: Change getExtent and readByte to non-const with llvm-3.1."

    This reverts commit d5a6c17254.

    llvm-3.1svn r151687 makes MemoryObject accessor members const again.

    Signed-off-by: Vinson Lee <vlee@freedesktop.org>
    Reviewed-by: Brian Paul <brianp@vmware.com>

commit 339960c82d2a9f5c928ee9035ed31dadb7f45537
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon May 14 16:19:56 2012 +0200

    gallivm: (trivial) fix assertion failure for mipmapped 1d textures

    In lp_build_rho, we may end up with a 1-element vector (for mipmapped 1d
    textures), but in this case we require the type to be a non-vector type,
    so need a cast.

commit 9d73edb727bd6d196030dc3026b7bf0c574b3e19
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu May 10 18:12:07 2012 +0200

    gallivm: prepare for per-quad lod calculations for large vectors

    to be able to handle multiple quads at once in texture sampling and still
    do lod calculations per quad, it is necessary to get the per-quad derivatives
    into the lp_build_rho function.
    Until now these derivative values were just scalars, which isn't going to work.
    So we now use vectors, and since the interface needs to change we also do some
    different (slightly more efficient) packing of the values.
    For 8-wide vectors the packed derivative values for 3 coords would look like
    this, this scales to a arbitrary (multiple of 4) vector size:
    ds1dx ds1dy dt1dx dt1dy ds2dx ds2dy dt2dx dt2dy
    dr1dx dr1dy _____ _____ dr2dx dr2dy _____ _____
    The second vector will be unused for 1d and 2d textures.
    To facilitate future changes the derivative values are put into a struct, since
    quite some functions just pass these values through.
    The generated code seems to be very slightly better for 2d textures (with
    4-wide vectors) than before with sse2 (if you have a cpu with physical 128bit
    simd units - otherwise it's probably not a win).
    v2: suggestions from José, rename variables, add comments, use swizzle helper

commit 0aa21de0d31466dac77b05c97005722e902517b8
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu May 10 18:10:31 2012 +0200

    gallivm: add undefined swizzle handling to lp_build_swizzle_aos

    This is useful for vectors with "holes", it lets llvm choose the most
    efficient shuffle instructions if some elements aren't needed without having to
    worry what elements to manually pick otherwise.

commit 00faf3f370e7ce92f5ef51002b0ea42ef856e181
Author: José Fonseca <jfonseca@vmware.com>
Date:   Fri May 4 17:25:16 2012 +0100

    gallivm: Get the LLVM IR optimization passes before JIT compilation.

    MC-JIT engine compiles the module immediately on creation, so the optimization
    passes were being run too late.

    So now we create a target data layout from a string, that matches the
    ABI parameters reported by the compiler.

    The backend optimization passes were always been run, so the performance
    improvement is modest (3% on multiarb mesa demo).

    Reviewed-by: Roland Scheidegger <sroland@vmware.com>
    Reviewed-by: Brian Paul <brianp@vmware.com>

commit 40a43f4e2ce3074b5ce9027179d657ebba68800a
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed May 2 16:03:54 2012 +0200

    gallivm: (trivial) fix wrong define used in lp_build_pack2

    should fix stack-smashing crashes.

commit e6371d0f4dffad4eb3b7a9d906c23f1c88a2ab9e
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Apr 30 21:25:29 2012 +0200

    gallivm: add perf warnings when not using intrinsics with 256bit vectors

    Helper functions using integer sse2 intrinsics could split the vectors with AVX
    instead of using generic fallback (which should be faster).
    We don't actually expect to hit these paths (hence don't fix them up to actually
    do the vector splitting) so just emit warnings (for those functions where it's
    obvious doing split/intrinsic is faster than using generic path).
    Only emit warnings for 256bit vectors since we _really_ don't expect to hit
    arbitrary large vectors which would affect a lot more functions.
    The warnings do not actually depend on avx since the same logic applies to
    plain sse2 too (but of course again there's _really_ no reason we should hit
    these functions with 256bit vectors without avx).

commit 8a9ea701ea7295181e846c6383bf66a5f5e47637
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue May 1 20:37:07 2012 +0200

    gallivm: split vectors manually for avx in lp_build_pack2 (v2)

    There's 2 reasons for this:
    First, there's a llvm bug (fixed in 3.1) which generates tons of byte
    inserts/extracts otherwise, and second, more importantly, we want to use
    pack intrinsics instead of shuffles.
    We do this in lp_build_pack2 and not the calling code (aos sample path)
    because potentially other callers might find that useful too, even if
    for larger sequences of code using non-native vector sizes it might be
    better to manually split vectors.
    This should boost texture performance in the aos path considerably.
    v2: fix issues with intrinsics types with old llvm

commit 27ac5b48fa1f2ea3efeb5248e2ce32264aba466e
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue May 1 20:26:22 2012 +0200

    llvmpipe: refactor lp_build_pack2 (v2)

    prettify, and it's unnecessary to assert when there's no intrinsic due to
    unsupported bit width - the shuffle path will work regardless.
    In contrast lp_build_packs2, should only rely on lp_build_pack2 doing the
    clamping for element sizes for which there is a sse2 intrinsic.
    v2: fix bug spotted by Jose regarding the intrinsic type for packusdw
    on old llvm versions.

commit ddf279031f0111de4b18eaf783bdc0a1e47813c8
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue May 1 20:13:59 2012 +0200

    gallivm: add src width check in lp_build_packs2()

    not doing so would skip clamping even if no sse2 pack instruction is
    available, which is incorrect (in theory only, such widths would also always
    hit a (unnecessary) assertion in lp_build_pack2().

commit e7f0ad7fe079975eae7712a6e0c54be4fae0114b
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Fri Apr 27 15:57:00 2012 +0200

    gallivm: (trivial) fix crash-causing typo for npot textures with avx

commit 28a9d7f6f655b6ec508c8a3aa6ffefc1e79793a0
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Apr 25 19:38:45 2012 +0200

    gallivm: (trivial) remove code mistakenly added twice.

commit d5926537316f8ff67ad0a52e7242f7c5478d919b
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Apr 24 21:16:15 2012 +0200

    gallivm: add a new avx aos sample path (v2)

    Try to avoid mixing float and int address calculations. This does texture wrap
    modes with floats, and then the offset calculations still with ints (because
    of lack of precision with floats, though we could do some effort to make it work
    with not too large (16MB) textures).
    This also handles wrap repeat mode with npot-sized textures differently than
    either the old soa or aos int path (likely way faster but untested).
    Otherwise the actual address wrap code is largely similar to the soa path (not
    quite the same as this one also has some int code), it should get used by avx
    soa sampling later as well but doesn't handle more complex address modes yet
    (this will also have the benefit that we can use aos sampling path for all
    texture address modes).
    Generated code for that looks reasonable, but still does not split vectors
    explicitly for fetch/filter which means still get hit by llvm (fixed upstream)
    which generates hundreds of pinsrb/pextrb instead of two shuffles.
    It is not obvious though if it's much of a win over just doing address calcs
    4-wide but with ints, even if it is definitely much less instructions on avx.
    piglit's texwrap seems to look exactly the same but doesn't test
    neither the non-normalized nor the npot cases.
    v2: fix comments, prettify based on Brian's and Jose's feedback.

commit bffecd22dea66fb416ecff8cffd10dd4bdb73fce
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Apr 19 01:58:29 2012 +0200

    gallivm: refactor aos lp_build_sample_image_nearest/linear

    split them up to separate address calculations and fetching/filtering.
    Need this for being able to do 8-wide float address calcs and 4-wide
    fetch/filter later (for avx). Plus the functions were very big scary monsters
    anyway (in particular lp_build_sample_image_linear).

commit a80b325c57529adddcfa367f96f03557725c4773
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Apr 16 17:17:18 2012 +0200

    gallivm: fix lp_build_resize when truncating width but expanding vector size

    Missed this case which I thought was impossible - the assertion for it was
    right after the division by zero...
    (AoS) texture sampling may ask us to do this, for things like 8 4x32int
    vectors to 1 32x8int vector conversion (eventually, we probably don't want
    this to happen).

commit f9c8337caa3eb185830d18bce8b95676a065b1d7
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Sat Apr 14 18:00:59 2012 +0200

    gallivm: fix cube maps with larger vectors

    This makes the branchless cube face selection code work with larger vectors.
    Because the complexity is quite high (cannot really be improved it seems,
    per-face selection would reduce complexity a lot but this leads to errors
    unless the derivatives are calculated all from the same face which almost
    doubles the work to be done) it is still slower than the branching version,
    hence only enable this with large vectors.
    It doesn't actually do per-quad face selection yet (only makes sense with
    matching lod selection, in fact it will select the same face for all pixels
    based on the average of the first four pixels for now) but only different
    shuffles are required to make it work (the branching version actually should
    work with larger vectors too now thanks to the improved horizontal add but of
    course it cannot be extended to really select the face per-quad unless doing
    branching per quad).

commit 7780c58869fc9a00af4f23209902db7e058e8a66
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Fri Mar 30 21:11:12 2012 +0100

    llvmpipe: (trivial) fix compiler warning

    and also clarify comment regarding availability of popcnt instruction.

commit a266dccf477df6d29a611154e988e8895892277e
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Fri Mar 30 14:21:07 2012 +0100

    gallivm: remove unneeded members in lp_build_sample_context

    Minor cleanup, the texture width, height, depth aren't accessed in their
    scalar form anywhere. Makes it more obvious those values should probably be
    fetched already vectorized (but this requires more invasive changes)...

commit b678c57fb474e14f05e25658c829fc04d2792fff
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Mar 29 15:53:55 2012 +0100

    gallivm: add a helper for concatenating vectors

    Similar to the extract_range helper intended to get around slow code generated
    by llvm for 128bit insertelements.
    Concatenating two 128bit vectors this way will result in a single vinsertf128
    operation rather than two 64bit stores plus one 128bit load, though it might be
    mildly useful for other purposes as well.

commit 415ff228bcd0cf5e44a4c15350a661f0f5520029
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Mar 28 19:41:15 2012 +0100

    gallivm: add a custom 2x8f->1x16ub avx conversion path

    Similar to the existing 4x4f->1x16ub sse2 path, shaves off a couple
    instructions (min/max mostly) because it relies on pack intrinsics clamping.

commit 78c08fc89f8fbcc6dba09779981b1e873e2a0299
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Mar 28 18:44:07 2012 +0100

    gallivm: add avx arithmetic intrinsics

    Add all avx intrinsics for arithmetic functions (with the exception
    of the horizontal add function which needs another look).
    Seems to pass basic tests.

    Reviewed-by: José Fonseca <jfonseca@vmware.com>

commit a586caa2800aa5ce54c173f7c0d4fc48153dbc4e
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Mar 28 15:31:35 2012 +0100

    gallivm: add avx logic intrinsics

    Add the blend intrinsics for 8-wide float and 4-wide double vectors.
    Since we lack 256bit int instructions these are used for int vectors as well,
    though obviously not for byte or word element values.
    The comparison intrinsics aren't extended for avx since these are only used
    for pre-2.7 llvm versions.

commit 70275e4c13c89315fc2560a4c488c0e6935d5caf
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Mar 28 00:40:53 2012 +0100

    gallivm: new helper function for extract shuffles.

    Based on José's idea as we can need that in a couple places.
    Note that such shuffles should not be used lightly, since data layout
    of <4 x i8> is different to <16 x i8> for instance, hence might cause
    data rearrangement.

commit 4d586dbae1b0c55915dda1759d2faea631c0a1c2
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Mar 27 18:27:25 2012 +0100

    gallivm: (trivial) don't overallocate shuffle variable

    using wrong define meant huge array...

commit 06b0ec1f6d665d98c135f9573ddf4ba04b2121ad
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Mar 27 17:54:20 2012 +0100

    gallivm: don't do per-element extract/insert for vector element resize

    Instead of doing per-element extract/insert if the src vectors
    and dst vector differ in total size (which generates atrocious code)
    first change the src vectors size by using shuffles to destination
    vector size.
    We can still do better than that on AVX for packing to color buffer
    (by exploiting pack intrinsics characteristics hence eleminating the
    need for some clamps) but this already generates much better code.

    v2: incorporate feedback from José, Keith and use shuffle instead of
    bitcasts/extracts. Due to llvm deficiencies the latter cause all data
    to get moved to GPRs and back in pieces (even though the data in the
    regs actually stays the same...).

commit c9970d70e05f95d3f52fe7d2cd794176a52693aa
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Fri Mar 23 19:33:19 2012 +0000

    gallivm: fix bug in simple position interpolation

    Accidental use of position attribute instead of just pixel coordinates.
    Caused failures in piglit glsl-fs-ceil and glsl-fs-floor.

commit d0b6fcdb008d04d7f73d3d725615321544da5a7e
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Fri Mar 23 15:31:14 2012 +0000

    gallivm: fix emission of ceil opcode

    lp_build_ceil seems more appropriate than lp_build_trunc.
    This seems to be never hit though someone performs some ceil
    to floor magic.

commit d97fafed7e62ffa6bf76560a92ea246a1a26d256
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Mar 22 11:46:52 2012 +0000

    gallivm: new vectorized path for cubemap calculations

    should be faster when adapted to multiple quads as only selection masks need to be different.
    The code is more or less a per-pixel version adapted to only do it per quad.
    A per pixel version would be much simpler (could drop 2 selects, 6 broadcasts and the messy
    horizontal add of 3 vectors at the expense of only 2 more absolute value instructions -
    would also just work for arbitary large vectors).
    This version doesn't yet work with larger vectors because the horizontal add isn't adjusted
    to be able to work with 2x4 vectors (and also because face selection wouldn't be done per
    quad just per block though that would be only a correctness issue just as with lod selection).
    The downside is this code is quite a bit slower. On a Core2 it can be sped up by disabling the
    hw blend instructions for selection and using logicop fallbacks instead, but it is still slower
    than the old code, hence leave that in for now. Probably will chose one or the other version
    based on vector length in the end.

commit b375fbb18a3fd46859b7fdd42f3e9908ea4ff9a3
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Mar 21 14:42:29 2012 +0000

    gallivm: fix optimized occlusion query intrinsic name

commit a9ba0a3b611e48efbb0e79eb09caa85033dbe9a2
Author: José Fonseca <jfonseca@vmware.com>
Date:   Wed Mar 21 16:19:43 2012 +0000

    draw,gallivm,llvmpipe: Call gallivm_verify_function everywhere.

commit f94c2238d2bc7383e088b8845b7410439a602071
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Mar 20 18:54:10 2012 +0000

    gallivm: optimize calculations for cube maps a bit

    this does some more vectorized calculations and uses horizontal adds if possible.
    A definite win with sse3 otherwise it doesn't seem to make much of a difference.
    In any case this is arithmetically identical, cannot handle larger vectors.
    Should be useful as a reference point against larger vector version later...

commit 21a2c1cf3c8e1ac648ff49e59fdc0e3be77e2ebb
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Mar 20 15:16:27 2012 +0000

    llvmpipe: slight optimization of occlusion queries

    using movmskps when available.
    While this is slightly better for cpus without popcnt we should
    really sum the vectors ourselves (it is also possible to cast to i4 before
    doing the popcnt but that doesn't help that much neither since llvm
    is using some optimized popcnt version for i32)

commit 5ab5a35f216619bcdf55eed52b0db275c4a06c1b
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Mar 20 13:32:11 2012 +0000

    llvmpipe: fix occlusion queries with larger vectors

    need to adjust casts etc.

commit ff95e6fdf5f16d4ef999ffcf05ea6e8c7160b0d5
Author: José Fonseca <jfonseca@vmware.com>
Date:   Mon Mar 19 20:15:25 2012 +0000

    gallivm: Restore optimization passes.

commit 57b05b4b36451e351659e98946dae27be0959832
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Mar 19 19:34:22 2012 +0000

    llvmpipe: use existing min2 macro

commit bc9a20e19b4f600a439f45679451f2e87cd4b299
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Mar 19 19:07:27 2012 +0000

    llvmpipe: add some safeguards against really large vectors

    As per José's suggestion, prevent things from blowing up if some cpu
    would have 1024bit or larger vectors.

commit 0e2b525e5ca1c5bbaa63158bde52ad1c1564a3a9
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Mar 19 18:31:08 2012 +0000

    llvmpipe: fix mask generation for uberwide vectors

    this was the only piece preventing 16-wide vectors from working
    (apart from the LP_MAX_VECTOR_WIDTH define that is), which is the maximum
    as we don't get more pixels in the fragment shader at once.
    Hence adjust that so things could be tested properly with that size
    even though there seems to be no practical value.

commit 3c8334162211c97f3a11c7f64e9e5a2a91ad9656
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Mar 19 18:19:41 2012 +0000

    llvmpipe: fix the simple interpolation method with larger vectors

    so both methods actually _really_ work now. Makes textures look
    nice with larger vectors...

commit 1cb0464ef8871be1778d43b0c56adf9c06843e2d
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Mar 19 17:26:35 2012 +0000

    llvmpipe: fix mask generation and position interpolation with 8-wide vectors

    trivial bugs, with these things start to look somewhat reasonable.
    Textures though have some swizzling issues it seems.

commit 168277a63ef5b72542cf063c337f2d701053ff4b
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Mar 19 16:04:03 2012 +0000

    llvmpipe: don't overallocate variables

    we never have more than 16 (stamp size) / 4 (minimum possible vector size).
    (With larger vectors those variables are still overallocated a bit.)

commit 409b54b30f81ed0aa9ed0b01affe15c72de9abd2
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Mar 19 15:56:48 2012 +0000

    llvmpipe: add some 32f8 formats to lp_test_conv

    Also add the ability to handle different sized vectors.

commit 55dcd3af8366ebdac0af3cdb22c2588f24aa18ce
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Mar 19 15:47:27 2012 +0000

    gallivm: handle different sized vectors in conversion / pack

    only fully generic path for now (extract/insert per element).

commit 9c040f78c54575fcd94a8808216cf415fe8868f6
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Sun Mar 18 00:58:28 2012 +0100

    llvmpipe: fix harmless use of unitialized values

commit 551e9d5468b92fc7d5aa2265db9a52bb1e368a36
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Fri Mar 16 23:31:21 2012 +0100

    gallivm: drop special path in extract_broadcast with different sized vectors

    Not needed, llvm can handle shuffles with different sized result vector just
    fine. Should hopefully generate the same code in the end, but simpler IR.

commit 44da531119ffa07a421eaa041f63607cec88f6f8
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Fri Mar 16 23:28:49 2012 +0100

    llvmpipe: adapt interpolation for handling multiple quads at once

    this is still WIP there are actually two methods possible not quite
    sure what makes the most sense, so there's code for both for now:
    1) the iterative method as used before (compute attrib values at upper left
    corner of stamp and upper left corner of each quad initially).
    It is improved to handle more than one quad at once, and also do some more vectorized
    calculations initially for slightly better code - newer cpus have full throughput with
    4 wide float vectors, hence don't try to code up a path which might be faster if there's
    just one channel active per attribute.
    2) just do straight interpolation for each pixel.
    Method 2) is more work per quad, but less initially - if all quads are executed
    significantly more overall though. But this might change with larger vector lengths.
    This method would also be needed if we'd do some kind of active quad merging when
    operating on multiple quads at once.
    This path contains some hack to force llvm to generate better code, it is still far
    from ideal though, still generates far too many unnecessary register spills/reloads.
    Both methods should work with different sized vectors.
    Not very well tested yet, still seems to work with four-wide vectors, need changes
    elsewhere to be able to test with wider vectors.

commit be5d3e82e2fe14ad0a46529ab79f65bf2276cd28
Author: José Fonseca <jfonseca@vmware.com>
Date:   Fri Mar 16 20:59:37 2012 +0000

    draw: Cleanup.

commit f85bc12c7fbacb3de2a94e88c6cd2d5ee0ec0e8d
Author: José Fonseca <jfonseca@vmware.com>
Date:   Fri Mar 16 20:43:30 2012 +0000

    gallivm: More module compilation refactoring.

commit d76f093198f2a06a93b2204857e6fea5fd0b3ece
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Mar 15 21:29:11 2012 +0000

    llvmpipe: Use gallivm_compile/free_function() in linear code.

    Should had been done before.

commit 122e1adb613ce083ad739b153ced1cde61dfc8c0
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Mar 13 14:47:10 2012 +0100

    llvmpipe: generate partial pixel mask for multiple quads

    still works with one quad, cannot be tested yet with more
    At least for now always fixed order with multiple quads.

commit 4c4f15081d75ed585a01392cd2dcce0ad10e0ea8
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Mar 8 22:09:24 2012 +0100

    llvmpipe: refactor state setup a bit

    Refactor to make it easier to emit (and potentially later fetch in fs)
    coefficients for multiple attributes at once.
    Need to think more about how to make this actually happen however, the
    problem is different attributes can have different interpolation modes,
    requiring different handling in both setup and fs (though linear and
    perspective handling is close).

commit 9363e49722ff47094d688a4be6f015a03fba9c79
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Mar 8 19:23:23 2012 +0100

    llvmpipe: vectorize tri offset calc

    cuts number of instructions in quad-offset-factor from 107 to 75.
    This code actually duplicated the (scalar) code calculating the determinant
    except it used different vertex order (leading to different sign but it doesn't
    matter) hence llvm could not have figured out it's the same (of course with
    determinant vectorized in the other place that wouldn't have worked any longer
    neither).
    Note this particular piece doesn't actually vectorize well, not many arithmetic
    instructions left but tons of shuffle instructions...
    Probably would need to work on n tris at a time for better vectorization.

commit 63169dcb9dd445c94605625bf86d85306e2b4297
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Mar 8 03:11:37 2012 +0100

    llvmpipe: vectorize some scalar code in setup

    reduces number of arithmetic instructions, and avoids loading
    vector x,y values twice (once as scalars once as vectors).
    Results in a reduction of instructions from 76 to 64 in fs setup for glxgears
    (16%) on a cpu with sse41.
    Since this code uses vec2 disguised as vec4, on old cpus which had physical
    64bit sse units (pre-Core2) it probably is less of a win in practice (and if
    you have no vectors you can only hope llvm eliminates the arithmetic for
    unneeded elements).

commit 732ecb877f951ab89bf503ac5e35ab8d838b58a1
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Mar 7 00:32:24 2012 +0100

    draw: fix clipping

    bug introduced by 4822fea3f0440b5205e957cd303838c3b128419c broke
    clipping pretty badly (verified with lineclip test)

commit ef5d90b86d624c152d200c7c4056f47c3c6d2688
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Mar 6 23:38:59 2012 +0100

    draw: don't store vertex header per attribute

    storing the vertex header once per attribute is totally unnecessary.
    Some quick look at the generated assembly says llvm in fact cannot optimize
    away the additional stores (maybe due to potentially aliasing pointers
    somewhere).
    Plus, this makes the code cleaner and also allows using a vector "or"
    instead of scalar ones.

commit 6b3a5a57b0b9850854cfbd7b586e4e50102dda71
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Mar 6 19:11:01 2012 +0100

    draw: do the per-vertex "boolean" clipmask "or" with vectors

    no point extracting the values and doing it per component.
    Doesn't help that much since we still extract the values elsewhere anyway.

commit 36519caf1af40e4480251cc79a2d527350b7c61f
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Fri Mar 2 22:27:01 2012 +0100

    gallivm: fix lp_build_extract_broadcast with different sized vectors

    Fix the obviously wrong argument, so it doesn't blow up.

commit 76d0ac3ad85066d6058486638013afd02b069c58
Author: José Fonseca <jfonseca@vmware.com>
Date:   Fri Mar 2 12:16:23 2012 +0000

    draw: Compile per module and not per function (WIP).

    Enough to get gears w/ LLVM draw + softpipe to work on AVX doing:

      GALLIUM_DRIVER=softpipe SOFTPIPE_USE_LLVM=yes glxgears

    But still hackish -- will need to rethink and refactor this.

commit 78e32b247d2a7a771be9a1a07eb000d1e54ea8bd
Author: José Fonseca <jfonseca@vmware.com>
Date:   Wed Feb 29 12:01:05 2012 +0000

    llvmpipe: Remove lp_state_setup_fallback.

    Never used.

commit 6895d5e40d19b4972c361e8b83fdb7eecda3c225
Author: José Fonseca <jfonseca@vmware.com>
Date:   Mon Feb 27 19:14:27 2012 +0000

    llvmpipe: Don't emit EMMS on x86

    We already take precautions to ensure that LLVM never emits MMX code.

commit 4822fea3f0440b5205e957cd303838c3b128419c
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Feb 29 15:58:19 2012 +0100

    draw: modifications for larger vector sizes

    We want to be able to use larger vectors especially for running the vertex
    shader. With this patch we build soa vectors which might have a different
    length than 4.
    Note that aos structures really remain the same, only when aos structures
    are converted to soa potentially different sized vectors are used.
    Samplers probably don't work yet, didn't look at them.
    Testing done:
    glxgears works with both 128bit and 256bit vectors.

commit f4950fc1ea784680ab767d3dd0dce589f4e70603
Author: José Fonseca <jfonseca@vmware.com>
Date:   Wed Feb 29 15:51:57 2012 +0100

    gallivm: override native vector width with LP_NATIVE_VECTOR_WIDTH env var for debug

commit 6ad6dbf0c92f3bf68ae54e5f2aca035d19b76e53
Author: José Fonseca <jfonseca@vmware.com>
Date:   Wed Feb 29 15:51:24 2012 +0100

    draw: allocate storage with alignment according to native vector width

commit 7bf0e3e7c9bd2469ae7279cabf4c5229ae9880c1
Author: José Fonseca <jfonseca@vmware.com>
Date:   Fri Feb 24 19:06:08 2012 +0000

    gallivm: Fix comment grammar.

    Was missing several words. Spotted by Roland.

commit b20f1b28eb890b2fa2de44a0399b9b6a0d453c52
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 19:22:09 2012 +0000

    gallivm: Use MC-JIT on LLVM 3.1 + (i.e, SVN)

    MC-JIT

    Note: MC-JIT is still WIP. For this to work correctly it requires
    LLVM changes which are not yet upstream.

commit b1af4dfcadfc241fd4023f4c3f823a1286d452c0
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Feb 23 20:03:15 2012 +0100

    llvmpipe: use new lp_type_width() helper in lp_test_blend

commit 04e0a37e888237d4db2298f31973af459ef9c95f
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Feb 23 19:50:34 2012 +0100

    llvmpipe: clean up lp_test_blend a little

    Using variables just sized and aligned right makes it a bit more obvious
    what's going on.
    The test still only tests vector length 4.
    For AoS anything else probably isn't going to work.
    For SoA other lengths should work (at least with floats).

commit e61c393d3ec392ddee0a3da170e985fda885a823
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 17:48:30 2012 +0000

    gallivm: Ensure vector width consistency.

    Instead of assuming that everything is the max native size.

commit 330081ac7bc41c5754a92825e51456d231bf84dd
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 17:44:14 2012 +0000

    draw: More simd vector width consistency fixes.

commit d90ca002753596269e37297e2e6c139b19f29f03
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 17:43:00 2012 +0000

    gallivm: Remove unused lp_build_int32_vec4_type() helper.

commit cae23417824d75869c202aaf897808d73a2c1db0
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Thu Feb 23 17:32:16 2012 +0100

    gallivm: use global variable for native vector width instead of define

    We do not know the simd extensions (and hence the simd width we should use)
    available at compile time.
    At least for now keep a define for maximum vector width, since a global
    variable obviously can't be used to adjust alignment of automatic stack
    variables.
    Leave the runtime-determined value at 128 for now in all cases.

commit 51270ace6349acc2c294fc6f34c025c707be538a
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 15:41:02 2012 +0000

    gallivm: Add a hunk inadvertedly lost when rebasing.

commit bf256df9cfdd0236637a455cbaece949b1253e98
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 14:24:23 2012 +0000

    llvmpipe: Use consistent vector width in depth/stencil test.

commit 5543b0901677146662c44be2cfba655fd55da94b
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 14:19:59 2012 +0000

    draw: Use a consistent the vector register width.

    Instead of 4x32 sometimes, LP_NATIVE_VECTOR_WIDTH other times.

commit eada8bbd22a3a61f549f32fe2a7e408222e5c824
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 12:08:04 2012 +0000

    gallivm: Remove garbagge collection.

    MC-JIT will require one compilation per module (as opposed to one
    compilation per function), therefore no state will be shared,
    eliminating the need to do garbagge collection.

commit 556697ea0ed72e0641851e4fbbbb862c470fd7eb
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 10:33:41 2012 +0000

    gallivm: Move all native target initialization to lp_set_target_options().

commit c518e8f3f2649d5dc265403511fab4bcbe2cc5c8
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 09:52:32 2012 +0000

    llvmpipe: Create one gallivm instance for each test.

commit 90f10af8920ec6be6f2b1e7365cfc477a0cb111d
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 09:48:08 2012 +0000

    gallivm: Avoid LLVMAddGlobalMapping() in lp_bld_assert().

    Brittle, complex, and unecesary. Just use function pointer constant.

commit 98fde550b33401e3fe006af59db4db628bcbf476
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 09:21:26 2012 +0000

    gallivm: Add a lp_build_const_func_pointer() helper.

    To be reused in all places where we want to call C code.

commit 6cfedadb62c2ce5af8d75969bc95a607f3ece118
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 09:44:41 2012 +0000

    gallivm: Cleanup/simplify lp_build_const_string_variable.

    - Move to lp_bld_const where it belongs
    - Rename to lp_build_const_string
    - take the length from the argument (and don't count the zero terminator twice)
    - bitcast the constant to generic i8 *

commit db1d4018c0f1fa682a9da93c032977659adfb68c
Author: José Fonseca <jfonseca@vmware.com>
Date:   Thu Feb 23 11:52:17 2012 +0000

    gallivm: Set NoFramePointerElimNonLeaf to true where supported.

commit 088614164aa915baaa5044fede728aa898483183
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Wed Feb 22 19:38:47 2012 +0100

    llvmpipe: pass in/out pointers rather scalar floats in lp_bld_arit

    we don't want llvm to potentially optimize away the vectors (though it doesn't
    seem to currently), plus we want to be able to handle in/out vectors of arbitrary
    length.

commit 3f5c4e04af8a7592fdffa54938a277c34ae76b51
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Feb 21 23:22:55 2012 +0100

    gallivm: fix lp_build_sqrt() for vector length 1

    since we optimize away vectors with length 1 need to emit intrinsic
    without vector type.

commit 79d94e5f93ed8ba6757b97e2026722ea31d32c06
Author: José Fonseca <jfonseca@vmware.com>
Date:   Wed Feb 22 17:00:46 2012 +0000

    llvmpipe: Remove lp_test_round.

commit 81f41b5aeb3f4126e06453cfc78990086b85b78d
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Tue Feb 21 23:56:24 2012 +0100

    llvmpipe: subsume lp_test_round into lp_test_arit

    Much simpler, and since the arguments aren't passed as 128bit values can run
    on any arch.
    This also uses the float instead of the double versions of the c functions
    (which probably was the intention anyway).
    In contrast to lp_test_round the output is much less verbose however.
    Tested vector width of 32 to 512 bits - all pass except 32 (length 1) which
    crashes in lp_build_sqrt() due to wrong type.

    Signed-off-by: José Fonseca <jfonseca@vmware.com>

commit 945b338b421defbd274481d8c4f7e0910fd0e7eb
Author: José Fonseca <jfonseca@vmware.com>
Date:   Wed Feb 22 09:55:03 2012 +0000

    gallivm: Centralize the function compilation logic.

    This simplifies a lot of code.

    Also doing this in a central place will make it easier to carry out the
    changes necessary to use MC-JIT in the future.

gallivm: Fix typo in explicit derivative shuffle.

Trivial.

draw: make DEBUG_STORE work again

adapt to lp_build_printf() interface changes

Reviewed-by: José Fonseca <jfonseca@vmware.com>

draw: get rid of vecnf_from_scalar()

just use lp_build_broadcast directly (cannot assign a name but don't really
need it, vecnf_from_scalar() was producing much uglier IR due to using
repeated insertelement instead of insertelement+shuffle).

Reviewed-by: José Fonseca <jfonseca@vmware.com>

llvmpipe: fix typo in complex interpolation code

Fixes position interpolation when using complex mode
(piglit fp-fragment-position and similar)

Reviewed-by: José Fonseca <jfonseca@vmware.com>

draw: fix clipvertex/position storing again

This appears to be the result of a bad merge.
Fixes piglit tests relying on clipping, like a lot of the interpolation tests.

Reviewed-by: José Fonseca <jfonseca@vmware.com>

gallivm: Fix explicit derivative manipulation.

Same counter variable was being used in two nested loops. Use more
meanigful variable names for the counter to fix and avoid this.

gallivm: Prevent buffer overflow in repeat wrap mode for NPOT.

Based on Roland's patch, discussion, and review .

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

gallivm: Fix dims for TGSI_TEXTURE_1D in emit_tex.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

gallivm: Fix explicit volume texture derivatives.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

gallivm: fix 1d shadow texture sampling

Always r coordinate is used, hence need 3 coords not two
(the second one is unused).

Reviewed-by: José Fonseca <jfonseca@vmware.com>

gallivm: Enable AVX support without MCJIT, where available.

For now, this just enables AVX on Windows for testing.  If the code is
stable then we might consider prefering the old JIT wherever possible.

No change elsewhere.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-07-17 13:42:39 +01:00
José Fonseca
ba9c1773d7 gallivm: Allow to force nearest filtering on a per-axis basis.
Experimental code, not really used yet.
2012-07-17 13:42:39 +01:00
Kristian Høgsberg
b262f56738 wayland: Include wl_drm format enum in wayland-drm.h
This gets referenced before we get to generate the header files, so just include the
enum that we need and don't include the generated header.
2012-07-17 08:30:39 -04:00
James Benton
e253175c9c llvmpipe: Fix bug with blend factor in complementary optimisations.
Fixes fdo 52168.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-07-17 13:16:38 +01:00
Christian König
89e755d762 radeonsi: fix vertex element state
The vertex element state isn't in registers any more, so
remove that old code. That fixes a memory corruption with
the blend state and gets eglgears partially working.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-07-17 10:44:12 +02:00
Christian König
4247fd9928 radeon/llvm: fix compiling when llvm is active, but opencl isn't
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-07-17 10:43:53 +02:00
Brian Paul
aa0becdbb6 mesa: include inttypes.h to get uint8_t type
To fix MSVC build.
2012-07-16 16:12:02 -06:00
Brian Paul
fe2a7b7e7f st/egl: fix uninitialized pointer bug
If no format is matched in the loop the value of xconf was undefined.

NOTE: This is a candidate for the 8.0 branch.
2012-07-16 16:03:31 -06:00
Brian Paul
2f92a9f721 r300g: silence uninitialized var warning 2012-07-16 16:03:31 -06:00
Elvis Lee
cf775c9cbf egl_dri2: NULL check for EGLNativeWindowType
Some application calls eglCreateWindowSurface with
EGLNativeWindowType parameter having zero value. It causes SEGV
and disturbs error handling like EGL_NO_SURFACE.

Signed-off-by: Elvis Lee <kwangwoong.lee@lge.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-07-16 16:03:31 -06:00
Jon TURNEY
d80fd04639 Fix building mesa with assembly enabled since a112ca5d
a112ca5d rather crassly smashed all the compiler flags together into AM_CFLAGS.
Separate them out the way they were before, putting pre-processor flags into
AM_CPPFLAGS, so assembly source gets preprocessed with the correct pre-processor
flags as well.

Also, remove unneeded CFLAGS from AM_CFLAGS, and CXXFLAGS from AM_CXXFLAGS

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Tested-by: Brian Paul <brianp@vmware.com>
2012-07-16 22:54:36 +01:00
Chad Versace
8dc074cd92 intel: Fix build broken by ETC1 patch
I suck at resolving merge conflicts and broke the build in a5a34b1.
This patch adds the missing field intel_mipmap_tree::wraps_etc1.

Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-16 14:29:24 -07:00
Chad Versace
a5a34b153d intel: Enable GL_OES_compressed_ETC1_RGB8_texture
Enable it for all hardware.

No current hardware supports ETC1, so this patch implements it by
translating the ETC1 data to RGBX data during the call to
glCompressedTexImage2D(). For details, see the doxygen for
intel_mipmap_tree::wraps_etc1.

Passes the Piglit test spec/OES_compressed_ETC1_RGB8_texture/miptree and
the ETC1 test in the GLES2 conformance suite.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-16 14:11:12 -07:00
Chad Versace
8ec721264c mesa: Add function for decoding ETC1 textures
Add function _mesa_etc1_unpack_rgba8888. It is intended to be used by
glCompressedTexSubImage2D to decode ETC1 textures into RGBA.

CC: Chia-I <olv@lunarg.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-16 14:07:57 -07:00
Chad Versace
d7458e401e gallium/util, mesa: Refactor etc1 unpack function
Move the body of util_etc1_rgb8_unpack_rgba_unorm8 into a new function
that can be shared between gallium and dri drivers,
texcompress_etc_tmp.h:etc1_unpack_rgba8888.

CC: Chia-I <olv@lunarg.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-16 14:07:57 -07:00
Kristian Høgsberg
7250cd506b gbm: Rename gbm_bo_get_pitch to gbm_bo_get_stride
We use pitch for 'pixels per row' and stride for 'bytes per row' pretty
consistently in mesa and most other places, so rename the gbm API.
2012-07-16 16:29:16 -04:00
Kristian Høgsberg
44f066b9ff gbm: Add new gbm_bo_import entry point
This generalizes and replaces gbm_bo_create_for_egl_image.  gbm_bo_import
will create a gbm_bo from either an EGLImage or a struct wl_buffer.
2012-07-16 16:29:15 -04:00
Roland Scheidegger
43ccded1e1 llvmpipe: destroy setup variants on context destruction
lp_delete_setup_variants() used to be called in garbage collection,
but this no longer exists hence the setup shaders never got freed.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-07-16 19:00:54 +01:00
James Benton
8684ffc141 llvmpipe: Unified common code between AoS and SoA blending.
Added a new file lp_bld_blend.c for the common code.
Merged and added some simple optimisations.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-07-16 19:00:54 +01:00
Kristian Høgsberg
636646a481 intel: Don't call _mesa_get_format_bytes for MESA_FORMAT_NONE
When we don't intend to texture from or render to a __DRIimage we
use __DRI_IMAGE_FORMAT_NONE.  In that case, we just create the __DRIimage
to reference the underlying buffer, and will create usable __DRIimages
from it using createSubImage later.

If we try to use _mesa_get_format_bytes() on MESA_FORMAT_NONE in
a debug build, we hit an assertion, so let's not do that.
2012-07-16 11:00:16 -04:00
Jon TURNEY
81de0431d6 Fix building glsl when using automake-1.12 after 68e04cc6
Commit 68e04cc6 was tested using automake-1.11.  Unfortunately, automake-1.12
made a "slightly backward-incompatible change" in the use of yacc with C++, and
for a .yy file, the generated header file is now named .hh, not .h

To work with both, write our own rule for running yacc, which generates a
header file named .h, rather than using automake's rule.

Also, remove things from BUILD_SOURCES which don't need to be there

Also, update EXCLUDE rules in doxygen/glsl.doxy, for change of generated files
from .cpp -> .cc, and glsl_lexer.h has never existed.

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
2012-07-15 15:27:26 +01:00
Marek Olšák
bc6bff7947 r600g: compute needed CS space for vertex buffers correctly 2012-07-15 15:26:14 +02:00
Marek Olšák
15ca9d159e r600g: don't check the R600_GLSL130 env var
GLSL 1.3 has been enabled by default for quite a while.
2012-07-15 02:16:46 +02:00
Jerome Glisse
e634651024 r600g: fix DB decompression on evergreen
Separated out of the hyperz patch by Marek with minor modifications.

Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-07-15 02:06:44 +02:00
Tom Stellard
c2f444c54d r600g: Emit vertex buffers using the same method as constant buffers
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-07-15 02:00:27 +02:00
Tom Stellard
9b76ee70b2 r600g: Unify 3D and compute vertex buffer emission
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-07-15 02:00:21 +02:00
Marek Olšák
0b4c5dbb8c r600g: fix grammar constant_buffer -> constant_buffers 2012-07-15 01:41:11 +02:00
Andreas Boll
e3ff4d4c10 radeon/llvm: Fix CR/LF in AMDILSIDevice.h 2012-07-13 16:35:22 +00:00
Tom Stellard
cc3907856e radeon/llvm: Clean up AMDILIntrinsicInfo.cpp 2012-07-13 16:29:46 +00:00
Tom Stellard
f323c6260d radeon/llvm: Coding style fixes 2012-07-13 16:29:46 +00:00
Jon TURNEY
39d82a1b20 Fix linking gallium drivers and with dricore after defadf2b1
Commit defadf2b1 erroneously tries to make gallium drivers link with libdricore
as a static library, not a shared library

Also, change uses of DRI_LIB_DEPS in gallium driver Makefiles to
GALLIUM_DRI_LIB_DEPS, so the libraries added are used in the linking the gallium
driver

Also, fix the path to the libdricore.so symlink, it's made in LIB_DIR, not in
the libdricore directory

Also repair quoting of dricore settings of DRI_LIB_DEPS and GALLIUM_DRI_LIB_DEPS
variables so VERSION is interpolated in configure but TOP and LIB_DIR are
interpolated later (where they are known, but VERSION isn't)

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-07-13 17:20:39 +01:00
Christoph Bumiller
9ed65301e0 nouveau: implement missing timer query functionality 2012-07-13 17:28:00 +02:00
Kristian Høgsberg
426a23af14 wayland: Stop trying to use make rules from aclocal, just copy and paste
Defeated by autotool, copy and paste to the rescue.

https://bugs.freedesktop.org/show_bug.cgi?id=51997
https://bugs.freedesktop.org/show_bug.cgi?id=51531

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2012-07-13 11:20:17 -04:00
José Fonseca
b3ba0a7afa mesa/st: Generates TGSI that always recognizes INSTANCEID/VERTEXID as integers.
Tested by running piglit draw-instanced, and by forcing llvmpipe advertise no native
integer support, which now produces:

VERT
DCL IN[0]
DCL SV[0], INSTANCEID
DCL OUT[0], POSITION
DCL OUT[1], COLOR
DCL CONST[0..19]
DCL TEMP[0], LOCAL
DCL TEMP[1], LOCAL
DCL TEMP[2], LOCAL
DCL ADDR[0]
  0: U2F TEMP[0].x, SV[0]
  1: ARL ADDR[0].x, TEMP[0].xxxx
  2: MOV TEMP[1].xy, CONST[ADDR[0].x+8].xyxx
  3: ADD TEMP[2].x, IN[0].xxxx, TEMP[1].xxxx
  4: ADD TEMP[1].x, IN[0].yyyy, TEMP[1].yyyy
  5: MUL TEMP[2], CONST[16], TEMP[2].xxxx
  6: MAD TEMP[2], CONST[17], TEMP[1].xxxx, TEMP[2]
  7: MAD TEMP[2], CONST[18], IN[0].zzzz, TEMP[2]
  8: MAD TEMP[2], CONST[19], IN[0].wwww, TEMP[2]
  9: ARL ADDR[0].x, TEMP[0].xxxx
 10: MOV TEMP[1], CONST[ADDR[0].x]
 11: MOV OUT[0], TEMP[2]
 12: MOV OUT[1], TEMP[1]
 13: END
2012-07-13 13:01:52 +01:00
José Fonseca
6dddd18480 draw,gallivm: Fix draw_get_shader_param.
- Use LLVM limits when LLVM is being used, instead of TGSI limits
- Provide draw_get_shader_param_no_llvm for when llvm is never used (softpipe)
- Eliminate several of the hacks around draw shader caps in several drivers

Unfortunately the hack for PIPE_MAX_VERTEX_SAMPLERS is still necessary.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-07-13 13:01:51 +01:00
Jon TURNEY
99728076ec Don't explicitly link libOsmesa with libmesa's dependency libglsl
The libmesa convenience library is linked with the libglsl convenience
library.  libOsmesa is linked with libmesa, and also directly with libglsl.
When using libtool, this gives rise to duplicate symbol errors.

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Matt Turner <mattst88@gmail.com>
2012-07-13 12:44:44 +01:00
Jon TURNEY
b2a37e242e automake: convert libglapi
* "configure substitutions are not allowed in _SOURCES variables" in automake,
so remove the AC_SUBST'ed GLAPI_ASM_SOURCES and instead use some AM_CONDITIONALS
to choose which asm sources are used

* Change GLAPI_LIB to point to the .la file in other Makefile.am files, and make a link
to the .a file for the convenience of other Makefiles which have not yet been converted
to automake

v2:
- Use AM_CPPFLAGS for cleaner build output
- EXTRA_SOURCES is not needed
- Remove libglapi.a compatibility link on clean

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Matt Turner <mattst88@gmail.com>
2012-07-13 12:44:07 +01:00
Jon TURNEY
1e48dfeee6 Rename X86-64_API -> X86_64_API
automake doesn't allow hyphens in variable names

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Matt Turner <mattst88@gmail.com>
2012-07-13 12:44:05 +01:00
Jon TURNEY
defadf2b15 Link dri drivers with mesa or dricore libtool library
Now mesa/drivers/dri is converted to automake, we want to update DRI_LIB_DEPS
so that we link with the libmesa or libdricore libtool library, as appropriate.

However, this is complicated by the fact that gallium/targets is not (yet)
converted, so we can't share the DRI_LIB_DEPS autoconf variable with that anymore.

Add an additional autoconf variable GALLIUM_DRI_LIB_DEPS, which is now used in
gallium/targets/Makefile.dri, to link with the libdircore or libmesa native library.

v2: libdricore$VERSION.a needs to be libdricore$(VERSION).a

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Matt Turner <mattst88@gmail.com>
2012-07-13 12:44:03 +01:00
Jon TURNEY
cf362d00b9 Remove unused MESA_MODULES autoconf variable
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Matt Turner <mattst88@gmail.com>
2012-07-13 12:44:01 +01:00
Jon TURNEY
a112ca5d5f automake: convert libmesa and libmesagallium
* "configure substitutions are not allowed in _SOURCES variables" in automake, so instead of
MESA_ASM_FILES, use some AM_CONDITIONALS to choose which architecture's asm sources are used
in libmesa_la_SOURCES. (Can't remove MESA_ASM_FILES autoconf variable as it's still used in
sources.mak)

* Update to link with the .la file in other Makefile.am files, and make a link to the
.a file for the convenience of other Makefiles which have not yet been converted to automake

v2: Remove stray -static from LDFLAGS
v3: Remove .a compatibility link on clean

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Matt Turner <mattst88@gmail.com>
2012-07-13 12:43:58 +01:00
Jon TURNEY
8676890018 Rename sparc/clip.S -> sparc/sparc_clip.S
Automake can't handle having both clip.S and clip.c, even though they have different paths

"src/mesa/Makefile.am: object `clip.lo' created by `$(SRCDIR)/sparc/clip.S' and `$(SRCDIR)/main/clip.c'"

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Matt Turner <mattst88@gmail.com>
2012-07-13 12:43:56 +01:00
Jon TURNEY
68e04cc601 automake: convert libglsl
v2: Use AM_V_GEN to silence generated code rules. Add BUILT_SOURCES to CLEANFILES
v3:
- Fix an accidental // in a path
- Use automake make rules for lex/yacc rather than writing our own
- Update .gitignore appropriately
- Build a libglcpp convenience library rather than awkwardly including
the files in libglsl and delegating the generation
- Remove libglsl.a compatibility link on clean
v4:
- Automake's rules for lex/yacc make .cc if source is .ll or .yy, and apparently we
must use those extensions "because of scons", so update everywhere glsl_parser.cpp
-> glsl_parser.cc and glsl_lexer.cpp -> glsl_lexer.cc. This fixes 'make tarballs'
and building with dricore enabled.

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Matt Turner <mattst88@gmail.com>
2012-07-13 12:43:41 +01:00
Laurent Carlier
284325d97b automake: convert libOSmesa
This also currently fix the installation of libOSmesa.

v2: Remove old Makefile, libOSmesa is now versioned, fix typos
v3: Keep config substitution alphabetized
v4: Update .gitignore
v5: Libraries will be in the builddir, not the srcdir.

Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Matt Turner <mattst88@gmail.com>
2012-07-13 12:43:39 +01:00
Marek Olšák
1a06e8454e mesa,st/mesa: implement GL_RGB565 from ARB_ES2_compatibility
This was not implemented, because the spec was changed just recently.

Everything has been in place already.

Gallium has PIPE_FORMAT_B5G6R5_UNORM, while Mesa has MESA_FORMAT_RGB565.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-13 01:36:07 +02:00
Kenneth Graunke
fe911c1d43 i965: Move loop over texture units into brw_populate_sampler_prog_key.
The whole reason I avoided this was because it might operate on a
brw_vertex_program or a brw_fragment_program.  However, that isn't a
problem: all we need is the gl_program base type.

This avoids awkwardly passing the loop counter 'i' as a parameter,
simplifies both callers, and also plumbs prog in place for future use.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-12 14:17:44 -07:00
Kenneth Graunke
86e401b771 i965: Always emit alpha when nr_color_buffers == 0.
If alpha-testing is enabled, we need to send alpha down the pipeline
even if nr_color_buffers == 0.  However, tracking whether alpha-testing
is enabled in the WM program key is expensive: it causes us to compile
multiple specializations of the same shader, using program cache space.

This patch removes the check for alpha-testing, and simply emits alpha
whenever nr_color_buffers == 0.  We believe this will also be necessary
for alpha-to-coverage, and it should add minimal overhead to an uncommon
case.  Saving the recompiles should more than make up the difference.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-12 13:35:46 -07:00
Kenneth Graunke
16060531ba i965: Use the blitter in intel_bufferobj_subdata for busy BOs on Gen6+.
Previously we only did this pre-Gen6, and used pwrite on Gen6+.
In one workload, this cuts significant amount of overhead.

v2: Simplify the function based on Eric's suggestions.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-12 13:35:46 -07:00
José Fonseca
978807ef01 gallivm: Use %.9g to print floats.
So that we can see them in their full denormalized glory.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-07-12 21:14:35 +01:00
José Fonseca
5b8d80a783 scons: Remove -ffast-math.
We rely on proper IEEE 754 behavior in too many places for this.

See also commit 2fdbbeca43 with equivalent
change for autoconf.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-07-12 21:14:29 +01:00
José Fonseca
bd3aab8d79 scons: Also require recent XCB.
And don't trip when it's not found -- simply skip building src/glx.
2012-07-12 21:13:10 +01:00
Eric Anholt
6882381a2e mesa: Require current libxcb.
Without that, people with buggy apps that looked at just the server
string for GLX_ARB_create_context would call this function that just
threw an error when you tried to make a context.  Google shows plenty
of complaints about this.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-12 12:29:12 -07:00
Tom Stellard
f92873be2c radeon/llvm: Don't use lp_build_swizzle_aos() for swizzles
This function assumes that lp_build_context::type is a vector type,
which is not true for r600 or radeonsi.

This fixes an assertion failure using glamor 2D accel.
2012-07-12 13:53:22 -04:00
Tom Stellard
185fc9a5ef radeonsi: Dump TGSI code prior to doing TGSI->LLVM conversion.
This way if the conversion fails, we know what the TGSI shader looks
like.
2012-07-12 13:53:22 -04:00
Kenneth Graunke
b546aebae9 i965: Delete previous workaround for textureGrad with shadow samplers.
It had many problems:
- The shadow comparison was done post-filtering.
- It required state-dependent recompiles whenever the comparison
  function changed.
- It didn't even work: many cases hit assertion failures.
- I never implemented it for the VS.

The new lowering pass which converts textureGrad to textureLod by
computing the LOD value works much better.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-12 10:20:26 -07:00
Kenneth Graunke
b0c8d3be73 i965: Add a lowering pass to convert TXD to TXL by computing the LOD.
Intel hardware doesn't natively support textureGrad with shadow
comparisons.  So we need to generate code to handle it somehow.

Based on the equations of page 205 of the OpenGL 3.0 specification,
it's possible to compute the LOD value that would be selected given the
gradient values.  Then, we can simply convert the TXD to a TXL.

Currently, this passes 34/46 of oglconform's shadow-grad subtests;
four cubemap tests are regressed.  We should investigate this in the
future.

v2: Apply abs() to the scalar case (thanks to Eric).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-12 10:20:23 -07:00
Kenneth Graunke
d9da350a83 glsl/ir_builder: Add a new swizzle_for_size() function.
This swizzles away unwanted components, while preserving the order of
the ones that remain.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-12 10:20:20 -07:00
Kenneth Graunke
0bb3d4ba54 glsl/ir_builder: Add a generic constructor for unary expressions.
I needed to compute logs and square roots in a patch I was working on,
and wanted to use the convenient interface.  We already have a similar
constructor for binops; adding one for unops seems reasonable.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-12 10:20:18 -07:00
Kenneth Graunke
b656df990f glsl: Initialize coordinate to NULL in ir_texture constructor.
I ran into this while trying to create a TXS query, which doesn't have a
coordinate.  Since it didn't get initialized to NULL, a bunch of
visitors tried to access it and crashed.

Most of the time, this won't be a problem, but it's just a good idea.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-12 10:19:38 -07:00
José Fonseca
d9a8cd76e5 st/xorg: Fix build failure due to symbol clash. 2012-07-12 16:02:49 +01:00
Marek Olšák
0f3659bb56 docs: update relnotes-8.1 and GL3 status 2012-07-12 13:05:59 +02:00
Marek Olšák
63d8c8baa9 st/mesa: expose new transform feedback extensions 2012-07-12 13:05:59 +02:00
Marek Olšák
d24ece97e5 mesa: add ARB_transform_feedback_instanced extension enable flag
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-12 13:05:59 +02:00
Marek Olšák
db7404defd mesa: implement new DrawTransformFeedback functions
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-12 13:05:59 +02:00
Marek Olšák
7e0cb473b0 mesa: implement display list support for new DrawTransformFeedback functions
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-12 13:05:59 +02:00
Marek Olšák
ce16ca4635 mesa: implement display list support for indexed query functions
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-12 13:05:59 +02:00
Marek Olšák
553e13dbc2 mesa: implement indexed query functions from ARB_transform_feedback3
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-12 13:05:58 +02:00
Marek Olšák
375e73d859 mesa: implement glGet queries and error handling for ARB_transform_feedback3
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-12 13:05:58 +02:00
Marek Olšák
21cb5ed20d glsl: implement ARB_transform_feedback3 in the linker
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-12 13:05:58 +02:00
Marek Olšák
9576d555e0 glapi: add ARB_transform_feedback_instanced
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-12 13:05:58 +02:00
Marek Olšák
6d13d91f4e glapi: add ARB_transform_feedback3
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-12 13:05:58 +02:00
Marek Olšák
e773a48a3b r600g: fix uploading non-zero mipmap levels of depth textures
This fixes piglit/depth-level-clamp.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:31 +02:00
Marek Olšák
fe1fd67556 r600g: don't flush depth textures set as colorbuffers
The only case a depth buffer can be set as a color buffer is when flushing.

That wasn't always the case, but now this code isn't required anymore.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:31 +02:00
Marek Olšák
6842d5fced r600g: don't set dirty_db_mask for a flushed depth texture
A flush depth texture is never set as a depth buffer and never flushed.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:31 +02:00
Marek Olšák
5a17d8318e r600g: flush depth textures bound to vertex shaders
This was missing/broken. There are also minor code cleanups.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:31 +02:00
Marek Olšák
dee58f94af r600g: do fine-grained depth texture flushing
- maintain a mask of which mipmap levels are dirty (instead of one big flag)
- only flush what was requested at a given point and not the whole resource
  (most often only one level and one layer has to be flushed)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
df79eb5956 r600g: remove is_flush from DSA state
we can just update the state when decompressing, there's no need to add
additional info into the DSA state

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
43e3f19c76 r600g: set DISABLE in CB_COLOR_CONTROL if colormask is 0
this will be useful for in-place DB decompression, otherwise should be harmless

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
4fe74412cf r600g: move CB_SHADER_MASK setup into cb_misc_state
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
a1a1ff5ec0 r600g: move MULTIWRITE setup into cb_misc_state for r6xx-r7xx
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
0ea76916e6 r600g: move CB_TARGET_MASK setup into new cb_misc_state
to remove some overhead from draw_vbo. This is a derived state.

BTW, I've got no idea how compute interacts with 3D here, but it should
use cb_misc_state, so that 3D and compute don't conflict.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
5ba15d8d38 st/mesa: implement accelerated stencil blitting using shader stencil export
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
a7f3697eb8 st/mesa: set colormask to zero when blitting depth
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
5a74e17ab0 gallium/u_blit: remove useless memset calls
the structure is calloc'd.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
24e0a26335 gallium/u_blit: drop not-very-useful wrapper around util_blit_pixels_writemask
just rename it to util_blit_pixels

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
3f13b5da15 gallium/u_blit: don't do two copies for non-2D textures
Because u_blit couldn't sample a 1D, 3D, CUBE and ARRAY texture, we created
a 2D texture holding a copy of one slice of the source texture (even for 1D).

Let's just do it right.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
2dca61bcb3 gallium/util: move pipe_tex_to_tgsi_tex helper function into u_inlines
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
bdaf0a085b gallium/u_blitter: accelerate stencil-only copying
This doesn't seem to be used by anything yet, but better safe than sorry.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
12fd81f9e7 gallium/u_blitter: accelerate depth-stencil copying using shader stencil export
This fixes stencil buffer write transfers on r600g.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
76db2c121c gallium: add util_format_stencil_only helper function
used for stencil sampler views.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
a730838a42 gallium/u_blitter: minify depth0 when initializing last_layer
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
91cf9fe988 gallium/u_gen_mipmap: accelerate depth texture mipmap generation
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Marek Olšák
13b0af721a mesa: remove assertions that do not allow compressed 2D_ARRAY textures
NOTE: This is a candidate for the 8.0 branch.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-07-12 02:08:30 +02:00
Paul Berry
33202b4876 i965/msaa: Enable CMS layout on Gen7 for the formats that support it.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-11 15:14:50 -07:00
Paul Berry
4ebbc76621 i965/msaa: Add CMS support to blorp.
This patch updates the blorp engine to properly handle the case where
the surface being textured from uses Gen7's CMS MSAA layout.  The
following changes were necessary:

- Before reading color values from the surface, we need to read from
  the MCS buffer using the ld_mcs sampler message.  This is done by
  the mcs_fetch() function, and the result is stored in the mcs_data
  register.  This only needs to be done once per pixel, since the MCS
  value is shared between all samples belonging to a pixel.

- When reading color values from the surface, we need to use the
  ld2dms sampler message instead of the ld2dss message, and we need to
  provide the value read from the MCS buffer as an argument.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-11 15:14:49 -07:00
Paul Berry
754953693d i965/msaa: Add CMS-related sampler messages to brw_defines.h.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-11 15:14:49 -07:00
Paul Berry
7b3263af69 i965/msaa: Set SURFACE_STATE properly when CMS MSAA is in use.
When a buffer using Gen7's CMS MSAA layout is bound to a texture or a
render target, the SURFACE_STATE structure needs to point to the MCS
buffer and to indicate its pitch.  This patch updates the functions
that emit SURFACE_STATE to handle CMS layout properly.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-11 15:14:49 -07:00
Paul Berry
0ba813506d i965/msaa: Add CMS MSAA settings to brw_structs.h.
Previously the DWORD used to control the CMS MSAA layout was just a
pad value, because we didn't use it.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-11 15:14:49 -07:00
Paul Berry
ccae1b1cd7 i965/msaa: Allocate MCS buffer when CMS MSAA is in use.
To implement Gen7's CMS MSAA layout, we need an extra buffer, the MCS
(Multisample Control Surface) buffer.  This patch introduces code for
allocating and deallocating the buffer, and storing a pointer to it in
the intel_mipmap_tree struct.

No functional change, since the CMS layout is not enabled yet.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-11 15:14:49 -07:00
Paul Berry
1bd4d456cd i965/msaa: Add an enum to describe MSAA layout.
From the Ivy Bridge PRM, Vol 1 Part 1, p112:

    There are three types of multisampled surface layouts designated
    as follows:
      - IMS Interleaved Multisampled Surface
      - CMS Compressed Mulitsampled Surface
      - UMS Uncompressed Multisampled Surface

Previously, the i965 driver only used IMS and UMS formats, and
distinguished beetween them using the boolean
intel_mipmap_tree::msaa_is_interleaved.  To facilitate adding support
for the CMS format, this patch replaces that boolean (and other
booleans derived from it) with an enum
INTEL_MSAA_LAYOUT_{IMS,CMS,UMS}.  It also updates the terminology used
in comments throughout the driver to match the IMS/CMS/UMS terminology
used in the PRM.  CMS layout is not yet used.

The enum has a fourth possible value, INTEL_MSAA_LAYOUT_NONE, which is
used for non-multisampled surfaces.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-11 15:14:49 -07:00
Paul Berry
67b0f7c7dd i965/msaa: Move {rt,tex}_interleaved into blorp program key.
On Gen6, MSAA buffers always use an interleaved layout and non-MSAA
buffers always use a non-interleaved layout, so it is not strictly
necessary to keep track of the layout of the texture and render target
surfaces in the blorp program key.  However, it is cleaner to do so,
since (a) it makes the blorp compiler less dependent on implicit
knowledge about how the GPU pipeline is configured, and (b) it paves
the way for implementing compressed multisampled surfaces in Gen7.

This patch won't cause any redundant compiles, because the layout of
the texture and render target surfaces depends on other parameters
that are already in the blorp program key.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-11 15:14:49 -07:00
Kristian Høgsberg
2adfce4a18 mapi: Move GL_NV_draw_buffers extension to es_EXT.xml
We don't generate public entrypoints for GLES extensions, so move the
GL_NV_draw_buffers definition from ARB_draw_buffers.xml to es_EXT.xml.
When the extension is defined in ARB_draw_buffers.xml, we end up with a
public entry point for it, but no prototype, which gives an error when
compiled with --disable-asm and --disable-shared-glapi.

Instead, just move the GLES extension to es_EXT.xml so this doesn't happen.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2012-07-11 15:28:36 -04:00
Kristian Høgsberg
e6a33570b7 egl: Add EGL_WAYLAND_PLANE_WL attribute
This lets us specify the plane to create the image for for multiplanar
wl_buffers.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2012-07-11 15:28:36 -04:00
Kristian Høgsberg
1aaec8c609 wayland-drm: Add protocol to create planar buffers 2012-07-11 15:28:35 -04:00
Kristian Høgsberg
379eb47ea6 wayland-drm: Pass struct wl_drm_buffer to the driver
We're going to extend this to support multi-plane buffers, so pass this
to the driver so it can access the details.
2012-07-11 15:28:35 -04:00
Kristian Høgsberg
95bc0527e9 intel: Implement __DRIimage::createSubImage and bump supported version to 5
We use the new miptree offset to pick out the sub-image when we bind
the EGLImage to a texture.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2012-07-11 15:28:35 -04:00
Kristian Høgsberg
02ebad900d intel: Add offset field to miptree
This lets us specify an offset into the bo where the miptree starts,
which will let us set up a texture for a single plane in a planar buffer.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-11 15:28:35 -04:00
Kristian Høgsberg
44a2b57f93 intel: Add support for new __DRIimage formats 2012-07-11 15:28:34 -04:00
Kristian Høgsberg
c029834808 __DRIimage: version 5, add new formats and createSubImage
The additions in version 5 enables creating EGLImages for different planes
of a YUV buffer.  createImageFromName is still used to create the containing
__DRIimage, and createSubImage can then be used no that __DRIimage to create
__DRIimages that correspond to the y, u, and v planes (__DRI_IMAGE_FORMAT_R8)
or the uv planes (__DRI_IMAGE_FORMAT_RG88) for formats such as NV12 where
the u and v components are interleaved.  Packed formats such as YUYV etc
doesn't require any special treatment, we just sample those as a regular
ARGB texture.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2012-07-11 15:28:34 -04:00
Tom Stellard
c0f7fe7b79 r600g/compute: Disable growing the memory pool
The code for growing the memory pool (which is used for storing all of
the global buffers) wasn't working.  There seem to be two separate issues
with the memory pool code.  The first was the way it was growing the pool.
When the memory pool needed more space, it would:

1. Copy the data from the memory pool's backing texture to system memory.
2. Delete the memory pool's texture
3. Create a bigger backing texture for the memory pool.
4. Copy the data from system memory into the bigger texture.

The copy operations didn't seem to be working, and I suspect that since
they were using fragment shaders to do the copy, that there might have
been a problem with the mixing of compute and 3D state.

The other issue is that the size of 1D textures is limited, and I was
having trouble getting 2D textures to work.

I think these problems will be easier to solve once more code is shared
between 3D and compute, which is why I decided to disable it for now
rather than continue searching for a fix.
2012-07-11 17:53:54 +00:00
Tom Stellard
49ae102ee3 radeon/llvm: Use multiclasses for floating point loads
The original strategy for handling floating point loads, which was to
lower (f32 load) to (f32 bitcast (i32 load)) wasn't really working.  The
main problem was that the DAG legalizer couldn't handle replacing a node
with two results (load) with a node with only one result (bitcast).
2012-07-11 17:47:20 +00:00
Tom Stellard
bbdf3af857 radeon/llvm: Don't set the IMM bit in SMRD instruction definitions.
The IMM bit is already being set in SICodeEmitter.
2012-07-11 17:47:20 +00:00
Tom Stellard
d36499aa62 r600g/compute: Add more debugging output 2012-07-11 17:46:59 +00:00
Eric Anholt
f9b3e257d1 i965: Revert the VBOs-in-system-memory hack.
It didn't change performance on Lightsmark or Nexuiz, which both used
DYNAMIC_DRAW buffers, but it was killing performance (40% CPU wasted pwriting
buffers) on a closed-source app we're looking at.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-11 09:20:21 -07:00
Eric Anholt
b5c037f6b1 Add emacs setup for the docs/devinfo.html comment wrapping recommendation.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-11 09:20:21 -07:00
Ian Romanick
a8724d85f8 glx/dri2: Add support for GLX_ARB_create_context_robustness
Add the infrastructure required for this extension.  There is no
xserver support and no driver support yet.  Drivers can enable this be
advertising DRI2 version 4 and accepting the
__DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS flag and the
__DRI_CTX_ATTRIB_RESET_STRATEGY attribute in create context.

Some additional Mesa infrastructure is needed before drivers can do
this.  The GL_ARB_robustness spec, which all Mesa drivers already
advertise, requires:

    "If the behavior is LOSE_CONTEXT_ON_RESET_ARB, a graphics reset
    will result in the loss of all context state, requiring the
    recreation of all associated objects."

It is necessary to land this infrastructure now so that the related
infrastructure can land in the xserver.  The xserver has very long
release schedules, and the remaining Mesa parts should land long, long
before the next xserver merge window opens.

v2: Expose robustness as a DRI2 extension rather than bumping
__DRI_DRI2_VERSION.

v3: Add a comment explaining why dri2->base.version >= 3 is also
required for GLX_ARB_create_context_robustness.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-11 08:54:50 -07:00
Ian Romanick
de9ed51525 dri2: Hard-code the DRI2 version
This allows revising the dri_interface.h separately from adding driver
support.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-11 08:54:50 -07:00
Ian Romanick
2879f758b5 glapi: Apply Xorg indent rules to all files generated for the xserver
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-11 08:54:50 -07:00
Kenneth Graunke
a0698b000b docs: Update GL3.txt.
We neglected to list the deprecation model/forward compatible context
support.

inverse() has been done for a while.

None of us know what "highp change" means; GLSL 1.30 already added the
ability to recognize precision keywords, and it doesn't look like 1.40
has any new requirements there (precision keywords still have no meaning).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-10 16:53:49 -07:00
Chad Versace
551078bb62 mesa: Remove unneeded extern qualifiers
Remove 'extern' from the functions declared in texcompress_etc.h.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-10 16:51:19 -07:00
Vadim Girlin
3770847960 r600g: improve flushed depth texture handling v2
Use r600_resource_texture::flished_depth_texture for GPU access, and
allocate it in the VRAM. For transfers we'll allocate texture in the GTT
and store it in the r600_transfer::staging.

Improves performance when flushed depth texture is frequently used by the
GPU, e.g. in Lightsmark (~30%)

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2012-07-11 02:39:59 +04:00
Kenneth Graunke
860d5bdf98 i965: Add hardware context support.
With fixes and updates from Ben Widawsky and comments from Paul Berry.

v2: Use drm_intel_gem_context_destroy to destroy hardware context;
    remove useless initialization of hw_ctx, both suggested by Eric.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Paul Berry <stereotype441@gmail.com>
2012-07-10 15:09:58 -07:00
Ian Romanick
4fae5e32d5 mesa/test: Update name of GL_TIME_ELAPSED
4952caa caused the _EXT to fall off the name of this enum.  This is
fine.  Update the unit test to expect the new value.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51956
2012-07-10 14:46:25 -07:00
Andreas Boll
40742fa686 docs/relnotes-8.0.4: fix html markup 2012-07-10 12:59:34 -07:00
Marek Olšák
67a8ee891b gallium/docs: document interface changes for timestamp query
the query type is already documented
2012-07-10 19:04:13 +02:00
Marek Olšák
a3fccafda9 identity: implement get_timestamp 2012-07-10 19:04:13 +02:00
Marek Olšák
e66d90ec6b noop: implement get_timestamp 2012-07-10 19:04:13 +02:00
Marek Olšák
642539e3f9 trace: implement get_timestamp 2012-07-10 19:04:12 +02:00
Marek Olšák
a471d268ec galahad: implement get_timestamp 2012-07-10 19:04:12 +02:00
Marek Olšák
768589e836 docs: update relnotes-8.1 and GL3 status 2012-07-10 19:04:12 +02:00
Marek Olšák
5ddcda060c softpipe: implement get_timestamp and expose ARB_timer_query
PIPE_QUERY_TIMESTAMP is already implemented and working.
2012-07-10 19:04:12 +02:00
Marek Olšák
21f78d2189 st/mesa: implement ARB_timer_query 2012-07-10 19:04:12 +02:00
Marek Olšák
bcc735aaca gallium: add QUERY_TIMESTAMP cap and get_timestamp screen function 2012-07-10 19:04:12 +02:00
Marek Olšák
d5a7866902 mesa: implement glGet(GL_TIMESTAMP) v2
This is adds a new driver function to retrieve the timestamp.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-10 19:04:12 +02:00
Marek Olšák
5094533040 mesa: add ARB_timer_query to the extension list
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-10 19:04:12 +02:00
Marek Olšák
204777c5dc mesa: add QueryCounter display list support
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-10 19:04:12 +02:00
Marek Olšák
f601dcdf70 mesa: implement TIMESTAMP query and glQueryCounter
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-10 19:04:12 +02:00
Marek Olšák
4952caad2d glapi: add ARB_timer_query
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-10 19:04:12 +02:00
Ian Romanick
25fec2e9ca docs: Add 8.0.4 release notes
Also add news story.  Extra, extra!  Read all about it!

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-10 09:05:39 -07:00
Eric Anholt
2d03f48a65 glsl: Add parsing for GLSL uniform blocks.
This doesn't do anything with the uniform block declarations yet, so
usage of those uniforms finds them to be undeclared.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-09 11:13:33 -07:00
Eric Anholt
912a429bc5 glsl: Don't hide the type of struct_declaration_list.
I've been trying to derive from this for UBO support, and the slightly
obfuscated types were putting me over the edge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-09 11:12:18 -07:00
Kenneth Graunke
532e99cbf2 glcpp: Add built-in #define for GL_ARB_uniform_buffer_object.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-09 11:11:59 -07:00
Vincent Lejeune
7fabb2b593 glsl: Parser handles "#extension GL_ARB_uniform_buffer_object"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-09 11:11:38 -07:00
Eric Anholt
f4fb6bf088 glsl: Reduce a bit of extra code in the merging of layout qualifiers.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-09 11:05:33 -07:00
Eric Anholt
60a784d56e glsl: Take advantage of the layout qualifier flags union to clean up parsing.
The got_one variable was set iff one of the bits in flags.i was set.

v2: Fix incorrect dropping of the ARB_conservative_depth warning.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-07-09 11:04:45 -07:00
Tom Stellard
9b00edc79a r600g: Don't create a texture for the memory_pool during screen init
This fixes a segfault in r600_screen_create() introduced by
eb065f5d9d

Reported by tilman on irc.
2012-07-09 12:14:07 -04:00
Tom Stellard
76b44034b9 radeon/llvm: Rename namespace from AMDIL to AMDGPU 2012-07-09 13:43:11 +00:00
Tom Stellard
39323e8f79 r600g: Update number of gprs when adding a vertex instruction 2012-07-09 13:42:24 +00:00
Tom Stellard
da9c8a73ec r600g/compute: Use evergreen_cb() for binding RATs 2012-07-09 13:41:18 +00:00
Tom Stellard
960906d16b r600g: Add support for RATs in evergreen_cb() 2012-07-09 13:41:18 +00:00
Tom Stellard
eb065f5d9d r600g: Use a texture as the underlying resource for compute_memory_pool
This the first step towards being able to use evergreen_cb to bind RATs.
2012-07-09 13:41:18 +00:00
Tom Stellard
9d36441374 r600g: Add is_rat flag to r600_resource_texture 2012-07-09 13:41:18 +00:00
Tom Stellard
3d3194e93c r600g: Add r600_context_pipe_state_emit()
This function is used when dispatching compute shader in order to avoid
mixing compute and 3D registers in the context's dirty list.  This
allows the compute code to resuse 3D functions like evergreen_cb, which
return a struct r600_pipe_state and still have control over when and how
the register writes are emitted.
2012-07-09 13:41:17 +00:00
Tom Stellard
e00e1586dd r600g: Add pkt_flag parameter to r600_context_block_emit_dirty()
This allows the shader type bit to be set in the pm4 header when
emitting registers for compute shaders.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-07-09 13:41:17 +00:00
Tom Stellard
25145de03e r600g/compute: Move LOOP_CONST initialization to start_compute_cs atom 2012-07-09 13:41:17 +00:00
Tom Stellard
5016fe2d47 r600g: Add start_compute_cs atom to struct r600_context
The start_compute_cs atom initializes some config and context registers
to the values needed for running compute shaders.  When a compute shader
is dispatched, this atom is emitted after the start_cs_cmd atom, which
initializes registers that are common to both 3D and compute.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-07-09 13:41:17 +00:00
Tom Stellard
38be0966c7 r600g: Add pkt_flag member to struct r600_command_buffer
Some packets require the shader type bit (bit 1) to be set when
used for compute shaders.  The pkt_flag will be initialized to
RADEON_CP_PACKET3_COMPUTE_MODE for any struct r600_command_buffer used
for dispatching compute shaders and it will be or'd against the result of
the PKT3 macro when adding a new packet to a struct r600_command buffer.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-07-09 13:41:17 +00:00
Tom Stellard
7d0c17fe74 r600g: Only emit start_cs_cmd atom once for compute command streams 2012-07-09 13:41:17 +00:00
Marek Olšák
0a21b561c7 r600g: fix stencil texturing with Z32_FLOAT_S8X24_UINT 2012-07-09 13:58:00 +02:00
Marek Olšák
a460df9299 r600g: add assertions after translate_colorswap/colorformat/dbformat/texformat 2012-07-09 13:57:59 +02:00
Marek Olšák
c1e8c845ea r600g: inline r600_hw_copy_region 2012-07-09 13:57:59 +02:00
Marek Olšák
9974e9ac5d r600g: enable dual src blending on r7xx
No lockups here.
2012-07-09 13:57:59 +02:00
Marek Olšák
6657a7af61 r600g: use depth format from pipe_surface, not pipe_resource 2012-07-09 13:57:59 +02:00
Marek Olšák
b278aba423 r600g: use u_box_origin_2d helper function 2012-07-09 13:57:59 +02:00
Marek Olšák
1f50f463eb gallium/u_blitter: consolidate some state changes 2012-07-09 13:57:59 +02:00
Marek Olšák
22d032707e r600g: remove stray semicolon 2012-07-07 15:09:57 +02:00
Marek Olšák
461e9f99c7 docs: document ARB_blend_func_extended and EXT_texture_rg in relnotes-8.1
also sort the extensions
2012-07-07 15:09:57 +02:00
Eric Anholt
1e28f55ab7 i965/fs: Invalidate live intervals after copy propagation.
For copy propgation, we've dropped the use of a GRF in favor of a
(probably later) use of a different GRF.  This definitely requires
invalidating intervals.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-06 14:20:33 -07:00
Eric Anholt
2343fe9a5d i965/fs: Invalidate live intervals in passes that remove an instruction.
Since live intervals are based on ip, removing an instruction trashes
the intervals unless we were to go do some surgery.  These happen to
usually remove a use of a grf, so it's time to recalculate, anyway.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 8.0 release branch.
2012-07-06 14:20:33 -07:00
Eric Anholt
25ca9cc823 i965/vs: Move the other two src_reg/dst_reg constructors to brw_vec4.cpp.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-06 14:20:33 -07:00
Eric Anholt
b2f5d4c3ec i965/vs: Move class functions to brw_vec4.cpp.
This has less impact than for the FS (4k savings), because it was partially
done already, but makes things more consistent.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-06 14:20:32 -07:00
Eric Anholt
fe27916ddf i965/fs: Move class functions from the header to .cpp files.
Cuts compile time for brw_fs.h changes from 2.7s to .7s and reduces
i965_dri.so size by 70k.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-06 14:20:32 -07:00
José Fonseca
8b1f1900d1 galahad: Check that texture format is supported. 2012-07-06 20:38:41 +01:00
José Fonseca
ff8ddf399a galahad: More detailed resource checks. 2012-07-06 20:22:29 +01:00
José Fonseca
f8e13e6d69 galahad: Fix zealous warnings. 2012-07-06 20:12:56 +01:00
José Fonseca
7bd926af89 galahad: Enumerate all methods that are missing. 2012-07-06 19:13:44 +01:00
José Fonseca
3d2550be9c galahad: Implement render_condition. 2012-07-06 18:45:14 +01:00
José Fonseca
5b45775e41 galahad: Don't implement context methods that are not implemented by the underlying pipe driver. 2012-07-06 18:38:51 +01:00
José Fonseca
3cb994afca galahad: Use debug_printf.
stderr is not visible on windows.
2012-07-06 18:38:39 +01:00
José Fonseca
1abb070633 galahad: Silence creation messages.
Let galahad warnings be true warnings.
2012-07-06 18:37:48 +01:00
José Fonseca
d78dee1671 galahad: Use reference counting when destroying the wraped objects.
As the wrapped pipe driver may hold internal references.
2012-07-06 18:35:44 +01:00
José Fonseca
fe602da63f galahad: Point to the galahad objects from the galahad sampler view.
And not the wraped driver's objects.
2012-07-06 18:35:32 +01:00
José Fonseca
04d29afb8b galahad: Don't defer index buffer when it's NULL. 2012-07-06 17:02:39 +01:00
José Fonseca
232073b0d9 target-helpers: Enable debug helpers only on debug builds.
Some of these helpers use debug_get_option, which works also on releases.
2012-07-06 15:05:16 +01:00
Marek Olšák
c445b0f76d st/mesa: only expose ARB_shader_bit_encoding with GLSL 1.3
I don't think it's possible or even useful to use the extension with GLSL 1.2.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-06 00:45:38 +02:00
Kristian Høgsberg
5f5746a692 egl_dri2: Reorganize the EGLImage constructors to share more code
We factor out all the EGL book-keeping into dri2_create_image() and
simplify the wayland case by using dupImage.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2012-07-05 14:22:07 -04:00
Kristian Høgsberg
1bb15c0a08 intel: Share common __DRIimage allocation code
We have the same switch and allocation code in two places.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-05 14:22:07 -04:00
Kristian Høgsberg
454fc07dde intel: Just look up image->internal_format using _mesa_get_format_base_format
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-05 14:22:07 -04:00
Kristian Høgsberg
e408c17767 intel: Remove unused __DRIimage::data_type field
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-05 14:22:06 -04:00
Brian Paul
bbe92dc608 svga: whitespace fixes 2012-07-05 08:07:26 -06:00
Brian Paul
76a6801240 Revert "mesa: #define fprintf to be __mingw_fprintf() on Mingw32"
This reverts commit cbffaf20e9.

Use the PRIx64 macro in the fprintf() call instead, as suggested
by Dylan Noblesmith.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-07-05 08:07:26 -06:00
Brian Paul
df2d81ea59 mesa: use the PRIx64 macro for printing 64-bit hexadecimal values
We'll revert the #define fprintf __mingw_fprintf change next.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-07-05 08:07:25 -06:00
Brian Paul
1ab37a2284 svga: implement TGSI_OPCODE_ROUND
ROUND and TRUNC are implemented with one function to reduce code duplication.
Note: ROUND isn't actually used yet, but probably will be soon.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-07-05 08:07:03 -06:00
Brian Paul
d594f72e16 svga: fix CMP translation for vertex shaders
Converting CMP to SLT+LRP didn't work when src2 or src3 was Inf/NaN.
That's the case for GLSL sqrt(0).  sqrt(0) actually happens in many
piglit auto-generated tests that use the distance() function.

v2: remove debug/devel code, per Jose

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-07-05 08:03:19 -06:00
Brian Paul
30f8575fde svga: properly implement TRUNC instruction
Was previously implemented with FLOOR.
Fixes quite a few piglit tests of float->int conversion, integer
division, etc.

v2: clean up left over debug/devel code, per Jose

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-07-05 08:03:19 -06:00
Brian Paul
0bd3a75de9 svga: fix register collision issue in emit_conditional()
If the 'dst' register is the same as the 'pass' register we'll generate
invalid code.  Use a temporary register in that case.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-07-05 08:03:19 -06:00
Brian Paul
9b3d87b092 svga: emit some debug messages when shader compilation fails 2012-07-05 07:59:20 -06:00
Eric Anholt
33526a2ffe intel: Fix a comment typo. 2012-07-04 13:59:14 -07:00
Gwenole Beauchesne
69f031cc19 mesa: add GL_EXT_texture_rg extension for OpenGL ES 2.x. 2012-07-04 15:26:22 -04:00
Kristian Høgsberg
3ed8d42853 GLES2: upgrade gl2ext.h to version 18099
Redo this commit, and remove the inclusion of gl2ext.h
from src/mapi/glapi/glapi_priv.h.  The include was added in
8f3be33985 to fix a missing prototype for
glDrawBuffersNV and others, but it's not possible to include both
glext.h and gl2ext.h from the same file.

I don't see the missing prototype here (with or without shared glapi)
so I'm just removing the offending #include.

Also, since we're redoing this, update to the most recent gl2ext.2.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2012-07-04 15:26:22 -04:00
Olivier Galibert
e620f3e763 mesa/st: gl_ClipDistance must be interpolated in 3d space.
That old bug was hidden but the clipper always interpolating in 3d space
no matter what it should have been doing.  Now that the interpolation
has been fixed, the bug shows up.

Fixes fdo 51364.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-07-04 10:47:14 +01:00
Stuart Abercrombie
95ce454c8c gallium/util: Save and restore vertex buffer state in util_gen_mipmap.
Calling glGenerateMipmap could overwrite vertex buffer state, leading
 to incorrect rendering or crashes depending on the Gallium driver.

This was happening on WebGL Conformance test texture-size.

Before 784dd51198 this was covered up
by redundant vertex buffer validation.

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-07-04 03:48:29 +02:00
Marek Olšák
567fcd2eb9 Revert "GLES2: upgrade gl2ext.h to version 16994."
This reverts commit 8818b88748.

I get a lot of errors like this one:

In file included from ../../../src/mapi/glapi/glapi_priv.h:49:0,
                 from glapi_dispatch.c:40:
../../../include/GLES2/gl2ext.h:1074:28: error: redefinition of typedef ‘PFNGLRENDERBUFFERSTORAGEMULTISAMPLEEXTPROC’
../../../include/GL/glext.h:10237:25: note: previous declaration of ‘PFNGLRENDERBUFFERSTORAGEMULTISAMPLEEXTPROC’ was here

This with a clean build (with git clean -fdX).

I don't get the errors on my other machine. I didn't investigate why,
a wild guess is that this depends on the version of gcc.
2012-07-04 01:40:05 +02:00
Marek Olšák
2668aaa557 Revert "mesa: add GL_EXT_texture_rg extension for OpenGL ES 2.x."
This reverts commit d1665388ce.
2012-07-04 01:39:52 +02:00
Gwenole Beauchesne
d1665388ce mesa: add GL_EXT_texture_rg extension for OpenGL ES 2.x. 2012-07-03 16:23:38 -04:00
Gwenole Beauchesne
8818b88748 GLES2: upgrade gl2ext.h to version 16994. 2012-07-03 16:23:38 -04:00
Eric Anholt
dd4282e38f i965/fs: Allow copy propagation on uniforms.
This is a big win for savage2, hon and yofrankie.  62 new programs for
savage2/hon get 16-wide mode, along with one for humus demos and two
for tropics.  Even a few shaders from tropics see reductions of 15% or
more.

total instructions in shared programs: 216536 -> 207353 (-4.24%)
instructions in affected programs:     123941 -> 114758 (-7.41%)

In benchmarking Tropics, only a .040% +/- 034% performance improvement
was observed (n=90).  Rather disappointing, but I was primarily
motivated to do this patch by a regression in the number of 16-wide
shaders compiled after a GRF texturing on IVB patch I'm working on.
Hopefully this helps avoid that regression.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-03 12:57:10 -07:00
Eric Anholt
0c4630bae0 i965/fs: Allow copy propagation with source modifiers.
This shaves a few instructions off of a ton of programs.  For 12
shaders from tropics and sanctuary, it's enough reduction in register
pressure to get 16-wide mode.  7 shaders from heroes of newerth and
savage2 are hurt by about 1.1%, where copy propagation of negates ends
up preventing coalescing, but we could regain that by doing dataflow
analysis in our copy propagation.

No significant performance difference in tropics (n=11)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-03 12:57:04 -07:00
Eric Anholt
458f7f0141 i965/fs: Move copy propagation test out to a separate function.
It's going to get more complicated in a moment.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-03 12:55:47 -07:00
Ian Romanick
5fb178ee43 glx/tests: Fix off-by-one error in allocating extension string buffer
NOTE: This is a candidate for the 8.0 release branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50621
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=418161
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Markus Oehme <oehme.markus@gmx.de>
2012-07-03 12:28:45 -07:00
Brian Paul
1853f467c6 glsl: fix unop/binop errors in comments 2012-07-03 09:42:59 -06:00
Paul Berry
f34764ea53 msaa: Make meta-ops save and restore state of GL_MULTISAMPLE.
The meta-ops _mesa_meta_Clear() and _mesa_meta_glsl_Clear() need to
ignore the state of GL_SAMPLE_ALPHA_TO_COVERAGE,
GL_SAMPLE_ALPHA_TO_ONE, GL_SAMPLE_COVERAGE, GL_SAMPLE_COVERAGE_VALUE,
and GL_SAMPLE_COVERAGE_INVERT when clearing multisampled buffers.  The
easiest way to accomplish this is to disable GL_MULTISAMPLE during the
clear meta-ops.

Note: this patch also causes GL_MULTISAMPLE to be disabled during
_mesa_meta_GenerateMipmap() and _mesa_meta_GetTexImage() (since those
two meta-ops use MESA_META_ALL).  Arguably this isn't strictly
necessary, since those meta-ops use their own non-MSAA fbo's, but it
shouldn't do any harm.

Fixes Piglit tests "EXT_framebuffer_multisample/clear {2,4}
{color,stencil}" on i965.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-07-02 14:09:27 -07:00
Paul Berry
8313f44409 i965/msaa: Fix centroid interpolation of unlit pixels.
From the Ivy Bridge PRM, Vol 2 Part 1 p280-281 (3DSTATE_WM:
Barycentric Interpolation Mode):

    "Errata: When Centroid Barycentric mode is required, HW may
    produce incorrect interpolation results when a 2X2 pixels have
    unlit pixels."

To work around this problem, after doing centroid interpolation, we
replace the centroid-interpolated values for unlit pixels with
non-centroid-interpolated values (which are interpolated at pixel
centers).  This produces correct rendering at the expense of a slight
increase in shader execution time.

I've conditioned the workaround with a runtime flag
(brw->needs_unlit_centroid_workaround) in the hopes that we won't need
it in future chip generations.

Fixes piglit tests "EXT_framebuffer_multisample/interpolation {2,4}
{centroid-deriv,centroid-deriv-disabled}".  All MSAA interpolation
tests pass now.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-02 13:27:36 -07:00
Paul Berry
3f929efa28 i965/fs: Add FS_OPCODE_MOV_DISPATCH_TO_FLAGS to fragment shader backend.
In order to compute centroid varyings correctly, the fragment shader
needs to be able to load the current pixel/sample mask into a flag
register.  This patch adds an opcode to the fragment shader back-end
to do this; the opcode gets translated into the instruction

mov(1)  f0<1>UW  g1.14<0,1,0>UW  { align1 WE_all }

Since this instruction clobbers f0, instruction scheduling has to
treat it the same as instructions that have a conditional modifier.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-02 13:27:36 -07:00
Jordan Justen
8aa78c104a i965: fix transform feedback with primitive restart
When querying GL_PRIMITIVES_GENERATED, if primitive restart
is also used, then take the software primitive restart
path so GL_PRIMITIVES_GENERATED is returned correctly.

GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN is also updated
since it will also affected by the same issue.

As noted in brw_primitive_restart.c, with further work we
should be able to move this situation back to a hardware
handled path.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-02 11:42:48 -07:00
Kenneth Graunke
14311ef3f2 i965: Re-enable rendering to SNORM formats.
Commit d73f6375f5 fixed the cause of the Piglit failure with
ARB_color_buffer_float fragment clamp modes.  Now that it's fixed,
there's no reason to leave snorm format rendering disabled.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-02 11:23:37 -07:00
Kenneth Graunke
b1802a2115 glsl: Remove unused ir_loop_jump::loop pointer.
Commit 0c005bd7 intended to make ir_loop_jump::mode public, but also
accidentally added a new pointer to the enclosing loop.  Furthermore, it
tried to initialize the new field by adding "this->loop = loop;" to the
constructor, but since there is no loop parameter, this only initialized
the field to itself---so it will likely be a garbage pointer.

A lot of code, such as lower_jumps, allocates new loop jumps without
setting this field appropriately, so any uses would probably just crash.

Thankfully, there were none, so we can just delete the field.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51574
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-07-02 11:08:59 -07:00
Kenneth Graunke
d73f6375f5 meta: Don't alter fragment color clamp in DrawPixels().
DrawPixels uses the MESA_META_CLAMP_FRAGMENT_COLOR flag to save/restore
the fragment color clamp mode.  This is unnecessary since it never
alters it.  It's also harmful: when the clamp mode is GL_FIXED_ONLY,
setting this flag causes _mesa_meta_begin to force it to GL_FALSE,
breaking clamping on SNORM formats.

DrawPixels should use the user-specified clamp mode and not change it.

Fixes Piglit's spec/ARB_color_buffer_float/GL_RGBA8_SNORM-drawpixels
test on i965/Sandybridge (with SNORM render targets re-enabled).

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-02 11:08:48 -07:00
Marek Olšák
9f0f2f9512 mesa: use FLUSH_CURRENT and not FLUSH_VERTICES in _mesa_validate_*
ASSERT_OUTSIDE_BEGIN_END_AND_FLUSH_WITH_RETVAL calls FLUSH_VERTICES, which
is not what we want.

This fixes a breakage in classic drivers, introduced in:

  62b9716739
  vbo: first ASSERT_OUTSIDE_BEGIN_END then FLUSH, not the other way around

It should fix:
  https://bugs.freedesktop.org/show_bug.cgi?id=51629
  https://bugs.freedesktop.org/show_bug.cgi?id=51642

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-02 17:48:36 +02:00
Dylan Noblesmith
876889b355 mesa: point to Makefile.old in the srcdir
Gets out-of-tree builds slightly closer to working.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-02 15:14:46 +00:00
Dylan Noblesmith
91ecba9d05 mesa: fix parser source gen for out-of-tree builds
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-02 15:14:39 +00:00
Dylan Noblesmith
261b1389eb mesa: fix api source gen for out-of-tree builds
Add $(srcdir) where needed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-02 15:14:27 +00:00
Dylan Noblesmith
43bca86c1b glapi/gen: fix out of tree build
Add "-f $(srcdir)/gl_API.xml" to the arguments of all
the scripts that by default look for gl_API.xml in the
working directory when run with no arguments, and prepend
$(srcdir) to those scripts that are already using an
explicit -f argument.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-07-02 15:13:58 +00:00
José Fonseca
f5c41e16d7 gallium/tgsi: Don't declare temps individually when they are all similar.
tgsi_ureg was recently enhanced to support local temporaries, and as result
temps are declared individually.

This change avoids many TEMP register declarations on common shaders.

(And fixes performance regression due to mismatches against performance
sensitive shaders.)

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-02 12:14:53 +01:00
José Fonseca
e75fe7ba08 gallivm: Cleanup the 4 x float -> 16 ub special path in lp_build_conv.
No behaviour change intended.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-07-02 12:13:52 +01:00
José Fonseca
63e0e4b8f5 gallium/util: Add ULL suffix to large constants.
As suggested by Andy Furniss: it looks like some old gcc versions
require it.
2012-07-02 12:12:42 +01:00
Tom Stellard
1d21bd057a clover: Handle NULL devs argument in clBuildProgram
If devs is NULL, then the kernel should be compiled for all devices
associated with the program.
2012-07-01 15:45:24 +02:00
Francisco Jerez
c6bb41c28b clover: Define non-templated copy constructor for clover::ref_ptr.
The templated copy constructor doesn't prevent the compiler from
emitting a default copy constructor, which leads to inconsistent
memory handling and was reported to cause segfaults when doing event
manipulation.

Reported-by: Tom Stellard <thomas.stellard@amd.com>
2012-07-01 15:37:30 +02:00
Brian Paul
db2b6ca504 llvmpipe: fix comment typo 2012-06-29 17:19:12 -06:00
Brian Paul
9dfe92019a st/mesa: use DEBUG_INCOMPLETE_FBO debug flag 2012-06-29 17:19:12 -06:00
Brian Paul
b186a9df32 mesa: remove some unused gl_dlist_state fields 2012-06-29 17:19:12 -06:00
Tom Stellard
ca8fa02308 clover: Add a function internalizer pass before LTO v2
The function internalizer pass marks non-kernel functions as internal,
which enables optimizations like function inlining and global dead-code
elimination.

v2:
  - Pass vector arguments by const reference
2012-06-29 18:46:18 +00:00
Tom Stellard
a31b2f7107 radeon/llvm: Enable vec4 loads on R600 2012-06-29 18:46:18 +00:00
Tom Stellard
e17c586d08 radeon/llvm: Enable floating point stores on R600 2012-06-29 18:46:18 +00:00
Tom Stellard
b66ef1f48c radeon/llvm: Handle floating point loads on R600 2012-06-29 18:46:18 +00:00
Tom Stellard
c01199dfc0 radeon/llvm: Expand UDIV and UREM nodes 2012-06-29 18:46:18 +00:00
Tom Stellard
2c485cda20 radeon/llvm: Emit raw ISA for vertex fetch instructions 2012-06-29 18:46:18 +00:00
José Fonseca
16e0ebccb6 gallium/util: Truly disable INF/NAN tests on MSVC.
Thanks to Brian for spotting this.
2012-06-29 14:49:23 +01:00
José Fonseca
c9bada497c gallium/util: Disable INF/NAN tests on MSVC.
Somehow they are not recognized as constants.
2012-06-29 13:39:07 +01:00
José Fonseca
fa8dcb848f translate: Free elt8_func/elt16_func too.
These were leaking.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-06-29 12:21:08 +01:00
James Benton
6dd8e6f9cb util: Reimplement half <-> float conversions.
Removed u_half.py used to generate the table for previous method.

Previous implementation of float to half conversion was faulty for
denormalised and NaNs and would require extra logic to fix,
thus making the speedup of using tables irrelevant.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-06-29 12:21:02 +01:00
James Benton
c8d3481cdb tests: Updated tests to properly handle NaN for half floats.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-06-29 12:20:59 +01:00
James Benton
60dca53833 util: Updated u_format_tests to rigidly test half-float boundary values.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-06-29 12:20:57 +01:00
James Benton
d069d8ef38 util: Added functions for checking NaN / Inf for double and half-floats.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-06-29 12:20:54 +01:00
James Benton
34075d4133 util: Added util_format_is_array.
This function checks whether a format description is in a simple array format.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-06-29 12:20:37 +01:00
Marek Olšák
fcebb157f0 vbo: optimize validation for glMultiDrawElements
Some parameters need to be checked only once.
check_valid_to_render needs to be called only once.

The validate function is based on the one for DrawElements.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-06-28 22:46:51 +02:00
Marek Olšák
62b9716739 vbo: first ASSERT_OUTSIDE_BEGIN_END then FLUSH, not the other way around
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-06-28 22:46:51 +02:00
Marek Olšák
d9eb1a1225 vbo: don't call twice _mesa_valid_to_render in DrawArraysInstancedBaseInstance
It's called in _mesa_validate_DrawArraysInstanced already.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-06-28 22:46:51 +02:00
Marek Olšák
15ac66e331 mesa: rename MaxTransformFeedbackSeparateAttribs to MaxTransformFeedbackBuffers
This is a cleanup for ARB_transform_feedback3, where
GL_MAX_TRANSFORM_FEEDBACK_BUFFERS is introduced for interleaved attribs and
has the same meaning as GL_MAX_.._SEPARATE_ATTRIBS for separate attribs.

Also, the maximum number of TFB buffers is reduced from 32 to 4, which makes
this patch useful even without the extension.
I don't know of any hardware which can do more than 4.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-06-28 22:46:51 +02:00
José Fonseca
638779e445 gallivm: Refactor lp_build_broadcast(_scalar) to share code.
Doesn't really change the generated assembly, but produces more compact IR,
and of course, makes code more consistent.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-06-28 20:20:34 +01:00
Johannes Obermayr
bf679ce1dc gallivm: Fix potential buffer overflowing in strncat.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-06-28 11:47:23 +01:00
Marcin Slusarz
1906d2b46b nv50: dynamically allocate space for shader local storage
Fixes 21 piglit tests:
spec/glsl-1.10/execution/variable-indexing/
fs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-row-wr

spec/glsl-1.20/execution/variable-indexing/
fs-temp-array-mat3-index-col-row-rd
fs-temp-array-mat3-index-row-rd
fs-temp-array-mat4-col-row-wr
fs-temp-array-mat4-index-col-row-rd
fs-temp-array-mat4-index-col-row-wr
fs-temp-array-mat4-index-row-rd
fs-temp-array-mat4-index-row-wr
vs-temp-array-mat3-index-col-row-rd
vs-temp-array-mat3-index-col-row-wr
vs-temp-array-mat3-index-row-rd
vs-temp-array-mat3-index-row-wr
vs-temp-array-mat4-col-row-wr
vs-temp-array-mat4-index-col-row-rd
vs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-col-wr
vs-temp-array-mat4-index-row-rd
vs-temp-array-mat4-index-row-wr
vs-temp-array-mat4-index-wr

... and prevents a lot of GPU lockups
2012-06-28 00:01:02 +02:00
Marcin Slusarz
0fceaee4fd nv50: streamline screen_create error handling
Remove macro which changes control flow (it's evil).
Make all fail paths print (correct) error message.
2012-06-28 00:01:02 +02:00
Marcin Slusarz
96259b5128 nv50/ir: make colorful ir dump output optional 2012-06-28 00:01:02 +02:00
Brian Paul
9881bf6e69 mesa: more const qualifiers to match the latest glext.h
For some reason regular gcc on Linux didn't catch these but the mingw
compiler did (generated errors, not warnings).

v2: include the changes in src/mapi/ too
2012-06-27 15:37:10 -06:00
Brian Paul
827bdee7d1 glapi: add const qualifier to glShaderSourceARB() parameter
Fixes the es2 build with gcc.

Note: in glext.h the prototypes for glShaderSource() and glShaderSourceARB()
disagree:  only the former has the extra const qualifier.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-06-27 15:37:10 -06:00
Jordan Justen
3588098ed8 i965: enable ARB_instanced_arrays extension
Set the step_rate value when drawing to implement
ARB_instanced_arrays for gen >= 4.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-06-27 10:35:14 -07:00
Brian Paul
8fb1e4a462 glsl: be more careful about counting varying vars in the linker
Previously, we were counting gl_FrontFacing, gl_FragCoord and gl_PointCoord
against the limit of varying variables.  This prevented some valid shaders
from linking.

The other potential solution to this is to have the driver advertise
more varying vars or set the GLSLSkipStrictMaxVaryingLimitCheck flag.
But the above-mentioned variables aren't conventional varying attributes
so it doesn't seem right to count them.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-06-27 11:31:16 -06:00
Andreas Boll
d9d84068e7 docs/helpwanted: add some useful todo lists
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-06-27 11:19:21 -06:00
Brian Paul
098aa5f9ab softpipe: fix numFragsEmitted debug code 2012-06-27 07:50:57 -06:00
Brian Paul
81e2a238bc gallium: minor whitespace, comment changes 2012-06-27 07:50:57 -06:00
Brian Paul
51b0a0b33c mesa: update glext.h to version 81 2012-06-27 07:50:57 -06:00
Brian Paul
52dd8961eb mesa: update glxext.h to version 33 2012-06-27 07:50:57 -06:00
Brian Paul
8459f4a63a mesa: make _mesa_reference_array_object() an inline function
As we do for texture objects, buffer objects, etc.
2012-06-27 07:50:57 -06:00
Brian Paul
dcf1dafa9e mesa: look up enum name for glEnable/Disable errors 2012-06-27 07:50:56 -06:00
Brian Paul
86ccd9aaac mesa: move TEXGEN defines closer to gl_texgen struct 2012-06-27 07:50:56 -06:00
Brian Paul
4cb3579e52 mesa: rename ColorMaterialBitmask to _ColorMaterialBitmask
Since it's a derived field.
2012-06-27 07:50:56 -06:00
Brian Paul
b114ff3783 mesa: re-order, update comments on lighting-related structs 2012-06-27 07:50:56 -06:00
José Fonseca
d1c5ea9207 gallium/util: Fix parsing of options with underscore.
For example

  GALLIVM_DEBUG=no_brilinear

which was being parsed as two options, "no" and "brilinear".
2012-06-27 11:16:18 +01:00
James Benton
789436f1e0 gallivm: Added a generic lp_build_print_value which prints a LLVMValueRef.
Updated lp_build_printf to share common code.
Removed specific lp_build_print_vecX.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-06-27 11:16:18 +01:00
Stéphane Marchesin
45fc069600 i915g: Implement sRGB textures
Since we don't have them in hw we emulate them in the shader. Although not
recommended by the spec it is legit.

As a side effect we also get GL 2.1. I think this is as far as we can take
the i915.
2012-06-26 23:18:15 -07:00
Brian Paul
3bc39414ab svga: return 120 for PIPE_CAP_GLSL_FEATURE_LEVEL
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-06-26 17:03:33 -06:00
Brian Paul
ac8613c298 llvmpipe: return 120 for PIPE_CAP_GLSL_FEATURE_LEVEL
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-06-26 17:03:33 -06:00
Carl Worth
d8e61f8f86 glsl: glcpp: Extend testing of #line directives
The most recent commit adds support for comments and macro expansion
on #line directives. Add testing to verify the new features.

Signed-off-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-06-26 15:23:55 -07:00
Carl Worth
aac78ce823 glsl: glcpp: Move handling of #line directives from lexer to parser.
The GLSL specification requires that #line directives be interpreted
after macro expansion. Our existing implementation of #line macros in
the lexer prevents conformance on this point.

Moving the handling of #line from the lexer to the parser gives us the
macro expansion we need. An additional benefit is that the
preprocessor also now supports comments on the same line as #line
directives.

Finally, the preprocessor now emits the (fully-macro-expanded) #line
directives into the output. This allows the full GLSL compiler to also
see and interpret these directives so it can also generate correct
line numbers in error messages.

Signed-off-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-06-26 15:23:49 -07:00
Carl Worth
39f8c46eaa glsl: glcpp: Rename and document _glcpp_parser_expand_if
This function is currently used only in the expansion of #if lines,
but we will soon be using it more generally (for the expansion of
(_glcpp_parser_expand_and_lex_from) and some more documentation.

Signed-off-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-06-26 15:21:16 -07:00
Carl Worth
1db463ce2e glsl: Consistently use length-based ralloc string functions for info_log.
Commit b823b99ec0 switched from using
functions such as ralloc_asprintf and ralloc_strcat to
ralloc_asprintf_rewrite_tail. This change maintains the string's
length as a aparamter that is updated by the ralloc functions (rather
than recomputing it with strlen over and over).

However, the change failed to updated two locations (glcpp_error and
glcpp_warning), with the result that the string's length wasn't
updated by these calls. Then, subsequent calls to other
ralloc_asprintf_rewrite_tail would overwrite the text appended by
glcpp_error.

This commit fixes the two missing updates, and restores line numbers
to the output of glcpp error messages, (as noticed by a glcpp unit
test case that has been failing since the above-mentioned commit).

Signed-off-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-06-26 15:20:53 -07:00
Carl Worth
c96b8302a3 glsl: glcpp: Allow "#if undefined-macro' to evaluate to false.
A strict reading of the GLSL specification would have this be an
error, but we've received reports from users who expect the
preprocessor to interepret undefined macros as 0. This is the standard
behavior of the rpeprocessor for C, and according to these user
reports is also the behavior of other OpenGL implementations.

So here's one of those cases where we can make our users happier by
ignoring the specification. And it's hard to imagine users who really,
really want to see an error for this case.

The two affected tests cases are updated to reflect the new behavior.

Signed-off-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-06-26 15:20:03 -07:00
Jerome Glisse
b75f1d973c r600g: enable DUAL_EXPORT mode when possible on r6xx/r7xx
DUAL_EXPORT can be enabled on r6xx/r7xx when all CBs use 16-bit export
and there is no depth/stencil export.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-06-27 02:06:55 +04:00
Vadim Girlin
470d00c0e2 r600g: enable DUAL_EXPORT mode when possible
It seems DUAL_EXPORT on evergreen may be enabled when all CBs use 16-bit export
mode (EXPORT_4C_16BPC), also there should be at least one CB, and the PS
shouldn't export depth/stencil.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2012-06-27 02:06:55 +04:00
Vadim Girlin
0c47d9dcab r600g: avoid unnecessary shader exports v2
In some cases TGSI shader has more color outputs than the number of CBs,
so it seems we need to limit the number of color exports. This requires
different shader variants depending on the nr_cbufs, but on the other hand
we are doing less exports, which are very costly.

v2: fix various piglit regressions

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-06-27 02:06:55 +04:00
Vadim Girlin
4acf71f01e r600g: cache shader variants instead of rebuilding v3
Shader variants are stored in the list, the key for lookup is based on the
states that require different hw shaders - currently it's rctx->two_side (all
gpus) and rctx->nr_cbufs (evergreen/cayman, when writes_all property is set).

v2:
 - use simple list instead of keymap as suggested by Marek on irc
 - call r600_adjust_gprs from r600_bind_vs_shader for r6xx/r7xx
   (r600_shader_select isn't used for vertex shaders currently)

v3:
 - fix call to r600_adjust_gprs - do it after updating current shader

Improves performance for some apps, e.g. FlightGear -
see https://bugs.freedesktop.org/show_bug.cgi?id=50360

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2012-06-27 02:06:55 +04:00
Brian Paul
55a89889ba svga: handle missing PIPE_CAP_x queries
And fix incorrect error message for a bad shader type/number.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-06-26 15:03:44 -06:00
Brian Paul
056e9b4511 llvmpipe: handle more PIPE_CAP_x queries
As with the previous commit for softpipe.

v2: remove 'default' case to get compile-time warning

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-06-26 15:03:44 -06:00
Brian Paul
7d23dcdacc softpipe: handle more PIPE_CAP_x queries
These all return zero.  Add a debug_printf() to catch the default case so
we don't accidently mishandle something important in the future.

v2: remove 'default' case to get compile-time warning

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-06-26 15:03:43 -06:00
Brian Paul
80efb524ee svga: return 1 for PIPE_CAP_MIXED_COLORBUFFER_FORMATS
This is actually required for GL_ARB_framebuffer_object, but the state
tracker doesn't currently check it.
Direct3D 9 allows mixed format color buffers with some restrictions.
Setting this allows Unigine Heaven 2.5 and 3.0 to run.  Tested both on
GL and D3D hosts.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
2012-06-26 15:03:43 -06:00
Brian Paul
36b3ee2ffc glsl: fix comment typo 2012-06-26 10:01:03 -06:00
Olivier Galibert
27e94ba4ea u2f_emit: Fix type parameter in LLVM call.
The type is the destination type (i.e. float vector) and not the
source type.  Fixes piglit fs-{in,de}crement-uint.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-06-26 16:55:40 +01:00
Paul Berry
6c355cca91 i965/msaa: Set KILL_ENABLE when GL_ALPHA_TO_COVERAGE enabled.
i965 hardware needs to be informed of situations in which it's
possible for pixels (or samples) to be discarded for reasons other
than depth/stencil testing (e.g. due to an explicit "discard" in the
fragment shader).  One of these situations is when
GL_ALPHA_TO_COVERAGE is enabled, since that can cause samples to be
discarded by the color calculator when the pixel's alpha value is less
than 1.0.

Without this patch, GL_ALPHA_TO_COVERAGE does not take effect on depth
buffers.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-06-26 07:45:54 -07:00
Paul Berry
bc53e14d98 i965/msaa: Implement GL_SAMPLE_ALPHA_TO_{COVERAGE,ONE}.
This patch enables the multisampling parameters
GL_SAMPLE_ALPHA_TO_COVERAGE and GL_SAMPLE_ALPHA_TO_ONE, which allow
the fragment shader's alpha output to be converted into a sample
coverage mask and ignored for blending.  i965 supports these
parameters through the BLEND_STATE structure.

The GL spec allows, but does not require, the implementation to dither
the conversion from alpha to a sample coverage mask, so that alpha
values that aren't a multiple of 1/num_samples result in the correct
proportion of samples being lit.  A bit exists in the BLEND_STATE
structure to enable this functionality, but according to the hardware
docs it must be disabled on Sandy Bridge (see the Sandy Bridge PRM,
Vol2, Part1, p379: AlphaToCoverage Dither Enable).  So it is enabled
for Gen7 only.

Fixes piglit tests
"EXT_framebuffer_multisample/sample-alpha-to-{coverage,one} {2,4}".

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-06-26 07:45:54 -07:00
Paul Berry
9ea60ce58f i965/msaa: Implement glSampleCoverage.
This patch enables glSampleCoverage() functionality, which allows the
client program to specify that only a portion of the samples be lit up
when performing multisampled rendering.  i965 supports
glSampleCoverage() through the 3DSTATE_SAMPLE_MASK command packet,
which allows the driver to specify a bitfield indicating which samples
to light up.

Fixes piglit tests "EXT_framebuffer_multisample/sample-coverage {2,4}
{inverted,non-inverted}".

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2012-06-26 07:45:54 -07:00
José Fonseca
4bde1ba7fb st/wgl: Add a few more comments. 2012-06-26 10:15:36 +01:00
Marek Olšák
cc2cd8b356 r600g: don't disable streamout if it hasn't been started 2012-06-26 03:37:24 +02:00
Marek Olšák
496399d8e9 u_blitter: disable streamout before rendering
This fixes piglit EXT_transform_feedback tests:
- intervening-read output
- intervening-read prims_written
2012-06-26 03:37:23 +02:00
Chad Versace
cf0bbb30f6 i965/fs: Fix conversions float->bool, int->bool
Fixes gles2conform GL.equal.equal_bvec2_frag.

This fixes brw_fs_visitor's translation of ir_unop_f2b.  It used CMP to
convert the float to one of 0 or ~0. However, the convention in the
compiler is that true is represented by 1, not ~0. This patch adds an AND
to convert ~0 to 1.

By inspection, a similar problem existed with ir_unop_i2b, with a similar
fix.

[v2 kayden]: eliminate extra temporary register.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49621
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-06-25 15:56:40 -07:00
3804 changed files with 374820 additions and 302564 deletions

View File

@@ -3,6 +3,7 @@
(tab-width . 8)
(c-basic-offset . 3)
(c-file-style . "stroustrup")
(fill-column . 78)
(eval . (progn
(c-set-offset 'innamespace '0)
(c-set-offset 'inline-open '0)))

7
.gitignore vendored
View File

@@ -4,6 +4,7 @@
*.ilk
*.la
*.lo
*.log
*.o
*.obj
*.os
@@ -17,6 +18,7 @@
*.tar
*.tar.bz2
*.tar.gz
*.trs
*.zip
*~
depend
@@ -36,8 +38,9 @@ config.py
build
libtool
manifest.txt
Makefile.in
.dir-locals.el
.deps/
.dirstamp
.libs/
/Makefile
Makefile
Makefile.in

View File

@@ -33,21 +33,24 @@ endif
LOCAL_C_INCLUDES += \
$(MESA_TOP)/include
MESA_VERSION=$(shell cat $(MESA_TOP)/VERSION)
# define ANDROID_VERSION (e.g., 4.0.x => 0x0400)
major := $(word 1, $(subst ., , $(PLATFORM_VERSION)))
minor := $(word 2, $(subst ., , $(PLATFORM_VERSION)))
LOCAL_CFLAGS += \
-DANDROID_VERSION=0x0$(major)0$(minor)
-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \
-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)
LOCAL_CFLAGS += \
-DPTHREADS \
-DHAVE_PTHREAD=1 \
-fvisibility=hidden \
-Wno-sign-compare
ifeq ($(strip $(MESA_ENABLE_ASM)),true)
ifeq ($(TARGET_ARCH),x86)
LOCAL_CFLAGS += \
-DUSE_X86_ASM
-DUSE_X86_ASM \
-DHAVE_DLOPEN \
endif
endif

View File

@@ -24,12 +24,17 @@
# BOARD_GPU_DRIVERS should be defined. The valid values are
#
# classic drivers: i915 i965
# gallium drivers: swrast i915g nouveau r300g r600g radeonsi vmwgfx
# gallium drivers: swrast i915g ilo nouveau r300g r600g radeonsi vmwgfx
#
# The main target is libGLES_mesa. For each classic driver enabled, a DRI
# module will also be built. DRI modules will be loaded by libGLES_mesa.
MESA_TOP := $(call my-dir)
MESA_ANDROID_MAJOR_VERSION := $(word 1, $(subst ., , $(PLATFORM_VERSION)))
MESA_ANDROID_MINOR_VERSION := $(word 2, $(subst ., , $(PLATFORM_VERSION)))
MESA_ANDROID_VERSION := $(MESA_ANDROID_MAJOR_VERSION).$(MESA_ANDROID_MINOR_VERSION)
MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk
MESA_PYTHON2 := python
@@ -37,7 +42,7 @@ DRM_TOP := external/drm
DRM_GRALLOC_TOP := hardware/drm_gralloc
classic_drivers := i915 i965
gallium_drivers := swrast i915g nouveau r300g r600g radeonsi vmwgfx
gallium_drivers := swrast i915g ilo nouveau r300g r600g radeonsi vmwgfx
MESA_GPU_DRIVERS := $(strip $(BOARD_GPU_DRIVERS))

View File

@@ -21,67 +21,49 @@
SUBDIRS = src
ACLOCAL_AMFLAGS = -I m4
doxygen:
cd doxygen && $(MAKE)
check-local:
$(MAKE) -C src/mapi/glapi/tests check
$(MAKE) -C src/mesa/main/tests check
$(MAKE) -C src/glsl/tests check
$(MAKE) -C src/glx/tests check
clean-local:
-@touch $(top_builddir)/configs/current
-@for dir in $(SUBDIRS) ; do \
if [ -d $$dir ] ; then \
(cd $$dir && $(MAKE) clean) ; \
fi \
done
-@test -s $(top_builddir)/configs/current || rm -f $(top_builddir)/configs/current
distclean-local:
-rm -rf lib*
-rm -f $(top_builddir)/configs/current
-find . '(' -name '*.o' -o -name '*.a' -o -name '*.so' -o \
-name depend -o -name depend.bak ')' -exec rm -f '{}' ';'
.PHONY: doxygen
# Rules for making release tarballs
PACKAGE_VERSION=8.1-devel
PACKAGE_DIR = Mesa-$(PACKAGE_VERSION)
PACKAGE_NAME = MesaLib-$(PACKAGE_VERSION)
EXTRA_FILES = \
aclocal.m4 \
configure \
bin/ar-lib \
bin/compile \
bin/config.sub \
bin/config.guess \
bin/depcomp \
bin/install-sh \
bin/ltmain.sh \
bin/missing \
bin/ylwrap \
bin/test-driver \
src/glsl/glsl_parser.cpp \
src/glsl/glsl_parser.h \
src/glsl/glsl_lexer.cpp \
src/glsl/glcpp/glcpp-lex.c \
src/glsl/glcpp/glcpp-parse.c \
src/glsl/glcpp/glcpp-parse.h \
src/mesa/main/api_exec_es1.c \
src/mesa/main/api_exec_es1_dispatch.h \
src/mesa/main/api_exec_es1_remap_helper.h \
src/mesa/main/api_exec_es2.c \
src/mesa/main/api_exec_es2_dispatch.h \
src/mesa/main/api_exec_es2_remap_helper.h \
src/mesa/program/lex.yy.c \
src/mesa/program/program_parse.tab.c \
src/mesa/program/program_parse.tab.h
src/mesa/program/program_parse.tab.h \
`git ls-files | grep "Makefile.am" | sed -e "s/Makefile.am/Makefile.in/"`
IGNORE_FILES = \
-x autogen.sh
parsers: configure
-@touch $(top_builddir)/configs/current
$(MAKE) -C src/glsl glsl_parser.cpp glsl_parser.h glsl_lexer.cpp
$(MAKE) -C src/glsl/glcpp glcpp-lex.c glcpp-parse.c glcpp-parse.h
$(MAKE) -C src/glsl glsl_parser.cpp glsl_parser.h glsl_lexer.cpp glcpp/glcpp-lex.c glcpp/glcpp-parse.c glcpp/glcpp-parse.h
$(MAKE) -C src/mesa program/lex.yy.c program/program_parse.tab.c program/program_parse.tab.h
# Everything for new a Mesa release:

View File

@@ -69,6 +69,13 @@ if env['gles']:
#######################################################################
# Environment setup
with open("VERSION") as f:
mesa_version = f.read().strip()
env.Append(CPPDEFINES = [
('PACKAGE_VERSION', '\\"%s\\"' % mesa_version),
('PACKAGE_BUGREPORT', '\\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\\"'),
])
# Includes
env.Prepend(CPPPATH = [
'#/include',

1
VERSION Normal file
View File

@@ -0,0 +1 @@
10.0.5

View File

@@ -1,119 +0,0 @@
# A few convenience macros for Mesa, mostly to keep all the platform
# specifics out of configure.ac.
# MESA_PIC_FLAGS()
#
# Find out whether to build PIC code using the option --enable-pic and
# the configure enable_static/enable_shared settings. If PIC is needed,
# figure out the necessary flags for the platform and compiler.
#
# The platform checks have been shamelessly taken from libtool and
# stripped down to just what's needed for Mesa. See _LT_COMPILER_PIC in
# /usr/share/aclocal/libtool.m4 or
# http://git.savannah.gnu.org/gitweb/?p=libtool.git;a=blob;f=libltdl/m4/libtool.m4;hb=HEAD
#
AC_DEFUN([MESA_PIC_FLAGS],
[AC_REQUIRE([AC_PROG_CC])dnl
AC_ARG_VAR([PIC_FLAGS], [compiler flags for PIC code])
AC_ARG_ENABLE([pic],
[AS_HELP_STRING([--disable-pic],
[compile PIC objects @<:@default=enabled for shared builds
on supported platforms@:>@])],
[enable_pic="$enableval"
test "x$enable_pic" = x && enable_pic=auto],
[enable_pic=auto])
# disable PIC by default for static builds
if test "$enable_pic" = auto && test "$enable_static" = yes; then
enable_pic=no
fi
# if PIC hasn't been explicitly disabled, try to figure out the flags
if test "$enable_pic" != no; then
AC_MSG_CHECKING([for $CC option to produce PIC])
# allow the user's flags to override
if test "x$PIC_FLAGS" = x; then
# see if we're using GCC
if test "x$GCC" = xyes; then
case "$host_os" in
aix*|beos*|cygwin*|irix5*|irix6*|osf3*|osf4*|osf5*)
# PIC is the default for these OSes.
;;
mingw*|os2*|pw32*)
# This hack is so that the source file can tell whether
# it is being built for inclusion in a dll (and should
# export symbols for example).
PIC_FLAGS="-DDLL_EXPORT"
;;
darwin*|rhapsody*)
# PIC is the default on this platform
# Common symbols not allowed in MH_DYLIB files
PIC_FLAGS="-fno-common"
;;
hpux*)
# PIC is the default for IA64 HP-UX and 64-bit HP-UX,
# but not for PA HP-UX.
case $host_cpu in
hppa*64*|ia64*)
;;
*)
PIC_FLAGS="-fPIC"
;;
esac
;;
*)
# Everyone else on GCC uses -fPIC
PIC_FLAGS="-fPIC"
;;
esac
else # !GCC
case "$host_os" in
hpux9*|hpux10*|hpux11*)
# PIC is the default for IA64 HP-UX and 64-bit HP-UX,
# but not for PA HP-UX.
case "$host_cpu" in
hppa*64*|ia64*)
# +Z the default
;;
*)
PIC_FLAGS="+Z"
;;
esac
;;
linux*|k*bsd*-gnu)
case `basename "$CC"` in
icc*|ecc*|ifort*)
PIC_FLAGS="-KPIC"
;;
pgcc*|pgf77*|pgf90*|pgf95*)
# Portland Group compilers (*not* the Pentium gcc
# compiler, which looks to be a dead project)
PIC_FLAGS="-fpic"
;;
ccc*)
# All Alpha code is PIC.
;;
xl*)
# IBM XL C 8.0/Fortran 10.1 on PPC
PIC_FLAGS="-qpic"
;;
*)
case `$CC -V 2>&1 | sed 5q` in
*Sun\ C*|*Sun\ F*)
# Sun C 5.9 or Sun Fortran
PIC_FLAGS="-KPIC"
;;
esac
esac
;;
solaris*)
PIC_FLAGS="-KPIC"
;;
sunos4*)
PIC_FLAGS="-PIC"
;;
esac
fi # GCC
fi # PIC_FLAGS
AC_MSG_RESULT([$PIC_FLAGS])
fi
AC_SUBST([PIC_FLAGS])
])# MESA_PIC_FLAGS

View File

@@ -3,17 +3,11 @@
srcdir=`dirname "$0"`
test -z "$srcdir" && srcdir=.
SRCDIR=`(cd "$srcdir" && pwd)`
ORIGDIR=`pwd`
if test "x$SRCDIR" != "x$ORIGDIR"; then
echo "Mesa cannot be built when srcdir != builddir" 1>&2
exit 1
fi
MAKEFLAGS=""
cd "$srcdir"
autoreconf -v --install || exit 1
cd $ORIGDIR || exit $?
if test -z "$NOCONFIGURE"; then
"$srcdir"/configure "$@"

39
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,39 @@
# Since we've disabled DRI3 completely in 10.0, this commit is no longer
# necessary.
f0f202e6b764be803470e27cba9102f14361ae22 glx: conditionaly build dri3 and present loader (v3)
# This patch makes bug #71870 worse, so it won't be cherry picked until that
# issue can be resolved. See
# http://lists.freedesktop.org/archives/mesa-dev/2013-November/048899.html
068a073c1d4853b5c8f33efdeb481026f42e23a5 meta: fix meta clear of layered framebuffers
# This patch isn't actually necessary because that bug that it fixes isn't in
# the 10.0 branch. See
# http://lists.freedesktop.org/archives/mesa-stable/2013-December/000500.html
a057b837ddd1c725a7504eedc53c6df05a012773 egl: add HAVE_LIBDRM define, fix EGL X11 platform
# Author requested skipping due to regressions
# Picking it would require at least also picking:
# 73c3c7e3, 3e0e9e3b, c59a605c
b2d1c579bb84a88179072a6a783f8827e218db55 glcpp: Set extension defines after resolving the GLSL version.
# These patches depend on other code not in stable branch.
# (at least 3b22146dc714b6090f7423abbc4df53d7d1fdaa9)
e190709119d8eb85c67bfbad5be699d39ad0118e mesa: Ensure that transform feedback refers to the correct program.
43e77215b13b2f86e461cd8a62b542fc6854dd1c i965/gen7: Use to the correct program when uploading transform feedback state.
# Author requested to ignore these four (since they depend on commits not in
# stable).
3313cc269bd428ca96a132d86da5fddc0f27386a i965: Add an option to ignore sample qualifier
a92e5f7cf63d496ad7830b5cea4bbab287c25b8e i965: Use sample barycentric coordinates with per sample shading
f5cfb4ae21df8eebfc6b86c0ce858b1c0a9160dd i965: Ignore 'centroid' interpolation qualifier in case of persample shading
dc2f94bc786768329973403248820a2e5249f102 i965: Ignore 'centroid' interpolation qualifier in case of persample shading
# This depends on the clear_buffer_object extensions work which is not in 10.0
# (See commit 5f7bc0c75904a40da0973329badea8497e53a26a on other branches)
aff7c5e78ab133866a90f67613508735c9b75094
# These patches are fixing code not present in 10.0
f34d75d6f69f4c0bf391e0adf1fd469601b01b04
e8d85034dad37177fce780ee3e09501e60be6e81
a61d859519d520b849c11ad5c1c1972870abd956

1
bin/.gitignore vendored
View File

@@ -6,3 +6,4 @@ install-sh
ylwrap
compile
ar-lib
/test-driver

52
bin/bugzilla_mesa.sh Executable file
View File

@@ -0,0 +1,52 @@
#!/bin/bash
# This script is used to generate the list of fixed bugs that
# appears in the release notes files, with HTML formatting.
#
# Note: This script could take a while until all details have
# been fetched from bugzilla.
#
# Usage examples:
#
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 > bugfixes
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee bugfixes
# $ DRYRUN=yes bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ DRYRUN=yes bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | wc -l
# regex pattern: trim before url
trim_before='s/.*\(http\)/\1/'
# regex pattern: trim after url
trim_after='s/\(show_bug.cgi?id=[0-9]*\).*/\1/'
# regex pattern: always use https
use_https='s/http:/https:/'
# extract fdo urls from commit log
urls=$(git log $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before -e $trim_after -e $use_https | sort | uniq)
# if DRYRUN is set to "yes", simply print the URLs and don't fetch the
# details from fdo bugzilla.
#DRYRUN=yes
if [ "x$DRYRUN" = xyes ]; then
for i in $urls
do
echo $i
done
else
echo "<ul>"
echo ""
for i in $urls
do
id=$(echo $i | cut -d'=' -f2)
summary=$(wget --quiet -O - $i | grep -e '<title>.*</title>' | sed -e 's/ *<title>Bug [0-9]\+ &ndash; \(.*\)<\/title>/\1/')
echo "<li><a href=\"$i\">Bug $id</a> - $summary</li>"
echo ""
done
echo "</ul>"
fi

35
bin/get-pick-list.sh Executable file
View File

@@ -0,0 +1,35 @@
#!/bin/sh
# Script for generating a list of candidates for cherry-picking to a stable branch
#
# Usage examples:
#
# $ bin/get-pick-list.sh
# $ bin/get-pick-list.sh > picklist
# $ bin/get-pick-list.sh | tee picklist
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate.*10\.0\|CC:.*10\.0.*mesa-stable\)' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.
if [ -f bin/.cherry-ignore ] ; then
if grep -q ^$sha bin/.cherry-ignore ; then
continue
fi
fi
# Check to see if it has already been picked over.
if grep -q ^$sha already_picked ; then
continue
fi
git log -n1 --pretty=oneline $sha | cat
done
rm -f already_picked

View File

@@ -1,74 +0,0 @@
#!/bin/sh
#
# Simple shell script for installing Mesa's header and library files.
# If the copy commands below don't work on a particular system (i.e. the
# -f or -d flags), we may need to branch on `uname` to do the right thing.
#
TOP=.
INCLUDE_DIR="/usr/local/include"
LIB_DIR="/usr/local/lib"
if [ "x$#" = "x0" ] ; then
echo
echo "***** Mesa installation - You may need root privileges to do this *****"
echo
echo "Default directory for header files is:" ${INCLUDE_DIR}
echo "Enter new directory or press <Enter> to accept this default."
read INPUT
if [ "x${INPUT}" != "x" ] ; then
INCLUDE_DIR=${INPUT}
fi
echo
echo "Default directory for library files is:" ${LIB_DIR}
echo "Enter new directory or press <Enter> to accept this default."
read INPUT
if [ "x${INPUT}" != "x" ] ; then
LIB_DIR=${INPUT}
fi
echo
echo "About to install Mesa header files (GL/*.h) in: " ${INCLUDE_DIR}/GL
echo "and Mesa library files (libGL.*, etc) in: " ${LIB_DIR}
echo "Press <Enter> to continue, or <ctrl>-C to abort."
read INPUT
else
INCLUDE_DIR=$1/include
LIB_DIR=$1/lib
fi
# flags:
# -f = force
# -d = preserve symlinks (does not work on BSD)
if [ `uname` = "FreeBSD" ] ; then
CP_FLAGS="-f"
elif [ `uname` = "Darwin" ] ; then
CP_FLAGS="-f"
elif [ `uname` = "AIX" ] ; then
CP_FLAGS="-fh"
else
CP_FLAGS="-fd"
fi
set -v
mkdir -p ${INCLUDE_DIR}
mkdir -p ${INCLUDE_DIR}/GL
# NOT YET: mkdir -p ${INCLUDE_DIR}/GLES
mkdir -p ${LIB_DIR}
cp -f ${TOP}/include/GL/*.h ${INCLUDE_DIR}/GL
cp -f ${TOP}/src/glw/*.h ${INCLUDE_DIR}/GL
# NOT YET: cp -f ${TOP}/include/GLES/*.h ${INCLUDE_DIR}/GLES
cp ${CP_FLAGS} ${TOP}/lib*/lib* ${LIB_DIR}
echo "Done."

View File

@@ -1,112 +0,0 @@
#!/bin/sh
# A minimal replacement for 'install' that supports installing symbolic links.
# Only a limited number of options are supported:
# -d dir Create a directory
# -m mode Sets a file's mode when installing
# If these commands aren't portable, we'll need some "if (arch)" type stuff
SYMLINK="ln -s"
MKDIR="mkdir -p"
RM="rm -f"
MODE=""
if [ "$1" = "-d" ] ; then
# make a directory path
$MKDIR "$2"
exit 0
fi
if [ "$1" = "-m" ] ; then
# set file mode
MODE=$2
shift 2
fi
# install file(s) into destination
if [ $# -ge 2 ] ; then
# Last cmd line arg is the dest dir
for FILE in $@ ; do
DESTDIR="$FILE"
done
# Loop over args, moving them to DEST directory
I=1
for FILE in $@ ; do
if [ $I = $# ] ; then
# stop, don't want to install $DEST into $DEST
exit 0
fi
DEST=$DESTDIR
# On CYGWIN, because DLLs are loaded by the native Win32 loader,
# they are installed in the executable path. Stub libraries used
# only for linking are installed in the library path
case `uname` in
CYGWIN*)
case $FILE in
*.dll)
DEST="$DEST/../bin"
;;
*)
;;
esac
;;
*)
;;
esac
PWDSAVE=`pwd`
# determine file's type
if [ -h "$FILE" ] ; then
#echo $FILE is a symlink
# Unfortunately, cp -d isn't universal so we have to
# use a work-around.
# Use ls -l to find the target that the link points to
LL=`ls -l "$FILE"`
for L in $LL ; do
TARGET=$L
done
#echo $FILE is a symlink pointing to $TARGET
FILE=`basename "$FILE"`
# Go to $DEST and make the link
cd "$DEST" # pushd
$RM "$FILE"
$SYMLINK "$TARGET" "$FILE"
cd "$PWDSAVE" # popd
elif [ -f "$FILE" ] ; then
#echo "$FILE" is a regular file
# Only copy if the files differ
if ! cmp -s $FILE $DEST/`basename $FILE`; then
$RM "$DEST/`basename $FILE`"
cp "$FILE" "$DEST"
fi
if [ $MODE ] ; then
FILE=`basename "$FILE"`
chmod $MODE "$DEST/$FILE"
fi
else
echo "Unknown type of argument: " "$FILE"
exit 1
fi
I=`expr $I + 1`
done
exit 0
fi
# If we get here, we didn't find anything to do
echo "Usage:"
echo " install -d dir Create named directory"
echo " install [-m mode] file [...] dest Install files in destination"

1037
bin/mklib

File diff suppressed because it is too large Load Diff

251
bin/perf-annotate-jit Executable file
View File

@@ -0,0 +1,251 @@
#!/usr/bin/env python
#
# Copyright 2012 VMware Inc
# Copyright 2008-2009 Jose Fonseca
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#
"""Perf annotate for JIT code.
Linux `perf annotate` does not work with JIT code. This script takes the data
produced by `perf script` command, plus the diassemblies outputed by gallivm
into /tmp/perf-XXXXX.map.asm and produces output similar to `perf annotate`.
See docs/llvmpipe.html for usage instructions.
The `perf script` output parser was derived from the gprof2dot.py script.
"""
import sys
import os.path
import re
import optparse
import subprocess
class Parser:
"""Parser interface."""
def __init__(self):
pass
def parse(self):
raise NotImplementedError
class LineParser(Parser):
"""Base class for parsers that read line-based formats."""
def __init__(self, file):
Parser.__init__(self)
self._file = file
self.__line = None
self.__eof = False
self.line_no = 0
def readline(self):
line = self._file.readline()
if not line:
self.__line = ''
self.__eof = True
else:
self.line_no += 1
self.__line = line.rstrip('\r\n')
def lookahead(self):
assert self.__line is not None
return self.__line
def consume(self):
assert self.__line is not None
line = self.__line
self.readline()
return line
def eof(self):
assert self.__line is not None
return self.__eof
mapFile = None
def lookupMap(filename, matchSymbol):
global mapFile
mapFile = filename
stream = open(filename, 'rt')
for line in stream:
start, length, symbol = line.split()
start = int(start, 16)
length = int(length,16)
if symbol == matchSymbol:
return start
return None
def lookupAsm(filename, desiredFunction):
stream = open(filename + '.asm', 'rt')
while stream.readline() != desiredFunction + ':\n':
pass
asm = []
line = stream.readline().strip()
while line:
addr, instr = line.split(':', 1)
addr = int(addr)
asm.append((addr, instr))
line = stream.readline().strip()
return asm
samples = {}
class PerfParser(LineParser):
"""Parser for linux perf callgraph output.
It expects output generated with
perf record -g
perf script
"""
def __init__(self, infile, symbol):
LineParser.__init__(self, infile)
self.symbol = symbol
def readline(self):
# Override LineParser.readline to ignore comment lines
while True:
LineParser.readline(self)
if self.eof() or not self.lookahead().startswith('#'):
break
def parse(self):
# read lookahead
self.readline()
while not self.eof():
self.parse_event()
asm = lookupAsm(mapFile, self.symbol)
addresses = samples.keys()
addresses.sort()
total_samples = 0
sys.stdout.write('%s:\n' % self.symbol)
for address, instr in asm:
try:
sample = samples.pop(address)
except KeyError:
sys.stdout.write(6*' ')
else:
sys.stdout.write('%6u' % (sample))
total_samples += sample
sys.stdout.write('%6u: %s\n' % (address, instr))
print 'total:', total_samples
assert len(samples) == 0
sys.exit(0)
def parse_event(self):
if self.eof():
return
line = self.consume()
assert line
callchain = self.parse_callchain()
if not callchain:
return
def parse_callchain(self):
callchain = []
while self.lookahead():
function = self.parse_call(len(callchain) == 0)
if function is None:
break
callchain.append(function)
if self.lookahead() == '':
self.consume()
return callchain
call_re = re.compile(r'^\s+(?P<address>[0-9a-fA-F]+)\s+(?P<symbol>.*)\s+\((?P<module>[^)]*)\)$')
def parse_call(self, first):
line = self.consume()
mo = self.call_re.match(line)
assert mo
if not mo:
return None
if not first:
return None
function_name = mo.group('symbol')
if not function_name:
function_name = mo.group('address')
module = mo.group('module')
function_id = function_name + ':' + module
address = mo.group('address')
address = int(address, 16)
if function_name != self.symbol:
return None
start_address = lookupMap(module, function_name)
address -= start_address
#print function_name, module, address
samples[address] = samples.get(address, 0) + 1
return True
def main():
"""Main program."""
optparser = optparse.OptionParser(
usage="\n\t%prog [options] symbol_name")
(options, args) = optparser.parse_args(sys.argv[1:])
if len(args) != 1:
optparser.error('wrong number of arguments')
symbol = args[0]
p = subprocess.Popen(['perf', 'script'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
parser = PerfParser(p.stdout, symbol)
parser.parse()
if __name__ == '__main__':
main()
# vim: set sw=4 et:

View File

@@ -2,6 +2,12 @@
# This script is used to generate the list of changes that
# appears in the release notes files, with HTML formatting.
#
# Usage examples:
#
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 > changes
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee changes
typeset -i in_log=0

View File

@@ -98,5 +98,6 @@ def AddOptions(opts):
opts.Add(BoolOption('debug', 'DEPRECATED: debug build', 'yes'))
opts.Add(BoolOption('profile', 'DEPRECATED: profile build', 'no'))
opts.Add(BoolOption('quiet', 'DEPRECATED: profile build', 'yes'))
opts.Add(BoolOption('texture_float', 'enable floating-point textures and renderbuffers', 'no'))
if host_platform == 'windows':
opts.Add(EnumOption('MSVS_VERSION', 'MS Visual C++ version', None, allowed_values=('7.1', '8.0', '9.0')))
opts.Add(EnumOption('MSVC_VERSION', 'MS Visual C++ version', None, allowed_values=('7.1', '8.0', '9.0', '10.0', '11.0', '12.0')))

2
configs/.gitignore vendored
View File

@@ -1,2 +0,0 @@
current
autoconf

View File

@@ -1,227 +0,0 @@
# Autoconf configuration
# Pull in the defaults
include $(TOP)/configs/default
# This is generated by configure
CONFIG_NAME = autoconf
# Compiler and flags
CC = @CC@
CXX = @CXX@
OPT_FLAGS = @OPT_FLAGS@
ARCH_FLAGS = @ARCH_FLAGS@
PIC_FLAGS = @PIC_FLAGS@
DEFINES = @DEFINES@
API_DEFINES = @API_DEFINES@
SHARED_GLAPI = @SHARED_GLAPI@
CFLAGS_NOVISIBILITY = @CPPFLAGS@ @CFLAGS@ \
$(OPT_FLAGS) $(PIC_FLAGS) $(ARCH_FLAGS) $(DEFINES)
CXXFLAGS_NOVISIBILITY = @CPPFLAGS@ @CXXFLAGS@ \
$(OPT_FLAGS) $(PIC_FLAGS) $(ARCH_FLAGS) $(DEFINES)
CFLAGS = $(CFLAGS_NOVISIBILITY) @VISIBILITY_CFLAGS@
CXXFLAGS = $(CXXFLAGS_NOVISIBILITY) @VISIBILITY_CXXFLAGS@
LDFLAGS = @LDFLAGS@
EXTRA_LIB_PATH = @EXTRA_LIB_PATH@
RADEON_CFLAGS = @RADEON_CFLAGS@
RADEON_LIBS = @RADEON_LIBS@
NOUVEAU_CFLAGS = @NOUVEAU_CFLAGS@
NOUVEAU_LIBS = @NOUVEAU_LIBS@
INTEL_LIBS = @INTEL_LIBS@
INTEL_CFLAGS = @INTEL_CFLAGS@
X11_LIBS = @X11_LIBS@
X11_CFLAGS = @X11_CFLAGS@
LLVM_BINDIR = @LLVM_BINDIR@
LLVM_CFLAGS = @LLVM_CFLAGS@
LLVM_CPPFLAGS = @LLVM_CPPFLAGS@
LLVM_CXXFLAGS = @LLVM_CXXFLAGS@
LLVM_LDFLAGS = @LLVM_LDFLAGS@
LLVM_LIBDIR = @LLVM_LIBDIR@
LLVM_LIBS = @LLVM_LIBS@
LLVM_INCLUDEDIR = @LLVM_INCLUDEDIR@
GLW_CFLAGS = @GLW_CFLAGS@
GLX_TLS = @GLX_TLS@
# dlopen
DLOPEN_LIBS = @DLOPEN_LIBS@
# Source selection
MESA_ASM_FILES = @MESA_ASM_FILES@
GLAPI_ASM_SOURCES = @GLAPI_ASM_SOURCES@
# Misc tools and flags
MAKE = @MAKE@
SHELL = @SHELL@
MKLIB_OPTIONS = @MKLIB_OPTIONS@
MKDEP = @MKDEP@
MKDEP_OPTIONS = @MKDEP_OPTIONS@
INSTALL = @INSTALL@
AWK = @AWK@
GREP = @GREP@
NM = @NM@
# Perl
PERL = @PERL@
# Indent (used for generating dispatch tables)
INDENT = @INDENT@
INDENT_FLAGS = @INDENT_FLAGS@
# Python and flags (generally only needed by the developers)
PYTHON2 = @PYTHON2@
PYTHON_FLAGS = -t -O -O
# Flex and Bison for GLSL compiler
FLEX = @LEX@
BISON = @YACC@
# Library names (base name)
GL_LIB = @GL_LIB@
GLU_LIB = @GLU_LIB@
GLW_LIB = GLw
OSMESA_LIB = @OSMESA_LIB@
GLESv1_CM_LIB = GLESv1_CM
GLESv2_LIB = GLESv2
VG_LIB = OpenVG
GLAPI_LIB = glapi
# Library names (actual file names)
GL_LIB_NAME = @GL_LIB_NAME@
GLU_LIB_NAME = @GLU_LIB_NAME@
GLW_LIB_NAME = @GLW_LIB_NAME@
OSMESA_LIB_NAME = @OSMESA_LIB_NAME@
EGL_LIB_NAME = @EGL_LIB_NAME@
GLESv1_CM_LIB_NAME = @GLESv1_CM_LIB_NAME@
GLESv2_LIB_NAME = @GLESv2_LIB_NAME@
VG_LIB_NAME = @VG_LIB_NAME@
GLAPI_LIB_NAME = @GLAPI_LIB_NAME@
# Globs used to install the lib and all symlinks
GL_LIB_GLOB = @GL_LIB_GLOB@
GLU_LIB_GLOB = @GLU_LIB_GLOB@
GLW_LIB_GLOB = @GLW_LIB_GLOB@
OSMESA_LIB_GLOB = @OSMESA_LIB_GLOB@
EGL_LIB_GLOB = @EGL_LIB_GLOB@
GLESv1_CM_LIB_GLOB = @GLESv1_CM_LIB_GLOB@
GLESv2_LIB_GLOB = @GLESv2_LIB_GLOB@
VG_LIB_GLOB = @VG_LIB_GLOB@
GLAPI_LIB_GLOB = @GLAPI_LIB_GLOB@
# Directories to build
LIB_DIR = @LIB_DIR@
SRC_DIRS = @SRC_DIRS@
GLU_DIRS = @GLU_DIRS@
DRIVER_DIRS = @DRIVER_DIRS@
GALLIUM_DIRS = @GALLIUM_DIRS@
GALLIUM_DRIVERS_DIRS = @GALLIUM_DRIVERS_DIRS@
GALLIUM_WINSYS_DIRS = @GALLIUM_WINSYS_DIRS@
GALLIUM_TARGET_DIRS = @GALLIUM_TARGET_DIRS@
GALLIUM_STATE_TRACKERS_DIRS = @GALLIUM_STATE_TRACKERS_DIRS@
GALLIUM_AUXILIARIES = $(TOP)/src/gallium/auxiliary/libgallium.a
GALLIUM_DRIVERS = $(foreach DIR,$(GALLIUM_DRIVERS_DIRS),$(TOP)/src/gallium/drivers/$(DIR)/lib$(DIR).a)
# Driver specific build vars
DRI_DIRS = @DRI_DIRS@
EGL_PLATFORMS = @EGL_PLATFORMS@
EGL_CLIENT_APIS = @EGL_CLIENT_APIS@
# Dependencies
X11_INCLUDES = @X11_INCLUDES@
# GLw motif setup
GLW_SOURCES = @GLW_SOURCES@
MOTIF_CFLAGS = @MOTIF_CFLAGS@
# Library/program dependencies
GL_LIB_DEPS = $(EXTRA_LIB_PATH) @GL_LIB_DEPS@
OSMESA_LIB_DEPS = -L$(TOP)/$(LIB_DIR) @OSMESA_MESA_DEPS@ \
$(EXTRA_LIB_PATH) @OSMESA_LIB_DEPS@
EGL_LIB_DEPS = $(EXTRA_LIB_PATH) @EGL_LIB_DEPS@
GLU_LIB_DEPS = -L$(TOP)/$(LIB_DIR) @GLU_MESA_DEPS@ \
$(EXTRA_LIB_PATH) @GLU_LIB_DEPS@
GLW_LIB_DEPS = -L$(TOP)/$(LIB_DIR) @GLW_MESA_DEPS@ \
$(EXTRA_LIB_PATH) @GLW_LIB_DEPS@
GLESv1_CM_LIB_DEPS = $(EXTRA_LIB_PATH) @GLESv1_CM_LIB_DEPS@
GLESv2_LIB_DEPS = $(EXTRA_LIB_PATH) @GLESv2_LIB_DEPS@
VG_LIB_DEPS = $(EXTRA_LIB_PATH) @VG_LIB_DEPS@
GLAPI_LIB_DEPS = $(EXTRA_LIB_PATH) @GLAPI_LIB_DEPS@
# DRI dependencies
MESA_MODULES = @MESA_MODULES@
DRI_LIB_DEPS = $(EXTRA_LIB_PATH) @DRI_LIB_DEPS@
LIBDRM_CFLAGS = @LIBDRM_CFLAGS@
LIBDRM_LIB = @LIBDRM_LIBS@
DRI2PROTO_CFLAGS = @DRI2PROTO_CFLAGS@
GLPROTO_CFLAGS = @GLPROTO_CFLAGS@
EXPAT_INCLUDES = @EXPAT_INCLUDES@
# Autoconf directories
prefix = @prefix@
exec_prefix = @exec_prefix@
libdir = @libdir@
includedir = @includedir@
# Installation directories (for make install)
INSTALL_DIR = $(prefix)
INSTALL_LIB_DIR = $(libdir)
INSTALL_INC_DIR = $(includedir)
# DRI installation directories
DRI_DRIVER_INSTALL_DIR = @DRI_DRIVER_INSTALL_DIR@
# Where libGL will look for DRI hardware drivers
DRI_DRIVER_SEARCH_DIR = @DRI_DRIVER_SEARCH_DIR@
# EGL driver install directory
EGL_DRIVER_INSTALL_DIR = @EGL_DRIVER_INSTALL_DIR@
# XVMC library install directory
XVMC_LIB_INSTALL_DIR=@XVMC_LIB_INSTALL_DIR@
# VDPAU library install directory
VDPAU_LIB_INSTALL_DIR=@VDPAU_LIB_INSTALL_DIR@
# VA library install directory
VA_LIB_INSTALL_DIR=@VA_LIB_INSTALL_DIR@
# Xorg driver install directory (for xorg state-tracker)
XORG_DRIVER_INSTALL_DIR = @XORG_DRIVER_INSTALL_DIR@
# Path to OpenCL C library libclc
LIBCLC_PATH = @LIBCLC_PATH@
# pkg-config substitutions
GL_PC_REQ_PRIV = @GL_PC_REQ_PRIV@
GL_PC_LIB_PRIV = @GL_PC_LIB_PRIV@
GL_PC_CFLAGS = @GL_PC_CFLAGS@
DRI_PC_REQ_PRIV = @DRI_PC_REQ_PRIV@
GLU_PC_REQ = @GLU_PC_REQ@
GLU_PC_REQ_PRIV = @GLU_PC_REQ_PRIV@
GLU_PC_LIB_PRIV = @GLU_PC_LIB_PRIV@
GLU_PC_CFLAGS = @GLU_PC_CFLAGS@
GLW_PC_REQ_PRIV = @GLW_PC_REQ_PRIV@
GLW_PC_LIB_PRIV = @GLW_PC_LIB_PRIV@
GLW_PC_CFLAGS = @GLW_PC_CFLAGS@
OSMESA_PC_REQ = @OSMESA_PC_REQ@
OSMESA_PC_LIB_PRIV = @OSMESA_PC_LIB_PRIV@
GLESv1_CM_PC_LIB_PRIV = @GLESv1_CM_PC_LIB_PRIV@
GLESv2_PC_LIB_PRIV = @GLESv2_PC_LIB_PRIV@
EGL_PC_REQ_PRIV = @GL_PC_REQ_PRIV@
EGL_PC_LIB_PRIV = @GL_PC_LIB_PRIV@
EGL_PC_CFLAGS = @GL_PC_CFLAGS@
XCB_DRI2_CFLAGS = @XCB_DRI2_CFLAGS@
XCB_DRI2_LIBS = @XCB_DRI2_LIBS@
LIBUDEV_CFLAGS = @LIBUDEV_CFLAGS@
LIBUDEV_LIBS = @LIBUDEV_LIBS@
WAYLAND_CFLAGS = @WAYLAND_CFLAGS@
WAYLAND_LIBS = @WAYLAND_LIBS@
MESA_LLVM = @MESA_LLVM@
LLVM_VERSION = @LLVM_VERSION@
HAVE_XF86VIDMODE = @HAVE_XF86VIDMODE@
GALLIUM_PIPE_LOADER_DEFINES = @GALLIUM_PIPE_LOADER_DEFINES@
GALLIUM_PIPE_LOADER_LIBS = @GALLIUM_PIPE_LOADER_LIBS@

View File

@@ -1,182 +0,0 @@
# Default/template configuration
# This is included by other config files which may override some
# of these variables.
# Think of this as a base class from which configs are derived.
CONFIG_NAME = default
# Version info
MESA_MAJOR=8
MESA_MINOR=1
MESA_TINY=0
MESA_VERSION = $(MESA_MAJOR).$(MESA_MINOR).$(MESA_TINY)
# external projects. This should be useless now that we use libdrm.
DRM_SOURCE_PATH=$(TOP)/../drm
# Compiler and flags
CC = cc
CXX = CC
CFLAGS = -O
CXXFLAGS = -O
LDFLAGS =
GLU_CFLAGS =
GLX_TLS = no
# Compiler for building demos/tests/etc
APP_CC = $(CC)
APP_CXX = $(CXX)
# Misc tools and flags
SHELL = /bin/sh
MKLIB = $(SHELL) $(TOP)/bin/mklib
MKLIB_OPTIONS =
MKDEP = makedepend
MKDEP_OPTIONS = -fdepend
MAKE = make
FLEX = flex
BISON = bison
PKG_CONFIG = pkg-config
# Use MINSTALL for installing libraries, INSTALL for everything else
MINSTALL = $(SHELL) $(TOP)/bin/minstall
INSTALL = $(MINSTALL)
# Tools for regenerating glapi (generally only needed by the developers)
PYTHON2 = python
PYTHON_FLAGS = -t -O -O
INDENT = indent
INDENT_FLAGS = -i4 -nut -br -brs -npcs -ce -T GLubyte -T GLbyte -T Bool
# Library names (base name)
GL_LIB = GL
GLU_LIB = GLU
GLW_LIB = GLw
OSMESA_LIB = OSMesa
EGL_LIB = EGL
GLESv1_CM_LIB = GLESv1_CM
GLESv2_LIB = GLESv2
VG_LIB = OpenVG
GLAPI_LIB = glapi
# Library names (actual file names)
GL_LIB_NAME = lib$(GL_LIB).so
GLU_LIB_NAME = lib$(GLU_LIB).so
GLW_LIB_NAME = lib$(GLW_LIB).so
OSMESA_LIB_NAME = lib$(OSMESA_LIB).so
EGL_LIB_NAME = lib$(EGL_LIB).so
GLESv1_CM_LIB_NAME = lib$(GLESv1_CM_LIB).so
GLESv2_LIB_NAME = lib$(GLESv2_LIB).so
VG_LIB_NAME = lib$(VG_LIB).so
GLAPI_LIB_NAME = lib$(GLAPI_LIB).so
# globs used to install the lib and all symlinks
GL_LIB_GLOB = $(GL_LIB_NAME)*
GLU_LIB_GLOB = $(GLU_LIB_NAME)*
GLW_LIB_GLOB = $(GLW_LIB_NAME)*
OSMESA_LIB_GLOB = $(OSMESA_LIB_NAME)*
EGL_LIB_GLOB = $(EGL_LIB_NAME)*
GLESv1_CM_LIB_GLOB = $(GLESv1_CM_LIB_NAME)*
GLESv2_LIB_GLOB = $(GLESv2_LIB_NAME)*
VG_LIB_GLOB = $(VG_LIB_NAME)*
GLAPI_LIB_GLOB = $(GLAPI_LIB_NAME)*
# Optional assembly language optimization files for libGL
MESA_ASM_FILES =
# GLw widget sources (Append "GLwMDrawA.c" here and add -lXm to GLW_LIB_DEPS in
# order to build the Motif widget too)
GLW_SOURCES = GLwDrawA.c
MOTIF_CFLAGS = -I/usr/include/Motif1.2
# Directories to build
LIB_DIR = lib
SRC_DIRS = glsl mapi/glapi mapi/vgapi mesa \
gallium egl gallium/winsys gallium/targets glu
GLU_DIRS = sgi
DRIVER_DIRS = x11 osmesa
# Gallium directories and
GALLIUM_DIRS = auxiliary drivers state_trackers
GALLIUM_AUXILIARIES = $(TOP)/src/gallium/auxiliary/libgallium.a
GALLIUM_DRIVERS_DIRS = softpipe trace rbug noop identity galahad i915 svga r300 nvfx nv50
GALLIUM_DRIVERS = $(foreach DIR,$(GALLIUM_DRIVERS_DIRS),$(TOP)/src/gallium/drivers/$(DIR)/lib$(DIR).a)
GALLIUM_WINSYS_DIRS = sw sw/xlib
GALLIUM_TARGET_DIRS = libgl-xlib
GALLIUM_STATE_TRACKERS_DIRS = glx vega
# native platforms EGL should support
EGL_PLATFORMS = x11
EGL_CLIENT_APIS = $(GL_LIB)
# Library dependencies
#EXTRA_LIB_PATH ?=
GL_LIB_DEPS = $(EXTRA_LIB_PATH) -lX11 -lXext -lm -lpthread
EGL_LIB_DEPS = $(EXTRA_LIB_PATH) -ldl -lpthread
OSMESA_LIB_DEPS = $(EXTRA_LIB_PATH) -L$(TOP)/$(LIB_DIR) -l$(GL_LIB)
GLU_LIB_DEPS = $(EXTRA_LIB_PATH) -L$(TOP)/$(LIB_DIR) -l$(GL_LIB) -lm
GLW_LIB_DEPS = $(EXTRA_LIB_PATH) -L$(TOP)/$(LIB_DIR) -l$(GL_LIB) -lXt -lX11
GLESv1_CM_LIB_DEPS = $(EXTRA_LIB_PATH) -lpthread
GLESv2_LIB_DEPS = $(EXTRA_LIB_PATH) -lpthread
VG_LIB_DEPS = $(EXTRA_LIB_PATH) -lpthread
GLAPI_LIB_DEPS = $(EXTRA_LIB_PATH) -lpthread
# Program dependencies - specific GL libraries added in Makefiles
X11_LIBS = -lX11
DLOPEN_LIBS = -ldl
# Installation directories (for make install)
INSTALL_DIR = /usr/local
INSTALL_LIB_DIR = $(INSTALL_DIR)/$(LIB_DIR)
INSTALL_INC_DIR = $(INSTALL_DIR)/include
DRI_DRIVER_INSTALL_DIR = $(INSTALL_LIB_DIR)/dri
# Where libGL will look for DRI hardware drivers
DRI_DRIVER_SEARCH_DIR = $(DRI_DRIVER_INSTALL_DIR)
# EGL driver install directory
EGL_DRIVER_INSTALL_DIR = $(INSTALL_LIB_DIR)/egl
# Xorg driver install directory (for xorg state-tracker)
XORG_DRIVER_INSTALL_DIR = $(INSTALL_LIB_DIR)/xorg/modules/drivers
# pkg-config substitutions
GL_PC_REQ_PRIV =
GL_PC_LIB_PRIV =
GL_PC_CFLAGS =
DRI_PC_REQ_PRIV =
GLU_PC_REQ = gl
GLU_PC_REQ_PRIV =
GLU_PC_LIB_PRIV =
GLU_PC_CFLAGS =
GLW_PC_REQ_PRIV =
GLW_PC_LIB_PRIV =
GLW_PC_CFLAGS =
OSMESA_PC_REQ =
OSMESA_PC_LIB_PRIV =
GLESv1_CM_PC_REQ_PRIV =
GLESv1_CM_PC_LIB_PRIV =
GLESv1_CM_PC_CFLAGS =
GLESv2_PC_REQ_PRIV =
GLESv2_PC_LIB_PRIV =
GLESv2_PC_CFLAGS =
VG_PC_REQ_PRIV =
VG_PC_LIB_PRIV =
VG_PC_CFLAGS =
# default targets
# this helps reduce the mismatch between our automake Makefiles and the old
# custom Makefiles while we transition.
all: default
am--refresh:
distclean: clean
check:
test:

File diff suppressed because it is too large Load Diff

View File

@@ -7,106 +7,114 @@ infrastructure is complete but it may be the case that few (if any) drivers
implement the features.
OpenGL Core and Compatibility context support
OpenGL 3.1 and later versions are only supported with the Core profile.
There are no plans to support GL_ARB_compatibility. The last supported OpenGL
version with all deprecated features is 3.0. Some of the later GL features
are exposed in the 3.0 context as extensions.
Feature Status
----------------------------------------------------- ------------------------
GL 3.0:
GLSL 1.30 DONE
GLSL 1.30 DONE (i965, r600, radeonsi)
glBindFragDataLocation, glGetFragDataLocation DONE
Conditional rendering (GL_NV_conditional_render) DONE (i965, r300, r600, swrast)
Map buffer subranges (GL_ARB_map_buffer_range) DONE (i965, r300, r600, swrast)
Clamping controls (GL_ARB_color_buffer_float) DONE (i965, r300, r600)
Float textures, renderbuffers (GL_ARB_texture_float) DONE (i965, r300, r600)
GL_EXT_packed_float DONE (i965, r600)
GL_EXT_texture_shared_exponent DONE (i965, r600, swrast)
Float depth buffers (GL_ARB_depth_buffer_float) DONE (i965, r600)
Framebuffer objects (GL_ARB_framebuffer_object) DONE (i965, r300, r600, swrast)
Half-float DONE
Non-normalized Integer texture/framebuffer formats DONE (i965)
1D/2D Texture arrays DONE
Per-buffer blend and masks (GL_EXT_draw_buffers2) DONE (i965, r600, swrast)
GL_EXT_texture_compression_rgtc DONE (i965, r300, r600, swrast)
Red and red/green texture formats DONE (i965, swrast, gallium)
Transform feedback (GL_EXT_transform_feedback) DONE (i965)
Vertex array objects (GL_APPLE_vertex_array_object) DONE (i965, r300, r600, swrast)
sRGB framebuffer format (GL_EXT_framebuffer_sRGB) DONE (i965, r600)
Conditional rendering (GL_NV_conditional_render) DONE (i965, r300, r600, radeonsi, swrast)
Map buffer subranges (GL_ARB_map_buffer_range) DONE (i965, r300, r600, radeonsi, swrast)
Clamping controls (GL_ARB_color_buffer_float) DONE (i965, r300, r600, radeonsi)
Float textures, renderbuffers (GL_ARB_texture_float) DONE (i965, r300, r600, radeonsi)
GL_EXT_packed_float DONE (i965, r600, radeonsi)
GL_EXT_texture_shared_exponent DONE (i965, r600, radeonsi, swrast)
Float depth buffers (GL_ARB_depth_buffer_float) DONE (i965, r600, radeonsi)
Framebuffer objects (GL_ARB_framebuffer_object) DONE (i965, r300, r600, radeonsi, swrast)
Half-float DONE (i965, r300, r600, radeonsi, swrast)
Non-normalized Integer texture/framebuffer formats DONE (i965, r600, radeonsi)
1D/2D Texture arrays DONE (i965, r600, radeonsi)
Per-buffer blend and masks (GL_EXT_draw_buffers2) DONE (i965, r600, radeonsi, swrast)
GL_EXT_texture_compression_rgtc DONE (i965, r300, r600, radeonsi, swrast)
Red and red/green texture formats DONE (i965, r300, r600, radeonsi, swrast)
Transform feedback (GL_EXT_transform_feedback) DONE (i965, r600, radeonsi)
Vertex array objects (GL_APPLE_vertex_array_object) DONE (all drivers)
sRGB framebuffer format (GL_EXT_framebuffer_sRGB) DONE (i965, r600, radeonsi)
glClearBuffer commands DONE
glGetStringi command DONE
glTexParameterI, glGetTexParameterI commands DONE
glVertexAttribI commands ~50% done (converts int
values to floats)
Depth format cube textures DONE
glVertexAttribI commands DONE
Depth format cube textures DONE (i965, r600, radeonsi)
GLX_ARB_create_context (GLX 1.4 is required) DONE
GL 3.1:
GLSL 1.40 missing: UBOS, inverse(),
highp change
Instanced drawing (GL_ARB_draw_instanced) DONE (i965, gallium, swrast)
Buffer copying (GL_ARB_copy_buffer) DONE (i965, r300, r600, swrast)
Primitive restart (GL_NV_primitive_restart) DONE (i965, r600)
16 vertex texture image units DONE
Texture buffer objs (GL_ARB_texture_buffer_object) needs GL3.1 enabling (i965)
Rectangular textures (GL_ARB_texture_rectangle) DONE (i965, r300, r600, swrast)
Uniform buffer objs (GL_ARB_uniform_buffer_object) not started
Signed normalized textures (GL_EXT_texture_snorm) DONE (i965, r300, r600)
GLSL 1.40 DONE (i965, r600, radeonsi)
Forward compatible context support/deprecations DONE (i965, r600, radeonsi)
Instanced drawing (GL_ARB_draw_instanced) DONE (i965, r600, radeonsi, swrast)
Buffer copying (GL_ARB_copy_buffer) DONE (i965, r300, r600, radeonsi, swrast)
Primitive restart (GL_NV_primitive_restart) DONE (i965, r300, r600, radeonsi)
16 vertex texture image units DONE (i965, r600, radeonsi)
Texture buffer objs (GL_ARB_texture_buffer_object) DONE for OpenGL 3.1 contexts (i965, r600, radeonsi)
Rectangular textures (GL_ARB_texture_rectangle) DONE (i965, r300, r600, radeonsi, swrast)
Uniform buffer objs (GL_ARB_uniform_buffer_object) DONE (i965, r600, radeonsi, swrast)
Signed normalized textures (GL_EXT_texture_snorm) DONE (i965, r300, r600, radeonsi)
GL 3.2:
Core/compatibility profiles not started
GLSL 1.50 not started
Geometry shaders (GL_ARB_geometry_shader4) partially done (Zack)
BGRA vertex order (GL_ARB_vertex_array_bgra) DONE (i965, r300, r600, swrast)
Base vertex offset(GL_ARB_draw_elements_base_vertex) DONE (i965, r300, r600, swrast)
Frag shader coord (GL_ARB_fragment_coord_conventions) DONE (i965, r300, r600, swrast)
Provoking vertex (GL_ARB_provoking_vertex) DONE (i965, r300, r600, swrast)
Seamless cubemaps (GL_ARB_seamless_cube_map) DONE (i965, r600)
Multisample textures (GL_ARB_texture_multisample) not started
Frag depth clamp (GL_ARB_depth_clamp) DONE (i965, r600, swrast)
Fence objects (GL_ARB_sync) DONE (i965, r300, r600, swrast)
Core/compatibility profiles DONE
GLSL 1.50 DONE (i965)
Geometry shaders DONE (i965)
BGRA vertex order (GL_ARB_vertex_array_bgra) DONE (i965, r300, r600, radeonsi, swrast)
Base vertex offset(GL_ARB_draw_elements_base_vertex) DONE (i965, r300, r600, radeonsi, swrast)
Frag shader coord (GL_ARB_fragment_coord_conventions) DONE (i965, r300, r600, radeonsi, swrast)
Provoking vertex (GL_ARB_provoking_vertex) DONE (i965, r300, r600, radeonsi, swrast)
Seamless cubemaps (GL_ARB_seamless_cube_map) DONE (i965, r600, radeonsi)
Multisample textures (GL_ARB_texture_multisample) DONE (i965, r600, radeonsi)
Frag depth clamp (GL_ARB_depth_clamp) DONE (i965, r600, swrast, radeonsi)
Fence objects (GL_ARB_sync) DONE (i965, r300, r600, radeonsi, swrast)
GLX_ARB_create_context_profile DONE
GL 3.3:
GLSL 3.30 new features in this version pretty much done
GL_ARB_blend_func_extended DONE (i965, r600, softpipe)
GL_ARB_explicit_attrib_location DONE (i915, i965, r300, r600, swrast)
GL_ARB_occlusion_query2 DONE (r300, r600, swrast)
GL_ARB_sampler_objects DONE (i965, r300, r600)
GL_ARB_shader_bit_encoding DONE
GL_ARB_texture_rgb10_a2ui DONE (r600)
GL_ARB_texture_swizzle DONE (same as EXT version) (i965, r300, r600, swrast)
GL_ARB_timer_query ~60% done (the EXT variant)
GL_ARB_instanced_arrays DONE (r300, r600)
GL_ARB_vertex_type_2_10_10_10_rev DONE (r600)
GLSL 3.30 DONE (i965)
GL_ARB_blend_func_extended DONE (i965, r600, radeonsi, softpipe)
GL_ARB_explicit_attrib_location DONE (i915, i965, r300, r600, radeonsi, swrast)
GL_ARB_occlusion_query2 DONE (i965, r300, r600, radeonsi, swrast)
GL_ARB_sampler_objects DONE (i965, r300, r600, radeonsi)
GL_ARB_shader_bit_encoding DONE (i965, r600, radeonsi)
GL_ARB_texture_rgb10_a2ui DONE (i965, r600, radeonsi)
GL_ARB_texture_swizzle DONE (i965, r300, r600, radeonsi, swrast)
GL_ARB_timer_query DONE (i965, r600, radeonsi)
GL_ARB_instanced_arrays DONE (i965, r300, r600, radeonsi)
GL_ARB_vertex_type_2_10_10_10_rev DONE (i965, r600, radeonsi)
GL 4.0:
GLSL 4.0 not started
GL_ARB_texture_query_lod not started
GL_ARB_draw_buffers_blend DONE (i965, r600, softpipe)
GL_ARB_draw_indirect not started
GL_ARB_texture_query_lod DONE (i965)
GL_ARB_draw_buffers_blend DONE (i965, r600, radeonsi, softpipe)
GL_ARB_draw_indirect started (Christoph)
GL_ARB_gpu_shader5 started
GL_ARB_gpu_shader_fp64 not started
GL_ARB_sample_shading not started
GL_ARB_sample_shading DONE (i965)
GL_ARB_shader_subroutine not started
GL_ARB_tessellation_shader not started
GL_ARB_texture_buffer_object_rgb32 not started
GL_ARB_texture_cube_map_array not started
GL_ARB_texture_gather not started
GL_ARB_transform_feedback2 DONE
GL_ARB_transform_feedback3 not started
GL_ARB_texture_buffer_object_rgb32 DONE (i965, r600, radeonsi, softpipe)
GL_ARB_texture_cube_map_array DONE (i965, r600, softpipe)
GL_ARB_texture_gather DONE (i965)
GL_ARB_transform_feedback2 DONE (i965, r600, radeonsi)
GL_ARB_transform_feedback3 DONE (i965, r600, radeonsi)
GL 4.1:
GLSL 4.1 not started
GL_ARB_ES2_compatibility DONE (i965, r300, r600)
GL_ARB_get_program_binary not started
GL_ARB_ES2_compatibility DONE (i965, r300, r600, radeonsi)
GL_ARB_get_program_binary DONE (0 binary formats)
GL_ARB_separate_shader_objects some infrastructure done
GL_ARB_shader_precision not started
GL_ARB_vertex_attrib_64bit not started
@@ -114,18 +122,60 @@ GL_ARB_viewport_array not started
GL 4.2:
GLSL 4.2 not started
GL_ARB_texture_compression_bptc not started
GL_ARB_compressed_texture_pixel_storage not started
GL_ARB_shader_atomic_counters not started
GL_ARB_texture_storage DONE (r300, r600, swrast)
GL_ARB_transform_feedback_instanced not started
GL_ARB_base_instance DONE (nv50, nvc0, r600, radeonsi)
GL_ARB_shader_atomic_counters DONE (i965)
GL_ARB_texture_storage DONE (all drivers)
GL_ARB_transform_feedback_instanced DONE (i965, r600, radeonsi)
GL_ARB_base_instance DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_shader_image_load_store not started
GL_ARB_conservative_depth DONE (softpipe)
GL_ARB_shading_language_420pack not started
GL_ARB_internalformat_query not started
GL_ARB_map_buffer_alignment not started
GL_ARB_conservative_depth DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_420pack DONE (all drivers that support GLSL 1.30)
GL_ARB_internalformat_query DONE (i965, r300, r600, radeonsi)
GL_ARB_map_buffer_alignment DONE (r300, r600, radeonsi)
GL 4.3:
GLSL 4.3 not started
GL_ARB_arrays_of_arrays not started
GL_ARB_ES3_compatibility DONE (i965)
GL_ARB_clear_buffer_object not started
GL_ARB_compute_shader not started
GL_ARB_copy_image not started
GL_KHR_debug DONE (all drivers)
GL_ARB_explicit_uniform_location not started
GL_ARB_fragment_layer_viewport not started
GL_ARB_framebuffer_no_attachments not started
GL_ARB_internalformat_query2 not started
GL_ARB_invalidate_subdata DONE (all drivers)
GL_ARB_multi_draw_indirect not started
GL_ARB_program_interface_query not started
GL_ARB_robust_buffer_access_behavior not started
GL_ARB_shader_image_size not started
GL_ARB_shader_storage_buffer_object not started
GL_ARB_stencil_texturing not started
GL_ARB_texture_buffer_range DONE (nv50, nvc0, i965, r600, radeonsi)
GL_ARB_texture_query_levels DONE (i965)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_texture_view not started
GL_ARB_vertex_attrib_binding DONE (all drivers)
GL 4.4:
GLSL 4.4 not started
GL_MAX_VERTEX_ATTRIB_STRIDE not started
GL_ARB_buffer_storage not started
GL_ARB_clear_texture not started
GL_ARB_enhanced_layouts not started
GL_ARB_multi_bind not started
GL_ARB_query_buffer_object not started
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv30, nv50, nvc0, r300, r600, radeonsi, swrast)
GL_ARB_texture_stencil8 not started
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, r600)
More info about these features and the work involved can be found at

13
docs/README.UVD Normal file
View File

@@ -0,0 +1,13 @@
The software may implement third party technologies (e.g. third party
libraries) that are not licensed to you by AMD and for which you may need
to obtain licenses from other parties. Unless explicitly stated otherwise,
these third party technologies are not licensed hereunder. Such third
party technologies include, but are not limited, to H.264, MPEG-2, MPEG-4,
AVC, and VC-1.
For MPEG-2 Encoding Products ANY USE OF THIS PRODUCT IN ANY MANNER OTHER
THAN PERSONAL USE THAT COMPLIES WITH THE MPEG-2 STANDARD FOR ENCODING VIDEO
INFORMATION FOR PACKAGED MEDIA IS EXPRESSLY PROHIBITED WITHOUT A LICENSE
UNDER APPLICABLE PATENTS IN THE MPEG-2 PATENT PORTFOLIO, WHICH LICENSES IS
AVAILABLE FROM MPEG LA, LLC, 6312 S. Fiddlers Green Circle, Suite 400E,
Greenwood Village, Colorado 80111 U.S.A.

View File

@@ -1,38 +0,0 @@
VMS support contributed by Jouk Jansen (joukj@hrem.stm.tudelft.nl)
The latest version was tested on a VMSAlpha7.2 system using DECC6.0, but
probably also works for other versions.
At the moment only the libraries LIBMESGL.EXE/LIBMESGL.OLB,
LIBMESAGLU.EXE/LIBMESAGLU.OLB and LIBGLUT.EXE/LIBGLUT.OLB and the demos of the
directory [.DEMOS] can be build.
However, feel free to create the missing "decrip.mms-files" in the other
directories.
The make files were tested
using the DIGITAL make utility called MMS. There is also a public domain
clone available (MMK) and I think, but it is not tested, that this
utility will give (hardly) any problem.
To make everything just type MMS (or MMK) in the main directory of
mesagl. For MMS the deafult makefile is called descrip.mms, and
that is what I have called it. I included alse some config files,
all having mms somewhere in the name which all the makefiles need
(just as your unix makefiles).
On Alpha platforms at default a sharable images for the libraries are created.
To get a static library make it by typing MMS/MACRO=(NOSHARE=1).
On VAX platforms only static libraries can be build.
23-sep-2005
changed default compilation to use /float=ieee/ieee=denorm. The reason for
this is that it makes Mesa on OpenVMS better compatible with other platforms
and other packages for VMS that I maintain.
For more information see
http://nchrem.tnw.tudelft.nl/openvms
https://bugs.freedesktop.org/show_bug.cgi?id=4270
You may want to compile Mesa to use VAX-floating point arithmetic, instead
of IEEE floating point by removing the /float=IEEE/denorm flag from the
compiler options in the descrip.mms files.

View File

@@ -1,6 +1,6 @@
File: docs/README.WIN32
Last updated: 23 April 2011
Last updated: 21 June 2013
Quick Start
@@ -30,6 +30,23 @@ At this time, only the gallium GDI driver is known to work.
Source code also exists in the tree for other drivers in
src/mesa/drivers/windows, but the status of this code is unknown.
Recipe
------
Building on windows requires several open-source packages. These are
steps that work as of this writing.
1) install python 2.7
2) install scons (latest)
3) install mingw, flex, and bison
4) install libxml2 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs
get libxml2-python-2.9.1.win-amd64-py2.7.exe
5) install pywin32 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs
get pywin32-218.4.win-amd64-py2.7.exe
6) install git
7) download mesa from git
see http://www.mesa3d.org/repository.html
8) run scons
General
-------

View File

@@ -1,92 +0,0 @@
Name
WL_bind_wayland_display
Name Strings
EGL_WL_bind_wayland_display
Contact
Kristian Høgsberg <krh@bitplanet.net>
Benjamin Franzke <benjaminfranzke@googlemail.com>
Status
Proposal
Version
Version 1, March 1, 2011
Number
EGL Extension #not assigned
Dependencies
Requires EGL 1.4 or later. This extension is written against the
wording of the EGL 1.4 specification.
EGL_KHR_base_image is required.
Overview
This extension provides entry points for binding and unbinding the
wl_display of a Wayland compositor to an EGLDisplay. Binding a
wl_display means that the EGL implementation should provide one or
more interfaces in the Wayland protocol to allow clients to create
wl_buffer objects. On the server side, this extension also
provides a new target for eglCreateImageKHR, to create an EGLImage
from a wl_buffer
Adding an implementation specific wayland interface, allows the
EGL implementation to define specific wayland requests and events,
needed for buffer sharing in an EGL wayland platform.
IP Status
Open-source; freely implementable.
New Procedures and Functions
EGLBoolean eglBindWaylandDisplayWL(EGLDisplay dpy,
struct wl_display *display);
EGLBoolean eglUnbindWaylandDisplayWL(EGLDisplay dpy,
struct wl_display *display);
New Tokens
Accepted as <target> in eglCreateImageKHR
EGL_WAYLAND_BUFFER_WL 0x31D5
Additions to the EGL 1.4 Specification:
To bind a server side wl_display to an EGLDisplay, call
EGLBoolean eglBindWaylandDisplayWL(EGLDisplay dpy,
struct wl_display *display);
To unbind a server side wl_display from an EGLDisplay, call
EGLBoolean eglUnbindWaylandDisplayWL(EGLDisplay dpy,
struct wl_display *display);
eglBindWaylandDisplayWL returns EGL_FALSE when there is already a
wl_display bound to EGLDisplay otherwise EGL_TRUE.
eglUnbindWaylandDisplayWL returns EGL_FALSE when there is no
wl_display bound to the EGLDisplay currently otherwise EGL_TRUE.
Import a wl_buffer by calling eglCreateImageKHR with
wl_buffer as EGLClientBuffer, EGL_WAYLAND_BUFFER_WL as the target,
NULL context and an empty attribute_list.
Issues
Revision History
Version 1, March 1, 2011
Initial draft (Benjamin Franzke)

View File

@@ -0,0 +1,83 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Application Issues</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Application Issues</h1>
<p>
This page documents known issues with some OpenGL applications.
</p>
<h2>Topogun</h2>
<p>
<a href="http://www.topogun.com/">Topogun</a> for Linux (version 2, at least)
creates a GLX visual without requesting a depth buffer.
This causes bad rendering if the OpenGL driver happens to choose a visual
without a depth buffer.
</p>
<p>
Mesa 9.1.2 and later (will) support a DRI configuration option to work around
this issue.
Using the <a href="http://dri.freedesktop.org/wiki/DriConf">driconf</a> tool,
set the "Create all visuals with a depth buffer" option before running Topogun.
Then, all GLX visuals will be created with a depth buffer.
</p>
<h2>Old OpenGL games</h2>
<p>
Some old OpenGL games (approx. ten years or older) may crash during
start-up because of an extension string buffer-overflow problem.
</p>
<p>
The problem is a modern OpenGL driver will return a very long string
for the glGetString(GL_EXTENSIONS) query and if the application
naively copies the string into a fixed-size buffer it can overflow the
buffer and crash the application.
</p>
<p>
The work-around is to set the MESA_EXTENSION_MAX_YEAR environment variable
to the approximate release year of the game.
This will cause the glGetString(GL_EXTENSIONS) query to only report extensions
older than the given year.
</p>
<p>
For example, if the game was released in 2001, do
<pre>
export MESA_EXTENSION_MAX_YEAR=2001
</pre>
before running the game.
</p>
<h2>Viewperf</h2>
<p>
See the <a href="viewperf.html">Viewperf issues</a> page for a detailed list
of Viewperf issues.
</p>
</div>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Compilation and Installation using Autoconf</h1>
<ol>
@@ -17,11 +24,6 @@
<li><a href="#dri">DRI Driver Options</a></li>
<li><a href="#osmesa">OSMesa Driver Options</a></li>
</ul>
<li><p><a href="#library">Library Options</a>
<ul>
<li><a href="#glu">GLU</a></li>
</ul>
<li><p><a href="#demos">Demo Program Options</a>
</ol>
@@ -60,83 +62,89 @@ configuration run <code>make realclean</code> before rebuilding.
<p>
Some of the generic autoconf options are used with Mesa:
<ul>
<li><code>--prefix=PREFIX</code> - This is the root directory where
</p>
<dl>
<dt><code>--prefix=PREFIX</code></dt>
<dd><p>This is the root directory where
files will be installed by <code>make install</code>. The default is
<code>/usr/local</code>.
</li>
<li><code>--exec-prefix=EPREFIX</code> - This is the root directory
<code>/usr/local</code>.</p>
</dd>
<dt><code>--exec-prefix=EPREFIX</code></dt>
<dd><p>This is the root directory
where architecture-dependent files will be installed. In Mesa, this is
only used to derive the directory for the libraries. The default is
<code>${prefix}</code>.
</li>
<li><code>--libdir=LIBDIR</code> - This option specifies the directory
<code>${prefix}</code>.</p>
</dd>
<dt><code>--libdir=LIBDIR</code></dt>
<dd><p>This option specifies the directory
where the GL libraries will be installed. The default is
<code>${exec_prefix}/lib</code>. It also serves as the name of the
library staging area in the source tree. For instance, if the option
<code>--libdir=/usr/local/lib64</code> is used, the libraries will be
created in a <code>lib64</code> directory at the top of the Mesa source
tree.
</li>
<li><code>--enable-static, --disable-shared</code> - By default, Mesa
tree.</p>
</dd>
<dt><code>--enable-static, --disable-shared</code></dt>
<dd><p>By default, Mesa
will build shared libraries. Either of these options will force static
libraries to be built. It is not currently possible to build static and
shared libraries in a single pass.
</li>
<li><code>CC, CFLAGS, CXX, CXXFLAGS</code> - These environment variables
shared libraries in a single pass.</p>
</dd>
<dt><code>CC, CFLAGS, CXX, CXXFLAGS</code></dt>
<dd><p>These environment variables
control the C and C++ compilers used during the build. By default,
<code>gcc</code> and <code>g++</code> are used with the options
<code>"-g -O2"</code>.
</li>
<li><code>LDFLAGS</code> - An environment variable specifying flags to
<code>"-g -O2"</code>.</p>
</dd>
<dt><code>LDFLAGS</code></dt>
<dd><p>An environment variable specifying flags to
pass when linking programs. These are normally empty, but can be used
to direct the linker to use libraries in nonstandard directories. For
example, <code>LDFLAGS="-L/usr/X11R6/lib"</code>.
</li>
<li><code>PKG_CONFIG_PATH</code> - When available, the
example, <code>LDFLAGS="-L/usr/X11R6/lib"</code>.</p>
</dd>
<dt><code>PKG_CONFIG_PATH</code></dt>
<dd><p>When available, the
<code>pkg-config</code> utility is used to search for external libraries
on the system. This environment variable is used to control the search
path for <code>pkg-config</code>. For instance, setting
<code>PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig</code> will search for
package metadata in <code>/usr/X11R6</code> before the standard
directories.
</li>
</ul>
directories.</p>
</dd>
</dl>
<p>
There are also a few general options for altering the Mesa build:
<ul>
<li><code>--with-x</code> - When the X11 development libraries are
needed, the <code>pkg-config</code> utility <a href="#pkg-config">will
be used</a> for locating them. If they cannot be found through
<code>pkg-config</code> a fallback routing using <code>imake</code> will
be used. In this case, the <code>--with-x</code>,
<code>--x-includes</code> and <code>--x-libraries</code> options can
control the use of X for Mesa.
</li>
<li><code>--enable-gl-osmesa</code> - The <a href="osmesa.html">OSMesa
library</a> can be built on top of libGL for drivers that provide it.
This option controls whether to build libOSMesa. By default, this is
enabled for the Xlib driver and disabled otherwise. Note that this
option is different than using OSMesa as the driver.
</li>
<li><code>--enable-debug</code> - This option will enable compiler
options and macros to aid in debugging the Mesa libraries.
</li>
<li><code>--disable-asm</code> - There are assembly routines
</p>
<dl>
<dt><code>--enable-debug</code></dt>
<dd><p>This option will enable compiler
options and macros to aid in debugging the Mesa libraries.</p>
</dd>
<dt><code>--disable-asm</code></dt>
<dd><p>There are assembly routines
available for a few architectures. These will be used by default if
one of these architectures is detected. This option ensures that
assembly will not be used.
</li>
<li><code>--enable-32-bit, --enable-64-bit</code> - By default, the
build will compile code as directed by the environment variables
assembly will not be used.</p>
</dd>
<dt><code>--enable-32-bit</code></dt>
<dt><code>--enable-64-bit</code></dt>
<dd><p>By default, the build will compile code as directed by the environment
variables
<code>CC</code>, <code>CFLAGS</code>, etc. If the compiler is
<code>gcc</code>, these options offer a helper to add the compiler flags
to force 32- or 64-bit code generation as used on the x86 and x86_64
architectures.
</li>
</ul>
architectures. Note that these options are mutually exclusive.</p>
</dd>
</dl>
<h2 id="driver">2. Driver Options</h2>
@@ -145,19 +153,19 @@ architectures.
There are several different driver modes that Mesa can use. These are
described in more detail in the <a href="install.html">basic
installation instructions</a>. The Mesa driver is controlled through the
configure option --with-driver. There are currently three supported
options in the configure script.
configure options <code>--enable-xlib-glx</code>, <code>--enable-osmesa</code>,
and <code>--enable-dri</code>.
</p>
<h3 id="xlib">Xlib</h3><p>This is the default mode for building Mesa.
<h3 id="xlib">Xlib</h3><p>
It uses Xlib as a software renderer to do all rendering. It corresponds
to the option <code>--with-driver=xlib</code>. The libX11 and libXext
to the option <code>--enable-xlib-glx</code>. The libX11 and libXext
libraries, as well as the X11 development headers, will be need to
support the Xlib driver.
<h3 id="dri">DRI</h3><p>This mode uses the DRI hardware drivers for
accelerated OpenGL rendering. Enable the DRI drivers with the option
<code>--with-driver=dri</code>. See the <a href="install.html">basic
<code>--enable-dri</code>. See the <a href="install.html">basic
installation instructions</a> for details on prerequisites for the DRI
drivers.
@@ -197,7 +205,8 @@ and <code>/usr/local/lib</code>, respectively.
<h3 id="osmesa">OSMesa </h3><p> No libGL is built in this
mode. Instead, the driver code is built into the Off-Screen Mesa
(OSMesa) library. See the <a href="osmesa.html">Off-Screen Rendering</a>
page for more details.
page for more details. It corresponds to the option
<code>--enable-osmesa</code>.
<!-- OSMesa specific options -->
<dl>
@@ -219,31 +228,6 @@ libraries that will be built. More details on the specific GL libraries
can be found in the <a href="install.html">basic installation
instructions</a>.
<dl>
<dt id="glu">GLU <dd><p> The libGLU library will be built by default
on all drivers. This can be disable with the option
<code>--disable-glu</code>.
</dl>
<h2 id="demos">4. Demo Program Options</h2>
<p>
There are many demonstration programs in the MesaDemos tarball. If the
programs are available when <code>./configure</code> is run, a subset of
the programs will be built depending on the driver and library options
chosen. See the directory <code>progs</code> for the full set of demos.
<dl>
<dt><code>--with-demos=DEMOS,DEMOS,...</code>
<dd><p> This option allows a
specific set of demo programs to be built. For example,
<code>--with-demos="xdemos,slang"</code>. Beware that if this option is
used, it will not be ensured that the necessary GL libraries will be
available.
<dt><code>--without-demos</code> <dd><p> This completely disables building the
demo programs. It is equivalent to <code>--with-demos=no</code>.
</dl>
</div>
</body>
</html>

View File

@@ -1,33 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Banner</title>
<style type="text/css">
<!--
body { background: black; color: white }
h1 {
font: x-large sans-serif; text-align: center;
height: 75px; margin-left: 100px; margin-right: 100px }
.gears { width: 100px; height: 73px; float: left; background: url('gears.png') right no-repeat }
div + .gears { float: right; background-position: left }
/*
This should happen in the future instead:
h1 {
border-left: 71px solid #c11800; border-right: 71px solid #00c130;
border-top: 0px; border-bottom: 0px;
border-image: url(gears.png) 100%; -webkit-border-image: url(gears.png) 100%;
}
*/
-->
</style>
</head>
<body>
<div class="gears"></div>
<div class="gears"></div>
<h1>The Mesa 3D Graphics Library</h1>
</body>
</html>

View File

@@ -7,18 +7,24 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Bug Database</h1>
<p>
The Mesa bug database is hosted on
<a href="http://freedesktop.org" target="_parent">freedesktop.org</a>.
<a href="http://freedesktop.org">freedesktop.org</a>.
The old bug database on SourceForge is no longer used.
</p>
<p>
To file a Mesa bug, go to
<a href="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"
target="_parent">
<a href="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa">
Bugzilla on freedesktop.org</a>
</p>
@@ -50,5 +56,6 @@ If your bug report is vague or your test program doesn't compile
easily, the problem may not be fixed very quickly.
</p>
</div>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Conformance</h1>
<p>
@@ -691,6 +698,6 @@ Conformx passed.
NOTE: conformx passes for all machine path levels (-p option).
</div>
</body>
</html>

View File

@@ -25,62 +25,65 @@
<b>Documentation</b>
<ul>
<li><a href="intro.html" target="MainFrame">Introduction</a>
<li><a href="news.html" target="MainFrame">News</a>
<li><a href="developers.html" target="MainFrame">Developers</a>
<li><a href="systems.html" target="MainFrame">Platforms and Drivers</a>
<li><a href="license.html" target="MainFrame">License &amp; Copyright</a>
<li><a href="faq.html" target="MainFrame">FAQ</a>
<li><a href="relnotes.html" target="MainFrame">Release Notes</a>
<li><a href="thanks.html" target="MainFrame">Acknowledgements</a>
<li><a href="conform.html" target="MainFrame">Conformance Testing</a>
<li><a href="intro.html" target="_parent">Introduction</a>
<li><a href="index.html" target="_parent">News</a>
<li><a href="developers.html" target="_parent">Developers</a>
<li><a href="systems.html" target="_parent">Platforms and Drivers</a>
<li><a href="license.html" target="_parent">License &amp; Copyright</a>
<li><a href="faq.html" target="_parent">FAQ</a>
<li><a href="relnotes.html" target="_parent">Release Notes</a>
<li><a href="thanks.html" target="_parent">Acknowledgements</a>
<li><a href="conform.html" target="_parent">Conformance Testing</a>
<li>more docs below...
</ul>
<b>Download / Install</b>
<ul>
<li><a href="download.html" target="MainFrame">Downloading / Unpacking</a>
<li><a href="install.html" target="MainFrame">Compiling / Installing</a>
<li><a href="precompiled.html" target="MainFrame">Precompiled Libraries</a>
<li><a href="download.html" target="_parent">Downloading / Unpacking</a>
<li><a href="install.html" target="_parent">Compiling / Installing</a>
<ul>
<li><a href="autoconf.html" target="_parent">Autoconf</a></li>
</ul>
</li>
<li><a href="precompiled.html" target="_parent">Precompiled Libraries</a>
</ul>
<b>Resources</b>
<ul>
<li><a href="lists.html" target="MainFrame">Mailing Lists</a>
<li><a href="bugs.html" target="MainFrame">Bug Database</a>
<li><a href="webmaster.html" target="MainFrame">Webmaster</a>
<li><a href="lists.html" target="_parent">Mailing Lists</a>
<li><a href="bugs.html" target="_parent">Bug Database</a>
<li><a href="webmaster.html" target="_parent">Webmaster</a>
<li><a href="http://dri.freedesktop.org/" target="_parent">Mesa/DRI Wiki</a>
</ul>
<b>User Topics</b>
<ul>
<li><a href="egl.html" target="MainFrame">EGL</a>
<li><a href="opengles.html" target="MainFrame">OpenGL ES</a>
<li><a href="openvg.html" target="MainFrame">OpenVG / Vega</a>
<li><a href="envvars.html" target="MainFrame">Environment Variables</a>
<li><a href="osmesa.html" target="MainFrame">Off-Screen Rendering</a>
<li><a href="debugging.html" target="MainFrame">Debugging Tips</a>
<li><a href="perf.html" target="MainFrame">Performance Tips</a>
<li><a href="extensions.html" target="MainFrame">Mesa Extensions</a>
<li><a href="mangling.html" target="MainFrame">Function Name Mangling</a>
<li><a href="llvmpipe.html" target="MainFrame">Gallium llvmpipe driver</a>
<li><a href="vmware-guest.html" target="MainFrame">VMware SVGA3D guest driver</a>
<li><a href="postprocess.html" target="MainFrame">Gallium post-processing</a>
<li><a href="viewperf.html" target="MainFrame">Viewperf Issues</a>
<li><a href="shading.html" target="_parent">Shading Language</a>
<li><a href="egl.html" target="_parent">EGL</a>
<li><a href="opengles.html" target="_parent">OpenGL ES</a>
<li><a href="openvg.html" target="_parent">OpenVG / Vega</a>
<li><a href="envvars.html" target="_parent">Environment Variables</a>
<li><a href="osmesa.html" target="_parent">Off-Screen Rendering</a>
<li><a href="debugging.html" target="_parent">Debugging Tips</a>
<li><a href="perf.html" target="_parent">Performance Tips</a>
<li><a href="extensions.html" target="_parent">Mesa Extensions</a>
<li><a href="mangling.html" target="_parent">Function Name Mangling</a>
<li><a href="llvmpipe.html" target="_parent">Gallium llvmpipe driver</a>
<li><a href="vmware-guest.html" target="_parent">VMware SVGA3D guest driver</a>
<li><a href="postprocess.html" target="_parent">Gallium post-processing</a>
<li><a href="application-issues.html" target="_parent">Application Issues</a>
<li><a href="viewperf.html" target="_parent">Viewperf Issues</a>
</ul>
<b>Developer Topics</b>
<ul>
<li><a href="http://sourceforge.net/projects/mesa3d" target="_parent">SourceForge homepage</a>
<li><a href="repository.html" target="MainFrame">Source Code Repository</a>
<li><a href="sourcetree.html" target="MainFrame">Source Code Tree</a>
<li><a href="glu.html" target="MainFrame">SGI's GLU</a>
<li><a href="utilities.html" target="MainFrame">Utilities</a>
<li><a href="helpwanted.html" target="MainFrame">Help Wanted</a>
<li><a href="devinfo.html" target="MainFrame">Development Notes</a>
<li><a href="sourcedocs.html" target="MainFrame">Source Documentation</a>
<li><a href="subset.html" target="MainFrame">Mesa Subset Driver</a>
<li><a HREF="dispatch.html" target="MainFrame">GL Dispatch</a>
<li><a href="repository.html" target="_parent">Source Code Repository</a>
<li><a href="sourcetree.html" target="_parent">Source Code Tree</a>
<li><a href="utilities.html" target="_parent">Utilities</a>
<li><a href="helpwanted.html" target="_parent">Help Wanted</a>
<li><a href="devinfo.html" target="_parent">Development Notes</a>
<li><a href="sourcedocs.html" target="_parent">Source Documentation</a>
<li><a href="dispatch.html" target="_parent">GL Dispatch</a>
</ul>
<b>Links</b>
@@ -88,11 +91,6 @@
<li><a href="http://www.opengl.org" target="_parent">OpenGL website</a>
<li><a href="http://dri.freedesktop.org" target="_parent">DRI website</a>
<li><a href="http://www.freedesktop.org" target="_parent">freedesktop.org</a>
<li><a href="games.html" target="MainFrame">Games and Entertainment</a>
<li><a href="libraries.html" target="MainFrame">Libraries and Toolkits</a>
<li><a href="modelers.html" target="MainFrame">Modeling and Rendering</a>
<li><a href="science.html" target="MainFrame">Science and Technical</a>
<li><a href="utility.html" target="MainFrame">Utilities</a>
</ul>
<b>Hosted by:</b>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Debugging Tips</h1>
<p>
@@ -35,5 +42,6 @@
src/dlist.c for details.
</p>
</div>
</body>
</html>

View File

@@ -7,13 +7,20 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Developers</h1>
<p>
Both professional and volunteer developers contribute to Mesa.
</p>
<p>
<a href="http://www.vmware.com/" target="_parent">VMware</a>
<a href="http://www.vmware.com/">VMware</a>
employs several of the main Mesa developers including Brian Paul
and Keith Whitwell.
</p>
@@ -31,13 +38,13 @@ including:
<p>
Other companies including
<a href="http://www.intellinuxgraphics.org/index.html" target="_parent">Intel</a>
<a href="http://www.intellinuxgraphics.org/index.html">Intel</a>
and RedHat also actively contribute to the project.
Intel has recently contributed the new GLSL compiler in Mesa 7.9.
</p>
<p>
<a href="http://www.lunarg.com/" target="_parent">LunarG</a> can be contacted
<a href="http://www.lunarg.com/">LunarG</a> can be contacted
for custom Mesa / 3D graphics development.
</p>
@@ -46,5 +53,6 @@ Volunteers have made significant contributions to all parts of Mesa, including
complete device drivers.
</p>
</div>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Development Notes</h1>
@@ -29,7 +36,7 @@ To add a new GL extension to Mesa you have to do at least the following.
</pre>
</li>
<li>
In the src/mesa/glapi/ directory, add the new extension functions and
In the src/mapi/glapi/gen/ directory, add the new extension functions and
enums to the gl_API.xml file.
Then, a bunch of source files must be regenerated by executing the
corresponding Python scripts.
@@ -148,6 +155,53 @@ of <tt>bool</tt>, <tt>true</tt>, and
src/mesa/state_tracker/st_glsl_to_tgsi.cpp can serve as examples.
</p>
<h2>Submitting patches</h2>
<p>
You should always run the Mesa Testsuite before submitting patches.
The Testsuite can be run using the 'make check' command. All tests
must pass before patches will be accepted, this may mean you have
to update the tests themselves.
</p>
<p>
Patches should be sent to the Mesa mailing list for review.
When submitting a patch make sure to use git send-email rather than attaching
patches to emails. Sending patches as attachments prevents people from being
able to provide in-line review comments.
</p>
<p>
When submitting follow-up patches you can use --in-reply-to to make v2, v3,
etc patches show up as replies to the originals. This usually works well
when you're sending out updates to individual patches (as opposed to
re-sending the whole series). Using --in-reply-to makes
it harder for reviewers to accidentally review old patches.
</p>
<h2>Marking a commit as a candidate for a stable branch</h2>
<p>
If you want a commit to be applied to a stable branch,
you should add an appropriate note to the commit message.
</p>
<p>
Here are some examples of such a note:
</p>
<ul>
<li>NOTE: This is a candidate for the 9.0 branch.</li>
<li>NOTE: This is a candidate for the 8.0 and 9.0 branches.</li>
<li>NOTE: This is a candidate for the stable branches.</li>
</ul>
<h2>Cherry-picking candidates for a stable branch</h2>
<p>
Please use <code>git cherry-pick -x &lt;commit&gt;</code> for cherry-picking a commit
from master to a stable branch.
</p>
<h2>Making a New Mesa Release</h2>
@@ -162,33 +216,22 @@ branch is relevant.
</p>
<h3>Verify and update version info</h3>
<dl>
<dt>configs/default</dt>
<dd>MESA_MAJOR, MESA_MINOR and MESA_TINY</dd>
<dt>Makefile.am</dt>
<dd>PACKAGE_VERSION</dd>
<dt>autoconf.ac</dt>
<dd>AC_INIT</dd>
<dt>src/mesa/main/version.h</dt>
<dd>MESA_MAJOR, MESA_MINOR, MESA_PATCH and MESA_VERSION_STRING</dd>
</dl>
<h3>Verify and update version info in VERSION</h3>
<p>
Create a docs/relnotes-x.y.z.html file.
The bin/shortlog_mesa.sh script can be used to create a HTML-formatted list
of changes to include in the file.
Link the new docs/relnotes-x.y.z.html file into the main <a href="relnotes.html">relnotes.html</a> file.
Create a docs/relnotes/x.y.z.html file.
The bin/bugzilla_mesa.sh and bin/shortlog_mesa.sh scripts can be used to
create the HTML-formatted lists of bugfixes and changes to include in the file.
Link the new docs/relnotes/x.y.z.html file into the main <a href="relnotes.html">relnotes.html</a> file.
</p>
<p>
Update <a href="news.html">docs/news.html</a>.
Update <a href="index.html">docs/index.html</a>.
</p>
<p>
Tag the files with the release name (in the form <b>mesa-x.y</b>)
with: <code>git tag -a mesa-x.y</code>
with: <code>git tag -s mesa-x.y -m "Mesa x.y Release"</code>
Then: <code>git push origin mesa-x.y</code>
</p>
@@ -197,13 +240,14 @@ Then: <code>git push origin mesa-x.y</code>
<p>
Make the distribution files. From inside the Mesa directory:
<pre>
./autogen.sh
make tarballs
</pre>
<p>
After the tarballs are created, the md5 checksums for the files will
be computed.
Add them to the docs/relnotes-x.y.html file.
Add them to the docs/relnotes/x.y.html file.
</p>
<p>
@@ -213,15 +257,18 @@ compile everything, and run some demos to be sure everything works.
<h3>Update the website and announce the release</h3>
<p>
Follow the directions on SourceForge for creating a new "release" and
uploading the tarballs.
Make a new directory for the release on annarchy.freedesktop.org with:
<br>
<code>
mkdir /srv/ftp.freedesktop.org/pub/mesa/x.y
</code>
</p>
<p>
Basically, to upload the tarball files with:
<br>
<code>
rsync -avP ssh Mesa*-X.Y.* USERNAME@frs.sourceforge.net:uploads/
rsync -avP -e ssh MesaLib-x.y.* USERNAME@annarchy.freedesktop.org:/srv/ftp.freedesktop.org/pub/mesa/x.y/
</code>
</p>
@@ -237,11 +284,12 @@ sftp USERNAME,mesa3d@web.sourceforge.net
<p>
Make an announcement on the mailing lists:
<em>m</em><em>e</em><em>s</em><em>a</em><em>-</em><em>d</em><em>e</em><em>v</em><em>@</em><em>l</em><em>i</em><em>s</em><em>t</em><em>s</em><em>.</em><em>f</em><em>r</em><em>e</em><em>e</em><em>d</em><em>e</em><em>s</em><em>k</em><em>t</em><em>o</em><em>p</em><em>.</em><em>o</em><em>r</em><em>g</em>,
<em>m</em><em>e</em><em>s</em><em>a</em><em>-</em><em>u</em><em>s</em><em>e</em><em>r</em><em>s</em><em>@</em><em>l</em><em>i</em><em>s</em><em>t</em><em>s</em><em>.</em><em>f</em><em>r</em><em>e</em><em>e</em><em>d</em><em>e</em><em>s</em><em>k</em><em>t</em><em>o</em><em>p</em><em>.</em><em>o</em><em>r</em><em>g</em>
<em>mesa-dev@lists.freedesktop.org</em>,
<em>mesa-users@lists.freedesktop.org</em>
and
<em>m</em><em>e</em><em>s</em><em>a</em><em>-</em><em>a</em><em>n</em><em>n</em><em>o</em><em>u</em><em>n</em><em>c</em><em>e</em><em>@</em><em>l</em><em>i</em><em>s</em><em>t</em><em>s</em><em>.</em><em>f</em><em>r</em><em>e</em><em>e</em><em>d</em><em>e</em><em>s</em><em>k</em><em>t</em><em>o</em><em>p</em><em>.</em><em>o</em><em>r</em><em>g</em>
<em>mesa-announce@lists.freedesktop.org</em>
</p>
</div>
</body>
</html>

View File

@@ -6,6 +6,14 @@
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>GL Dispatch in Mesa</h1>
<p>Several factors combine to make efficient dispatch of OpenGL functions
@@ -197,7 +205,7 @@ few preprocessor defines.</p>
<ul>
<li>If <tt>GLX_USE_TLS</tt> is defined, method #4 is used.</li>
<li>If <tt>PTHREADS</tt> is defined, method #3 is used.</li>
<li>If <tt>HAVE_PTHREAD</tt> is defined, method #3 is used.</li>
<li>If <tt>WIN32_THREADS</tt> is defined, method #2 is used.</li>
<li>If none of the preceeding are defined, method #1 is used.</li>
</ul>
@@ -266,5 +274,6 @@ included.</p>
<h2 id="autogen">4. Automatic Generation of Dispatch Stubs</h2>
</div>
</body>
</html>

View File

@@ -7,17 +7,23 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Downloading</h1>
<p>
Primary Mesa download site:
<a href="ftp://ftp.freedesktop.org/pub/mesa/"
target="_parent">freedesktop.org</a> (FTP)
<a href="ftp://ftp.freedesktop.org/pub/mesa/">freedesktop.org</a> (FTP)
</p>
<p>
When a new release is coming, release candidates (betas) may be found
<a href="ftp://ftp.freedesktop.org/pub/mesa/beta/" target="_parent">here</a>.
<a href="ftp://ftp.freedesktop.org/pub/mesa/beta/">here</a>.
</p>
@@ -69,7 +75,6 @@ docs/ - documentation
src/ - source code for libraries
src/mesa - sources for the main Mesa library and device drivers
src/gallium - sources for Gallium and Gallium drivers
src/glu - libGLU source code
src/glx - sources for building libGL with full GLX and DRI support
</pre>
@@ -80,24 +85,33 @@ instructions</a>.
</p>
<h1>Demos and GLUT</h1>
<h1>Demos, GLUT, and GLU</h1>
<p>
A package of SGI's GLU library is available
<a href="ftp://ftp.freedesktop.org/pub/mesa/glu/">here</a>
</p>
<p>
A package of Mark Kilgard's GLUT library is available
<a href="ftp://ftp.freedesktop.org/pub/mesa/glut/" target="_parent">here</a>
<a href="ftp://ftp.freedesktop.org/pub/mesa/glut/">here</a>
</p>
<p>
The Mesa demos collection is available
<a href="ftp://ftp.freedesktop.org/pub/mesa/demos/" target="_parent">here</a>
<a href="ftp://ftp.freedesktop.org/pub/mesa/demos/">here</a>
</p>
<p>
In the past, GLUT and the Mesa demos were released in conjunction with
Mesa releases. But since GLUT and the demos change infrequently, they
were split off some time ago.
In the past, GLUT, GLU and the Mesa demos were released in conjunction with
Mesa releases. But since GLUT, GLU and the demos change infrequently, they
were split off into their own git repositories:
<a href="http://cgit.freedesktop.org/mesa/glut/">GLUT</a>,
<a href="http://cgit.freedesktop.org/mesa/glu/">GLU</a> and
<a href="http://cgit.freedesktop.org/mesa/demos/">Demos</a>,
</p>
</div>
</body>
</html>

View File

@@ -7,11 +7,18 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Mesa EGL</h1>
<p>The current version of EGL in Mesa implements EGL 1.4. More information
about EGL can be found at
<a href="http://www.khronos.org/egl/" target="_parent">
<a href="http://www.khronos.org/egl/">
http://www.khronos.org/egl/</a>.</p>
<p>The Mesa's implementation of EGL uses a driver architecture. The main
@@ -53,28 +60,32 @@ or more EGL drivers.</p>
<p>There are several options that control the build of EGL at configuration
time</p>
<ul>
<li><code>--enable-egl</code>
<dl>
<dt><code>--enable-egl</code></dt>
<dd>
<p>By default, EGL is enabled. When disabled, the main library and the drivers
will not be built.</p>
</li>
</dd>
<li><code>--with-egl-driver-dir</code>
<dt><code>--with-egl-driver-dir</code></dt>
<dd>
<p>The directory EGL drivers should be installed to. If not specified, EGL
drivers will be installed to <code>${libdir}/egl</code>.</p>
</li>
</dd>
<li><code>--enable-gallium-egl</code>
<dt><code>--enable-gallium-egl</code></dt>
<dd>
<p>Enable the optional <code>egl_gallium</code> driver.</p>
</li>
</dd>
<li><code>--with-egl-platforms</code>
<dt><code>--with-egl-platforms</code></dt>
<dd>
<p>List the platforms (window systems) to support. Its argument is a comma
seprated string such as <code>--with-egl-platforms=x11,drm</code>. It decides
@@ -88,30 +99,34 @@ types such as <code>EGLNativeDisplayType</code> or
only be built with SCons. Unless for special needs, the build system should
select the right platforms automatically.</p>
</li>
</dd>
<li><code>--enable-gles1</code> and <code>--enable-gles2</code>
<dt><code>--enable-gles1</code></dt>
<dt><code>--enable-gles2</code></dt>
<dd>
<p>These options enable OpenGL ES support in OpenGL. The result is one big
internal library that supports multiple APIs.</p>
</li>
</dd>
<li><code>--enable-shared-glapi</code>
<dt><code>--enable-shared-glapi</code></dt>
<dd>
<p>By default, <code>libGL</code> has its own copy of <code>libglapi</code>.
This options makes <code>libGL</code> use the shared <code>libglapi</code>. This
is required if applications mix OpenGL and OpenGL ES.</p>
</li>
</dd>
<li><code>--enable-openvg</code>
<dt><code>--enable-openvg</code></dt>
<dd>
<p>OpenVG must be explicitly enabled by this option.</p>
</li>
</dd>
</ul>
</dl>
<h2>Use EGL</h2>
@@ -125,8 +140,9 @@ mesa/demos repository.</p>
<p>There are several environment variables that control the behavior of EGL at
runtime</p>
<ul>
<li><code>EGL_DRIVERS_PATH</code>
<dl>
<dt><code>EGL_DRIVERS_PATH</code></dt>
<dd>
<p>By default, the main library will look for drivers in the directory where
the drivers are installed to. This variable specifies a list of
@@ -144,18 +160,20 @@ may set</p>
<p>to test a build without installation</p>
</li>
</dd>
<li><code>EGL_DRIVER</code>
<dt><code>EGL_DRIVER</code></dt>
<dd>
<p>This variable specifies a full path to or the name of an EGL driver. It
forces the specified EGL driver to be loaded. It comes in handy when one wants
to test a specific driver. This variable is ignored for setuid/setgid
binaries.</p>
</li>
</dd>
<li><code>EGL_PLATFORM</code>
<dt><code>EGL_PLATFORM</code></dt>
<dd>
<p>This variable specifies the native platform. The valid values are the same
as those for <code>--with-egl-platforms</code>. When the variable is not set,
@@ -167,28 +185,31 @@ create displays for non-native platforms. These extensions are usually used by
applications that support non-native platforms. Setting this variable is
probably required only for some of the demos found in mesa/demo repository.</p>
</li>
</dd>
<li><code>EGL_LOG_LEVEL</code>
<dt><code>EGL_LOG_LEVEL</code></dt>
<dd>
<p>This changes the log level of the main library and the drivers. The valid
values are: <code>debug</code>, <code>info</code>, <code>warning</code>, and
<code>fatal</code>.</p>
</li>
</dd>
<li><code>EGL_SOFTWARE</code>
<dt><code>EGL_SOFTWARE</code></dt>
<dd>
<p>For drivers that support both hardware and software rendering, setting this
variable to true forces the use of software rendering.</p>
</li>
</ul>
</dd>
</dl>
<h2>EGL Drivers</h2>
<ul>
<li><code>egl_dri2</code>
<dl>
<dt><code>egl_dri2</code></dt>
<dd>
<p>This driver supports both <code>x11</code> and <code>drm</code> platforms.
It functions as a DRI driver loader. For <code>x11</code> support, it talks to
@@ -196,9 +217,10 @@ the X server directly using (XCB-)DRI2 protocol.</p>
<p>This driver can share DRI drivers with <code>libGL</code>.</p>
</li>
</dd>
<li><code>egl_gallium</code>
<dt><code>egl_gallium</code></dt>
<dd>
<p>This driver is based on Gallium3D. It supports all rendering APIs and
hardwares supported by Gallium3D. It is the only driver that supports OpenVG.
@@ -208,16 +230,17 @@ The supported platforms are X11, DRM, FBDEV, and GDI.</p>
(<code>pipe_&lt;hw&gt;</code>) and client API modules
(<code>st_&lt;api&gt;</code>).</p>
</li>
</dd>
<li><code>egl_glx</code>
<dt><code>egl_glx</code></dt>
<dd>
<p>This driver provides a wrapper to GLX. It uses exclusively GLX to implement
the EGL API. It supports both direct and indirect rendering when the GLX does.
It is accelerated when the GLX is. As such, it cannot provide functions that
is not available in GLX or GLX extensions.</p>
</li>
</ul>
</dd>
</dl>
<h2>Packaging</h2>
@@ -317,5 +340,6 @@ not be called with the sample display at the same time. If a driver has access
to an <code>EGLDisplay</code> without going through the EGL APIs, the driver
should as well lock the display before using it.
</div>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Environment Variables</h1>
<p>
@@ -25,6 +32,8 @@ sometimes be useful for debugging end-user issues.
<li>LIBGL_ALWAYS_INDIRECT - forces an indirect rendering context/connection.
<li>LIBGL_ALWAYS_SOFTWARE - if set, always use software rendering
<li>LIBGL_NO_DRAWARRAYS - if set do not use DrawArrays GLX protocol (for debugging)
<li>LIBGL_SHOW_FPS - print framerate to stdout based on the number of glXSwapBuffers
calls per second.
</ul>
@@ -62,9 +71,25 @@ If the extension string is too long, the buffer overrun can cause the game
to crash.
This is a work-around for that.
<li>MESA_GL_VERSION_OVERRIDE - changes the value returned by
glGetString(GL_VERSION). Valid values are point-separated version numbers,
such as "3.0". Mesa will not really implement all the features of the given
version if it's higher than what's normally reported.
glGetString(GL_VERSION) and possibly the GL API type.
<ul>
<li> The format should be MAJOR.MINOR[FC]
<li> FC is an optional suffix that indicates a forward compatible context.
This is only valid for versions &gt;= 3.0.
<li> GL versions &lt; 3.0 are set to a compatibility (non-Core) profile
<li> GL versions = 3.0, see below
<li> GL versions &gt; 3.0 are set to a Core profile
<li> Examples: 2.1, 3.0, 3.0FC, 3.1, 3.1FC
<ul>
<li> 2.1 - select a compatibility (non-Core) profile with GL version 2.1
<li> 3.0 - select a compatibility (non-Core) profile with GL version 3.0
<li> 3.0FC - select a Core+Forward Compatible profile with GL version 3.0
<li> 3.1 - select a Core profile with GL version 3.1
<li> 3.1FC - select a Core+Forward Compatible profile with GL version 3.1
</ul>
<li> Mesa may not really implement all the features of the given version.
(for developers only)
</ul>
<li>MESA_GLSL_VERSION_OVERRIDE - changes the value returned by
glGetString(GL_SHADING_LANGUAGE_VERSION). Valid values are integers, such as
"130". Mesa will not really implement all the features of the given language version
@@ -121,14 +146,13 @@ Mesa EGL supports different sets of environment variables. See the
<h2>Gallium environment variables</h2>
<ul>
<li>GALLIUM_HUD - draws various information on the screen, like framerate,
cpu load, driver statistics, performance counters, etc.
Set GALLIUM_HUD=help and run e.g. glxgears for more info.
<li>GALLIUM_LOG_FILE - specifies a file for logging all errors, warnings, etc.
rather than stderr.
<li>GALLIUM_PRINT_OPTIONS - if non-zero, print all the Gallium environment
variables which are used, and their current values.
<li>GALLIUM_NOSSE - if non-zero, do not use SSE runtime code generation for
shader execution
<li>GALLIUM_NOPPC - if non-zero, do not use PPC runtime code generation for
shader execution
<li>GALLIUM_DUMP_CPU - if non-zero, print information about the CPU on start-up
<li>TGSI_PRINT_SANITY - if set, do extra sanity checking on TGSI shaders and
print any errors to stderr.
@@ -136,6 +160,9 @@ Mesa EGL supports different sets of environment variables. See the
<LI>DRAW_NO_FSE - ???
<li>DRAW_USE_LLVM - if set to zero, the draw module will not use LLVM to execute
shaders, vertex fetch, etc.
<li>ST_DEBUG - controls debug output from the Mesa/Gallium state tracker.
Setting to "tgsi", for example, will print all the TGSI shaders.
See src/mesa/state_tracker/st_debug.c for other options.
</ul>
<h3>Softpipe driver environment variables</h3>
@@ -162,11 +189,22 @@ Mesa EGL supports different sets of environment variables. See the
cores present.
</ul>
<h3>VMware SVGA driver environment variables</h3>
<ul>
<li>SVGA_FORCE_SWTNL - force use of software vertex transformation
<li>SVGA_NO_SWTNL - don't allow software vertex transformation fallbacks
(will often result in incorrect rendering).
<li>SVGA_DEBUG - for dumping shaders, constant buffers, etc. See the code
for details.
<li>See the driver code for other, lesser-used variables.
</ul>
<p>
Other Gallium drivers have their own environment variables. These may change
frequently so the source code should be consulted for details.
</p>
</div>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Mesa Extensions</h1>
<p>
@@ -16,20 +23,29 @@ The specifications follow.
<ul>
<li><a href="MESA_agp_offset.spec">MESA_agp_offset.spec</a>
<li><a href="MESA_copy_sub_buffer.spec">MESA_copy_sub_buffer.spec</a>
<li><a href="MESA_packed_depth_stencil.spec">MESA_packed_depth_stencil.spec</a>
<li><a href="MESA_pack_invert.spec">MESA_pack_invert.spec</a>
<li><a href="MESA_pixmap_colormap.spec">MESA_pixmap_colormap.spec</a>
<li><a href="MESA_release_buffers.spec">MESA_release_buffers.spec</a>
<li><a href="MESA_resize_buffers.spec">MESA_resize_buffers.spec</a>
<li><a href="MESA_set_3dfx_mode.spec">MESA_set_3dfx_mode.spec</a>
<li><a href="MESA_sprite_point.spec">MESA_sprite_point.spec</a> (obsolete)
<li><a href="MESA_texture_signed_rgba.spec">MESA_texture_signed_rgba.spec</a>
<li><a href="MESA_trace.spec">MESA_trace.spec</a> (obsolete)
<li><a href="MESA_window_pos.spec">MESA_window_pos.spec</a>
<li><a href="MESA_ycbcr_texture.spec">MESA_ycbcr_texture.spec</a>
<li><a href="specs/MESA_agp_offset.spec">MESA_agp_offset.spec</a>
<li><a href="specs/MESA_copy_sub_buffer.spec">MESA_copy_sub_buffer.spec</a>
<li><a href="specs/MESA_drm_image.spec">MESA_drm_image.spec</a>
<li><a href="specs/MESA_multithread_makecurrent.spec">MESA_multithread_makecurrent.spec</a>
<li><a href="specs/OLD/MESA_packed_depth_stencil.spec">MESA_packed_depth_stencil.spec</a> (obsolete)
<li><a href="specs/MESA_pack_invert.spec">MESA_pack_invert.spec</a>
<li><a href="specs/MESA_pixmap_colormap.spec">MESA_pixmap_colormap.spec</a>
<li><a href="specs/OLD/MESA_program_debug.spec">MESA_program_debug.spec</a> (obsolete)
<li><a href="specs/MESA_release_buffers.spec">MESA_release_buffers.spec</a>
<li><a href="specs/OLD/MESA_resize_buffers.spec">MESA_resize_buffers.spec</a> (obsolete)
<li><a href="specs/MESA_set_3dfx_mode.spec">MESA_set_3dfx_mode.spec</a>
<li><a href="specs/MESA_shader_debug.spec">MESA_shader_debug.spec</a>
<li><a href="specs/OLD/MESA_sprite_point.spec">MESA_sprite_point.spec</a> (obsolete)
<li><a href="specs/MESA_swap_control.spec">MESA_swap_control.spec</a>
<li><a href="specs/MESA_swap_frame_usage.spec">MESA_swap_frame_usage.spec</a>
<li><a href="specs/MESA_texture_array.spec">MESA_texture_array.spec</a>
<li><a href="specs/MESA_texture_signed_rgba.spec">MESA_texture_signed_rgba.spec</a>
<li><a href="specs/OLD/MESA_trace.spec">MESA_trace.spec</a> (obsolete)
<li><a href="specs/MESA_window_pos.spec">MESA_window_pos.spec</a>
<li><a href="specs/MESA_ycbcr_texture.spec">MESA_ycbcr_texture.spec</a>
<li><a href="specs/WL_bind_wayland_display.spec">WL_bind_wayland_display.spec</a>
</ul>
</div>
</body>
</html>

View File

@@ -7,9 +7,16 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<center>
<h1>Mesa Frequently Asked Questions</h1>
Last updated: 21 August 2006
Last updated: 9 October 2012
</center>
<br>
@@ -38,21 +45,25 @@ See the <a href="http://www.opengl.org/">OpenGL website</a> for more
information.
</p>
<p>
Mesa 6.x supports the OpenGL 1.5 specification.
Mesa 9.x supports the OpenGL 3.1 specification.
</p>
<h2>1.2 Does Mesa support/use graphics hardware?</h2>
<p>
Yes. Specifically, Mesa serves as the OpenGL core for the open-source DRI
drivers for XFree86/X.org. See the <a href="http://dri.freedesktop.org/">DRI
website</a> for more information.
</p>
<p>
There have been other hardware drivers for Mesa over the years (such as
the 3Dfx Glide/Voodoo driver, an old S3 driver, etc) but the DRI drivers
are the modern ones.
drivers for X.org.
</p>
<ul>
<li>See the <a href="http://dri.freedesktop.org/">DRI website</a>
for more information.</li>
<li>See <a href="http://intellinuxgraphics.org">intellinuxgraphics.org</a>
for more information about Intel drivers.</li>
<li>See <a href="http://nouveau.freedesktop.org">nouveau.freedesktop.org</a>
for more information about Nouveau drivers.</li>
<li>See <a href="http://www.x.org/wiki/RadeonFeature">www.x.org/wiki/RadeonFeature</a>
for more information about Radeon drivers.</li>
</ul>
<h2>1.3 What purpose does Mesa serve today?</h2>
<p>
@@ -61,7 +72,7 @@ operating systems today.
Still, Mesa serves at least these purposes:
</p>
<ul>
<li>Mesa is used as the core of the open-source XFree86/X.org DRI
<li>Mesa is used as the core of the open-source X.org DRI
hardware drivers.
</li>
<li>Mesa is quite portable and allows OpenGL to be used on systems
@@ -83,7 +94,7 @@ Still, Mesa serves at least these purposes:
</ul>
<h2>1.4 What's the difference between"Stand-Alone" Mesa and the DRI drivers?</h2>
<h2>1.4 What's the difference between "Stand-Alone" Mesa and the DRI drivers?</h2>
<p>
<em>Stand-alone Mesa</em> is the original incarnation of Mesa.
On systems running the X Window System it does all its rendering through
@@ -125,8 +136,7 @@ Just follow the Mesa <a href="install.html">compilation instructions</a>.
<h2>1.6 Are there other open-source implementations of OpenGL?</h2>
<p>
Yes, SGI's <a href="http://oss.sgi.com/projects/ogl-sample/index.html"
target="_parent">
Yes, SGI's <a href="http://oss.sgi.com/projects/ogl-sample/index.html">
OpenGL Sample Implemenation (SI)</a> is available.
The SI was written during the time that OpenGL was originally designed.
Unfortunately, development of the SI has stagnated.
@@ -134,34 +144,33 @@ Mesa is much more up to date with modern features and extensions.
</p>
<p>
<a href="http://ogl-es.sourceforge.net" target="_parent">Vincent</a> is
<a href="http://sourceforge.net/projects/ogl-es/">Vincent</a> is
an open-source implementation of OpenGL ES for mobile devices.
<p>
<a href="http://www.dsbox.com/minigl.html" target="_parent">miniGL</a>
<a href="http://www.dsbox.com/minigl.html">miniGL</a>
is a subset of OpenGL for PalmOS devices.
<p>
<a href="http://fabrice.bellard.free.fr/TinyGL/"
target="_parent">TinyGL</a> is a subset of OpenGL.
<a href="http://bellard.org/TinyGL/">TinyGL</a>
is a subset of OpenGL.
</p>
<p>
<a href="http://softgl.studierstube.org/" target="_parent">SoftGL</a>
<a href="http://sourceforge.net/projects/softgl/">SoftGL</a>
is an OpenGL subset for mobile devices.
</p>
<p>
<a href="http://chromium.sourceforge.net/" target="_parent">Chromium</a>
<a href="http://chromium.sourceforge.net/">Chromium</a>
isn't a conventional OpenGL implementation (it's layered upon OpenGL),
but it does export the OpenGL API. It allows tiled rendering, sort-last
rendering, etc.
</p>
<p>
<a href="http://www.ticalc.org/archives/files/fileinfo/361/36173.html"
target="_parent">ClosedGL</a> is an OpenGL subset library for TI
graphing calculators.
<a href="http://www.ticalc.org/archives/files/fileinfo/361/36173.html">ClosedGL</a>
is an OpenGL subset library for TI graphing calculators.
</p>
<p>
@@ -211,8 +220,7 @@ GLw (OpenGL widget library) is now available from a separate <a href="http://cgi
<h2>2.5 What's the proper place for the libraries and headers?</h2>
<p>
On Linux-based systems you'll want to follow the
<a href="http://oss.sgi.com/projects/ogl-sample/ABI/index.html"
target="_parent">Linux ABI</a> standard.
<a href="http://oss.sgi.com/projects/ogl-sample/ABI/index.html">Linux ABI</a> standard.
Basically you'll want the following:
</p>
<ul>
@@ -226,21 +234,24 @@ Basically you'll want the following:
</li><li>/usr/lib/libGL.so.1 - a symlink to libGL.so.1.xyz
</li><li>/usr/lib/libGL.so.xyz - the actual OpenGL/Mesa library. xyz denotes the
Mesa version number.
</li><li>/usr/lib/libGLU.so - a symlink to libGLU.so.1
</li><li>/usr/lib/libGLU.so.1 - a symlink to libGLU.so.1.3.xyz
</li><li>/usr/lib/libGLU.so.xyz - the OpenGL Utility library. xyz denotes the Mesa
version number.
</li></ul>
<p>
After installing XFree86/X.org and the DRI drivers, some of these files
may be symlinks into the /usr/X11R6/ tree.
When configuring Mesa, there are three autoconf options that affect the install
location that you should take care with: <code>--prefix</code>,
<code>--libdir</code>, and <code>--with-dri-driverdir</code>. To install Mesa
into the system location where it will be available for all programs to use, set
<code>--prefix=/usr</code>. Set <code>--libdir</code> to where your Linux
distribution installs system libraries, usually either <code>/usr/lib</code> or
<code>/usr/lib64</code>. Set <code>--with-dri-driverdir</code> to the directory
where your Linux distribution installs DRI drivers. To find your system's DRI
driver directory, try executing <code>find /usr -type d -name dri</code>. For
example, if the <code>find</code> command listed <code>/usr/lib64/dri</code>,
then set <code>--with-dri-driverdir=/usr/lib64/dri</code>.
</p>
<p>
The old-style Makefile system doesn't install the Mesa libraries; it's
up to you to copy them (and the headers) to the right place.
</p>
<p>
The GLUT header and library should go in the same directories.
After determining the correct values for the install location, configure Mesa
with <code>./configure --prefix=/usr --libdir=xxx --with-dri-driverdir=xxx</code>
and then install with <code>sudo make install</code>.
</p>
<br>
<br>
@@ -250,24 +261,22 @@ The GLUT header and library should go in the same directories.
<h2>3.1 Rendering is slow / why isn't my graphics hardware being used?</h2>
<p>
Stand-alone Mesa (downloaded as MesaLib-x.y.z.tar.gz) doesn't have any
support for hardware acceleration (with the exception of the 3DFX Voodoo
driver).
</p>
<p>
What you really want is a DRI or NVIDIA (or another vendor's OpenGL) driver
for your particular hardware.
If Mesa can't use its hardware accelerated drivers it falls back on one of its software renderers.
(eg. classic swrast, softpipe or llvmpipe)
</p>
<p>
You can run the <code>glxinfo</code> program to learn about your OpenGL
library.
Look for the GL_VENDOR and GL_RENDERER values.
That will identify who's OpenGL library you're using and what sort of
Look for the <code>OpenGL vendor</code> and <code>OpenGL renderer</code> values.
That will identify who's OpenGL library with which driver you're using and what sort of
hardware it has detected.
</p>
<p>
If you're using a hardware accelerated driver you want <code>direct rendering: Yes</code>.
</p>
<p>
If your DRI-based driver isn't working, go to the
<a href="http://dri.sf.net/" target="_parent">DRI website</a> for trouble-shooting information.
<a href="http://dri.freedesktop.org/">DRI website</a> for trouble-shooting information.
</p>
@@ -275,8 +284,8 @@ If your DRI-based driver isn't working, go to the
<p>
Make sure the ratio of the far to near clipping planes isn't too great.
Look
<a href="http://www.opengl.org/resources/faq/technical/depthbuffer.htm#0040"
target="_parent"> here</a> for details.
<a href="http://www.opengl.org/resources/faq/technical/depthbuffer.htm#0040">here</a>
for details.
</p>
<p>
Mesa uses a 16-bit depth buffer by default which is smaller and faster
@@ -339,12 +348,11 @@ may introduce rasterization artifacts; see the leading comments in
<h2>4.1 How can I contribute?</h2>
<p>
First, join the <a href="http://www.mesa3d.org/lists.html">Mesa3d-dev
mailing list</a>.
First, join the <a href="lists.html">mesa-dev mailing list</a>.
That's where Mesa development is discussed.
</p>
<p>
The <a href="http://www.opengl.org/documentation" target="_parent">
The <a href="http://www.opengl.org/documentation">
OpenGL Specification</a> is the bible for OpenGL implemention work.
You should read it.
</p>
@@ -362,8 +370,8 @@ target hardware/operating system.
<p>
The best way to get started is to use an existing driver as your starting
point.
For a software driver, the X11 and OSMesa drivers are good examples.
For a hardware driver, the Radeon and R200 DRI drivers are good examples.
For a classic hardware driver, the i965 driver is a good example.
For a Gallium3D hardware driver, the r300g, r600g and the i915g are good examples.
</p>
<p>The DRI website has more information about writing hardware drivers.
The process isn't well document because the Mesa driver interface changes
@@ -378,7 +386,7 @@ the archives) is a good way to get information.
<h2>4.3 Why isn't GL_EXT_texture_compression_s3tc implemented in Mesa?</h2>
<p>
The <a href="http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_compression_s3tc.txt" target="_parent">specification for the extension</a>
The <a href="http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_compression_s3tc.txt">specification for the extension</a>
indicates that there are intellectual property (IP) and/or patent issues
to be dealt with.
</p>
@@ -388,10 +396,10 @@ implement the extension (specifically the compression/decompression
algorithms).
</p>
<p>
In the mean time, a 3rd party <a href=
"http://dri.freedesktop.org/wiki/S3TC"
target="_parent">plug-in library</a> is available.
In the mean time, a 3rd party <a href="http://dri.freedesktop.org/wiki/S3TC">
plug-in library</a> is available.
</p>
</div>
</body>
</html>

View File

@@ -1,64 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Games</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<h1>Games</h1>
<ul>
<li><a href="http://www.psc.edu/%7Esmp/a3d/">Asteroids3D</a> - 3D asteroids game
</li><li><a href="http://evlweb.eecs.uic.edu/aej/AndyBattalion.html" target="_parent">Battalion</a>
- battle game
</li><li><a href="http://bzflag.sourceforge.net/" target="_parent">BZFLAG</a> - 3-D tank
battle game
</li><li><a href="http://www.speakeasy.org/%7Emorse/copter-commander" target="_parent">Copter Commander</a> - 2d multiplayer side scroller
</li><li><a href="http://www.crystalspace.org/" target="_parent">CrystalSpace</a> - Free
3d game engine
</li><li><a href="http://www.afn.org/%7Ecthugha/" target="_parent">Cthugha</a> - music-sync'ed
graphical effects
</li><li><a href="http://www.sics.se/dive/" target="_parent">DIVE</a> - Distributed Interactive
Virtual Environment
</li><li><a href="http://www.newdoom.com/doomlegacy/" target="_parent">Doom Legacy</a>
- an OpenGL port of id software's popular game, Doom
</li><li><a href="http://www.asimov.de/intern_dropit.html" target="_parent">DropIt</a> - 3-D tetris game
</li><li><a href="http://www.flightgear.org/" target="_parent">Flight Gear</a> - Flight
simulator
</li><li><a href="http://freetrek.linuxgames.com/" target="_parent">Free Trek</a> - Star
Trek battle simulator
</li><li><a href="http://glchess.sourceforge.net/" target="_parent">GLChess</a> - chess game
</li><li><a href="http://heretic.linuxgames.com/" target="_parent">GLHeretic</a> - Heretic
for Linux
</li><li><a href="http://glider3d.free.fr/" target="_parent">Glider3D</a> - flight simulator
</li><li><a href="http://www.gltron.org/" target="_parent">glTron</a> - Tron lightcycles
game
</li><li><a href="http://gracer.sourceforge.net/" target="_parent">GRacer</a> - 3D Motor
Sports Simulator
</li><li><a href="http://jongl.home.pages.de/" target="_parent">JONGL</a> - Juggling simulator
</li><li><a href="http://samba.anu.edu.au/KnightCap/" target="_parent">KnightCap</a> -
chess game
</li><li><a href="http://www.hackcraft.de/games/linwarrior_3d/" target="_parent">LinWarrior 3D</a> - A Battle Mech Simulator
</li><li><a href="http://www.nada.kth.se/%7Ef96-lfo/lunar/" target="_parent">Lunar Lander
2000</a> - 3D version of the classis lunar lander game
</li><li><a href="http://www.majik3d.org/" target="_parent">Majik 3D</a> - an online role-playing
world
</li><li><a href="http://www.pobox.com/%7Eshankel/opentrek.html" target="_parent">OpenTrek</a>
- Super Star Trek
</li><li><a href="http://www.idsoftware.com/" target="_parent">Quake(2,3)</a> - the popular
games from id software
</li><li><a href="http://torcs.free.fr/indexm.html" target="_parent">TORCS</a> - car racing
simulator
</li><li><a href="http://www.woodsoup.org/projs/tux_aqfh" target="_parent">TUX-AQFH</a>
- Tux the Penguin - a Quest for Herring
</li><li><a href="http://mordred.8m.com/voidrunner/" target="_parent">Void Runner</a>
- freeware arcade style game
</li><li><a href="http://xracer.annexia.org/" target="_parent">XRacer</a> - Free spaceship
racing game, similar to Wipeout
</li>
</ul>
</body>
</html>

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.6 KiB

After

Width:  |  Height:  |  Size: 3.4 KiB

View File

@@ -1,46 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>SGI GLU</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<h1>SGI SI GLU</h1>
(Silicon Graphics, Inc. Sample Implementation of the OpenGL Utility library)
<p>
SGI open-sourced their OpenGL Sample Implementation (SI) in January, 2000.
This includes the GLU library.
</p>
<p>
The SI GLU library implements GLU version 1.3 whereas the original
Mesa GLU library only implemented version 1.2.
We recommend using the SI GLU library instead of Mesa's GLU library
since it's more up-to-date, complete and reliable.
We're no longer developing the original Mesa GLU library.
</p>
<p>
The SI GLU library code is included in the Mesa distribution.
You don't have to download it separately.
</p>
<p>
<b>Olivier Michel</b> has made Linux RPMs of GLU for i386 and PowerPC.
You can download them from the
<a href="http://www.sourceforge.net/project/showfiles.php?group_id=3"
target="_parent">download area</a> under <b>Miscellaneous</b>.
</p>
<p>
Visit the <a href="http://oss.sgi.com/projects/ogl-sample/" target="_parent">
OpenGL Sample Implementation home page</a> for more information about the SI.
</p>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Help Wanted / To-Do List</h1>
<p>
@@ -17,12 +24,12 @@ Here are some specific ideas and areas where help would be appreciated:
<ol>
<li>
<b>Driver patching and testing.</b>
Patches are often posted to the <a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev" target="_parent">mesa-dev mailing list</a>, but aren't
Patches are often posted to the <a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev mailing list</a>, but aren't
immediately checked into git because not enough people are testing them.
Just applying patches, testing and reporting back is helpful.
<li>
<b>Driver debugging.</b>
There are plenty of open bugs in the <a href="https://bugs.freedesktop.org/describecomponents.cgi?product=Mesa" target="_parent">bug database</a>.
There are plenty of open bugs in the <a href="https://bugs.freedesktop.org/describecomponents.cgi?product=Mesa">bug database</a>.
<li>
<b>Remove aliasing warnings.</b>
Enable gcc -Wstrict-aliasing=2 -fstrict-aliasing and track down aliasing
@@ -31,13 +38,8 @@ issues in the code.
<b>Windows driver building, testing and maintenance.</b>
Fixing MSVC builds.
<li>
<b>Maintenance and testing of lesser-used drivers.</b>
Drivers such as i810, mach64, mga, r128, savage, sis, tdfx, unichrome, etc that aren't being maintained are being
deprecated starting in Mesa 8.0.<br>
They have to be ported to DRI2 to be accepted in mesa master again.
<li>
<b>Contribute more tests to
<a href="http://people.freedesktop.org/~nh/piglit/" target="_parent">Piglit</a>.</b>
<a href="http://piglit.freedesktop.org/">Piglit</a>.</b>
<li>
<b>Automatic testing.
</b>
@@ -46,6 +48,35 @@ the latest Mesa code and run tests (such as piglit) then report issues to
the mailing list.
</ol>
<p>
You can find some further To-do lists here:
</p>
<p>
<b>Common To-Do lists:</b>
</p>
<ul>
<li><a href="http://cgit.freedesktop.org/mesa/mesa/tree/docs/GL3.txt">
<b>GL3.txt</b></a> - Status of OpenGL 3.x / 4.x features in Mesa.</li>
<li><a href="http://dri.freedesktop.org/wiki/MissingFunctionality">
<b>MissingFunctionality</b></a> - Detailed information about missing OpenGL features.</li>
</ul>
<p>
<b>Driver specific To-Do lists:</b>
</p>
<ul>
<li><a href="http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/docs/llvm-todo.txt">
<b>LLVMpipe</b></a> - Software driver using LLVM for runtime code generation.</li>
<li><a href="http://dri.freedesktop.org/wiki/RadeonsiToDo">
<b>radeonsi</b></a> - Driver for AMD Southern Island.</li>
<li><a href="http://dri.freedesktop.org/wiki/R600ToDo">
<b>r600g</b></a> - Driver for ATI/AMD R600 - Northern Island.</li>
<li><a href="http://dri.freedesktop.org/wiki/R300ToDo">
<b>r300g</b></a> - Driver for ATI R300 - R500.</li>
<li><a href="http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/i915/TODO">
<b>i915g</b></a> - Driver for Intel i915/i945.</li>
</ul>
<p>
If you want to do something new in Mesa, first join the Mesa developer's
@@ -69,6 +100,6 @@ Finally:
<li>Test your code thoroughly. Include test programs if appropriate.
</ol>
</div>
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Compiling and Installing</h1>
<ol>
@@ -52,9 +59,9 @@ The following are required for DRI-based hardware acceleration with Mesa:
</p>
<ul>
<li><a href="http://xorg.freedesktop.org/releases/individual/proto/"
target="_parent">dri2proto</a> version 2.6 or later
<li><a href="http://dri.freedesktop.org/libdrm/" target="_parent">libDRM</a>
<li><a href="http://xorg.freedesktop.org/releases/individual/proto/">
dri2proto</a> version 2.6 or later
<li><a href="http://dri.freedesktop.org/libdrm/">libDRM</a>
version 2.4.33 or later
<li>Xorg server version 1.5 or later
<li>Linux 2.6.28 or later
@@ -151,9 +158,6 @@ You'll see a set of library files similar to this:
lrwxrwxrwx 1 brian users 10 Mar 26 07:53 libGL.so -> libGL.so.1*
lrwxrwxrwx 1 brian users 19 Mar 26 07:53 libGL.so.1 -> libGL.so.1.5.060100*
-rwxr-xr-x 1 brian users 3375861 Mar 26 07:53 libGL.so.1.5.060100*
lrwxrwxrwx 1 brian users 11 Mar 26 07:53 libGLU.so -> libGLU.so.1*
lrwxrwxrwx 1 brian users 20 Mar 26 07:53 libGLU.so.1 -> libGLU.so.1.3.060100*
-rwxr-xr-x 1 brian users 549269 Mar 26 07:53 libGLU.so.1.3.060100*
lrwxrwxrwx 1 brian users 14 Mar 26 07:53 libOSMesa.so -> libOSMesa.so.6*
lrwxrwxrwx 1 brian users 23 Mar 26 07:53 libOSMesa.so.6 -> libOSMesa.so.6.1.060100*
-rwxr-xr-x 1 brian users 23871 Mar 26 07:53 libOSMesa.so.6.1.060100*
@@ -162,8 +166,6 @@ lrwxrwxrwx 1 brian users 23 Mar 26 07:53 libOSMesa.so.6 -> libOSM
<p>
<b>libGL</b> is the main OpenGL library (i.e. Mesa).
<br>
<b>libGLU</b> is the OpenGL Utility library.
<br>
<b>libOSMesa</b> is the OSMesa (Off-Screen) interface library.
</p>
@@ -174,7 +176,6 @@ If you built the DRI hardware drivers, you'll also see the DRI drivers:
-rwxr-xr-x 1 brian users 16895413 Jul 21 12:11 i915_dri.so
-rwxr-xr-x 1 brian users 16895413 Jul 21 12:11 i965_dri.so
-rwxr-xr-x 1 brian users 11849858 Jul 21 12:12 r200_dri.so
-rwxr-xr-x 1 brian users 16050488 Jul 21 12:11 r300_dri.so
-rwxr-xr-x 1 brian users 11757388 Jul 21 12:12 radeon_dri.so
</pre>
@@ -205,6 +206,6 @@ For example, compiling and linking a GLUT application can be done with:
<br>
</div>
</body>
</html>

View File

@@ -7,11 +7,18 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Introduction</h1>
<p>
Mesa is an open-source implementation of the
<a href="http://www.opengl.org/" target="_parent">OpenGL</a> specification -
<a href="http://www.opengl.org/">OpenGL</a> specification -
a system for rendering interactive 3D graphics.
</p>
@@ -23,8 +30,8 @@ for modern GPUs.
<p>
Mesa ties into several other open-source projects: the
<a href="http://dri.freedesktop.org/" target="_parent">Direct Rendering
Infrastructure</a> and <a href="http://x.org" target="_parent">X.org</a> to
<a href="http://dri.freedesktop.org/">Direct Rendering
Infrastructure</a> and <a href="http://x.org">X.org</a> to
provide OpenGL support to users of X on Linux, FreeBSD and other operating
systems.
</p>
@@ -78,7 +85,7 @@ the OpenGL API, so they didn't feel threatened by the project.
1995-1996: I continue working on Mesa both during my spare time and during
my work hours at the Space Science and Engineering Center at the University
of Wisconsin in Madison. My supervisor, Bill Hibbard, lets me do this because
Mesa is now being using for the <a href="http://www.ssec.wisc.edu/%7Ebillh/vis.html" target="_parent">Vis5D</a> project.
Mesa is now being using for the <a href="http://www.ssec.wisc.edu/%7Ebillh/vis.html">Vis5D</a> project.
</p><p>
October 1996: Mesa 2.0 is released. It implements the OpenGL 1.1 specification.
</p>
@@ -135,7 +142,7 @@ and OpenGL Shading Language.
<p>
2008: Keith Whitwell and other Tungsten Graphics employees develop
<a href="http://en.wikipedia.org/wiki/Gallium3D" target="_parent">Gallium</a>
<a href="http://en.wikipedia.org/wiki/Gallium3D">Gallium</a>
- a new GPU abstraction layer. The latest Mesa drivers are based on
Gallium and other APIs such as OpenVG are implemented on top of Gallium.
</p>
@@ -166,6 +173,17 @@ of the OpenGL specification is implemented.
</p>
<h2>Version 9.x features</h2>
<p>
Version 9.x of Mesa implements the OpenGL 3.1 API.
While the driver for Intel Sandy Bridge and Ivy Bridge is the only
driver to support OpenGL 3.1, many developers across the open-source
community contributed features required for OpenGL 3.1. The primary
features added since the Mesa 8.0 release are
GL_ARB_texture_buffer_object and GL_ARB_uniform_buffer_object.
</p>
<h2>Version 8.x features</h2>
<p>
Version 8.x of Mesa implements the OpenGL 3.0 API.
@@ -216,7 +234,7 @@ GL_SRC2_ALPHA GL_SOURCE2_ALPHA
</pre>
<p>
See the
<a href="http://www.opengl.org/documentation/spec.html" target="_parent">
<a href="http://www.opengl.org/documentation/spec.html">
OpenGL specification</a> for more details.
</p>
@@ -332,6 +350,6 @@ features.
</ul>
</ul>
</div>
</body>
</html>

View File

@@ -1,58 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Libraries and Toolkits</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<h1>Libraries and Toolkits</h1>
<ul>
<li><a href="http://mrpowers.com/Apprentice/" target="_parent">Apprentice</a> - free OpenInventor work-alike
<li><a href="http://www.coin3d.org/coin.html" target="_parent">Coin</a> - OSS Open Inventor clone
<li><a href="http://www.softintegration.com/products/toolkit/opengl/" target="_parent">Ch</a> - OpenGL bindings for the Ch C/C++ interpreter
<li><a href="http://www.cfdrc.com/FOX/fox.html" target="_parent">FOX</a> - GUI Library
<li><a href="http://www.jausoft.com/gl4java.html" target="_parent">GL4Java</a> - a Java wrapper for OpenGL
<li><a href="http://www.student.oulu.fi/%7Ejlof/gtkglarea/" target="_parent">GtkGLArea</a> - OpenGL Gtk widget
<li><a href="http://www.ece.ucdavis.edu/%7Ekenelson/gtk-glarea/" target="_parent">GtkGLArea--</a> - OpenGL Gtk-- widget for C++
<li><a href="http://gtkpas.sourceforge.net/" target="_parent">GTKpas</a> - OpenGL Gtk widget for <a href="http://www.freepascal.org/" target="_parent">FreePascal</a>
<li><a href="http://freeglut.sourceforge.net/" target="_parent">FreeGLUT</a> - a GLUT work-alike
<li><a href="http://math.nist.gov/f90gl" target="_parent">Fortran77/90 bindings for OpenGL and Mesa</a> - by William Mitchell
<li><a href="http://glow.sourceforge.net/" target="_parent">GLOW</a> - a GUI toolkit for GLUT and OpenGL
<li><a href="http://www.nigels.com/glt/">Glt</a> - an OpenGL C++ toolkit
<li><a href="http://www.opengl.org/resources/libraries/glut.html" target="_parent">GLUT (GL Utility Toolkit)</a> - by Mark Kilgard
<li><a href="http://atrey.karlin.mff.cuni.cz/%7E0rfelyus/guileGL/" target="_parent">GuileGL</a> - OpenGL and GtkGLArea language bindings for Guile
<li><a href="http://www.rsinc.com/" target="_parent">IDL</a> - Interactive Data Language
<li><a href="http://www.newplanetsoftware.com/jx/" target="_parent">JX</a> - C++ application framework and GUI library
<li><a href="http://www.vrs3d.org/" target="_parent">MAM/VRS</a> - object-oriented toolkit for 3D graphics
<li><a href="http://www.jwdt.com/%7Epaysan/bigforth.html" target="_parent">MINOS</a> - GUI library
<li><a href="http://sourceforge.net/project/?group_id=2795" target="_parent">OglCLib</a> - C++ wrapper for OpenGL
<li><a href="http://oss.sgi.com/projects/inventor" target="_parent"> Open Inventor</a> - the Open Inventor toolkit from SGI
<li><a href="http://www.tgs.com/" target="_parent">Open Inventor</a> - the Open Inventor toolkit from Template Graphics Software, Inc.
<li><a href="http://openrm.sourceforge.net/" target="_parent">OpenRM</a>
- Open Source, multithreaded, parallel scene graph API
<li><a href="http://www.opensg.org/OpenSGPLUS/index.EN.html" target="_parent">
Open SG PLUS</a> - a scene-graph library
<li><a href="http://www.openscenegraph.org/" target="_parent">Open Scene Graph
</a> - a scene-graph library
<li><a href="http://www.openvrml.org/" target="_parent">OpenVRML</a>
- a VRML parsing/display library with "lookat" - an example VRML browser
<li><a href="http://plib.sourceforge.net/" target="_parent">PLIB</a> - A collection of portable games libraries, including an OpenGL GUI and a simple Scene Graph API
<li><a href="ftp://ftp.troll.no/contest/Pryan-1.2.tar.gz" target="_parent">Pryan</a> - an OpenInventor-like toolkit
<li><a href="http://starship.python.net:9673/crew/da/Code/PyOpenGL" target="_parent">PyOpenGL</a> - OpenGL interface for Python
<li><a href="http://www.quesa.org/" target="_parent">Quesa</a> - QuickDraw3D-compatible library based on OpenGL, Mesa or Direct3D
<li><a href="http://www.mesa3d.org/brianp/repgl.txt" target="_parent">repGL</a> - IRIS GL emulated with OpenGL
<li><a href="http://www.scitechsoft.com/dp_mgl.html" target="_parent">SciTech MGL</a> - A multiplatform (Windows, Linux, OS/2, DOS, QNX, SMX, RT-Target &amp; more) graphics library
<li><a href="http://sgl.sourceforge.net/" target="_parent">SGL</a> - a 3D Scene Graph Library
<li><a href="http://www.lal.in2p3.fr/SI/SoFree/" target="_parent">SoFree</a> - a free implementation of Open Inventor
<li><a href="http://togl.sourceforge.net/" target="_parent">Togl</a> - Tcl/Tk widget for OpenGL
<li><a href="http://www.int.com/" target="_parent">View3D Widget</a> - 3-D GUI widget
<li><a href="http://www.vtk.org/" target="_parent">VTK</a> - Visualization Toolkit
<li><a href="http://home.earthlink.net/%7Erzeh/YAJOGLB/doc/YAJOGLB.html" target="_parent">YAJOGL</a> - Yet Another Java GL Binding.
</ul>
</body>
</html>

View File

@@ -2,19 +2,26 @@
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>License / Cppyright Information</title>
<title>License / Copyright Information</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Disclaimer</h1>
<p>
Mesa is a 3-D graphics library with an API which is very similar to
that of <a href="http://www.opengl.org/" target="_parent">OpenGL</a>.*
that of <a href="http://www.opengl.org/">OpenGL</a>.*
To the extent that Mesa utilizes the OpenGL command syntax or state
machine, it is being used with authorization from <a
href="http://www.sgi.com/" target="_parent">Silicon Graphics,
href="http://www.sgi.com/">Silicon Graphics,
Inc.</a>(SGI). However, the author does not possess an OpenGL license
from SGI, and makes no claim that Mesa is in any way a compatible
replacement for OpenGL or associated with SGI. Those who want a
@@ -30,7 +37,7 @@ library</em>. <br>
<p>
* OpenGL is a trademark of <a href="http://www.sgi.com/"
target="_parent">Silicon Graphics Incorporated</a>.
>Silicon Graphics Incorporated</a>.
</p>
@@ -68,9 +75,10 @@ in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
BRIAN PAUL BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN
AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
</pre>
@@ -95,14 +103,12 @@ Device drivers src/mesa/drivers/* MIT, generally
Ext headers include/GL/glext.h Khronos
include/GL/glxext.h
SGI GLU library src/glu/sgi/ SGI Free B
</pre>
<p>
In general, consult the source files for license terms.
</p>
</div>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Mailing Lists</h1>
@@ -14,26 +21,24 @@
</p>
<ul>
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-users"
target="_parent">mesa-users</a> - intended for end-users of Mesa and DRI
drivers. Newbie questions are OK, but please try the general OpenGL
resources and Mesa/DRI documentation first.</p>
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-users">mesa-users</a>
- intended for end-users of Mesa and DRI drivers. Newbie questions are OK,
but please try the general OpenGL resources and Mesa/DRI documentation first.</p>
</li>
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev"
target="_parent">mesa-dev</a> - for Mesa, Gallium and DRI development
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev</a>
- for Mesa, Gallium and DRI development
discussion. Not for beginners.</p>
</li>
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-commit"
target="_parent">mesa-commit</a> - relays git check-in messages
(for developers).
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-commit">mesa-commit</a>
- relays git check-in messages (for developers).
In general, people should not post to this list.</p>
</li>
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-announce"
target="_parent">mesa-announce</a> - announcements of new Mesa
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-announce">mesa-announce</a>
- announcements of new Mesa
versions are sent to this list. Very low traffic.</p>
</li>
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/piglit"
target="_parent">piglit</a> - for Piglit (OpenGL driver testing framework) discussion.</p>
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/piglit">piglit</a>
- for Piglit (OpenGL driver testing framework) discussion.</p>
</li>
</ul>
@@ -51,20 +56,24 @@ Follow the links above for list archives.
<p>
The old Mesa lists hosted at SourceForge are no longer in use.
The archives are still available, however:
<a href="http://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-announce" target="_parent">mesa3d-announce</a>,
<a href="http://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-users" target="_parent">mesa3d-users</a>,
<a href="http://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-dev" target="_parent">mesa3d-dev</a>.
<a href="http://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-announce">mesa3d-announce</a>,
<a href="http://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-users">mesa3d-users</a>,
<a href="http://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-dev">mesa3d-dev</a>.
</p>
<p>For mailing lists about Direct Rendering Modules (drm) in Linux/BSD
kernels, see the
<a href="http://dri.freedesktop.org/wiki/MailingLists" target="_parent">
DRI wiki</a>.
<a href="http://dri.freedesktop.org/wiki/MailingLists">DRI wiki</a>.
</p>
<br>
<h1>IRC</h1>
<p>join <a href="irc://chat.freenode.net#dri-devel">#dri-devel channel</a>
on <a href="http://webchat.freenode.net/">irc.freenode.net</a>
</p>
<h1>OpenGL Forums</h1>
@@ -73,8 +82,8 @@ Here are some other OpenGL-related forums you might find useful:
</p>
<ul>
<li><a href="http://www.opengl.org/cgi-bin/ubb/ultimatebb.cgi"
target="_parent">OpenGL discussion forums</A> at www.opengl.org</li>
<li><a href="http://www.opengl.org/cgi-bin/ubb/ultimatebb.cgi">OpenGL discussion forums</a>
at www.opengl.org</li>
<li>Usenet newsgroups:
<ul>
<li>comp.graphics.algorithms</li>
@@ -83,5 +92,6 @@ target="_parent">OpenGL discussion forums</A> at www.opengl.org</li>
</ul>
</ul>
</div>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Introduction</h1>
<p>
@@ -123,38 +130,38 @@ need to ask, don't even try it.
<h1>Profiling</h1>
To profile llvmpipe you should pass the options
<p>
To profile llvmpipe you should build as
</p>
<pre>
scons build=profile &lt;same-as-before&gt;
</pre>
<p>
This will ensure that frame pointers are used both in C and JIT functions, and
that no tail call optimizations are done by gcc.
</p>
To better profile JIT code you'll need to build LLVM with oprofile integration.
<h2>Linux perf integration</h2>
<p>
On Linux, it is possible to have symbol resolution of JIT code with <a href="http://perf.wiki.kernel.org/">Linux perf</a>:
</p>
<pre>
./configure \
--prefix=$install_dir \
--enable-optimized \
--disable-profiling \
--enable-targets=host-only \
--with-oprofile
make -C "$build_dir"
make -C "$build_dir" install
find "$install_dir/lib" -iname '*.a' -print0 | xargs -0 strip --strip-debug
perf record -g /my/application
perf report
</pre>
The you should define
<p>
When run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with
symbol address table. It also dumps assembly code to /tmp/perf-XXXXX.map.asm,
which can be used by the bin/perf-annotate-jit script to produce disassembly of
the generated code annotated with the samples.
</p>
<pre>
export LLVM=/path/to/llvm-2.6-profile
</pre>
and rebuild.
<p>You can obtain a call graph via
<a href="http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#linux_perf">Gprof2Dot</a>.</p>
<h1>Unit testing</h1>
@@ -201,5 +208,6 @@ for posterior analysis, e.g.:
</li>
</ul>
</div>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Function Name Mangling</h1>
<p>
@@ -26,6 +33,6 @@ For example:
CFLAGS += -DUSE_MGL_NAMESPACE
</pre>
</div>
</body>
</html>

View File

@@ -31,3 +31,33 @@ pre {
/*color: black;*/
}
iframe {
width: 19em;
height: 80em;
border: none;
float: left;
}
.content {
position: absolute;
left: 20em;
right: 10px;
overflow: hidden
}
.header {
background: black url('gears.png') 15px no-repeat;
margin:0;
padding: 5px;
clear:both;
}
.header h1 {
background: url('gears.png') right no-repeat;
color: white;
font: x-large sans-serif;
text-align: center;
height: 50px;
margin: 0;
padding-top: 30px;
}

View File

@@ -1,65 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Modelers, Renderers and Viewers</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<h1>Modelers, Renderers and Viewers</h1>
<ul>
<li><a href="http://www.aqsis.org/" target="_parent">Aqsis</a> - a RenderMan compatible renderer</li>
<li><a href="http://www.ac3d.org/" target="_parent">AC3D</a> - 3-D modeler
</li><li><a href="http://www.mediascape.com/" target="_parent">Artstream</a> - provides
functionality like Corel Draw and Illustrator
</li><li><a href="http://www.blender.org/" target="_parent">Blender</a> - 3-D animation
software
</li><li><a href="http://www.arq.net/%7Ekasten/demtools/" target="_parent">Demtools</a>
- Map viewer
</li><li><a href="http://www.holometric.de/dimension/" target="_parent">DIMENSION</a>
- freeform surface reconstruction
</li><li><a href="http://www.vectaport.com/vhclmaps/demviewer.html" target="_parent">demviewer</a>
- interactive terrain viewer
</li><li><a href="http://www.crc.ca/FreeWRL" target="_parent">FreeWRL</a> - VRML browser
</li><li><a href="http://www.geomview.org/" target="_parent">Geomview</a> - 3-D geometry
exploration
</li><li><a href="http://innovation3d.sourceforge.net/" target="_parent">Innovation3D</a>
- 3D modeling program
</li><li><a href="http://www.openvrml.org/" target="_parent">LibVRML97/Lookat</a>
- VRML viewer
</li><li><a href="http://aig.cs.man.ac.uk/systems/Maverik/" target="_parent">Maverik</a>
- VR graphics and interaction system
</li><li><a href="http://www.swissquake.ch/chumbalum-soft/md2v" target="_parent">MD2 Viewer</a>
- View .MD2 files
</li><li><a href="http://www.megacads.dlr.de/" target="_parent">MegaCads</a>
- Multiblock-Elliptic-Grid-Generation-And-CAD-System
</li><li><a href="http://www.swissquake.ch/chumbalum-soft/" target="_parent">MilkShape
3D</a> - 3D modeler/animator
</li><li><a href="http://mindseye.sourceforge.net/" target="_parent">Mindseye</a> - Rendering/Modeling
Package
</li><li><a href="http://www.neuralvr.com/" target="_parent">Pansophica</a> - Virtual Reality web organizer
</li><li><a href="http://www.sim.no/reducer.html" target="_parent">Rational Reducer</a>
- polygon reduction tool
</li><li><a href="http://www.cs.kuleuven.ac.be/cwis/research/graphics/RENDERPARK/" target="_parent">RenderPark</a>
- photorealistic rendering
</li><li><a href="http://www.hardgeus.com/revolution" target="_parent">Revolution 3D Engine</a>
- .3ds rendering engine
</li><li><a href="http://www.dgp.toronto.edu/%7Emjmcguff/eversion/" target="_parent">sphereEversion</a>
- inside-out sphere visualization
</li><li><a href="http://www.cs.kuleuven.ac.be/cwis/research/graphics/3DOM/" target="_parent">3Dom</a>
- 3-D modeler
</li><li><a href="http://www.microform.se/" target="_parent">VARKON</a> - product engineering,
design, modeling
</li><li><a href="http://www.sim.no/vrmlview.html" target="_parent">VRMLview</a> - VRML
model viewer
</li><li><a href="http://www.iicm.edu/vrwave/" target="_parent">VRWave</a> - a VRML 2.0
browser
</li><li><a href="http://www.csv.ica.uni-stuttgart.de/vrml/dune/" target="_parent">white_dune</a>
- graphical VRML97 Editor and animation tool
</li></ul>
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@@ -7,11 +7,18 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>OpenGL ES</h1>
<p>Mesa implements OpenGL ES 1.1 and OpenGL ES 2.0. More informations about
OpenGL ES can be found at <a href="http://www.khronos.org/opengles/"
target="_parent"> http://www.khronos.org/opengles/</a>.</p>
OpenGL ES can be found at <a href="http://www.khronos.org/opengles/">
http://www.khronos.org/opengles/</a>.</p>
<p>OpenGL ES depends on a working EGL implementation. Please refer to
<a href="egl.html">Mesa EGL</a> for more information about EGL.</p>
@@ -58,5 +65,6 @@ EGL drivers for your hardware.</p>
<p>Other than the last case, OpenGL ES uses <code>APIspec.xml</code> to generate functions to check and/or converts the arguments.</p>
</div>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>OpenVG State Tracker</h1>
<p>
@@ -14,7 +21,7 @@ The current version of the OpenVG state tracker implements OpenVG 1.1.
</p>
<p>
More informations about OpenVG can be found at
<a href="http://www.khronos.org/openvg/" target="_parent">
<a href="http://www.khronos.org/openvg/">
http://www.khronos.org/openvg/</a> .
</p>
<p>
@@ -47,5 +54,6 @@ or more EGL drivers.</p>
<p>OpenVG demos can be found in mesa/demos repository.</p>
</div>
</body>
</html>

View File

@@ -7,83 +7,75 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Off-screen Rendering</h1>
<p>
Mesa's off-screen rendering interface is used for rendering into
user-allocated blocks of memory.
Mesa's off-screen interface is used for rendering into user-allocated memory
without any sort of window system or operating system dependencies.
That is, the GL_FRONT colorbuffer is actually a buffer in main memory,
rather than a window on your display.
There are no window system or operating system dependencies.
One potential application is to use Mesa as an off-line, batch-style renderer.
</p>
<p>
The <b>OSMesa</b> API provides three basic functions for making off-screen
The OSMesa API provides three basic functions for making off-screen
renderings: OSMesaCreateContext(), OSMesaMakeCurrent(), and
OSMesaDestroyContext(). See the Mesa/include/GL/osmesa.h header for
more information about the API functions.
</p>
<p>
There are several examples of OSMesa in the <code>progs/osdemos/</code>
directory.
The OSMesa interface may be used with any of three software renderers:
</p>
<ol>
<li>llvmpipe - this is the high-performance Gallium LLVM driver
<li>softpipe - this it the reference Gallium software driver
<li>swrast - this is the legacy Mesa software rasterizer
</ol>
<h2>Deep color channels</h2>
<p>
For some applications 8-bit color channels don't have sufficient
precision.
OSMesa supports 16-bit and 32-bit color channels through the OSMesa interface.
When using 16-bit channels, channels are GLushorts and RGBA pixels occupy
8 bytes.
When using 32-bit channels, channels are GLfloats and RGBA pixels occupy
16 bytes.
There are several examples of OSMesa in the mesa/demos repository.
</p>
<p>
Before version 6.5.1, Mesa had to be recompiled to support exactly
one of 8, 16 or 32-bit channels.
With Mesa 6.5.1, Mesa can be compiled for either 8, 16 or 32-bit channels
and render into any of the smaller size channels.
For example, if Mesa's compiled for 32-bit channels, you can also render
16 and 8-bit channel images.
</p>
<h1>Building OSMesa</h1>
<p>
To build Mesa/OSMesa for 16 and 8-bit color channel support:
Configure and build Mesa with something like:
<pre>
make realclean
make linux-osmesa16
configure --enable-osmesa --disable-driglx-direct --disable-dri --with-gallium-drivers=swrast
make
</pre>
<p>
To build Mesa/OSMesa for 32, 16 and 8-bit color channel support:
Make sure you have LLVM installed first if you want to use the llvmpipe driver.
</p>
<p>
When the build is complete you should find:
</p>
<pre>
make realclean
make linux-osmesa32
lib/libOSMesa.so (swrast-based OSMesa)
lib/gallium/libOSMsea.so (gallium-based OSMesa)
</pre>
<p>
You'll wind up with a library named libOSMesa16.so or libOSMesa32.so.
Otherwise, most Mesa configurations build an 8-bit/channel libOSMesa.so library
by default.
Set your LD_LIBRARY_PATH to point to one directory or the other to select
the library you want to use.
</p>
<p>
If performance is important, compile Mesa for the channel size you're
most interested in.
</p>
<p>
If you need to compile on a non-Linux platform, copy Mesa/configs/linux-osmesa16
to a new config file and edit it as needed. Then, add the new config name to
the top-level Makefile. Send a patch to the Mesa developers too, if you're
inclined.
When you link your application, link with -lOSMesa
</p>
</div>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Performance Tips</h1>
<p>
@@ -64,6 +71,6 @@ Performance tips for software rendering:
command glEnable(GL_DITHER) will be ignored.
</ol>
</div>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Gallium Post-processing</h1>
<p>
@@ -52,6 +59,6 @@ Numbers higher than 8 see minimizing gains.
<br>
<br>
</div>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Precompiled Libraries</h1>
<p>
@@ -17,5 +24,6 @@ However, some Linux distros (such as Ubuntu) seem to closely track
Mesa and often have the latest Mesa release available as an update.
</p>
</div>
</body>
</html>

View File

@@ -1,56 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<h1>Mesa 8.1 Release Notes / date TBD</h1>
<p>
Mesa 8.1 is a new development release.
</p>
<p>
Mesa 8.1 implements the OpenGL 3.0 API, but the version reported by
glGetString(GL_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.0.
</p>
<h2>MD5 checksums</h2>
<pre>
tbd
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_ARB_base_instance extension</li>
<li>GL_NV_read_buffer extension for ES 2.0</li>
<li>GL_ARB_shader_bit_encoding</li>
<li>GL_EXT_unpack_subimage for ES 2.0</li>
<li>GL_EXT_read_format_bgra for ES 1.1 and 2.0</li>
<li>GL_ARB_debug_output</li>
</ul>
<h2>Bug fixes</h2>
<p>TBD -- This list is likely incomplete.</p>
<h2>Changes</h2>
<p>TBD</p>
</body>
</html>

View File

@@ -7,6 +7,13 @@
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Release Notes</h1>
<p>
@@ -14,50 +21,69 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes-8.1.html">8.1 release notes</a>
<li><a href="relnotes-8.0.3.html">8.0.3 release notes</a>
<li><a href="relnotes-8.0.2.html">8.0.2 release notes</a>
<li><a href="relnotes-8.0.1.html">8.0.1 release notes</a>
<li><a href="relnotes-8.0.html">8.0 release notes</a>
<li><a href="relnotes-7.11.html">7.11 release notes</a>
<li><a href="relnotes-7.10.3.html">7.10.3 release notes</a>
<li><a href="relnotes-7.10.2.html">7.10.2 release notes</a>
<li><a href="relnotes-7.10.1.html">7.10.1 release notes</a>
<li><a href="relnotes-7.10.html">7.10 release notes</a>
<li><a href="relnotes-7.9.2.html">7.9.2 release notes</a>
<li><a href="relnotes-7.9.1.html">7.9.1 release notes</a>
<li><a href="relnotes-7.9.html">7.9 release notes</a>
<li><a href="relnotes-7.8.3.html">7.8.3 release notes</a>
<li><a href="relnotes-7.8.2.html">7.8.2 release notes</a>
<li><a href="relnotes-7.8.1.html">7.8.1 release notes</a>
<li><a href="relnotes-7.8.html">7.8 release notes</a>
<li><a href="relnotes-7.7.1.html">7.7.1 release notes</a>
<li><a href="relnotes-7.7.html">7.7 release notes</a>
<li><a href="relnotes-7.6.1.html">7.6.1 release notes</a>
<li><a href="relnotes-7.6.html">7.6 release notes</a>
<li><a href="relnotes-7.5.2.html">7.5.2 release notes</a>
<li><a href="relnotes-7.5.1.html">7.5.1 release notes</a>
<li><a href="relnotes-7.5.html">7.5 release notes</a>
<li><a href="relnotes-7.4.4.html">7.4.4 release notes</a>
<li><a href="relnotes-7.4.3.html">7.4.3 release notes</a>
<li><a href="relnotes-7.4.2.html">7.4.2 release notes</a>
<li><a href="relnotes-7.4.1.html">7.4.1 release notes</a>
<li><a href="relnotes-7.4.html">7.4 release notes</a>
<li><a href="relnotes-7.3.html">7.3 release notes</a>
<li><a href="relnotes-7.2.html">7.2 release notes</a>
<li><a href="relnotes-7.1.html">7.1 release notes</a>
<li><a href="relnotes-7.0.4.html">7.0.4 release notes</a>
<li><a href="relnotes-7.0.3.html">7.0.3 release notes</a>
<li><a href="relnotes-7.0.2.html">7.0.2 release notes</a>
<li><a href="relnotes-7.0.1.html">7.0.1 release notes</a>
<li><a href="relnotes-7.0.html">7.0 release notes</a>
<li><a href="relnotes-6.5.3.html">6.5.3 release notes</a>
<li><a href="relnotes-6.5.2.html">6.5.2 release notes</a>
<li><a href="relnotes-6.5.1.html">6.5.1 release notes</a>
<li><a href="relnotes-6.5.html">6.5 release notes</a>
<li><a href="relnotes-6.4.2.html">6.4.2 release notes</a>
<li><a href="relnotes-6.4.1.html">6.4.1 release notes</a>
<li><a href="relnotes-6.4.html">6.4 release notes</a>
<li><a href="relnotes/10.0.html">10.0 release notes</a>
<li><a href="relnotes/9.2.2.html">9.2.2 release notes</a>
<li><a href="relnotes/9.2.1.html">9.2.1 release notes</a>
<li><a href="relnotes/9.2.html">9.2 release notes</a>
<li><a href="relnotes/9.1.7.html">9.1.7 release notes</a>
<li><a href="relnotes/9.1.6.html">9.1.6 release notes</a>
<li><a href="relnotes/9.1.5.html">9.1.5 release notes</a>
<li><a href="relnotes/9.1.4.html">9.1.4 release notes</a>
<li><a href="relnotes/9.1.3.html">9.1.3 release notes</a>
<li><a href="relnotes/9.1.2.html">9.1.2 release notes</a>
<li><a href="relnotes/9.1.1.html">9.1.1 release notes</a>
<li><a href="relnotes/9.1.html">9.1 release notes</a>
<li><a href="relnotes/9.0.3.html">9.0.3 release notes</a>
<li><a href="relnotes/9.0.2.html">9.0.2 release notes</a>
<li><a href="relnotes/9.0.1.html">9.0.1 release notes</a>
<li><a href="relnotes/9.0.html">9.0 release notes</a>
<li><a href="relnotes/8.0.5.html">8.0.5 release notes</a>
<li><a href="relnotes/8.0.4.html">8.0.4 release notes</a>
<li><a href="relnotes/8.0.3.html">8.0.3 release notes</a>
<li><a href="relnotes/8.0.2.html">8.0.2 release notes</a>
<li><a href="relnotes/8.0.1.html">8.0.1 release notes</a>
<li><a href="relnotes/8.0.html">8.0 release notes</a>
<li><a href="relnotes/7.11.2.html">7.11.2 release notes</a>
<li><a href="relnotes/7.11.1.html">7.11.1 release notes</a>
<li><a href="relnotes/7.11.html">7.11 release notes</a>
<li><a href="relnotes/7.10.3.html">7.10.3 release notes</a>
<li><a href="relnotes/7.10.2.html">7.10.2 release notes</a>
<li><a href="relnotes/7.10.1.html">7.10.1 release notes</a>
<li><a href="relnotes/7.10.html">7.10 release notes</a>
<li><a href="relnotes/7.9.2.html">7.9.2 release notes</a>
<li><a href="relnotes/7.9.1.html">7.9.1 release notes</a>
<li><a href="relnotes/7.9.html">7.9 release notes</a>
<li><a href="relnotes/7.8.3.html">7.8.3 release notes</a>
<li><a href="relnotes/7.8.2.html">7.8.2 release notes</a>
<li><a href="relnotes/7.8.1.html">7.8.1 release notes</a>
<li><a href="relnotes/7.8.html">7.8 release notes</a>
<li><a href="relnotes/7.7.1.html">7.7.1 release notes</a>
<li><a href="relnotes/7.7.html">7.7 release notes</a>
<li><a href="relnotes/7.6.1.html">7.6.1 release notes</a>
<li><a href="relnotes/7.6.html">7.6 release notes</a>
<li><a href="relnotes/7.5.2.html">7.5.2 release notes</a>
<li><a href="relnotes/7.5.1.html">7.5.1 release notes</a>
<li><a href="relnotes/7.5.html">7.5 release notes</a>
<li><a href="relnotes/7.4.4.html">7.4.4 release notes</a>
<li><a href="relnotes/7.4.3.html">7.4.3 release notes</a>
<li><a href="relnotes/7.4.2.html">7.4.2 release notes</a>
<li><a href="relnotes/7.4.1.html">7.4.1 release notes</a>
<li><a href="relnotes/7.4.html">7.4 release notes</a>
<li><a href="relnotes/7.3.html">7.3 release notes</a>
<li><a href="relnotes/7.2.html">7.2 release notes</a>
<li><a href="relnotes/7.1.html">7.1 release notes</a>
<li><a href="relnotes/7.0.4.html">7.0.4 release notes</a>
<li><a href="relnotes/7.0.3.html">7.0.3 release notes</a>
<li><a href="relnotes/7.0.2.html">7.0.2 release notes</a>
<li><a href="relnotes/7.0.1.html">7.0.1 release notes</a>
<li><a href="relnotes/7.0.html">7.0 release notes</a>
<li><a href="relnotes/6.5.3.html">6.5.3 release notes</a>
<li><a href="relnotes/6.5.2.html">6.5.2 release notes</a>
<li><a href="relnotes/6.5.1.html">6.5.1 release notes</a>
<li><a href="relnotes/6.5.html">6.5 release notes</a>
<li><a href="relnotes/6.4.2.html">6.4.2 release notes</a>
<li><a href="relnotes/6.4.1.html">6.4.1 release notes</a>
<li><a href="relnotes/6.4.html">6.4 release notes</a>
</ul>
<p>
@@ -66,31 +92,33 @@ Versions of Mesa prior to 6.4 are summarized in the
</p>
<ul>
<li><a href="RELNOTES-6.3.2">RELNOTES-6.3.2</a>
<li><a href="RELNOTES-6.3">RELNOTES-6.3</a>
<li><a href="RELNOTES-6.2.1">RELNOTES-6.2.1</a>
<li><a href="RELNOTES-6.2">RELNOTES-6.2</a>
<li><a href="RELNOTES-6.1">RELNOTES-6.1</a>
<li><a href="RELNOTES-6.0">RELNOTES-6.0</a>
<li><a href="RELNOTES-5.1">RELNOTES-5.1</a>
<li><a href="RELNOTES-5.0.2">RELNOTES-5.0.2</a>
<li><a href="RELNOTES-5.0.1">RELNOTES-5.0.1</a>
<li><a href="RELNOTES-5.0">RELNOTES-5.0</a>
<li><a href="RELNOTES-4.1">RELNOTES-4.1</a>
<li><a href="RELNOTES-4.0.3">RELNOTES-4.0.3</a>
<li><a href="RELNOTES-4.0.2">RELNOTES-4.0.2</a>
<li><a href="RELNOTES-4.0.1">RELNOTES-4.0.1</a>
<li><a href="RELNOTES-4.0">RELNOTES-4.0</a>
<li><a href="RELNOTES-3.5">RELNOTES-3.5</a>
<li><a href="RELNOTES-3.4.2">RELNOTES-3.4.2</a>
<li><a href="RELNOTES-3.4.1">RELNOTES-3.4.1</a>
<li><a href="RELNOTES-3.4">RELNOTES-3.4</a>
<li><a href="RELNOTES-3.3">RELNOTES-3.3</a>
<li><a href="RELNOTES-3.2.1">RELNOTES-3.2.1</a>
<li><a href="RELNOTES-3.2">RELNOTES-3.2</a>
<li><a href="RELNOTES-3.1">RELNOTES-3.1</a>
<li><a href="relnotes/6.3.2">6.3.2 release notes</a>
<li><a href="relnotes/6.3.1">6.3.1 release notes</a>
<li><a href="relnotes/6.3">6.3 release notes</a>
<li><a href="relnotes/6.2.1">6.2.1 release notes</a>
<li><a href="relnotes/6.2">6.2 release notes</a>
<li><a href="relnotes/6.1">6.1 release notes</a>
<li><a href="relnotes/6.0.1">6.0.1 release notes</a>
<li><a href="relnotes/6.0">6.0 release notes</a>
<li><a href="relnotes/5.1">5.1 release notes</a>
<li><a href="relnotes/5.0.2">5.0.2 release notes</a>
<li><a href="relnotes/5.0.1">5.0.1 release notes</a>
<li><a href="relnotes/5.0">5.0 release notes</a>
<li><a href="relnotes/4.1">4.1 release notes</a>
<li><a href="relnotes/4.0.3">4.0.3 release notes</a>
<li><a href="relnotes/4.0.2">4.0.2 release notes</a>
<li><a href="relnotes/4.0.1">4.0.1 release notes</a>
<li><a href="relnotes/4.0">4.0 release notes</a>
<li><a href="relnotes/3.5">3.5 release notes</a>
<li><a href="relnotes/3.4.2">3.4.2 release notes</a>
<li><a href="relnotes/3.4.1">3.4.1 release notes</a>
<li><a href="relnotes/3.4">3.4 release notes</a>
<li><a href="relnotes/3.3">3.3 release notes</a>
<li><a href="relnotes/3.2.1">3.2.1 release notes</a>
<li><a href="relnotes/3.2">3.2 release notes</a>
<li><a href="relnotes/3.1">3.1 release notes</a>
</ul>
</div>
</body>
</html>

150
docs/relnotes/10.0.1.html Normal file
View File

@@ -0,0 +1,150 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.0.1 Release Notes / (December 12, 2013)</h1>
<p>
Mesa 10.0.1 is a bug fix release which fixes bugs found since the 10.0 release.
</p>
<p>
Mesa 10.0.1 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
0a72ca5b36046a658bf6038326ff32ed MesaLib-10.0.1.tar.bz2
01bde35c912e504ba62caf1ef9f7022c MesaLib-10.0.1.tar.gz
59a174a11a89e6b1b8ee9c3f7e3c388c MesaLib-10.0.1.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64323">Bug 64323</a> - Severe misrendering in Left 4 Dead 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68838">Bug 68838</a> - GLSL: struct declarations produce a &quot;empty declaration warning&quot; in 9.2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69155">Bug 69155</a> - [NV50 gallium] [piglit] bin/varying-packing-simple triggers memory corruption/failures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70250">Bug 70250</a> - weston-terminal rendering corrupted with output transform 90 and 270</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70601">Bug 70601</a> - [SNB Bisected]Piglit spec/ARB_texture_float/multisample-formats 2 GL_ARB_texture_float fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72230">Bug 72230</a> - Unable to extract MesaLib-10.0.0.tar.{gz,bz2} with bsdtar</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72325">Bug 72325</a> - [swrast] piglit glean fbo regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72327">Bug 72327</a> - [swrast] piglit glean pointSprite regression</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following git command:</p>
<pre>
git log mesa-10.0..mesa-10.0.1
</pre>
<p>Axel Davy (2):</p>
<ul>
<li>egl/wayland: Flush the wl_display at the end of SwapBuffers</li>
<li>Enable throttling in SwapBuffers</li>
</ul>
<p>Chad Versace (2):</p>
<ul>
<li>i965/hsw: Apply non-msrt fast color clear w/a to all HSW GTs</li>
<li>i965: Add extra-alignment for non-msrt fast color clear for all hw (v2)</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>swrast: fix readback regression since inversion fix</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>automake: include only one copy VERSION in tarball</li>
</ul>
<p>Ian Romanick (3):</p>
<ul>
<li>docs: Add 10.0 release md5sums</li>
<li>Remove a057b83 from the pick list</li>
<li>glsl: Don't emit empty declaration warning for a struct specifier</li>
</ul>
<p>Ilia Mirkin (8):</p>
<ul>
<li>mesa: don't leak performance monitors on context destroy</li>
<li>nv50: Fix GPU_READING/WRITING bit removal</li>
<li>nouveau: avoid leaking fences while waiting</li>
<li>nv50: wait on the buf's fence before sticking it into pushbuf</li>
<li>nv50: enable h264 and mpeg4 for nv98+ (vp3, vp4.0)</li>
<li>nouveau/video: update h264 picparm field names based on usage</li>
<li>nouveau/video: update a few more h264 picparm field names</li>
<li>nv50: report 15 max inputs for fragment programs</li>
</ul>
<p>Jordan Justen (1):</p>
<ul>
<li>dri megadriver_stub: add compatibility for older DRI loaders</li>
</ul>
<p>Kristian Høgsberg (2):</p>
<ul>
<li>egl/wayland: Damage INT32_MAX x INT32_MAX region for eglSwapBuffers</li>
<li>egl/wayland: Send commit after flushing the driver context</li>
</ul>
<p>Maarten Lankhorst (1):</p>
<ul>
<li>nouveau: Fix compiler warning regression</li>
</ul>
<p>Paul Berry (1):</p>
<ul>
<li>i965/gen6: Fix multisample resolve blits for luminance/intensity 32F formats.</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>st/xa: Bump major version number to 2</li>
</ul>
<p>Tom Stellard (2):</p>
<ul>
<li>r300/compiler/tests: Fix segfault</li>
<li>r300/compiler/tests: Fix line length check in test parser</li>
</ul>
</div>
</body>
</html>

161
docs/relnotes/10.0.2.html Normal file
View File

@@ -0,0 +1,161 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.0.2 Release Notes / (January 9, 2014)</h1>
<p>
Mesa 10.0.2 is a bug fix release which fixes bugs found since the 10.0.1 release.
</p>
<p>
Mesa 10.0.2 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
de7d14baf0101b697c140d2f47ef27e9 MesaLib-10.0.2.tar.gz
8544c0ab3e438a08b5103421ea15b6d2 MesaLib-10.0.2.tar.bz2
181b0d6c1afca38e98a930d0e564ed90 MesaLib-10.0.2.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70740">Bug 70740</a> - HiZ on SNB causes GPU hang with WebGL web app</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72026">Bug 72026</a> - SIGSEGV in fs_visitor::visit(ir_dereference_variable*)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72264">Bug 72264</a> - GLSL error reporting</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72369">Bug 72369</a> - glitches in serious sam 3 with the sb shader backend</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following git command:</p>
<pre>
git log mesa-10.0.1..mesa-10.0.2
</pre>
<p>Aaron Watry (8):</p>
<ul>
<li>clover: Remove unused variable</li>
<li>pipe_loader/sw: close dev-&gt;lib when initialization fails</li>
<li>radeon/compute: Stop leaking LLVMContexts in radeon_llvm_parse_bitcode</li>
<li>r600/compute: Free compiled kernels when deleting compute state</li>
<li>r600/compute: Use the correct FREE macro when deleting compute state</li>
<li>radeon/llvm: Free target data at end of optimization</li>
<li>st/vdpau: Destroy context when initialization fails</li>
<li>r600/pipe: Stop leaking context-&gt;start_compute_cs_cmd.buf on EG/CM</li>
</ul>
<p>Alex Deucher (1):</p>
<ul>
<li>r600g: fix SUMO2 pci id</li>
</ul>
<p>Alexander von Gluck IV (1):</p>
<ul>
<li>Haiku: Add in public GL kit headers</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>mesa: Fix error code generation in glBeginConditionalRender()</li>
</ul>
<p>Carl Worth (2):</p>
<ul>
<li>docs: Add md5sums for the 10.0.1 release.</li>
<li>Update version to 10.0.2</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>i965/gen6: Fix HiZ hang in WebGL Google Maps</li>
</ul>
<p>Erik Faye-Lund (1):</p>
<ul>
<li>glcpp: error on multiple #else/#elif directives</li>
</ul>
<p>Henri Verbeet (1):</p>
<ul>
<li>i915: Add support for gl_FragData[0] reads.</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nv50: fix a small leak on context destroy</li>
</ul>
<p>Jonathan Liu (2):</p>
<ul>
<li>st/mesa: use pipe_sampler_view_release()</li>
<li>llvmpipe: use pipe_sampler_view_release() to avoid segfault</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Fix 3DSTATE_PUSH_CONSTANT_ALLOC_PS packet creation.</li>
<li>Revert "mesa: Remove GLXContextID typedef from glx.h."</li>
</ul>
<p>Kevin Rogovin (1):</p>
<ul>
<li>Use line number information from entire function expression</li>
</ul>
<p>Kristian Høgsberg (1):</p>
<ul>
<li>dri_util: Don't assume __DRIcontext-&gt;driverPrivate is a gl_context</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>mesa: fix interpretation of glClearBuffer(drawbuffer)</li>
<li>st/mesa: fix glClear with multiple colorbuffers and different formats</li>
</ul>
<p>Paul Berry (2):</p>
<ul>
<li>glsl: Teach ir_variable_refcount about ir_loop::counter variables.</li>
<li>glsl: Fix inconsistent assumptions about ir_loop::counter.</li>
</ul>
<p>Vadim Girlin (1):</p>
<ul>
<li>r600g/sb: fix stack size computation on evergreen</li>
</ul>
</div>
</body>
</html>

206
docs/relnotes/10.0.3.html Normal file
View File

@@ -0,0 +1,206 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.0.3 Release Notes / (February 3, 2014)</h1>
<p>
Mesa 10.0.3 is a bug fix release which fixes bugs found since the 10.0.2 release.
</p>
<p>
Mesa 10.0.3 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
5f9f463ef08129f6762106b434910adb MesaLib-10.0.3.tar.bz2
fb3997b6500e153bc32370cb3fc4ca9e MesaLib-10.0.3.tar.gz
a07b4b6b9eb449b88a6cb5061e51c331 MesaLib-10.0.3.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72708">Bug 72708</a> - Master fails to build with older gcc due to -msse4.1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72926">Bug 72926</a> - [REGRESSION,swrast] Memory-related crash with anti-aliasing enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73096">Bug 73096</a> - Query GL_RGBA_SIGNED_COMPONENTS_EXT missing</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73100">Bug 73100</a> - Please use AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73418">Bug 73418</a> - OpenCL hangs graphics on CAYMAN</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73473">Bug 73473</a> - Potential crash bug in src/gallium/auxiliary/rtasm/rtasm_execmem.c</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73915">Bug 73915</a> - sample shading + centroid broken since f5cfb4a</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73956">Bug 73956</a> - SIGSEGV when passing GL_NONE to glReadBuffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74026">Bug 74026</a> - Compiler rejects chained assignments involving array dereferences</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following git command:</p>
<pre>
git log mesa-10.0.2..mesa-10.0.3
</pre>
<p>Aaron Watry (2):</p>
<ul>
<li>radeon: Move gfx/dma cs cleanup to r600_common_context_cleanup</li>
<li>st/dri: prevent leak of dri option default values</li>
</ul>
<p>Andreas Fänger (1):</p>
<ul>
<li>swrast: fix delayed texel buffer allocation regression for OpenMP</li>
</ul>
<p>Anuj Phogat (3):</p>
<ul>
<li>glsl: Disable ARB_texture_rectangle in shader version 100.</li>
<li>i965: Use sample barycentric coordinates with per sample shading</li>
<li>i965: Ignore 'centroid' interpolation qualifier in case of persample shading</li>
</ul>
<p>Brian Paul (3):</p>
<ul>
<li>mesa: implement missing glGet(GL_RGBA_SIGNED_COMPONENTS_EXT) query</li>
<li>st/mesa: fix glReadBuffer(GL_NONE) segfault</li>
<li>draw: fix incorrect vertex size computation in LLVM drawing code</li>
</ul>
<p>Carl Worth (5):</p>
<ul>
<li>Add md5sums for 10.0.2. release.</li>
<li>cherry-ignore: Ignore several patches not yet ready for the stable branch</li>
<li>Drop another couple of patches.</li>
<li>cherry-ignore: Ignore 4 patches at teh request of the author, (Anuj).</li>
<li>Update version to 10.0.3</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>i965/gen6/blorp: Emit more flushes to workaround hangs</li>
</ul>
<p>Chris Forbes (1):</p>
<ul>
<li>i965: fold offset into coord for textureOffset(gsampler2DRect)</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>mesa: use signed temporary variable to store _ColorDrawBufferIndexes</li>
<li>st/mesa: use signed temporary variable to store _ColorDrawBufferIndexes</li>
<li>nv50: access only the available amount of textures</li>
<li>nv50: access only the available amount of constbuf</li>
<li>gallium/rtasm: handle mmap failures appropriately</li>
</ul>
<p>Eric Anholt (2):</p>
<ul>
<li>i965: Fix handling of MESA_pack_invert in blit (PBO) readpixels.</li>
<li>i965: Don't do the temporary-and-blit-copy for INVALIDATE_RANGE maps.</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>mesa: Add COMPRESSED_RGBA_S3TC_DXT1_EXT to COMPRESSED_TEXTURE_FORMATS for GLES</li>
<li>radeon / r200: Pass the API into _mesa_initialize_context</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>mesa: fix GL_COLOR_SUM enum for drivers without ARB_vertex_program</li>
<li>st/vdpau: don't return a device if the screen doesn't support NPOT</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>mesa: Use IROUND instead of roundf.</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>glsl: Rename "expr" to "lhs_expr" in vector_extract munging code.</li>
<li>glsl: Fix chained assignments of vector channels.</li>
</ul>
<p>Lauri Kasanen (1):</p>
<ul>
<li>mesa: Fix build to properly check for supported compiler flags</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>st/mesa: use sRGB formats for MSAA resolving if destination is sRGB</li>
<li>gallium/util: util_format_srgb should not return FORMAT_NONE for sRGB formats</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>glcpp: Define GL_EXT_shader_integer_mix in both GL and ES.</li>
<li>glx: Update glxext.h to revision 24777.</li>
</ul>
<p>Michał Górny (1):</p>
<ul>
<li>Use AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config.</li>
</ul>
<p>Paul Berry (1):</p>
<ul>
<li>i965: Ensure that all necessary state is re-emitted if we run out of aperture.</li>
</ul>
<p>Paul Seidler (1):</p>
<ul>
<li>build: move ARCH_LIBS definition outside of ASM definition</li>
</ul>
<p>Thomas Sondergaard (4):</p>
<ul>
<li>mesa: Preliminary support for MSVC_VERSION=12.0</li>
<li>mesa: Fix compile error with MSVC 2013</li>
<li>mesa: Work around internal compiler error</li>
<li>mesa: Namespace qualify fma to override ambiguity with fma from math.h</li>
</ul>
<p>Tom Stellard (1):</p>
<ul>
<li>r600g/compute: Emit DEALLOC_STATE on cayman after dispatching a compute shader.</li>
</ul>
</div>
</body>
</html>

191
docs/relnotes/10.0.4.html Normal file
View File

@@ -0,0 +1,191 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.0.4 Release Notes / (March 12, 2014)</h1>
<p>
Mesa 10.0.4 is a bug fix release which fixes bugs found since the 10.0.3 release.
</p>
<p>
Mesa 10.0.4 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
5a3c5b90776ec8a9fcd777c99e0607e2 MesaLib-10.0.4.tar.gz
8b148869d2620b0720c8a8d2b7eb3e38 MesaLib-10.0.4.tar.bz2
da2418d25bfbc273660af7e755fb367e MesaLib-10.0.4.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71870">Bug 71870</a> - Metro: Last Light rendering issues</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72895">Bug 72895</a> - Missing trees in flightgear 2.12.1 with mesa 10.0.1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74251">Bug 74251</a> - Segfault in st_finalize_texture with Texture Buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74723">Bug 74723</a> - main/shaderapi.c:407: detach_shader: Assertion `shProg-&gt;Shaders[j]-&gt;Type == 0x8B31 || shProg-&gt;Shaders[j]-&gt;Type == 0x8B30' failed.</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following git command:</p>
<pre>
git log mesa-10.0.3..mesa-10.0.4
</pre>
<p>Anuj Phogat (4):</p>
<ul>
<li>mesa: Generate correct error code in glDrawBuffers()</li>
<li>mesa: Add GL_TEXTURE_CUBE_MAP_ARRAY to legal_get_tex_level_parameter_target()</li>
<li>glsl: Fix condition to generate shader link error</li>
<li>i965: Fix the region's pitch condition to use blitter</li>
</ul>
<p>Brian Paul (8):</p>
<ul>
<li>r200: move driContextSetFlags(ctx) call after ctx var is initialized</li>
<li>radeon: move driContextSetFlags(ctx) call after ctx var is initialized</li>
<li>gallium/auxiliary/indices: replace free() with FREE()</li>
<li>draw: fix incorrect color of flat-shaded clipped lines</li>
<li>st/mesa: avoid sw fallback for getting/decompressing textures</li>
<li>mesa: update assertion in detach_shader() for geom shaders</li>
<li>mesa: do depth/stencil format conversion in glGetTexImage</li>
<li>softpipe: use 64-bit arithmetic in softpipe_resource_layout()</li>
</ul>
<p>Carl Worth (4):</p>
<ul>
<li>docs: Add md5sums for 10.0.3 release</li>
<li>main: Avoid double-free of shader Label</li>
<li>get-pick-list: Update to only find patches nominated for the 10.0 branch</li>
<li>Update version to 10.0.4</li>
</ul>
<p>Chris Forbes (1):</p>
<ul>
<li>i965: Validate (and resolve) all the bound textures.</li>
</ul>
<p>Christian König (1):</p>
<ul>
<li>radeon/uvd: fix feedback buffer handling v2</li>
</ul>
<p>Daniel Kurtz (1):</p>
<ul>
<li>glsl: Add locking to builtin_builder singleton</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>dri/nouveau: Pass the API into _mesa_initialize_context</li>
<li>nv50: correctly calculate the number of vertical blocks during transfer map</li>
<li>dri/i9*5: correctly calculate the amount of system memory</li>
</ul>
<p>Fredrik Höglund (3):</p>
<ul>
<li>mesa: Preserve the NewArrays state when copying a VAO</li>
<li>glx: Fix the default values for GLXFBConfig attributes</li>
<li>glx: Fix the GLXFBConfig attrib sort priorities</li>
</ul>
<p>Hans (2):</p>
<ul>
<li>util: don't define isfinite(), isnan() for MSVC &gt;= 1800</li>
<li>mesa: don't define c99 math functions for MSVC &gt;= 1800</li>
</ul>
<p>Ian Romanick (6):</p>
<ul>
<li>meta: Release resources used by decompress_texture_image</li>
<li>meta: Release resources used by _mesa_meta_DrawPixels</li>
<li>meta: Fallback to software for GetTexImage of compressed GL_TEXTURE_CUBE_MAP_ARRAY</li>
<li>meta: Consistenly use non-Apple VAO functions</li>
<li>glcpp: Only warn for macro names containing __</li>
<li>glsl: Only warn for macro names containing __</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>nv30: report 8 maximum inputs</li>
<li>nouveau/video: make sure that firmware is present when checking caps</li>
<li>nouveau: fix chipset checks for nv1a by using the oclass instead</li>
</ul>
<p>Julien Cristau (1):</p>
<ul>
<li>glx/dri2: fix build failure on HURD</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>glsl: Don't lose precision qualifiers when encountering "centroid".</li>
<li>i965: Create a hardware context before initializing state module.</li>
</ul>
<p>Kusanagi Kouichi (1):</p>
<ul>
<li>targets/vdpau: Always use c++ to link</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>st/mesa: fix crash when a shader uses a TBO and it's not bound</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>glsl: Initialize ubo_binding_mask flags to zero.</li>
</ul>
<p>Paul Berry (2):</p>
<ul>
<li>glsl: Make condition_to_hir() callable from outside ast_iteration_statement.</li>
<li>glsl: Fix continue statements in do-while loops.</li>
</ul>
<p>Tom Stellard (1):</p>
<ul>
<li>r600g/compute: PIPE_CAP_COMPUTE should be false for pre-evergreen GPUs</li>
</ul>
<p>Topi Pohjolainen (1):</p>
<ul>
<li>i965/blorp: do not use unnecessary hw-blending support</li>
</ul>
</div>
</body>
</html>

173
docs/relnotes/10.0.5.html Normal file
View File

@@ -0,0 +1,173 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.0.5 Release Notes / (April 18, 2014)</h1>
<p>
Mesa 10.0.5 is a bug fix release which fixes bugs found since the 10.0.4 release.
</p>
<p>
Mesa 10.0.5 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
db606aadd0fe321f3664099677d159bc MesaLib-10.0.5.tar.gz
e6009ccd8898d7104bb325b6af9ec354 MesaLib-10.0.5.tar.bz2
c8ab9e502542bf32299a4df85b0b704d MesaLib-10.0.5.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=58660">Bug 58660</a> - CAYMAN broken with HyperZ on</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64471">Bug 64471</a> - Radeon HD6570 lockup in Brütal Legend with HyperZ</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66352">Bug 66352</a> - GPU lockup in L4D2 on TURKS with HyperZ</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68799">Bug 68799</a> - [APITRACE] Hyper-Z lockup with Falcon BMS 4.32u6 on CAYMAN</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71547">Bug 71547</a> - compilation failure :#error &quot;SSE4.1 instruction set not enabled&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72685">Bug 72685</a> - [radeonsi hyperz] Artifacts in Unigine Sanctuary</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73088">Bug 73088</a> - [HyperZ] Juniper (6770): Gone Home / Unigine Heaven 4.0 lock up system after several minutes of use</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74428">Bug 74428</a> - hyperz causes gpu hang in Counter-strike: Source</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74803">Bug 74803</a> - [r600g] HyperZ broken on RV630 (Cogs shadows are broken)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74863">Bug 74863</a> - [r600g] HyperZ broken on RV770 and CYPRESS (Left 4 Dead 2 trees corruption) bisected!</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74892">Bug 74892</a> - HyperZ GPU lockup with radeonsi 7970M PITCAIRN and Distance Alpha game</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74988">Bug 74988</a> - Buffer overrun (segfault) decompressing ETC2 texture in GLBenchmark 3.0 Manhattan</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75279">Bug 75279</a> - XCloseDisplay() takes one minute around nouveau_dri.so, freezing Firefox startup</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77102">Bug 77102</a> - gallium nouveau has no profile in vdpau and libva</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77207">Bug 77207</a> - [ivb/hsw] batch overwritten with garbage</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following git command:</p>
<pre>
git log mesa-10.0.4..mesa-10.0.5
</pre>
<p>Alex Deucher (1):</p>
<ul>
<li>radeon: reverse DBG_NO_HYPERZ logic</li>
</ul>
<p>Brian Paul (9):</p>
<ul>
<li>mesa: add unpacking code for MESA_FORMAT_Z32_FLOAT_S8X24_UINT</li>
<li>mesa: fix copy &amp; paste bugs in pack_ubyte_SARGB8()</li>
<li>mesa: fix copy &amp; paste bugs in pack_ubyte_SRGB8()</li>
<li>mesa: fix unpack_Z32_FLOAT_X24S8() / unpack_Z32_FLOAT() mix-up</li>
<li>st/mesa: add null pointer checking in query object functions</li>
<li>mesa: fix glMultiDrawArrays inside a display list</li>
<li>cso: fix sampler view count in cso_set_sampler_views()</li>
<li>svga: replace sampler assertion with conditional</li>
<li>svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()</li>
</ul>
<p>Carl Worth (3):</p>
<ul>
<li>docs: Add md5sums for the 10.0.4 release.</li>
<li>Ignore patches which don't apply.</li>
<li>Update version to 10.0.5</li>
</ul>
<p>Christian König (2):</p>
<ul>
<li>st/mesa: recreate sampler view on context change v3</li>
<li>st/mesa: fix sampler view handling with shared textures v4</li>
</ul>
<p>Courtney Goeltzenleuchter (1):</p>
<ul>
<li>mesa: add bounds checking to eliminate buffer overrun</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>mesa: return v.value_int64 when the requested type is TYPE_INT64</li>
<li>glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>i965: Fix buffer overruns in MSAA MCS buffer clearing.</li>
</ul>
<p>Ilia Mirkin (6):</p>
<ul>
<li>nouveau: fix fence waiting logic in screen destroy</li>
<li>nv50: adjust blit_3d handling of ms output textures</li>
<li>mesa/main: condition GL_DEPTH_STENCIL on ARB_depth_texture</li>
<li>nouveau: add forgotten GL_COMPRESSED_INTENSITY to texture format list</li>
<li>nouveau: there may not have been a texture if the fbo was incomplete</li>
<li>nouveau: fix firmware check on nvd7/nvd9</li>
</ul>
<p>Johannes Nixdorf (1):</p>
<ul>
<li>configure.ac: fix the detection of expat with pkg-config</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>gallium: add endian detection for OpenBSD</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>draw: Duplicate TGSI tokens in draw_pipe_pstipple module.</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>mesa: Wrap SSE4.1 code in #ifdef __SSE4_1__.</li>
</ul>
<p>Paul Berry (1):</p>
<ul>
<li>i965/gen7: Prefer vertical alignment of 4 when possible.</li>
</ul>
</div>
</body>
</html>

146
docs/relnotes/10.0.html Normal file
View File

@@ -0,0 +1,146 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.0 Release Notes / (November 30th, 2013)</h1>
<p>
Mesa 10.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 10.0.1.
</p>
<p>
Mesa 10.0 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
b38626b96c664db67a534d7859682436 MesaLib-10.0.0.tar.gz
f3fe55d9735bea158bbe97ed9a0da819 MesaLib-10.0.0.tar.bz2
c6ee1ce51e3bf35947d2978b872daf51 MesaLib-10.0.0.zip
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_AMD_seamless_cubemap_per_texture on i965.</li>
<li>GL_ARB_conservative_depth on i965.</li>
<li>GL_ARB_texture_gather on i965.</li>
<li>GL_ARB_texture_query_levels on i965.</li>
<li>GL_ARB_texture_mirror_clamp_to_edge.</li>
<li>GL_ARB_transform_feedback2, GL_ARB_transform_feedback3, and GL_ARB_transform_feedback_instanced on i965/Gen7 (with appropriate kernel support).</li>
<li>GL_ARB_sample_shading on i965.</li>
<li>GL_ARB_shader_atomic_counters on i965.</li>
<li>GL_ARB_vertex_attrib_binding</li>
<li>GL_ARB_vertex_type_10f_11f_11f_rev on i965 and r600g</li>
<li>GL_KHR_debug</li>
<li>GLX_MESA_query_renderer</li>
</ul>
<h2>Bug fixes</h2>
<p>Attempts have been made to <b>not</b> include bugs fixed in previous 9.2
releases or bugs that were regressions during 10.0 development. This list is
likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=47755">Bug 47755</a> - [glsl-compiler] no error checking when Interpolation qualifier for built-in variable is different in vertex and fragment shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=52171">Bug 52171</a> - [gallium/r600/clover] Simple benchmarks failed to run</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=53077">Bug 53077</a> - [IVB] Output error with msaa when both of framebuffer and source color's alpha are not 1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54867">Bug 54867</a> - bug in r300 compiler</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60929">Bug 60929</a> - [r600-llvm] mono games with opengl are blocking on start</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62142">Bug 62142</a> - Mesa/demo mipmap_limits upside down with running by SOFTWARE</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62698">Bug 62698</a> - [bisected] WebGL demo &quot;Consumed&quot;: texstate.c:628: update_texture_state: Assertion „__builtin_popcount(enabledTargets) == 1“ failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64225">Bug 64225</a> - bfgminer --scyte generates Segmentation Fault on Northern Island</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64226">Bug 64226</a> - python-opencl package generate segmentation fault at pipe_r600.so</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64261">Bug 64261</a> - [SNB Bisected]Ogles3conform GL3Tests_color_buffer_float_color_buffer_float_clamp_fixed.test fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66213">Bug 66213</a> - Certain Mesa Demos Rendering Inverted (vertically)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66806">Bug 66806</a> - [softpipe] glxgears floating point exception</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67921">Bug 67921</a> - [bisected commit 883987] crosscompiling fails with util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68162">Bug 68162</a> - [radeonsi] texture rendering is broken in Source-Engine games</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68451">Bug 68451</a> - Texture flicker in native Dota2 in mesa 9.2.0rc1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68503">Bug 68503</a> - Graphical glitches in Serious Sam 3 when SB is enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68792">Bug 68792</a> - Problems during playback of h264 files using UVD and VLC on AMD E-350 CPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68845">Bug 68845</a> - VDPAU/UVD regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69078">Bug 69078</a> - Modern Warfare (1, 2 and 3) broken in Wine on SNB</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69321">Bug 69321</a> - starting openCL crashes/boots system</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70042">Bug 70042</a> - Major texture flickering in Dota 2 (r600g on HD 6950)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70088">Bug 70088</a> - Glamor on r600g crashes Xserver</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70123">Bug 70123</a> - Freeze caused by 'winsys/radeon: remove cs_queue_empty' commit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70327">Bug 70327</a> - Casting floating point variable to integer not working properly while constant gets converted properly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70891">Bug 70891</a> - CL_INVALID_BUILD_OPTIONS results in CL_INVALID_DEVICE when asking for build log</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70913">Bug 70913</a> - [PIGLIT,radeonsi] crash in &quot;spec/EXT_framebuffer_multisample/sample-alpha-to-coverage 4 depth&quot; (buffer overflow)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71022">Bug 71022</a> - configure: error: Expat required for DRI.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71110">Bug 71110</a> - xorg_driver.c:1030:2: error: too many arguments to function DamageUnregister</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71172">Bug 71172</a> - Segfault when running glxinfo. NV25GL [Quadro4 900 XGL]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71512">Bug 71512</a> - dlopen.h:54: undefined reference to `dlopen'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71870">Bug 71870</a> - Metro: Last Light rendering issues</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Removed X.Org state tracker (unmaintained and broken)</li>
<li>Removed the video-accel r300 targets</li>
<li>Removed the video-accel softpipe targets</li>
</ul>
</div>
</body>
</html>

View File

@@ -106,7 +106,7 @@ Vertex/Fragment program debugger
GL_MESA_program_debug is an experimental extension to support
interactive debugging of vertex and fragment programs. See the
docs/MESA_program_debug.spec file for details.
docs/specs/OLD/MESA_program_debug.spec file for details.
The bulk of the vertex/fragment program debugger is implemented
outside of Mesa. The GL_MESA_program_debug extension just has minimal

View File

@@ -3,10 +3,17 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 6.4.1 / November 29, 2006</h1>
<p>
@@ -63,5 +70,6 @@ Allegro requires updates
D3D requires updates
</pre>
</div>
</body>
</html>

View File

@@ -3,10 +3,17 @@
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 6.4.2 / February 2, 2006</h1>
<p>
@@ -70,5 +77,6 @@ Allegro requires updates
D3D requires updates
</pre>
</div>
</body>
</html>

Some files were not shown because too many files have changed in this diff Show More