Compare commits

..

829 Commits

Author SHA1 Message Date
Carl Worth
8f0742051e Add bin/test-driver to the list of files to be distributed.
Without this, the build fails for me when trying to build from a generated tar
file after running just ./configure. (It's not clear to me why I didn't
encounter similar breakage with previous releases.)
2013-10-18 16:58:32 -07:00
Carl Worth
8eb1046996 docs: Add release notes for 9.2.2 release
With the list of bugs fixed and a full list of changes.
2013-10-18 16:41:15 -07:00
Carl Worth
cc6ad9ce2c Bump version to 9.2.2
In preparation for the 9.2.2 release, of course.
2013-10-18 16:36:31 -07:00
Carl Worth
82d5b5e20f Revert "glx: Generate fewer errors in MakeContextCurrent"
This reverts commit fb3e55f898.

This commit was identified as causing the piglit
glx-create-context-current-no-framebuffer test to crash, (where, previously,
it merely failed without crashing).
2013-10-17 11:30:26 -07:00
Tom Stellard
bf9be81b47 radeonsi: Use 'SI' as the LLVM processor for CIK on LLVM <= 3.3
LLVM 3.3 does not know about CIK processors, and the codes paths for SI
and CIK are the same.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9da4021626)
2013-10-16 15:15:05 -07:00
Brian Paul
995dc3782b mesa: consolidate cube width=height error checking
Instead of checking width==height in four places, just do it in
_mesa_legal_texture_dimensions() where we do the other width, height,
depth checks.  Similarly, move the check that cube map array depth is
a multiple of 6.

This change also fixes some missing cube dimension checks for the
glTexStorage[23]D() functions.

Remove width==height assertion in _mesa_get_tex_max_num_levels() since
that's called before the other size checks for glTexStorage.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fa9c702164)
2013-10-16 15:13:29 -07:00
Constantin Baranov
cd5ea2788d mesa: Add missing switch break in invalidate_framebuffer_storage()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70411
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 53904c64da)
2013-10-14 14:44:57 -07:00
Eric Anholt
f1257f5fe0 i965: Fix 3D texture layout by more literally copying from the spec.
Fixes 3 texelFetch tests in piglit all.tests on ivb, and cubemap npot on gm45.

v2: Don't forget the gen4 DL=6 cubemap behavior.

Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v1)
(cherry picked from commit 8da15d7544)
2013-10-14 14:34:04 -07:00
Eric Anholt
cde1ff2d7c mesa: Fix compiler warnings when ALIGN's alignment is "1 << value".
We hadn't run into order of operation warnings before, apparently, since
addition is so low on the order.

Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit bfe6e5dda5)
2013-10-14 14:32:59 -07:00
Eric Anholt
eb69e251a8 i965: Don't forget the cube map padding on gen5+.
We had a fixup for gen4's 3d-layout cubemaps (which, iirc, we'd
experimentally found to be necessary!), but while the spec still requires
it on gen5, we'd been missing it in the array-layout cubemaps.

Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 791550aa8e)
2013-10-14 14:31:55 -07:00
Adam Jackson
fb3e55f898 glx: Generate fewer errors in MakeContextCurrent
For a few reasons.

1: In the (current) common case, these conditionals are never true. All
we're doing by checking them is slowing down MakeCurrent.  The server
does these checks already anyway.

2: GLX >= 3.0 contexts may legally be made current without a bound
framebuffer.

This does not fix piglit/glx-create-context-current-no-framebuffer, but
is a prerequisite for fixing it.

Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit e166a58c43)
2013-10-14 14:31:12 -07:00
Francisco Jerez
6d6d8fb073 glsl: Fix usage of the wrong union member in program_resource_visitor::recursion.
In the array-of-struct case, recursion() takes the row_major flag for
each iteration from 't->fields.structure[i]', but 't' is not a record
type.  Inherit the array declaration row_major flag instead.

This mistake was found by running piglit on valgrind.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69449
Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b3c04362b4)
2013-10-14 14:29:55 -07:00
Brian Paul
9eae0c95f8 svga: fix incorrect memcpy src in svga_buffer_upload_piecewise()
As we march over the source buffer we're uploading in pieces, we
need to memcpy from the current offset, not the start of the buffer.
Fixes graphical corruption when drawing very large vertex buffers.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matthew McClure <mcclurem@vmware.com>
(cherry picked from commit a50c5f8d24)
2013-10-14 14:28:56 -07:00
Brian Paul
fac3094fef docs: add missing <pre> tag 2013-10-05 14:18:48 -06:00
Carl Worth
8c4c3d01ee docs: Add md5sums for 9.2.1 release
Which we could only do after creating the tar files, of course.
2013-10-04 20:42:21 -07:00
Carl Worth
2c3aa1b4ee docs: Add release notes for 9.2.1 release
With the list of bugs fixed and a full list of changes.
2013-10-04 17:01:47 -07:00
Carl Worth
5377bc3e40 mesa: Bump version to 9.2.1
In preparation for the 9.2.1 release, of course.
2013-10-04 15:25:44 -07:00
Ian Romanick
486aecac7e mesa: Don't return any data for GL_SHADER_BINARY_FORMATS
We return 0 for GL_NUM_SHADER_BINARY_FORMATS, so
GL_SHADER_BINARY_FORMATS should not write any data to the application
buffer.

Fixes piglit test 'arb_get_program_binary-overrun shader'.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 0667e2c969)
2013-10-04 14:33:28 -07:00
Torsten Duwe
8fc8f38d94 wayland-egl.pc requires wayland-client.pc.
Mesa provides the wayland-egl libs and the pkgconfig file, but the headers
originate from the wayland package. Ensure everything matches, by requiring
application builds to look at the wayland headers as well.

Signed-off-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Johannes Obermayr <johannesobermayr@gmx.de>
(cherry picked from commit 3bc642cbf6)
2013-10-04 14:21:50 -07:00
Johannes Obermayr
dfcc8caf25 st/gbm: Add $(WAYLAND_CFLAGS) for HAVE_EGL_PLATFORM_WAYLAND.
(cherry picked from commit 87ebbe1270)
2013-10-04 14:21:27 -07:00
Kenneth Graunke
c8ae770068 meta: Set correct viewport and projection in decompress_texture_image.
_mesa_meta_begin() sets up an orthographic project and initializes the
viewport based on the current drawbuffer's width and height.  This is
likely the window size, since it occurs before the meta operation binds
any temporary buffers.

decompress_texture_image needs the viewport to be the size of the image
it's trying to draw.  Otherwise, it may only draw part of the image.

v2: Actually set the projection properly too.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68250
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Mak Nazecic-Andrlon <owlberteinstein@gmail.com>
(cherry picked from commit 62411681da)
2013-10-04 14:13:07 -07:00
Chris Forbes
88513d6485 i965: fix bogus swizzle in brw_cubemap_normalize
When used with a cube array in VS, failed assertion in ir_validate:

   Assignment count of LHS write mask channels enabled not
   matching RHS vector size (3 LHS, 4 RHS).

To fix this, swizzle the RHS correctly for the writemask.

This showed up in the ARB_texture_gather tests, which exercise cube
arrays in the VS.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0d7fc10bcd)
2013-10-02 22:01:05 -07:00
Brian Paul
2eb55601bb mesa: check for bufSize > 0 in _mesa_GetSynciv()
The spec doesn't say GL_INVALID_VALUE should be raised for bufSize <= 0.
In any case, memcpy(len < 0) will lead to a crash, so don't allow it.

CC: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6659131be3)
2013-10-01 14:11:38 -07:00
Eric Anholt
26ffbf6f39 i965: Reenable glBitmap() after the sRGB winsys enabling.
The format of the window system framebuffer changed from ARGB8888 to
SARGB8, but we're still supposed to render to it the same as ARGB8888
unless the user flipped the GL_FRAMEBUFFER_SRGB switch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for stable branches.
(cherry picked from commit 48b9720272)
2013-10-01 14:11:10 -07:00
Ian Romanick
421141192f mesa: Remove all traces of GL_OES_matrix_get
I believe this extension was enabled by accident.  As far as I can tell,
there has never been any code in Mesa to actually support it.  Not only
that, this extension is only useful in the common-lite profile, and Mesa
does the common profile.

This "fixes" the piglit test oes_matrix_get-api.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 3e1fdf3899)
2013-10-01 14:10:40 -07:00
Carl Worth
be029eb993 Use -Bsymbolic when linking libEGL.so
For some reason that I don't yet fully understand, Glaze does not work with
libEGL unless libEGL is linked with -Bsymbolic.[*]

Beyond that specific reason, all of the reasons for which libGL.so is linked
with -Bsymbolic, (see the commit history), should also apply here.

[*] The specific behavior I am seeing is that when Glaze calls dlopen for
libEGL.so, ifunc resolvers within Glaze for EGL functions are called before
the dlopen returns. These resolvers cannot succeed, as they need the return
value from dlopen in order to find the functions to resolve to. I don't know
what's causing these resolvers to be called, but I have verified that linking
libEGL with -Bsymbolic causes this problematic behavior to stop.

CC: "9.1 and 9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 9baf35de5c)
2013-10-01 14:09:18 -07:00
Carl Worth
f7fba18e2e cherry-ignore: Ignore a commit which appeared twice on master
In between the two appearances, it was reverted once.

Regardless, the two versions on master are the same, and we've already
cherry-picked one of them, so ignore the second.
2013-10-01 14:08:17 -07:00
Marek Olšák
42b6d94537 r600g: fix texture buffer object cache flushing
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f7d004b9ad)

Conflicts:
	src/gallium/drivers/r600/r600_hw_context.c
2013-10-01 14:03:39 -07:00
Marek Olšák
563c488453 r600g: fix constant buffer cache flushing
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6317a3fb31)

Conflicts:
	src/gallium/drivers/r600/r600_hw_context.c
2013-10-01 14:02:31 -07:00
Chris Forbes
4babf9ba6b i965: Fix cube array coordinate normalization
Hardware requires the magnitude of the largest component to not exceed
1; brw_cubemap_normalize ensures that this is the case.

Unfortunately, we would previously multiply the array index for cube
arrays by the normalization factor. The incorrect array index would then
cause the sampler to attempt to access either the wrong cube, or memory
outside the cube surface entirely, resulting in garbage rendering or in
the worst case, hangs.

Alter the normalization pass to only multiply the .xyz components.

Fixes broken rendering in the arb_texture_cube_map_array-cubemap piglit,
which was recently adjusted to provoke this behavior.

V2: Fix indent.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "9.2" mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit fe2528c0b6)
2013-09-27 15:31:59 -07:00
Eric Anholt
8a9099d4ef i965/gen4: Fix fragment program rectangle texture shadow compares.
The rescale_texcoord(), if it does something, will return just the
GLSL-sized coordinate, leaving out the 3rd and 4th components where we
were storing our projected shadow compare and the texture projector.
Deref the shadow compare before using the shared rescale-the-coordinate
code to fix the problem.

Fixes piglit tex-shadow2drect.shader_test and txp-shadow2drect.shader_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69525
NOTE: This is a candidate for stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 938956ad52)
2013-09-27 15:31:29 -07:00
Ian Romanick
beebb2d9d5 mesa: Support GL_MAX_VERTEX_OUTPUT_COMPONENTS query with ES3
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d38765f3c8)
2013-09-27 15:30:20 -07:00
Kenneth Graunke
e021b50227 i965: Fix brw_vs_prog_data_compare to actually check field members.
&a and &b are the address of the local stack variables, not the actual
structures.  Instead of comparing the fields of a and b, we compared
...some stack memory.

Caught by Valgrind on Piglit's glsl-lod-bias test (among many others).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68233
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 4e4b079916)
2013-09-27 15:30:11 -07:00
Dave Airlie
3801e9a87e st/mesa: don't dereference stObj->pt if NULL
It seems a user app can get us into this state, I trigger the fail
running fbo-maxsize inside virgl, it fails to create the backing
storage for the texture object, but then segfaults here when it
should fail the completeness test.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 2f508f244e)
2013-09-27 15:30:03 -07:00
Andreas Boll
faec15dc7a os: First check for __GLIBC__ and then for PIPE_OS_BSD
Fixes FTBFS on kfreebsd-*

Debian GNU/kFreeBSD doesn't provide getprogname() since it uses stdlib.h
from glibc. Instead it provides program_invocation_short_name from glibc.

You can find the same order in src/mesa/drivers/dri/common/xmlconfig.c

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Tested-by: Julien Cristau <jcristau@debian.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 32637f56a5)
2013-09-27 15:29:54 -07:00
Kenneth Graunke
5461cc1f00 i965/vec4: Only zero out unused message components when there are any.
Otherwise, coordinates with four components would result in a MOV
with a destination writemask that has no channels enabled:

mov(8) g115<1>.F 0D { align16 WE_normal NoDDChk 1Q };

At best, this is stupid: we emit code that shouldn't do anything.
Worse, it apparently causes GPU hangs (observable with Chris's
textureGather test on CubeArrays.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 6c3db2167c)
2013-09-27 15:28:12 -07:00
Dominik Behr
4fbbf49cc5 glsl: propagate max_array_access through function calls
Fixes a bug where if an uniform array is passed to a function the accesses
to the array are not propagated so later all but the first vector of the
uniform array are removed in parcel_out_uniform_storage resulting in
broken shaders and out of bounds access to arrays in
brw::vec4_visitor::pack_uniform_registers.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dominik Behr <dbehr@chromium.org>
(cherry picked from commit 0f6fce1585)
2013-09-27 15:27:49 -07:00
Ilia Mirkin
130fda3d3b nv30: fix inconsistent setting of push->user_priv
It's set to &nv30->bufctx everywhere else.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 85f7df81a9)
2013-09-27 15:27:38 -07:00
Rico Schüller
3a2926fdbf glx: Initialize OpenGL version to 1.0
The old code in dri2_glx suffered from a typographical error that caused
the default version to be 2.1 instead of 1.2 (minimum required by the
Linux OpenGL ABI).  drisw_glx had a similar error resulting in a default
version of 0.1.

Some driver/card combinations (r200/RV280, i915/915G) don't support
OpenGL 2.1.  These create in some corner cases an indirect context
instead of a direct context when calling glXCreateContextAttribsARB().
This happens because of a bad default value.  To avoid this, just used
the default value specified by the GLX_ARB_create_context specification:

    "The default values for GLX_CONTEXT_MAJOR_VERSION_ARB and
    GLX_CONTEXT_MINOR_VERSION_ARB are 1 and 0 respectively. In this
    case, implementations will typically return the most recent version
    of OpenGL they support which is backwards compatible with OpenGL 1.0
    (e.g. 3.0, 3.1 + GL_ARB_compatibility, or 3.2 compatibility
    profile)"

Refactor all the default value setting to dri2_convert_glx_attribs, and
make sure the correct defaults are set in that one place.

Signed-off-by: Rico Schüller <kgbricola@web.de>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla http://bugs.winehq.org/show_bug.cgi?id=34238
Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>

(cherry picked from commit 8b302e1635)
2013-09-27 15:27:26 -07:00
Ian Romanick
616da8f818 glsl: Reallow precision qualifiers on structure members
Changes to the grammar for GL_ARB_shading_language_420pack (commit
6eec502) moved precision qualifiers out of the type_specifier production
chain.  This caused declarations such as:

    struct S {
        lowp float f;
    };

to generate parse errors.  Section 4.1.8 (Structures) of both the GLSL
ES 1.00 spec and GLSL 1.30 specs says:

        "Member declarators may contain precision qualifiers, but may not
        contain any other qualifiers."

So, it sure seems like we shouldn't generate a parse error. :)

Instead of type_specifier, use fully_specified_type in struct members.
However, fully_specified_type allows a lot of other qualifiers that are
not allowed on structure members, so expeclitly disallow them.

Note, this makes struct_declaration look an awful lot like
member_declaration (used for interface blocks).  We may want to
(somehow) unify these rules to reduce code duplication at some point.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68753
Reported-by: Aras Pranckevicius <aras@unity3d.com>
Cc: Aras Pranckevicius <aras@unity3d.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 87252bf97b)
2013-09-27 15:27:14 -07:00
Maarten Lankhorst
72295c5f67 nvc0: restore viewport after blit
Based on calim's original fix in the nine branch.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ad4dc77231)
2013-09-27 15:27:05 -07:00
Christoph Bumiller
a6a2039a44 nvc0: delete compute object on screen destruction
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7fe159ba74)
2013-09-27 15:26:53 -07:00
Joakim Sindholt
f53b9849a1 nvc0: fix blitctx memory leak
Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2a7762bdb6)
2013-09-27 15:26:37 -07:00
Christoph Bumiller
50ffa8bac5 nvc0/ir: add f32 long immediate cannot saturate
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5399206056)
2013-09-27 15:26:27 -07:00
Tiziano Bacocco
0547f28134 nvc0/ir: fix use after free in texture barrier insertion pass
Fixes crash with Amnesia: The Dark Descent.

Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7086636358)
2013-09-27 15:26:16 -07:00
Emil Velikov
47da22626d nouveau: initialise the nouveau_transfer maps
Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit dc10251d08)
2013-09-27 15:26:04 -07:00
Chris Forbes
c3de1eea7f i965/fs: Gen4: Zero out extra coordinates when using shadow compare
Fixes broken rendering if these MRFs contained anything other than zero.

NOTE: This is a candidate for stable branches.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f35dea05b1)
2013-09-27 15:25:53 -07:00
Maarten Lankhorst
ab9322534c st/dri: do not create a new context for msaa copy
Commit b77316ad75
    st/dri: always copy new DRI front and back buffers to corresponding MSAA buffers

introduced creating a pipe_context for every call to validate, which is not required
because the callers have a context anyway.

Only exception is egl_g3d_create_pbuffer_from_client_buffer, can someone test if it
still works with NULL passed as context for validate? From examining the code I
believe it does, but I didn't thoroughly test it.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: 9.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b217d48364)
2013-09-26 12:36:26 +02:00
Alex Deucher
2cda3f0e90 radeon/winsys: pad IBs to a multiple of 8 DWs
This aligns the gfx, compute, and dma IBs to 8 DW boundries.
This aligns the the IB to the fetch size of the CP for optimal
performance. Additionally, r6xx hardware requires at least 4
DW alignment to avoid a hw bug.  This also aligns the DMA
IBs to 8 DW which is required for the DMA engine.  This
alignment is already handled in the gallium driver, but that
patch can be removed now that it's done in the winsys.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: "9.2" <mesa-stable@lists.freedesktop.org>
CC: "9.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a81beee37e)
2013-09-17 07:45:42 +10:00
Ilia Mirkin
3b852f9d52 nv30: find first unused texcoord rather than bailing if first is used
This fixes shaders produced by supertuxkart.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 3282697621)
2013-09-17 07:38:56 +10:00
Ian Romanick
fd31f5ee1d mesa: Note that 89a665e should not be picked
See also:

http://lists.freedesktop.org/archives/mesa-stable/2013-September/000251.html
http://lists.freedesktop.org/archives/mesa-stable/2013-September/000252.html

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-09-11 15:58:14 -05:00
Kenneth Graunke
c0253baaa0 i965/fs: Detect GRF sources in split_virtual_grfs send-from-GRF code.
It is incorrect to assume that src[0] of a SEND-from-GRF opcode is the
GRF.  For example, FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD uses src[1] for
the GRF.

To be safe, loop over all the source registers and mark any GRFs.  We
probably won't ever have more than one, but it's simpler to just check
all three rather than attempting to bail early.

Not observed to fix anything yet, but likely to.  Parallels the bug fix
in the previous commit, which actually does fix known failures.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit a35b320250)
2013-09-10 14:36:11 -05:00
Kenneth Graunke
9dd4e1ef85 i965/vs: Detect GRF sources in split_virtual_grfs send-from-GRF code.
It is incorrect to assume that src[0] of a SEND-from-GRF opcode is the GRF.
VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 uses an IMM as src[0], and stores the
GRF as src[1].

To be safe, loop over all the source registers and mark any GRFs.  We
probably won't ever have more than one, but it's simpler to just check
all three rather than attempting to bail early.

Fixes assertion failures in Unigine Sanctuary since we started making
register allocation rely on split_virtual_grfs working.  (The register
classes were actually sufficient, we were just interpreting an IMM as
a virtual GRF number.)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68637
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 4e3d1712a2)
2013-09-10 14:36:04 -05:00
Eric Anholt
fbbe25ef26 mesa: Don't choose S3TC for generic compression if we can't compress.
If the app is asking us to do GL_COMPRESSED_RGBA, then the app obviously
doesn't have pre-compressed data to hand us.  So don't choose a storage
format that we won't actually be able to compress and store.

Fixes black screen in warzone2100 when libtxc_dxtn is not present.  Also
66 piglit tests.

NOTE: This is a candidate for the 9.2 branch.
Reported-by: Paul Wise <pabs@debian.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit bdf3f50e9a)
2013-09-10 14:35:59 -05:00
Eric Anholt
e5f788e1e0 mesa: Rip out more extension checking from texformat.c.
You should only be flagging the formats as supported if you support them
anyway.

NOTE: This is a candidate for the 9.2 branch. (required for next commit)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b188467fdf)
2013-09-10 14:35:52 -05:00
Anuj Phogat
26ae6ec1e8 glsl: Allow precision qualifiers for sampler types
GLSL 1.30 doesn't allow precision qualifiers on sampler types,
but in GLSL ES, sampler types are also allowed. This seems like
an oversight (since the intention of including these in GLSL 1.30
is to allow compatibility with ES shaders).

Currently, Mesa allows "default" precision qualifiers to be set for
sampler types in GLSL (commit d5948f2). This patch makes it follow
GLSL ES rules and also allow declaring sampler variables with a
precision qualifier in GLSL 1.30 (and later). e.g.
uniform lowp sampler2D sampler;

This fixes a shader compilation error in Khronos OpenGL conformance
test "depth_texture_mipmap".

V2: Update comments.
Signed-off-by: Ian Romanick <idr@lists.freedesktop.org>

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <idr@lists.freedesktop.org>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9c0b7be964)
2013-09-10 14:35:46 -05:00
Paul Berry
9586f4de71 i965: Initialize inout_offset parameter to brw_search_cache().
Two callers of brw_search_cache() weren't initializing that function's
inout_offset parameter: brw_blorp_const_color_params::get_wm_prog()
and brw_blorp_const_color_params::get_wm_prog().

That's a benign problem, since the only effect of not initializing
inout_offset prior to calling brw_search_cache() is that the bit
corresponding to cache_id in brw->state.dirty.cache may not be set
reliably.  This is ok, since the cache_id's used by
brw_blorp_const_color_params::get_wm_prog() and
brw_blorp_blit_params::get_wm_prog() (BRW_BLORP_CONST_COLOR_PROG and
BRW_BLORP_BLIT_PROG, respectively) correspond to dirty bits that are
not used.

However, failing to initialize this parameter causes valgrind to
complain.  So let's go ahead and fix it to reduce valgrind noise.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66779

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b8f13fbb85)
2013-08-28 11:38:25 -07:00
Ian Romanick
fa892ecc04 Add .cherry-ignore file
Somebody forgot -x with git-cherry-pick...

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-28 11:37:08 -07:00
Brian Paul
2377205bcb docs: minor fixes for 9.2 release notes
Fix incorrect </li> tag, fix language.
2013-08-27 18:57:59 -06:00
Ian Romanick
8218eebc80 docs: Add 9.2 release md5sums
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-27 16:33:48 -07:00
Ian Romanick
46273ba256 mesa: Bump version to 9.2 (final)
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-27 15:52:35 -07:00
Ian Romanick
d3f99fb532 docs: Update release notes for 9.2
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-27 15:52:31 -07:00
Matt Turner
6fb2032c35 glsl: Disallow uniform block layout qualifiers on non-uniform block vars.
Cc: 9.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68460
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-26 23:21:24 -07:00
Kristian Lehmann
c0abf6499f Fixed and/or order mistake, resulting in compiling llvmpipe without llvm installed
Cc: 9.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68544
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit cec7b5c5bc)
2013-08-26 22:17:03 -07:00
Tom Stellard
1a9bda1f34 clover: Don't use PIPE_TRANSFER_UNSYNCHRONIZED for blocking copies
CC: "9.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit f3e86d4a68)
2013-08-26 22:16:59 -07:00
Michel Dänzer
59781051eb radeonsi: Also set the depth component mask bit for stencil-only exports
The stencil values come out wrong without this for some reason.

50 more little piglits.

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 46fd81e586)
2013-08-26 10:05:36 -07:00
Kenneth Graunke
f31a1d9f8d mesa: Set query->EverBound in glQueryCounter().
glIsQuery is supposed to return false for names returned by glGenQueries
until their first use.  BeginQuery is a use, but QueryCounter is also a
use.

From the ARB_timer_query spec:
"A timer query object is created with the command

      void QueryCounter(uint id, enum target);

 [...] If <id> is an unused query object name, the
 name is marked as used [...]"

Fixes Piglit's spec/ARB_timer_query/query-lifetime.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 7950315583)
2013-08-26 10:05:25 -07:00
Ilia Mirkin
3370dfdf3e nv30: add forgotten PIPE_CAP_CUBE_MAP_ARRAY cap to list
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bac6efe8e3)
2013-08-26 10:05:20 -07:00
Jon Severinsson
2153557906 gallium/osmesa: Link, not copy, the shared library to the LIB_DIR.
Just like all other mesa libraries...

CC: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b47bde0079)
2013-08-23 16:41:59 -07:00
Jon Severinsson
dda7358377 gallium/osmesa: Always link with the c++ linker.
Just like all other gallium targets...

CC: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit aeb9c9e4b0)
2013-08-23 16:41:55 -07:00
Jon Severinsson
3fd1ca7949 gallium/osmesa: Make and install an osmesa.pc.
As of "2f142d59 build: Add --enable-gallium-osmesa flag." the pkgconfig
file from classic osmesa is no longer installed when building gallium
osmesa, so copy it to gallium osmesa and install the copy instead.

CC: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit c811190430)
2013-08-23 16:41:51 -07:00
Timothy Arceri
4aa9f013d5 mesa: Fix assertion error with glDebugMessageControl
enums were being converted twice resulting in incorrect values.
The extra conversion has been removed and the redundant assert is
removed also.

Cc: 9.2 <mesa-stable@lists.freedesktop.org>

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit f0072e3c6b)
2013-08-23 16:41:46 -07:00
Kenneth Graunke
7aefdab219 mesa: Specify a better GL_MAX_SERVER_WAIT_TIMEOUT limit.
The previous value of (GLuint64) ~0 has some problems:

GL_MAX_SERVER_WAIT_TIMEOUT is supposed to be a GLuint64 value, but has
to be queried via GetInteger64v(), which returns a GLint64.  This means
that some applications are likely to treat it as a signed integer, where
~0 means -1.  Negative values are nonsensical and problematic.

When interpreted correctly, ~0 translates to about 0.58 million years,
which seems rather excessive.

This patch changes it to 0x1fff7fffffff, which is about 1.11 years.
This is still plenty long, and is the same as both an int64 and uint64.
Applications that accidentally store it in a 32-bit int/unsigned also
get a non-negative value, which is again the same as both int and
unsigned.  This value was suggested by Ian Romanick.

v2: Add the ULL prefix on the constant (suggested by Ian).

Fixes Piglit's spec/!OpenGL 3.2/get-integer-64v.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit a27180d0d8)
2013-08-23 16:41:38 -07:00
Ian Romanick
fe6526f439 mesa: Bump version to 9.2-rc2
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-22 15:21:55 -07:00
Ian Romanick
08b192d26a glsl: Give a warning, not an error, for UBO qualifiers on non-matrices.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59648
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit dded321f92)
2013-08-22 11:50:24 -07:00
Matt Turner
c1c076dd8d glsl: Remove ubo_qualifiers_allowed variable.
No longer used.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 921ef55a72)
2013-08-22 11:50:18 -07:00
Matt Turner
368fc4f3ec glsl: Drop duplicate error messages.
This same message is printed in the validate_matrix_layout_for_type
function.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 77373e020e)
2013-08-22 11:49:40 -07:00
Matt Turner
e14baf425b glsl: Rename ubo_qualifiers_valid to ubo_qualifiers_allowed.
The variable means that UBO qualifiers are allowed in a particular
context (e.g., not allowed in a struct field declaration), rather than a
particular set of UBO qualifiers are valid.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 1a45db9705)
2013-08-22 11:48:26 -07:00
Chad Versace
dce3865306 i965: Fix misapplication of gles3 srgb workaround
Fixes inconsistent failure of gles2conform/GL2Tests/glUniform/glUniform.test
under gnome-shell. What follows is a description of the bug and its fix.

When intel_update_renderbuffers() allocates a miptree for a winsys
renderbuffer, it propagates the renderbuffer's format to become also the
miptree's format.

If the winsys color buffer format is SARGB, then, in the first call to
eglMakeCurrent, intel_gles3_srgb_workaround() changes the renderbuffer's
format to ARGB. That is, it changes the format from sRGB to non-sRGB.
However, it changes the renderbuffer's format *after*
intel_update_renderbuffers() has allocated the renderbuffer's miptree.
Therefore, when eglMakeCurrent returns, the miptree format (SARGB)
differs from the renderbuffer format (ARGB).

If the X server reallocates the color buffer,
intel_update_renderbuffers() will create a new miptree for the
renderbuffer. The new miptree's format (ARGB) will differ from old
miptree's format (SARGB). This mismatch between old and new miptrees
causes bugs.

Fix the bug by moving intel_gles3_srgb_workaround() to occur *before*
intel_update_renderbuffers().

CC: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67934
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit ce8639a766)
2013-08-22 11:46:24 -07:00
Michel Dänzer
8efdaedfc2 radeonsi: Fix y/z/w component values of TGSI_SEMANTIC_FOG pixel shader inputs
They are defined as constant 0.0/0.0/1.0.

Three more little piglits.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 237cb074cb)
2013-08-22 11:46:19 -07:00
Matt Turner
c47804d286 build: Add --enable-gallium-osmesa flag.
The Gallium implementation is apparently not ready for regular
consumption, so as much as I hate adding more build-time options, here's
another.

Acked-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 2f142d596f)
2013-08-21 23:09:04 -07:00
Matt Turner
5114ac3f87 i965: Don't copy propagate bitcasts with source modifiers.
Previously, copy propagation would cause bitcast_f2u(abs(float)) to
be performed in a single step, but the application of source modifiers
(abs, neg) happens after type conversion, leading to incorrect results.

That is, for bitcast_f2u(abs(float)) we would in fact generate code to
do abs(bitcast_f2u(float)).

For example, whereas bitcast_f2u(abs(float)) might result in a register
argument such as
   (abs)g2.2<0,1,0>UD

v2: Set interfered = true and break in register_coalesce instead of
    returning false.

Reviewed-by: Paul Berry <stereoytpe441@gmail.com>
(cherry picked from commit 9c48ae751a)
2013-08-21 21:11:54 -07:00
Matt Turner
f0bc10679e i965: Emit MOVs for neg/abs.
Necessary to avoid combining a bitcast and a modifier into a single
operation. Otherwise if safe, the MOV should be removed by
copy-propagation or register coalescing.

With this and the next patch, there are only four changes in shader-db:
all a single extra instruction. The code does something like
   mov a.w, -b.x
and copy propagation doesn't work because it only handles no-op
swizzles. Seems acceptable, given the known limitation of our copy
propagation.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereoytpe441@gmail.com>
(cherry picked from commit 0ae9ca12a8)
2013-08-21 21:11:51 -07:00
Armin K
3f438bfa4c osmesa: Symlink shared library to LIB_DIR
Cc: 9.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Brian Paul <brianp at vmware.com>
Reviewed-by: Brian Paul <brianp at vmware.com>
(cherry picked from commit 63ac68bae3)
2013-08-21 21:10:27 -07:00
Maarten Lankhorst
60e1b03455 glapi/gen: build temporary files in the build directory
Writing to the source directory can cause multiple parallel builds
from the same source to fail. Create the temporary files in the
build directory.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 10aa3677cc)
2013-08-21 15:26:05 -07:00
Ian Romanick
cf537c405b mesa: Never advertise _S3TC compressed formats
The NVIDIA driver doesn't expose them, and piglit's
arb_texture_compression-invalid-formats expects them to not be there.

This, with the previous commit, fixes piglit
arb_texture_compression-invalid-formats.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f53b634807)
2013-08-21 07:54:41 -07:00
Ian Romanick
601926515e mesa: Only advertise GL_ETC1_RGB8_OES in ES contexts
There is no extension for this format in desktop GL, so an application
can't give the format back to glCompressedTexImage2D.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 40550c8ced)
2013-08-21 07:54:37 -07:00
Ian Romanick
c9a7d6950b glsl: Track existence of default float precision in GLSL ES fragment shaders
This is required by the spec, and it's a bit tricky because the default
precision is scoped.  As a result, I'm slightly abusing the symbol
table.

Fixes piglit no-default-float-precision.frag tests and the piglit
default-precision-nested-scope-0[1234].frag tests that are currently on
the piglit mailing list for review.

On IRC I got confirmation from cwabbot that ARM (Mali T6xx and T400)
enforces this requirement and from kusma that NVIDIA (Tegra2) enforces
this requirement.  We should be safe from regressing shipping
applications.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cabd45773b)
2013-08-21 07:54:33 -07:00
Ian Romanick
89aff30f9b glsl: Merge precision qualifiers too
We never noticed this before because we previously didn't enfoce GLSL ES
fragement shader requirements that precision be defined.  There may also
have been some interaction here with the addition of
GL_ARB_shading_language_420pack, but it doesn't appear to me that it
added any new bugs (just perhaps uncovered some old ones).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 73e2d69792)
2013-08-21 07:54:29 -07:00
Ian Romanick
3eae076d70 glsl: Pass type to is_valid_default_precision_type instead of name
This is used by the next patch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b15b62c54c)
2013-08-21 07:54:25 -07:00
Ross Burton
3ec07eaaa5 build: fix out-of-tree builds in gallium/auxiliary
The rules were writing files to e.g. util/u_indices_gen.py, but in an
out-of-tree build this directory doesn't exist in the build directory.  So,
create the directories just in case.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Ross Burton <ross.burton@intel.com>
(cherry picked from commit 76feef0823)
2013-08-21 07:54:22 -07:00
Michel Dänzer
35c9345711 radeonsi: Always pre-load separate VGPRs for centroid vs. center interpolation
The LLVM R600 backend currently always uses separate VGPRs for these.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68162
(Centroid interpolation is identical to center interpolation without
multisampling, so the shader hardware was only pre-loading one set of
interpolation coefficients, and the pixel shader code was using
uninitialized values as the centroid interpolation coefficients)

Cc: mesa-stable@lists.freedesktop.org
Tested-by: Laurent Carlier <lordheavym@gmail.com>
(cherry picked from commit be301f707e)
2013-08-21 07:54:18 -07:00
Maarten Lankhorst
74fcc65b58 gallium/osmesa: add same checks to OSMesaMakeCurrent as the other osmesa
Fixes a opengl crash in wine.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
(cherry picked from commit 86751cbddf)
2013-08-21 07:54:13 -07:00
Maarten Lankhorst
b802f6a124 gallium/osmesa: link against static libglapi library too to get the gl exports
This should fix missing symbols in a osmesa built against shared glapi
osmesa build. All opengl exports were missing that are defined in the
static glapi, so link against both to fix this.

I could swear I've done this before, maybe there was a glitch in the matrix.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47824
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
(cherry picked from commit 603160d4c0)
2013-08-21 07:54:09 -07:00
Andreas Boll
170b952cfe docs: Add md5sums to 9.1.5 release notes
(cherry picked from commit 38903db439)
2013-08-21 07:53:04 -07:00
Andreas Boll
b97305bc21 docs: Fix a typo in the 9.1.6 release notes
(cherry picked from commit 7eaaf62434)
2013-08-21 07:52:57 -07:00
Carl Worth
ecd1d92baf docs: Add md5sums to 9.1.6 release notes
(cherry picked from commit 7f2f63409a)
2013-08-21 07:52:39 -07:00
Carl Worth
c8b5222074 docs: Import 9.1.6 release notes, add news item.
(cherry picked from commit 964b89e42a)
2013-08-21 07:52:18 -07:00
Carl Worth
bd83ff1923 get-pick-list: Allow for non-whitespace between "CC:" and "mesa-stable"
We recently proposed a new syntax for stable-patch nominations such as:

	CC: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>

and this has already appeared in the wild.

So we extend the regular expression to pick this up as well.
(cherry picked from commit c6f3036179)
2013-08-21 07:51:16 -07:00
Carl Worth
a8b08c5ccd get-pick-list.sh: Include commits mentionining "CC: mesa-stable..." in pick list
We recently adopted a new convention that patches can be nominated for the
stable branch by including a line in the commit message as follows:

	CC: mesa-stable@lists.freedesktop.org

This is a convenient syntax as "git send-email" will notice this line and
automatically copy the resulting patch email to the mesa-stable mailing list.

Here we extend the regular expression in the get-pick-list.sh script to also
notice this pattern, (as well as the traditional "NOTE: This patch is a
candidate..." form.
(cherry picked from commit 122d8d2f5a)
2013-08-21 07:51:05 -07:00
Ian Romanick
796b4a7b40 mesa: Bump version to 9.2-rc1 2013-08-19 16:49:02 -07:00
Ian Romanick
d3004acdd1 glsl: Use alignment of container record for its first field
The first field of a record in a UBO has the aligment of the record
itself.

Fixes piglit vs-struct-pad, fs-struct-pad, and (with the patch posted to
the piglit list that extends the test) layout-std140.

NOTE: The bit of strangeness with the version of visit_field without the
record_type poitner is because that method is pure virtual in the base
class.  The original implementation of the class did this to ensure
derived classes remembered to implement that flavor.  Now they can
implement either flavor but not both.  I don't know a C++ way to enforce
that.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68195
Cc: "9.2 9.1" mesa-stable@lists.freedesktop.org
(cherry picked from commit 574e4843e9)
2013-08-19 16:40:07 -07:00
Ian Romanick
684316512c glsl: Add new overload of program_resource_visitor::visit_field method
The outer-most record is passed into the visit_field method for
the first field.  In other words, in the following structure:

    struct S1 {
        vec4 v;
        float f;
    };

    struct S {
        S1 s1;
        S1 s2;
    };

    uniform Ubo {
        S s;
    };

s.s1.v would get record_type = S (because s1.v is the first non-record
field in S), and s.s2.v would get record_type = S1.  s.s1.f and s.s2.f
would get record_type = NULL becuase they aren't the first field of
anything.

This new overload isn't used yet, but the next patch will add several
uses.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: "9.2 9.1" mesa-stable@lists.freedesktop.org
(cherry picked from commit 5ac884fd9f)
2013-08-19 16:40:03 -07:00
Ian Romanick
9f7f727345 glsl: Disallow embedded structure definitions
Continue to allow them in GLSL 1.10 because the spec allows it.
Generate an error in all other versions because the specs specifically
disallow it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d9bb8b7b56)
2013-08-19 16:39:59 -07:00
Ian Romanick
1fb22bf143 meta: Add default precision qualifier to all fragement shaders
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5fb1dd51f3)
2013-08-19 16:39:55 -07:00
Ian Romanick
9fa7313e34 glsl: Add default precision qualifiers for ES builtins
Once the compiler proplerly checks for default precision qualifiers,
these shaders will cease to compile.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5ac247a73e)
2013-08-19 16:39:52 -07:00
Marek Olšák
6296abed15 glsl: don't eliminate texcoords that can be set by GL_COORD_REPLACE
Tested by examining generated TGSI shaders from piglit/glsl-routing.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Henri Verbeet <hverbeet@gmail.com>
Tested-by: Henri Verbeet <hverbeet@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit d13003f544)
2013-08-19 16:39:48 -07:00
Ilia Mirkin
5b8c943eb2 nv50: allow non-nv12 buffers to be created, just pass them through to vl
Since we expose non-NV12 formats as supported when there is no decoer
profile selected, make sure that those formats are actually allowed to
be allocated.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a8346a2f52)
2013-08-19 16:39:43 -07:00
Anuj Phogat
d72f7720a6 meta: Fix blitting a framebuffer with renderbuffer attachment
This patch fixes a case of framebuffer blitting with renderbuffer
as color attachment and GL_LINEAR filter. Meta implementation of
glBlitFrambuffer() converts source color buffer to a texture and
uses it to do the scaled blitting in to destination buffer. Using
the exact source rectangle to create the texture does incorrect
linear filtering along the edges. This patch makes the changes to
extend the texture edges by one pixel in x, y directions. This
ensures correct linear filtering.
It fixes failing piglit fbo-attachments-blit-scaled-linear test.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
CC: "9.2" <mesa-stable@lists.freedesktop.org>
CC: "9.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
(cherry picked from commit d944a6144f)
2013-08-16 12:36:38 -07:00
Ilia Mirkin
b40d9e4f41 nv30: remove no-longer-used formats from table
Commit 14ee790df7 removed the formats from the vtxfmt_table but forgot
to also update the info_table.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c1a6f59b20)
2013-08-16 12:36:34 -07:00
Kenneth Graunke
e2185778e2 i965: Force X-tiling for 128 bpp formats on Sandybridge.
128 bpp formats are not allowed to be Y-tiled on any architectures
except Gen7.

+11 Piglits on Sandybridge (mostly regression fixes since the
switch to Y-tiling).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63867
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64261
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c189840b21)
2013-08-16 12:36:31 -07:00
Laurent Carlier
a98d5f2663 mesa/program: remove useless YYID
This fixes the build with Bison 3.0. Also works with Bison 2.7.1.

CC: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5ffa28df4e)
2013-08-16 12:15:39 -07:00
Ian Romanick
24d1949ddc mesa/vbo: Fix handling of attribute 0 in non-compatibilty contexts
It is only in OpenGL compatibility-style contexts where generic
attribute 0 and GL_VERTEX_ARRAY have a bizzare, aliasing relationship.
Moreover, it is only in OpenGL compatibility-style contexts and OpenGL
ES 1.x where one of these attributes provokes the vertex.  In all other
APIs each implicit call to glArrayElement provokes a vertex regardless
of which attributes are enabled.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Robert Bragg <robert@sixbynine.org>
Cc: "9.0 9.1 9.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55503
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66292
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67548
(cherry picked from commit 41eef83cc0)
2013-08-15 15:16:45 -07:00
Vinson Lee
996bc26c87 i915,i965: Fix memory leak in try_pbo_upload (v2)
Fixes "Resource leak" defect reported by Coverity.
Tested on Haswell, no Piglit regressions.

v2: Apply to i965, not just i915. (chadv)

CC: "9.2, 9.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 035bf21983)
2013-08-15 15:16:45 -07:00
Michel Dänzer
b055c8689e radeonsi: Don't leave gaps between position exports from vertex shader
If the vertex shader exports clip distances but not point size, use
position exports 1/2 instead of 2/3 for the clip distances. Fixes
geometry corruption in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66974

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit b00269aa58)
2013-08-15 15:16:44 -07:00
Roland Scheidegger
d6d0175203 llvmpipe: fix stencil bug if we have both stencil and depth tests
This is a very well hidden bug found by accident (only the fixed glean
tstencil2 test so far seems to hit it).
We must use new mask with combined s_pass values and orig_mask values
for zpass/zfail stencil ops, otherwise both the sfail op and one of
zpass/zfail op are applied (probably not hit in most tests because
some of the ops tend to be KEEP usually).

Note: this is a candidate for the 9.2 branch.

Reviewed-by: Zack Rusin <zackr@vmware.com>
(cherry picked from commit abdd32dcd5)
2013-08-15 15:16:44 -07:00
Ilia Mirkin
4ba5fd1052 nv30: U8_USCALED only works for size 4
See https://bugs.freedesktop.org/show_bug.cgi?id=61635 for a sample
program. Changing it to use a vec4 makes it work. Remove the unsupported
formats.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 14ee790df7)
2013-08-15 15:16:44 -07:00
Ian Romanick
872c09586c glsl: Emit better warnings for things that look like default precision statements
Previously we would emit a warning for empty declarations like

float;

We would also emit the same warning for things like

highp float;

However, this second case is most likely the application trying to set
the default precision.  This makes the compiler generate a stronger
warning with some suggestion of a fix.

It really seems like this should be an error.  I'll bet that 100% of the
time someone writes 'highp float;' the actually meant 'precision highp
float;'.  Alas, both AMD and NVIDIA accept this syntax, and the spec
doesn't explicitly forbid it.

This makes piglit's precision-05.vert generate the following warnings:

0:12(11): warning: empty declaration with precision qualifier, to set the default precision, use `precision lowp float;'
0:13(12): warning: empty declaration with precision qualifier, to set the default precision, use `precision mediump int;'

v2: Add { } around a one-line if body and fix a comment.  Suggested by
Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 830f4df993)
2013-08-15 15:16:44 -07:00
Roland Scheidegger
4f44202aae draw: always call util_cpu_detect() in draw context creation.
Since disabling denorms in draw_vbo() we require the util_cpu_caps to be
initialized there. Hence add another util_cpu_detect() call in
draw_create_context() which should ensure this.
(There is another call in draw_get_option_use_llvm() which only gets called
with x86 (not x86_64) but calling it always there wouldn't help since it most
likely wouldn't get called when compiling without llvm, so leave it alone
there.)
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=66806.
(Because util_cpu_caps wasn't initialized when first calling util_fpstate_get()
hence it returning zero, but it would later get initialized by rtasm translate
code hence when draw call returned it unmasked all exceptions by calling
util_fpstate_set(). This was happening only with DRAW_USE_LLVM=0 or not
compiling with llvm, otherwise the llvm init code was calling it on time too.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2013-08-15 17:38:27 +02:00
Jon Severinsson
33b581f6f6 radeon/llvm: Add missing "%s" format string to fprintf.
This fixes a compilation warning with -Wformat-security.

CC: "9.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit 9298f537a7)
2013-08-14 11:21:30 +02:00
Tapani Pälli
c088c24588 glsl: disable ARB_texture_cube_map_array_enable keywords for glsl es
Patch fixes a crash with Webgl 'shader-with-non-reserved-words'
conformance test by ignoring desktop extension keywords on GLSL ES.

v2: fix reserved and allowed desktop glsl versions (Chris)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64087
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8c211dd742)
2013-08-13 18:13:35 -07:00
Armin K
7d6dcb61cd gbm: Link to libwayland-drm if Wayland EGL platform is enabled
We were relying on libEGL to pull in libwayland-client symbols, but with
commit 2c2e64edab cleaned up the
symbol leak.

CC: "9.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67962
Tested-by: Bryce Harrington <b.harrington@samsung.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit f423eba46e)
2013-08-13 18:11:22 -07:00
Ian Romanick
8025bac852 glsl: Require function return type arrays be explicitly sized
Fixes piglit array-function-return-unsized.vert.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1b35e33af4)
2013-08-13 17:57:39 -07:00
Ian Romanick
5d6dc93490 glsl: Move and refine test for unsized arrays in GLSL ES
GLSL ES does not allow unsized arrays, and GLSL ES 1.00 does not allow
array initializers.  However, GLSL ES 3.00 allows array initializers,
and the initializer can explicitly size the array.  The specification
even includes some examples of this:

    float x[] = float[2] (1.0, 2.0);     // declares an array of size 2
    float y[] = float[] (1.0, 2.0, 3.0); // declares an array of size 3

    float a[5];
    float b[] = a;

Move the unsized array check to after the initializer has been
processed.  If the array is still unsized, generate the error.  This
should have no effect in GLSL ES 1.00 because, as previously mentioned,
array initializers are not allowed.

Fixes piglit "glsl-es-3.00 compiler array-sized-by-initializer.vert".

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 42624b1c81)
2013-08-13 17:57:02 -07:00
Ian Romanick
31f582abd4 glx: Generate GLXBadDrawable when drawable is zero
Fixes piglit glx-query-drawable-GLXBadDrawable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d5aee174b8)
2013-08-13 17:54:29 -07:00
Ian Romanick
0b131ae24f mesa: Use _mesa_detach_renderbuffer when deleting a texture
The functional change is that now invalidate_framebuffer is called if
the texture is actually detached from one of the currently bound FBOs.
Previously this was only done for renderbuffers.

The remaining changes make the texture delete path look more similar to
the renderbuffer delete path.  This includes adding relevant spec
quotations to justify the behavior.

Fixes piglit fbo-incomplete "delete texture of bound FBO" test.

v2: Move 'fb->Attachment[i].Texture == att' check from previous patch to
this patch... where it was intended to be in the first place.  Noticed
by Chad.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ef83bd2b95)
2013-08-13 17:54:24 -07:00
Ian Romanick
8ee4a4e417 mesa: Make detach_renderbuffer available outside fbobject.c
Also add a return value indicating whether any work was done.

This will be used by the next patch.

v2: Move 'fb->Attachment[i].Texture == att' check to the next
patch... where it was intended to be in the first place.  Noticed by
Chad.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 438cc6bc49)
2013-08-13 17:54:21 -07:00
Ian Romanick
0c405cd0e8 meta: Don't call _mesa_Ortho with width or height of 0
Fixes failures in oglconform fbo mipmap.manual.color,
mipmap.manual.colorAndDepth, mipmap.automatic, and
mipmap.manualIterateTexTargets subtests.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 341fb93c16)
2013-08-13 17:54:17 -07:00
Vadim Girlin
b76ff3dbcd r600g/sb: use MULADD workaround on R7xx for MULADD_IEEE
Looks like the same issue that was seen with MULADD in trans slot on
R7xx also affects MULADD_IEEE (maybe all OP3 instructions and MULADD is
just a most frequently used?). So the workaround is to not allow affected
instructions to be placed into the trans slot.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=67927

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 17bb96b03d)
2013-08-13 17:54:12 -07:00
Ian Romanick
cb8e109492 glsl: Don't allow const on out or inout function parameters
Fixes piglit tests const-inout-parameter.frag and
const-out-parameter.frag.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5894898148)
2013-08-09 15:26:08 -07:00
Alex Deucher
4006fc4656 r600g: disable GPUVM by default
Cayman and trinity systems still seem to suffer from
stability problems with GPUVM.  This also fixes compute
on these asics.  It can still be enabled for testing
by setting env var RADEON_VA=true.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=65958

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: "9.2" <mesa-stable@lists.freedesktop.org>
CC: "9.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit c88783047e)
2013-08-09 15:26:05 -07:00
Chad Versace
731a08341e egl: Do not export private symbols
libEGL was incorrectly exporting *all* symbols, public and private.
This patch adds -fvisibility=hidden to libEGL's linker flags to ensure
that only symbols annotated with __attribute__((visibility("default")))
get exported.

Sanity-checked with libEGL's builtin DRI2 driver and the i965 DRI driver
by running Piglit on X/EGL and by running weston-gears on Weston as an
X client.

Sanity-checked with libEGL's Gallium driver (which is not built-in) and
the swrast Gallium driver by running es2gears_x11.

Kristian reviewed the symbol diff in `nm libEGL.so`.

CC: "9.2" <mesa-stable@lists.freedesktop.org>
CC: Ian Romanick <idr@freedesktop.org>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 2c2e64edab)
2013-08-09 15:26:00 -07:00
Kenneth Graunke
3da0c76ec0 i965: Remember to call intel_prepare_render() before blitting.
Otherwise, blits to the window system buffer may cause crashes,
since dst_irb->mt may be NULL.

This code is lifted straight out of brw_blorp_framebuffer()'s
try_blorp_blit() helper.

Fixes crashes in Piglit's fbo-sys-blit on systems without BLORP.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65919
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fb3d62fe3d)
2013-08-09 15:25:56 -07:00
Tom Stellard
10ff10c89e r300g/compiler/tests: Pass the required LDFLAGS when building the test program
CC: "9.2 <mesa-stable@lists.freedesktop.org>"
(cherry picked from commit d0c13fba17)
2013-08-07 18:35:28 -07:00
Tom Stellard
12da1bcb3b r300g/compiler/tests: Fix segfault
CC: "9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d691ba4d94)
2013-08-07 18:35:21 -07:00
Emil Velikov
195e995968 nv50: handle pure integer vertex attributes
And as a side effect fix a crash in the following piglit test:
general/attribs GL3

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "9.2 and 9.1" mesa-stable@lists.freedesktop.org
(cherry picked from commit 07c8f7a6f8)
2013-08-06 19:53:33 -07:00
Ian Romanick
6f9b090719 mesa: Generate a renderbuffer wrapper even if the texture has no image
This prevents a segfault in check_begin_texture_render when an FBO is
rebound while in this state.  This fixes the piglit test
fbo-incomplete-invalid-texture.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
(cherry picked from commit 2f9fe2d80a)
2013-08-06 12:20:12 -07:00
Ian Romanick
70c9e07bd4 mesa: Validate the layer selection of an array texture too
Previously only the slice of a 3D texture was validated in the FBO
completeness check.  This fixes the failure in the 'invalid layer of an
array texture' subtest of piglit's fbo-incomplete test.

v2: 1D_ARRAY textures have Depth == 1.  Instead, compare against Height.

v3: Handle CUBE_MAP_ARRAY textures too.  Noticed by Marek.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
(cherry picked from commit 25281fef0f)
2013-08-06 12:20:08 -07:00
Ian Romanick
d383ff0843 mesa: Don't call driver RenderTexture for invalid zoffset
This fixes the segfault in the 'invalid slice of 3D texture' and
'invalid layer of an array texture' subtests of piglit's fbo-incomplete
test.

The 'invalid layer of an array texture' subtest still fails.

v2: Fix off-by-one comparison error noticed by Chris Forbes.  Also,
1D_ARRAY textures have Depth == 1.  Instead, compare against Height.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
(cherry picked from commit 41485fea7c)
2013-08-06 12:20:05 -07:00
Ian Romanick
d1419857d7 mesa: Don't call driver RenderTexture for really broken textures
This fixes the segfault in the '0x0 texture' subtest of piglit's
fbo-incomplete test.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
(cherry picked from commit fb49713f8e)
2013-08-06 12:20:01 -07:00
Ian Romanick
c15b2d86e2 mesa: Remove stray debug printfs in attachment completeness code
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org
(cherry picked from commit 0c3dbd689b)
2013-08-06 12:19:58 -07:00
Ian Romanick
1e0ad955e7 mesa: Treat glBindFramebuffer and glBindFramebufferEXT more correctly
Allow user-generated names for glBindFramebufferEXT on desktop GL.
Disallow its use altogether for core profiles.

Names bound with glBindFramebuffer in desktop OpenGL are still
(incorrectly) shared across the share group instead of being
per-context.  This gets us a bit closer to being strictly conformant.

v2: Disallow glBindFramebufferEXT in 3.1 by not installing it in the
dispatch table.  Suggested by Jordan.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v1]
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 4a9522a5a0)
2013-08-06 12:19:55 -07:00
Ian Romanick
9aeb967e75 mesa: Treat glBindRenderbuffer and glBindRenderbufferEXT correctly
Allow user-generated names for glBindRenderbufferEXT on desktop GL.
Disallow its use altogether for core profiles.

v2: Disallow glBindRenderbufferEXT in 3.1 by not installing it in the
dispatch table.  Suggested by Jordan.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v1]
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 97965e87fc)
2013-08-06 12:19:40 -07:00
Ian Romanick
001c29cb18 mesa: Disable GL_EXT_framebuffer_object in core profiles and OpenGL 3.1
GL_EXT_framebuffer_object differs from GL_ARB_framebuffer_object in ways
that we can't and don't implement in core profiles.  Exposing it is a
lie, so we shouldn't do that.

It's possible the some other GL_EXT_framebuffer_* extensions should be
disabled, but it's not quite so clear cut.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b55c1638ad)
2013-08-06 09:44:17 -07:00
Matt Turner
c331562158 Makefile.am: Remove api_exec_es* from EXTRA_FILES.
These files were removed in commits a0102154 and a8ab7e33.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-08-06 09:38:38 -07:00
Marek Olšák
8e1d37161f st/dri: add a new driconf option disable_shader_bit_encoding for Unigine
Now Unigine Heaven 3.0 finally works with r600g.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 7568a89500)
2013-08-06 09:26:08 -07:00
Marek Olšák
78e760c234 mesa,glsl,st/dri: add a new driconf option force_glsl_version for Unigine
See documentation in mtypes.h.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0f6a7cb00c)
2013-08-06 09:26:01 -07:00
Marek Olšák
71891ce017 driconf: enable app-specific workarounds for all drivers
They were only enabled for i965.

Note that drirc must be installed in /etc.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7f2f804c75)
2013-08-06 09:25:54 -07:00
Marek Olšák
a19bc84380 st/dri: remove more unused driconf options
vblank_mode is read by dri_util.c and falls under the "dri2" driver name,
which is not connected to the actual Mesa/Gallium driver in any way.

Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 772070527f)
2013-08-06 09:25:48 -07:00
Marek Olšák
73bde3b8ff st/dri: implement the driconf option force_s3tc_enable properly
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 83dbe61ea4)
2013-08-06 09:25:43 -07:00
Marek Olšák
4a37827752 driconf: remove the unused option allow_large_textures
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit f27f3a4b15)
2013-08-06 09:25:36 -07:00
Marek Olšák
adc87c5e3f st/dri: support the driconf option disable_blend_func_extended
This is needed for Unigine.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 2acc27cc6d)
2013-08-06 09:25:27 -07:00
Marek Olšák
4d7ebeb51e st/osmesa: initialize disable_glsl_line_continuations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 71e0b5d688)
2013-08-06 09:25:16 -07:00
Michel Dänzer
6d8f471640 radeonsi: Number of SGPRs retrieved from LLVM already includes VCC
Fixes spurious 'Assertion `num_sgprs <= 104' failed.' with shaders using
all 104 SGPRs.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 46b6f79fea)
2013-08-06 09:22:28 -07:00
Andreas Boll
687415cf70 docs: Document UVD (2.2 and 3.0) video decoding support in mesa 9.2
Cc: "9.2" mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9d569fed8d)
2013-08-05 17:25:56 -07:00
Andreas Boll
e4f81bdbc4 docs: Document that i965 Gen6+ requires Kernel 3.6 or later
Cc: "9.2" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ec4a6a94b1)
2013-08-05 17:25:53 -07:00
Samuel Pitoiset
6c25c0a0da nvc0: properly align NVE4_COMPUTE_MP_TEMP_SIZE
MP_TEMP_SIZE must be aligned to 0x8000, while TEMP_SIZE on NVE4_3D
must be aligned to 0x20000, so perform both alignments to be sure
we allocate enough space (actually the bo will most likely use 128
KiB pages and not aligning to that would be a waste anyway).

Cc: "9.2" mesa-stable@lists.freedesktop.org
(cherry picked from commit ef6d5ee9f3)
2013-08-05 17:25:51 -07:00
Kenneth Graunke
526e71bfcc mesa/program: Switch from the deprecated YYLEX_PARAM to %lex-param.
YYLEX_PARAM is no longer supported as of Bison 3.0.  Instead, the Bison
developers recommend using %lex-param.

%lex-param takes a type and variable name, similar to %parse-param,
so you can't pass an arbitrary expression like state->scanner.  But Flex
insists on passing the actual scanner object, not an arbitrary object
like state.

To solve this, the parser defines a wrapper lex() function which accepts
"state," and calls Flex's lex() function with state->scanner.

Fixes the build with Bison 3.0.  Also works with Bison 2.7.1.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67354
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Cc: "9.2" mesa-stable@lists.freedesktop.org
(cherry picked from commit 6d2a9220b8)
2013-08-05 17:25:48 -07:00
Kenneth Graunke
06aee8a56c mesa/program: Change the program parser's namespace.
Bison 3.0 removes the YYLEX_PARAM macro.  In preparation for handling
this using %lex-param, the parser needs a wrapper function for the
actual Flex lex() function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67354
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Cc: "9.2" mesa-stable@lists.freedesktop.org
(cherry picked from commit de917b4c4c)
2013-08-05 17:25:45 -07:00
Kenneth Graunke
b319e3975e glsl: Switch from the deprecated YYLEX_PARAM to %lex-param.
YYLEX_PARAM is no longer supported as of Bison 3.0.  Instead, the Bison
developers recommend using %lex-param.

%lex-param takes a type and variable name, similar to %parse-param,
so you can't pass an arbitrary expression like state->scanner.  But Flex
insists on passing the actual scanner object, not an arbitrary object
like state.

To solve this, the parser defines a wrapper lex() function which accepts
"state," and calls Flex's lex() function with state->scanner.

Fixes the build with Bison 3.0.  Also works with Bison 2.7.1.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67354
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Cc: "9.2" mesa-stable@lists.freedesktop.org
(cherry picked from commit f043381334)
2013-08-05 17:25:42 -07:00
Kenneth Graunke
d4c2c5a739 glsl: Change the lexer's namespace.
Bison 3.0 removes the YYLEX_PARAM macro.  In preparation for handling
this using %lex-param, the parser needs a wrapper function for the
actual Flex lex() function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67354
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Cc: "9.2" mesa-stable@lists.freedesktop.org
(cherry picked from commit eb7c8c7fb6)
2013-08-05 17:25:40 -07:00
Eric Anholt
a3f48d97cd egl: Restore "bogus" DRI2 invalidate event code.
I had removed it in commit 1e7776ca2b
because it was obviously wrong -- why do we care whether the server is a
version that emits events, if we're not watching for the server's events,
anyway?  And why would you only invalidate on a server that emits
invalidate events, when the comment said to emit invalidates if the server
*doesn't*?  Only, I missed that we otherwise don't flag that our buffers
might have changed at swap time at all, so the driver was only checking
for new buffers when triggered by the Viewport hack.  Of course you don't
expect Viewport to be called after a swap.

So, this is effectively a revert of the previous commit, except that I
dropped the check for only emitting invalidates on a new server -- we
*always* need to invalidate if we're doing a SwapBuffers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63435
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 and 9.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit eed0a80137)
2013-08-05 17:07:15 -07:00
Mikko Juola
030ada7a50 mesa: fix multisampling proxy textures not being queryable
The code that checks if some texture target is valid for
glGetTexLevelParameter*() was not programmed to check for multisampling
proxy textures.  This made it impossible(?) to use the proxy textures
for their intended purpose as glGetTexLevelParameter*() would just fail
on you.

Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8624a514c2)
2013-08-05 17:07:11 -07:00
Mikko Juola
b61036fa2d mesa: fix proxy textures becoming immutable and unusable
glTexStorage*() functions make textures immutable.  This carries on to
proxy textures.  Error checking in texture storage functions prevents
proxy textures from working after first time because internally, they
became immutable.

This commit makes the error checking ignore the immutability flag when
working with proxy textures.

Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e404105e7d)
2013-08-05 17:07:06 -07:00
Mikko Juola
ddf6f591a9 mesa: fix proxy textures not working with default texture binding
When working with the glTexStorage*() functions, the error checking
checks that a non-default (i.e., non-zero) texture is currently bound.
However, this check made glTexStorage*() functions fail with proxy
textures when the default texture is bound. Proxy textures do not care
about the current texture bindings so for them this check should not
be done.

Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3f3f66fd94)
2013-08-05 17:07:03 -07:00
Mikko Juola
e3dbfc5769 mesa: fix number of mipmaps calculation for proxy textures
The function _mesa_get_tex_max_num_levels() is supposed to calculate
the number of mipmap levels but it was not written to handle proxy
textures, at best returning a maximum of 1 mipmap level. Because of
this, at least glTexStorage*() calls would incorrectly fail when used
with proxy textures with more than one mipmap level.

Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit de7e3741eb)
2013-08-05 17:06:59 -07:00
Brian Paul
cdcba2878a mesa: improve free() cleanup in generate_mipmap_compressed()
Free all our temporary buffers in one place at the end of the
function.  Fixes memory leak detected by Coverity.

Note: This is a candidate for the 9.x branches
Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit e5f32a0b3a)
2013-08-05 16:58:05 -07:00
Chris Forbes
8efee44c38 i965/vs: Put lod parameter in the correct place for Gen4
This was never visible before due to the bogus sampler state pointer.
Fixes remaining vertex texturing breakage on Gen4.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit cace82b0cd)
2013-08-05 16:58:01 -07:00
Chris Forbes
3bdd95270d i965/vs: set up sampler state pointer for Gen4/5.
Fixes broken filter and lod selection for vertex texturing.
(txs/txf only worked properly because they ignore the sampler state
completely)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 97676032c2)
2013-08-05 16:57:57 -07:00
Marek Olšák
5476049e38 st/mesa: fix opcode translation for ARB_shader_bit_encoding functions
We treat the opcodes as MOVs, but we should at least change the type
of the expression, which later affects which TGSI opcode is chosen.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 369c829152)
2013-08-05 16:57:53 -07:00
Marek Olšák
a2dbaeb2d8 gallium/postprocessing: convert blits to pipe->blit
PP saves current states to cso_context and then util_blit_pixels does
the same. cso_context doesn't like that and the original state is not
correctly restored.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 4c89ec1f69)
2013-08-05 16:57:49 -07:00
Marek Olšák
ded1695494 gallium/postprocessing: fix shader parsing
tokens was converted to a pointer, which made the Elements macro return 1.

Broken by e87fc11cac.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit c84e8d039e)
2013-08-05 16:57:46 -07:00
Marek Olšák
3213c60d81 mesa: default texture buffer format should be R8 in the core profile
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

v2: Since we don't expose the extension in the compatibility profile,
    the "if (API == CORE) .. else .." statement is removed.
(cherry picked from commit 7db83d8d4b)
2013-08-05 15:41:14 -07:00
Marek Olšák
678ac190a5 mesa: default DEPTH_TEXTURE_MODE should be RED in the core profile
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a6b1a7c0d2)
2013-08-05 15:41:14 -07:00
Marek Olšák
771b576da6 st/mesa: fix sRGB renderbuffers without EXT_framebuffer_sRGB support
https://bugs.freedesktop.org/show_bug.cgi?id=59322

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1302c66896)
2013-08-05 15:41:14 -07:00
Marek Olšák
9c66a29358 Revert "r300g: Give CLIP_DISABLE another try"
This reverts commit e866bd1ade.

https://bugs.freedesktop.org/show_bug.cgi?id=57875

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 4dfe1a0df5)
2013-08-05 15:41:14 -07:00
Ian Romanick
5154c93fa8 glsl: Less const for glsl_type convenience accessors
The second 'const' says that the pointer itself is constant.  This in
unenforcible in C++, so GCC emits a warning (see) below for each of
these functions in every file that includes glsl_types.h.  It's a lot of
warning spam.

../../../src/glsl/glsl_types.h:176:58: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 803f755ede)
2013-08-05 15:41:14 -07:00
Kenneth Graunke
8a27c824ec glsl: Disallow auxiliary storage qualifiers on FS outputs.
This has always been an error; we just forgot to check for it.

Fixes Piglit's no-aux-qual-on-fs-output.frag.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67333
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 17856726c9)
2013-08-05 15:41:14 -07:00
Kenneth Graunke
4953bf3837 glsl: Classify "layout" like other identifiers.
When "layout" isn't being lexed as LAYOUT_TOK, we should treat it like
an ordinary identifier.  This means we need to classify it to determine
whether we should return IDENTIFIER, TYPE_IDENTIFIER, or NEW_IDENTIFIER.

Fixes the WebGL conformance test "shader-with-non-reserved-words."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64087
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c178ec0d7e)
2013-08-05 15:41:14 -07:00
Chris Forbes
c7bfe87721 i965/vs: Fix flaky texture swizzling
If any component used the ZERO or ONE swizzle, its corresponding member
in the `swizzle` array would never be initialized. We *mostly* got away
with this, except when that memory happened to contain a value that
clobbered another channel when combined using BRW_SWIZZLE4().

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 124f567f1d)
2013-08-05 15:41:14 -07:00
Dave Airlie
c6f6b4e161 gallium/vl: add prime support
This fixes the dri2 opening to check if DRI_PRIME is set,
and picks the correct drm device path to open, this along
with a change to libvdpau allows vdpauinfo to work at least,

Martin Peres tested with nouveau, and there seems to be a
further issue with final displaying, it only works sometimes,
but this patch is at least necessary to help debug further.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67283
Tested-by: Armin K. <krejzi@email.com>
(cherry picked from commit 19338157c9)
2013-08-05 15:41:14 -07:00
Kenneth Graunke
4f5e18cb3e Revert "i965: Delete pre-DRI2.3 viewport hacks."
This reverts commit c9db037dc9.

Eric believes that the viewport hacks are still necessary for EGL;
invalidate events aren't hooked up properly.

This commit caused a regression where EFL applications wouldn't show
anything other than window decorations; GLBenchmark also showed issues.

The revert had conflicts due to the intel_context/brw_context merge.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66606
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 0e9549e2bd)
2013-08-05 15:41:13 -07:00
Paul Berry
e108bb07a0 glsl: Handle empty if statement encountered during loop analysis.
The is_loop_terminator() function was asserting that the following
kind of if statement could never occur:

    if (...) { } else { }

(presumably based on the assumption that such an if statement would be
eliminated by previous optimization stages).  But that isn't the
case--it's possible that previous optimization stages might simplify
more complex code down to this empty if statement, in which case it
won't be eliminated until the next time through the optimization loop.

So is_loop_terminator() needs to handle it.  Fortunately it's easy to
handle--it's not a loop terminator because it does nothing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64330
CC: mesa-stable@lists.freedesktop.org

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a5eecb246d)
2013-08-05 15:41:13 -07:00
Brian Paul
55241e9958 mesa: implement mipmap generation for compressed 2D array textures
We weren't looping over all the slices in the array.  The updated
code should also correctly handle 3D compressed textures too, whenever
we have that feature.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66850

NOTE: This is a candidate for the 9.x branches
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 8a9df7a370)
2013-08-05 15:41:13 -07:00
Brian Paul
6237090330 meta: handle 2D texture arrays in decompress_texture_image()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66850

NOTE: This is a candidate for the 9.x branches.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 484fa87984)
2013-08-05 15:41:13 -07:00
Brian Paul
55ab069e5f mesa: handle 2D texture arrays in get_tex_rgba_compressed()
If we call glGetTexImage() for a compressed 2D texture array we need
to loop over all the slices.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66850

NOTE: This is a candidate for the 9.x branches.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 2931bcb0d2)
2013-08-05 15:41:13 -07:00
Francisco Jerez
925e8a200b clover: Respect kernel argument alignment restrictions.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit df530829f7)
2013-08-05 15:41:13 -07:00
Francisco Jerez
014b9ceb62 clover: Extend kernel arguments for differing host and device data types.
Loosely based on a similar patch by Tom Stellard.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit f64c0ca692)
2013-08-05 15:41:13 -07:00
Francisco Jerez
8f80e55002 clover: Byte-swap kernel arguments when host and device endianness differ.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit 829caf410e)
2013-08-05 15:41:13 -07:00
Francisco Jerez
579eae3012 clover: Add kernel argument fields to allow differing host/target data types.
Loosely based on a similar patch by Tom Stellard.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit 2265b40e37)
2013-08-05 15:41:13 -07:00
Francisco Jerez
cb06c9b2aa clover: Pass corresponding module::argument to kernel::argument::bind().
And remove size information from most kernel::argument derived
classes, it's no longer going to be necessary.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit a3dcab43c6)
2013-08-05 15:41:13 -07:00
Tom Stellard
253a4c3e73 clover: Return correct value for CL_DEVICE_ENDIAN_LITTLE
Query the driver using PIPE_CAP_ENDIANNESS rather than always returning
true.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 8c9d3c62f6)
2013-08-05 15:41:13 -07:00
Tom Stellard
99666d81e4 gallium: Add PIPE_CAP_ENDIANNESS
Cc: mesa-stable@lists.freedesktop.org
[ Francisco Jerez: Fix "PIPE_ENDIAN_SMALL" in the documentation,
  define PIPE_ENDIAN_NATIVE. ]
(cherry picked from commit 4e90bc9a12)
2013-08-05 15:41:12 -07:00
Maarten Lankhorst
e8bc520713 nvc0: force use of correct firmware file
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
(cherry picked from commit e847b5ae06)
2013-08-03 12:54:41 +02:00
Ilia Mirkin
49f40ebefa nv50: fix some h264 interlaced decoding on vp2
Some videos specify mb_adaptive_frame_field_flag instead of
field_pic_flag. This implies that the pic height needs to be halved, and
this field needs to be passed to the VP engine.

Cc: "9.2" mesa-stable@lists.freedesktop.org

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 8edb79f1ef)
2013-08-03 12:54:20 +02:00
Christoph Bumiller
9b8ad64362 nv50,nvc0: s/uint16/uint32 for constant buffer offset
Looks like a thinko, "Hey, constant buffers can be at most 64 KiB
in size, offset can't be larger." But it can, of course.

I think piglit lacks a test for UBO and BindBufferRange that
tests if it actually works.
2013-07-25 15:55:21 +02:00
Jeremy Huddleston Sequoia
ee421aec32 Apple: glFlush() is not needed with CGLFlushDrawable()
<rdar://problem/14496373>

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
(cherry picked from commit fa5ed99d8e)
2013-07-20 11:48:44 -07:00
Tomasz Lis
9f07ca11c1 mesa: Dispatch ARB_framebuffer_object and EXT_framebuffer_object differently
Almost all of the functions between the ARB and the EXT share the same
GLX protocol because the functionality is, essentially, identical.
However, there are some differences between the extensions:

- In the ARB extension, names must come from glGenBuffers.

- In the ARB extension, framebuffer objects are not shared (but they are
  in the EXT).

For these reasons, glBindFramebuffer and glBindRenderbuffer have
different GLX protocol opcodes than their EXT counterparts.  Currently
these functions alias each other in the dispatch table.  This makes it
impossible to be truly spec conformant.

This patch enables fixing the conformance issue by splitting
glBindFramebuffer / glBindFramebufferEXT and glBindRenderbuffer /
glBindRenderbufferEXT into separate dispatch table entries.

Patches will be available shortly to:

- Fix the conformance issue.

- Stop advertising the EXT in OpenGL 3.1 (or core profiles).

HOWEVER, this does represent a compatibility break between the loader
(libGL or the Xserver GLX module) and the driver.  Mesa drivers compiled
without this change will request a single dispatch table entry for
glBindFramebuffer and glBindFramebufferEXT.  Since the updated loader
has different entries for each, the request will fail, and the driver
will die in a fire.

Drivers built with the change should continue to load fine on loaders
without the change.  In this case, the driver will separately ask for
entries for glBindFramebuffer and glBindFramebufferEXT, and the loader
will tell it the same location.  Since the loader in the server's GLX
module is not (yet) updated, this should not be a problem.  We also do
not advertise the ARB extension from the server, so, again, this should
not be a problem for the server.

HOWEVER, this means that DRI1 drivers (remember mga_dri.so?) will no
longer load with libGL build hereafter.  That means this patch will need
to be back ported to the 8.0 branch.

v2 (idr): Added missing GLX protocol opcodes for the EXT functions and
corrected the opcodes for the ARB functions.  Updated GLX indirect_api
unit test and dispatch sanity unit test.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Bartosz Zawistowski <bartosz.l.zawistowski@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
2013-07-18 17:42:46 -07:00
Kenneth Graunke
adfd0123c8 st/mesa: Enable the ARB_shading_language_420pack extension for 1.30+.
Any driver that supports GLSL 1.30 should be able to handle this
extension, as it's entirely implemented in the GLSL compiler.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
46d9baf3e3 i965: Enable the GL_ARB_shading_language_420pack extension on Gen6+.
While all the work is in the shared GLSL compiler, this extension
requires GLSL 1.30, which is currently only supported on Gen6+.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
bfcec4618a glsl: Handle the binding qualifier for UBO variables.
layout(binding = N) is equivalent to calling glUniformBlockBinding(_,N).

This currently only handles the GLSL 1.40 case - no interface names, no
arrays of uniform blocks.  This is okay since we don't yet support GLSL
1.50, and don't expose ARB_shading_language_420pack in ES 3.0.

v2: Move into the other function; use binding, not constant_value.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
f25d94084c glsl: Propagate UBO binding qualifier into UBO member variables.
Without an instance name, there is no ir_variable representing the
actual uniform block declaration.  When the linker goes to set uniform
initializers, it only sees the members as ir_variables; never the block.

So, unfortunately, the members need to know about the binding.

There has to be a better way to do this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
34e2ccc9f0 glsl: Handle the binding qualifier for arrays of samplers.
Normally, uniform array variables are initialized by array literals.
That is, val->type->array_elements >= storage->array_elements.

However, samplers are different.  Consider a declaration such as:

   layout(binding = 5) uniform sampler2D[3];

The initializer value is a single integer (5), while the storage has 3
array elements.  The proper behavior here is to increment one for each
element; they should be initialized to 5, 6, and 7.

This patch introduces new code for sampler types which handles both
arrays of samplers and single samplers correctly.

v2: Move into the other function; use binding, not constant_value.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
67038c6ba2 glsl: Add plumbing for handling uniform binding qualifiers.
Sampler uniforms and uniform blocks do not have a var->constant_value.
Instead, they have an integer var->binding value.

This makes extending set_uniform_initializer() somewhat problematic: it
assumes that there is an ir_constant * which represents the initializer,
and that it's safe to dereference that without any NULL checks.

Instead, this patch creates an analogous function for binding
qualifiers, and calls one or the other as appropriate.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
0a23ec2b6e glsl: Delete unused code for handling samplers in array-initializers.
There is existing code to handle sampler uniform initializers.  Prior to
GLSL 4.20's "binding" keyword, sampler uniforms don't have initializers
at all, so this is somewhat surprising.

The existing code is broken into two cases: one where both the variable and
initializer are arrays, and a second where the variable and initializer are
scalars.

The first case should never occur, since array-typed initializers do not
exist for sampler uniforms.  Even with the binding keyword, the
initializer is a single integer which represents the texture unit to use
for the first array element.

The second is apparently used for some fixed-function code.

v2: Rewrite the commit message - suggested by Paul.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
9a9a830b44 glsl: Cross-validate explicit binding points.
All compilation units need to agree on the binding point, if they
specify one at all.

v2: Use binding, not constant_value.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
d4375fc016 glsl: Propagate explicit binding information from AST to IR.
Rather than creating a new "binding" field in ir_variable, we reuse
constant_value since the linker code for handling uniform initializers
uses that.

Since UBOs and samplers can't otherwise have initializers/constant
values, there shouldn't be a conflict.

v2: Propagate the new binding variable around too.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
4da1504c0f glsl: Add ir_variable fields for explicit bindings.
These are not used yet, but they exist and are copied appropriately.

v2: Add an explicit "int binding" variable rather than reusing
    constant_value, as suggested by Paul Berry.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:24 -07:00
Kenneth Graunke
5e5e12040b glsl: Add validation for the "binding" qualifier.
The "binding" qualifier only applies to UBO blocks and samplers, along
with arrays of those types.  (It would also apply to images and atomic
counters, but we don't support those yet.)

This also validates sampler bindings against the maximum number of
texture units, and UBO bindings against the number of uniform buffer
binding points.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
0418846a07 glsl: Parse the "binding" keyword and store it in ast_type_qualifier.
Nothing actually uses this yet.

v2: Remove >= 0 checks.  They'll be handled in later validation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
7f6a2d6937 glsl: Have the lexer return LAYOUT_TOK if 420pack is enabled.
GL_ARB_shading_language_420pack also provides layout qualifiers.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
56bcde34b2 glsl: Use has_layout() rather than a partial open coded version.
The idea of this code is to disallow layout(...) sections with the
deprecated "varying" or "attribute" keywords, unless a few select
extensions are enabled which allow a more relaxed check.

In order to detect a layout(...) section, the code checks for a number
of layout qualifiers.  However, it failed to check for all of them,
which could lead to layout(...) not being detected when it should.

By replacing this with has_layout(), we properly check for all layout
qualifiers, and also guarantees that new qualifiers added in the future
will not be forgotten.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
c397ec94e9 glsl: Relax auxiliary storage ordering requirements with 420pack.
These were already semi-relaxed, since the storage qualifier rule
already skipped when 420pack was enabled.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
b5d6c51e2b glsl: Handle centroid qualifier ordering in C code, not the parser.
The GL_ARB_shading_language_420pack extension/GLSL 4.20 split centroid
off into a new category, "auxiliary storage qualifiers," and allow these
to be placed anywhere in the series.  So we have to stop recognizing
"centroid in"/"centroid out"/"centroid varying" in the grammar and get
more creative.

The same approach used before works here, too.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
844307a584 glsl: Allow precision qualifiers to be flexibly ordered with 420pack.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
6eec502e84 glsl: Move precision handling to be part of qualifier handling.
This is necessary for the parser to be able to accept precision
qualifiers not immediately adjacent to the type, such as "const highp
inout float foo".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
308d4c7146 glsl: Change is_precision_statement to default_precision != none.
Currently, we store precision in ast_type_specifier, rather than
ast_type_qualifier.  This works because precision is the last qualifier,
and immediately adjacent to the type.

Default precision statements (such as "precision highp float") are
represented as ast_type_specifier objects, with a boolean to indicate
that it's a default precision statement rather than an ordinary type.

ast_type_specifier::precision will be moving to ast_type_qualifier soon,
in order to support arbitrary qualifier ordering.  However, we still
need to store a "this is a precision statement" flag /and/ the default
precision in ast_type_specifier.

This patch changes the boolean into a new field, default_precision.
If default_precision != ast_precision_none, it's a precision statement
with the specified precision.  Otherwise, it's an ordinary type.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:23 -07:00
Kenneth Graunke
7855482138 glsl: Disable ordering checks for const parameters with 420pack.
This makes the complier accept both "const in" and "in const".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
293dfe5738 glsl: Handle "const" as a parameter qualifier.
This will make it easy to support both "const in" and "in const", as
required by GLSL 4.20/ARB_shading_language_420pack.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
a4d15a3cd9 glsl: Refactor parameter qualifier handling.
"Parameter direction qualifier" is a new term I invented just now; it's
not part of any GLSL specification.

This paves the way handling multiple parameter qualifiers, in any order,
as required by GLSL 4.20/ARB_shading_language_420pack.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
83fe4f7019 glsl: Use merge_qualifier() when processing qualifier lists.
Most of ast_type_qualifier is simply a bitfield (represented as a
structure of unsigned:1 bits in a union with an unsigned).  However, it
also contains ARB_explicit_attrib_location's location/index fields.

In the past, this has worked by simply returning the layout qualifier's
ast_type_qualifier and merging the other bits into it.  However, that's
not obvious until you break it by switching $1 and $2.

Using merge_qualifier() copies them appropriately, and also properly
overrides layout qualifiers.  It also checks for duplicate qualifiers,
which renders some of the checks in the previous patch unnecessary.
However, those checks provide better error messages, such as "Duplicate
interpolation qualifier", rather than just "duplicate qualifier".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
0cb90fcfbd glsl: Allow duplicate layout qualifiers with 420pack.
The new 4.20 rules explicitly allow multiple layout(...) sections.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
89f75e7e7b glsl: Disable ordering checks on most qualifiers for 420pack.
This makes the compiler accept invariant, storage, layout, and
interpolation qualifiers in any order when ARB_shading_language_420pack
is enabled.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
48e3bd33dc glsl: Handle most qualifier ordering in C code rather than the grammar.
The GL_ARB_shading_language_420pack extension/GLSL 4.20 allow qualifiers
to be specified in (basically) any order.  In order to support this, we
can't hardcode the ordering restrictions in the grammar.

This patch alters the grammar to accept invariant, storage, layout, and
interpolation qualifiers in any order, but adds C code to enforce the
ordering requirements.  In the 420pack case, we should be able to simply
skip the error checks.

As a bonus, this also lets us generate decent error messages, rather
than Bison's awful "unexpected TOKEN" errors.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
1b719df14d glsl: Add a new ast_type_qualifier::has_auxiliary_storage() method.
"Auxiliary storage qualifiers" is the new term given to "centroid",
"patch", and "sample" by GLSL 4.20/GL_ARB_shading_language_420pack.

Even though we only support "centroid", it's useful to add this now
so that all auxiliary storage qualifiers get handled in the right places
once they're eventually supported.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
eb30af51d6 glsl: Add a new ast_type_qualifier::has_storage() method.
This makes it easy to check if any storage qualifiers are set.

"centroid" is not considered a storage qualifier.  In the old language
rules, you can't specify "centroid" by itself; it's always "centroid
in", "centroid out", or "centroid varying."  So one of the other storage
qualifiers will always be set; there's no need to specifically check for
centroid.

In the new 4.20 rules, centroid is an auxiliary storage qualifier, not a
storage qualifier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:22 -07:00
Kenneth Graunke
7cef2b22b8 glsl: Add a new ast_type_qualifier::has_layout() method.
This makes it easy to check if any layout qualifiers are set.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-18 16:57:21 -07:00
Kenneth Graunke
7ce5c6b214 i965: Combine URB code emission into a single group.
All four URB packets need to be programmed together in order for the GPU
state to be valid.  Putting them in separate BEGIN..ADVANCE blocks is
risky: if we're nearing the end of a batch, the batch could be flushed
inbetween two of the commands, causing the URB programming to be split
into two batchbuffers.

This -might- be okay with hardware contexts, but it offers no advantages
over keeping them together, and has a potential for hangs.

Putting them into a single BEGIN..ADVANCE block ensures they'll be kept
in the same batch, which seems wise.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-18 16:57:21 -07:00
Chad Versace
30f33deccb i965/hsw: Change L3 MOCS for depth, hiz, and stencil
Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:18:22 -07:00
Chad Versace
2273b652bb i965/hsw: Change L3 MOCS of 3DSTATE_CONSTANT_VS/PS
Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

In blorp, change only the PS packet, because the VS packet is disabled.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:18:22 -07:00
Chad Versace
2f346395f5 i965/hsw: Change L3 MOCS of SURFACE_STAT
Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:18:21 -07:00
Chad Versace
a16d47465e i965/hsw: Change L3 MOCS of 3DSTATE_VERTEX_BUFFERS
Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:18:21 -07:00
Tomasz Lis
eb83079b35 glx: Enable floating-point fbconfig extensions
Signed-off-by: Tomasz Lis <listom@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Ian Romanick
74cbe6e497 egl: Drop configs with unknown or invalide __DRI_ATTRIB_RENDER_TYPE
Some render types, such as floating-point, aren't valid with EGL.
Return NULL in those cases to drop them.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
c37c367d38 dri: Introduce new flags in __DRI_ATTRIB_RENDER_TYPE
Mark __DRI_ATTRIB_FLOAT_MODE as deprecated, and introduce new flags to
__DRI_ATTRIB_RENDER_TYPE for float modes.  Both signed float
(fbconfig_float) and unsigned (packed_float) are introduced. The old
attribute should be set for both float modes.

v2 (idr): Require that the render mode from the DRI attributes matches the
render mode of the config exactly.  This is the behavior of the old code.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
4473af7aca glx: Require proper drawableType in init_fbconfig_for_chooser
Make sure that init_fbconfig_for_chooser sets correct value of
drawableType for visual configs and fbconfigs.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
2eed9ff2fb glx: Validate the GLX_RENDER_TYPE value
Correctly handle the value of renderType in GLX context.  In case of the
value being incorrect, context creation fails.

v2 (idr): indirect_create_context is just a memory allocator, so don't
validate the GLX_RENDER_TYPE there.  Fixes regressions in several
GLX_ARB_create_context piglit tests.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
27c8aa5cfb glx: Store the RENDER_TYPE in indirect rendering
v2 (idr): Open-code the check for GLX_RENDER_TYPE.
dri2_convert_glx_attribs can't be called from here because that function
only exists in direct-rendering builds.  Also add a stub version of
indirect_create_context_attribs to tests/fake_glx_screen.cpp to prevent
'make check' regressions.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
1c748dff6b glx: Handling RENDER_TYPE in glXCreateContext and init_fbconfig_for_chooser
Set the correct values of renderType in glXCreateContext and
init_fbconfig_for_chooser.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
b8126c7c8a glx: Changes to visual configs initialization.
Correctly handle the value of renderType and drawableType in
fbconfig. Modify glXInitializeVisualConfigFromTags to read the parameter
value, or detect it if it's not there.

v2 (idr): If there was no GLX_RENDER_TYPE property, set the type based
purely on the rgbMode as the previous code did.  It is impossible for
floatMode to be set at this point, so we can't have a float config.  The
previous code regressed a large number of piglit GLX tests because those
tests don't set GLX_RENDER_TYPE in the glXChooseConfig call.  Restoring
the old behavior for that case fixes those regressions.

Also fix handling of GLX_DONT_CARE for GLX_RENDER_TYPE.  Fixes a
regression in glx-dont-care-mask.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
a92cd5b245 glx: Retrieve the value of RENDER_TYPE from GLX attribs array
Make sure that context creation routines are provided with the value of
RENDER_TYPE retrieved from GLX attribs.

v2 (idr): Minor formatting changes.  Change type of
dri2_convert_glx_attribs render_type parameter to uint32_t to silence
some GCC warnings.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Tomasz Lis
36259a16fe glx: Store the value of renderType while creating context
Make sure that renderType property value is stored in GLX context while
it's being created.  Further patches will be provided to make the value
correspond to fbconfig's renderType.

v2 (idr): Move a hunk from the next patch to this patch to prevent a
build break.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-18 16:03:42 -07:00
Kenneth Graunke
7791c9869b i965: Add #defines for Memory Object Control State fields on Gen7-7.5.
The L3 controls are identical on all platforms, but LLC differs:
- Ivybridge has a "cache in LLC" flag
- Baytrail has no LLC, but instead has a snoop bit:
  "data accesses in this page must be snooped in the CPU caches."
- Haswell has writeback/uncached flags for LLC and eLLC (eDRAM).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 16:03:19 -07:00
Fabian Bieler
6368478712 glsl/linker: Use correct array length when linking inter-stage uniforms and varyings.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Fabian Bieler <fabianbieler@fastmail.fm>
2013-07-18 14:12:44 -07:00
Mike Frysinger
73c9b4b0e0 gen_matypes: fix cross-compiling with gcc
The current gen_matypes logic assumes that the host compiler will produce
information that is useful for the target compiler.  Unfortunately, this
is not the case whenever cross-compiling.

When we detect that we're cross-compiling and using GCC, use the target
compiler to produce assembly from the gen_matypes.c source, then process
it with a shell script to create a usable header.  This is similar to how
the linux kernel creates its asm-offsets.c file.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2013-07-18 13:55:48 -07:00
Andreas Oberritter
a48be954ce ax_prog_flex.m4: change grep syntax to accept e.g. flex.real
This is required in case a wrapper or symlink is used. This patch
has also been sent upstream, awaiting moderation.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andreas Oberritter <obi@saftware.de>
2013-07-18 13:54:59 -07:00
Jonathan Liu
2da0bd0526 builtin_compiler/build: Avoid using libtool if cross compiling
Adds the dependencies of builtin_compiler as sources when cross
compiling instead of using libtool to share compilation with src/glsl.
The builtin_compiler executable is built for the host when cross
compiling so it doesn't make sense to share compilation with src/glsl
built for the target in this case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44618
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jonathan Liu <net147@gmail.com>
2013-07-18 13:54:20 -07:00
Kenneth Graunke
2b5b436615 i965: Add MOCS shift and mask for SURFACE_STATE entries.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-18 10:45:49 -07:00
Roland Scheidegger
4ef19f7fec llvmpipe: clamp inputs for srgb render buffers
Usually with fixed point renderbuffers clamping is done as part of conversion.
However, since we blend in float format, we essentially skip all conversion
steps pre-blend but since this is still a fixed point renderbuffer we must
still clamp the inputs in this case. Makes no difference for piglit though.
Obviously we could skip this if fragment color clamping is enabled, but a)
this is deprecated in OpenGL (d3d never had it) and b) we don't support it
natively so it gets baked into the shader.
Also add some comment about logic ops being broken for srgb, luckily no test
tries to do that as there's no easy fix...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-07-18 19:04:20 +02:00
Roland Scheidegger
e57b98bad3 llvmpipe: fix blending with SRC_ALPHA_SATURATE with some formats without alpha
We were fixing up the blend factor to ZERO, however this only works correctly
with fixed point render buffers where the input values are clamped to 0/1
(because src_alpha_saturate is min(As, 1-Ad) so can be negative with unclamped
inputs). Haven't seen any failure anywhere due to that with fixed point SNORM
buffers (which clamp inputs to -1/1) but it should apply there as well (snorm
blending is rare, even opengl 4.3 doesn't require snorm rendertargets at all,
d3d10 requires them but they are not blendable).
Doesn't look like piglit hits this though (some internal testing hits the
float case at least). (With legacy OpenGL we could theoretically still use the
fixup to zero if the fragment color clamp is enabled, but we can't detect that
easily since we don't support native clamping hence it gets baked into the
shader.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-07-18 19:03:35 +02:00
Marek Olšák
0d7f087483 r600g: use WAIT_3D_IDLE before using CP DMA
I broke this with 7948ed1250 for r700 at least.
2013-07-18 14:27:34 +02:00
Jonathan Gray
0b405f364f r300g: make use of gallium's os_get_process_name()
Lets the code compile on non Linux systems.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-07-18 14:04:48 +02:00
Jean-Sébastien Pédron
148f0deb06 configure.ac: On some systems, "x86-64" is called "amd64"
For instance, this is the case on FreeBSD.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 23:10:23 -07:00
Ilia Mirkin
fbdae1ca41 nv50: H.264/MPEG2 decoding support via VP2, available on NV84-NV96, NVA0
Adds H.264 and MPEG2 codec support via VP2, using firmware from the
blob. Acceleration is supported at the bitstream level for H.264 and
IDCT level for MPEG2.

Known issues:
 - H.264 interlaced doesn't render properly
 - H.264 shows very occasional artifacts on a small fraction of videos
 - MPEG2 + VDPAU shows frequent but small artifacts, which aren't there
   when using XvMC on the same videos

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-07-18 07:52:32 +02:00
Jonathan Gray
f96c07abf6 configure.ac: make grep tests more portable
Use grep -w instead of the empty string escape sequences
which are less portable.  Makes the grep tests
function as intended on OpenBSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 22:50:19 -07:00
Jonathan Gray
78fbb41fe3 configure.ac: add OpenBSD
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 21:06:46 -07:00
Vinson Lee
21f97446f4 glsl: Remove comma at end of enumerator list.
Fixes this build error on OpenBSD 5.3.

In file included from ../../src/mesa/main/ff_fragment_shader.cpp:53:
./../glsl/ir_optimization.h:64: error: comma at end of enumerator list

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 20:57:54 -07:00
Vinson Lee
77311dab3a mesa: Remove commas at end of enumerator lists.
Fixes these build errors on OpenBSD 5.3.

In file included from ../../src/mesa/main/errors.h:47,
                 from ../../src/mesa/main/imports.h:41,
                 from ../../src/mesa/main/ff_fragment_shader.cpp:32:
../../src/mesa/main/mtypes.h:3286: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3296: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3303: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3356: error: comma at end of enumerator list

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 20:57:53 -07:00
Carl Worth
ceaf1a74cb docs: Import 9.1.5 release notes
And add news item for the release.
2013-07-17 20:11:02 -07:00
Roland Scheidegger
7fd30a8621 gallivm: (trivial) simplify lp_build_cos/lp_build_sin a tiny bit
Use "or" instead of "add" (this is a classic select sequence, which at
least newer llvm versions can actually recognize (3.2+?), and the "add"
might prevent that - and we really don't want an add instead of an or with
avx if it isn't recognized (even without avx logic ops might be cheaper)).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-17 18:16:34 +02:00
Roland Scheidegger
f0f9fb59c3 util/u_format_s3tc: handle srgb formats correctly.
Instead of just ignoring the srgb/linear conversions, simply call the
corresponding conversion functions, for all of pack/unpack/fetch,
both for float and unorm8 versions (though some don't make a whole
lot of sense, i.e. unorm8/unorm8 srgb/linear combinations).
Refactored some functions a bit so don't have to duplicate all the code
(there's a slight change for packing dxt1_rgb, as there will now be
always 4 components initialized and sent to the external compression
function so the same code can be used for all, the quite horrid and
ad-hoc interface (by now) should always have worked with that).

Fixes llvmpipe/softpipe piglit texwrap GL_EXT_texture_sRGB-s3tc.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-17 18:16:27 +02:00
Vadim Girlin
07baf9cfd1 r600g/sb: improve alu packing on cayman
Scheduler/register allocator in r600-sb was developed and optimized
on evergreen (VLIW-5) hardware, so currently it's not optimal for
VLIW-4 chips.
This patch should improve performance on cayman gpus due to better alu
packing, but also it tends to increase register usage, so overall positive
effect on performance has to be proven by real benchmarks yet.

Some results with bfgminer kernel on cayman:
source bytecode:       60 gprs, 3905 alu groups,
sbcl before the patch: 45 gprs, 4088 alu groups,
sbcl with this patch:  55 gprs, 3474 alu groups.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:29:56 +04:00
Vadim Girlin
ba7fa4c4c9 r600g/sb: fix handling of new multislot instructions on cayman
Ex-scalar instructions that became multislot on cayman do replicate result
to all channels - handle them similar to DOT4.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:31 +04:00
Vadim Girlin
033eec4145 r600g/sb: fix debug dump code in scheduler
Update the stale debug code for other changes related to debug output.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:31 +04:00
Vadim Girlin
44ebe7291c r600g/sb: fix initial register allocation
Mark values that are members of the 'same register' constraint as
preallocated in ra_init pass, this will prevent incorrect
reallocation in scheduler in some cases.

Should fix https://bugs.freedesktop.org/show_bug.cgi?id=66713

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:30 +04:00
Vadim Girlin
f0d881106a r600g/sb: move chip & class name functions to sb_context
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:30 +04:00
Vadim Girlin
96efa4cdf4 r600g/sb: fix handling of PS in source bytecode on cayman
Actually PS doesn't make sense for cayman and isn't even mentioned in
cayman docs, but llvm backend currently uses it in bytecode and, assuming
that hw seems to be mostly ok with it, this will allow sb to parse such
source bytecode correctly.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-17 18:27:30 +04:00
Vinson Lee
81d3881367 r600g/sb: Initialize ra_checker member variables.
Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-17 18:27:30 +04:00
Emil Velikov
b20e0fb520 gallium/util: use explicily sized types for {un, }pack_rgba_{s, u}int
Every function but the above four uses explicitly sized types for their
src and dst arguments. Even fetch_rgba_{s,u}int follows the convention.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-07-17 13:01:46 +02:00
Kyle McMartin
87c3440567 llvmpipe: use MCJIT on ARM and AArch64
MCJIT is the only supported LLVM JIT on AArch64 and ARM (the regular
JIT has bit-rotted badly on ARM and doesn't exist on AArch64.)

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
2013-07-17 17:29:01 +10:00
Kenneth Graunke
00d32cd5b4 glsl: Fix absurd whitespace conventions in the parser.
Historically, we indented grammar production rules with a single 8-space
tab, but code inside of blocks used Mesa's 3-space indents.

This meant when editing code, you had to use an 8-space tab for the
first level of indentation, and 3-spaces after that.  Unless you
specifically configure your editor to understand this, it will get the
indentation wrong on every single line you touch, which quickly devolves
into a colossal waste of time.

It's also inconsistent with every other file in the entire project.

This patch removes all tabs and moves to a consistent 3-space indent.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-16 11:31:58 -07:00
Kenneth Graunke
4ab7fc9ec3 glsl: Fail the build if the grammar contains shift/reduce errors.
When working on a parser, it's very easy to accidentally introduce
new shift/reduce conflicts.  Failing the build guarantees they'll
be noticed and fixed.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-16 11:31:58 -07:00
Kenneth Graunke
73620709c9 glsl: Silence the last shift/reduce conflict warning in the grammar.
The single remaining shift/reduce conflict was the classic ELSE problem:

  292 selection_rest_statement: statement . ELSE statement
  293                         | statement .

    ELSE  shift, and go to state 479

    ELSE      [reduce using rule 293 (selection_rest_statement)]
    $default  reduce using rule 293 (selection_rest_statement)

The correct behavior here is to shift, which is what happens by default.
However, resolving it explicitly will make it possible to fail the build
on new errors, making them much easier to detect.

The classic way to solve this is to use right associativity:
http://www.gnu.org/software/bison/manual/html_node/Non-Operators.html

Since there is no THEN token in GLSL, we need to fake one.  %right THEN
creates a new terminal symbol; the %prec directive says to use the
precedence of that terminal.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-16 11:31:58 -07:00
Vinson Lee
fa7829c36b glsl: Initialize ast_jump_statement::opt_return_value.
opt_return_value was not initialized if mode != ast_return.

Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-16 09:03:02 -07:00
Vinson Lee
f74acb9835 glapi: Do not use backtrace on OpenBSD.
execinfo.h is not available on OpenBSD.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-16 09:00:38 -07:00
Maarten Lankhorst
b20b2b6dc8 osmesa: link against static libglapi library too to get the gl exports
This should fix missing symbols in a osmesa built against shared glapi
osmesa build. All opengl exports were missing that are defined in the
static glapi, so link against both to fix this.

This is a candidate for the stable series.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47824
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-07-16 10:18:40 +02:00
Chris Forbes
121ea0b38b i965/Gen4: Zero extra coordinates for ir_tex
We always emit U,V,R coordinates for this message, but the sampler gets
very angry if we pass garbage in the R coordinate for at least some
texture formats.

Fill the remaining coordinates with zero instead.

Fixes broken rendering on GM45 in Source games, and in VDrift.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65236

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-16 19:08:41 +12:00
Kenneth Graunke
e4fdf1b008 i965: Cite the Ivybridge PRM for 3DSTATE_CLEAR_PARAMS notes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:53 -07:00
Kenneth Graunke
b72a298751 i965: Refer people to brw_tex_layout.c rather than the BSpec.
brw_tex_layout.c sets up the align_w/h fields, and has all the
appropriate spec references already.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:53 -07:00
Kenneth Graunke
4b704424e0 i965: Remove old BSpec reference from BLORP's 3DSTATE_WM/PS packets.
The Sandybridge code had a citation for the range of the "Maximum Number
of Threads" field, and the Ivybridge code just mentioned the "BSpec" in
general.  That's documented in the obvious place, so people can find it
without a spec reference.

The real value of the comment is to say "we tried zero, and it exploded,
so program it to a valid number even if pixel shading is off."

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
ada110716a i965: Cite the Ivybridge PRM for 3DSTATE_URB_* programming.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
90b5a03581 i965: Update workaround flush comments for Gen6 3DSTATE_VS.
Unfortunately, the workaround text never made it into the Sandybridge
PRM, so we still have to refer to the BSpec.

It also wasn't obvious why we needed this workaround at all, since we
don't currently do VS passthrough - but BLORP can turn off the VS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
3b3a440d2b i965: Cite the Ivybridge PRM for VS PIPE_CONTROL workarounds.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
9a86875c6b i965: Cite the Sandybridge PRM for Gen7 stencil pitch requirements.
Sadly, the Ivybridge PRM can't be cited, as it is missing the relevant
text for some reason.  However, the Sandybridge PRM has the text Chad
originally quoted, and the modern BSpec has the same text.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
2e928e2a3f i965: Cite the Ivybridge PRM for multisample surface format notes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
43ea434225 i965: Delete "the data cache is the sampler cache" comments on Gen7+.
I cut and pasted these comments from the Gen4 code during Ivybridge
enabling, and didn't understand what they meant at the time.

The data cache is NOT the same as the sampler cache on Ivybridge.
The sampler cache has L1 and L2 caches in addition to the L3 cache,
while data port messages to the "data cache" hit L3 directly.

This means that the sampler domain is technically wrong, but we stopped
caring about read/write domains quite a while ago.  The kernel just
flushes all the caches at the end of each batchbuffer, and our render to
texture code flushes the sampler caches when necessary.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:52 -07:00
Kenneth Graunke
3f64cfabfc i965: Cite the 965 PRM for "the data cache is the sampler cache".
Presumably, this comment exists to justify the usage of
I915_GEM_DOMAIN_SAMPLER for this relocation.  At one point, this was
necessary to ensure that the right flushing was done to keep caches
coherent.  These days, the kernel just flushes everything, so I don't
think it matters.

Still, the comment is interesting, so leave it in place.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:51 -07:00
Kenneth Graunke
f254c94204 i965: Cite the Ivybridge PRM for DP message descriptor fields.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:51 -07:00
Kenneth Graunke
a0c8e76202 i965: Cite the Ivybridge PRM for why the fake MRF range is what it is.
The exact text is in the public docs, so we should cite those.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:51 -07:00
Kenneth Graunke
3090d39dde i965: Cite the Ivybridge PRM for SFID enum values.
The Ivybridge PRM adds new SFIDs and lists them in a different volume
than Sandybridge, so it's worth adding a reference.

I also removed the BSpec reference, as the section it referred to
was moved somewhere, and I couldn't find it.  This leaves one Haswell
SFID without a citation, but we can add one once the PRMs are out.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 19:40:51 -07:00
Roland Scheidegger
dc1cc928ed llvmpipe: support sRGB framebuffers
Just use the new conversion functions to do the work. The way it's plugged
in into the blend code is quite hacktastic but follows all the same hacks
as used by packed float format already.
Only support 4x8bit srgb formats (rgba/rgbx plus swizzle), 24bit formats never
worked anyway in the blend code and are thus disabled, and I don't think anyone
is interested in L8/L8A8. Would need even more hacks otherwise.
Unless I'm missing something, this is the last feature except MSAA needed for
OpenGL 3.0, and for OpenGL 3.1 as well I believe.

v2: prettify a bit, use separate function for packing.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-16 01:54:51 +02:00
Marek Olšák
a882067d74 Revert "r300g: allow HiZ with a 16-bit zbuffer"
This reverts commit 631c631cbf.

https://bugs.freedesktop.org/show_bug.cgi?id=66921

Cc: mesa-stable@lists.freedesktop.org
2013-07-15 23:46:01 +02:00
Marek Olšák
7969b567bd r300g/swtcl: fix a lockup in MSAA resolve
Cc: mesa-stable@lists.freedesktop.org
2013-07-15 23:45:22 +02:00
Marek Olšák
22427640b2 r300g/swtcl: fix geometry corruption by uploading indices to a buffer
The splitting of a draw call into several draw commands was broken, because
the split sometimes took place in the middle of a primitive. The splitting
was supposed to be dealing with the case when there are more indices than
the maximum size of a CS.

This commit throws that code away and uses a real index buffer instead.

https://bugs.freedesktop.org/show_bug.cgi?id=66558

Cc: mesa-stable@lists.freedesktop.org
2013-07-15 23:45:16 +02:00
Matt Turner
c889df3fbe glsl: Reject C-style initializers with unknown types.
_mesa_ast_set_aggregate_type walks through declarations initialized with
C-style aggregate initializers and stops when it runs out of LHS
declarations or RHS expressions.

In the example

   vec4 v = {{{1, 2, 3, 4}}};

_mesa_ast_set_aggregate_type would not recurse into the subexpressions
(since vec4s do not contain types that can be initialized with an
aggregate initializer) to set their <constructor_type>s. Later in ::hir
we would dereference the NULL pointer and segfault.

If <constructor_type> is NULL in ::hir we know that the LHS and RHS
were unbalanced and the code is illegal.

Arrays, structs, and matrices were unaffected.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-07-15 13:02:36 -07:00
Paul Berry
7706e52b25 glsl: Rework builtin_variables.cpp to reduce code duplication.
Previously, we had a separate function for setting up the built-in
variables for each combination of shader stage and GLSL version
(e.g. generate_110_vs_variables to generate the built-in variables for
GLSL 1.10 vertex shaders).  The functions called each other in ad-hoc
ways, leading to unexpected inconsistencies (for example,
generate_120_fs_variables was called for GLSL versions 1.20 and above,
but generate_130_fs_variables was called only for GLSL version 1.30).
In addition, it led to a lot of code duplication, since many varyings
had to be duplicated in both the FS and VS code paths.  With the
advent of geometry shaders (and later, tessellation control and
tessellation evaluation shaders), this code duplication was going to
get a lot worse.

So this patch reworks things so that instead of having a separate
function for each shader type and GLSL version, we have a function for
constants, one for uniforms, one for varyings, and one for the special
variables that are specific to each shader type.

In addition, we use a class, builtin_variable_generator, to keep track
of the instruction exec_list, the GLSL parse state, commonly-used
types, and a few other variables, so that we don't have to pass them
around as function arguments.  This makes the code a lot more compact.

Where it was feasible to do so without introducing compilation errors,
I've also gone ahead and introduced the variables needed for
{ARB,EXT}_geometry_shader4 style geometry shaders.  This patch takes
care of everything except the GS variable gl_VerticesIn, the FS
variable gl_PrimitiveID, and GLSL 1.50 style geometry shader inputs
(using the gl_in interface block).  Those remaining features will be
added later.

I've also made a slight nomenclature change: previously we used the
word "deprecated" to refer to variables which are marked in GLSL 1.40
as requiring the ARB_compatibility extension, and are marked in GLSL
1.50 onward as requiring the compatibilty profile.  This was
misleading, since not all deprecated variables require the
compatibility profile (for example gl_FragData and gl_FragColor, which
have been deprecated since GLSL 1.30, but do not require the
compatibility profile until GLSL 4.20).  We now consistently use the
word "compatibility" to refer to these variables.

This patch doesn't introduce any functional changes (since geometry
shaders haven't been enabled yet).

Reviewed-by: Matt Turner <mattst88@gmail.com>

v2: Rename "typ" -> "type".  Add blank line between inline functions
and declarations in builtin_variable_generator class.  Use the
standard comment "/* FALLTHROUGH */" for compatibility with static
code analysis tools.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 09:35:28 -07:00
Paul Berry
428e030210 glsl: Fix lower_named_interface_blocks to account for dereferences of consts.
In certain rare cases (such as those involving dereference of a
literal constant array of structs),
flatten_named_interface_blocks_declarations's rvalue visitor may be
invoked on an ir_dereference_record whose variable_referenced() method
returns NULL.

Check for this case to avoid a segfault.

Prevents crashes in piglit tests
{vs,fs}-deref-literal-array-of-structs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2013-07-15 07:59:52 -07:00
Paul Berry
b2265db8e7 glsl: Don't allow vertex shader input arrays until GLSL 1.50.
Vertex shader inputs are not allowed to be arrays until GLSL 1.50.  We
were accidentally enabling them for GLSL 1.40 (although we haven't
written any tests for them, so it's not clear whether they actually
work).

NOTE: although this is a simple bug fix, it probably isn't sensible to
cherry-pick it to stable release branches, since its only effect is to
cause incorrectly-written shaders to fail to compile.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-15 07:50:47 -07:00
Chris Forbes
b616d01661 i965: Gen4/5: use IEEE floating point mode for GLSL shaders.
Fixes isinf(), isnan() from GLSL 1.30

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-14 19:58:25 +12:00
Chris Forbes
1ec66f2fb2 i965/vs: Gen4/5: enable front colors if back colors are written
Fixes undefined results if a back color is written, but the
corresponding front color is not, and only backfacing primitives are
drawn. Results are still undefined if a frontfacing primitive is drawn,
but that's OK.

The other reasonable way to fix this would have been to just pick
the one color slot that was populated, but that dilutes the value of
the tests.

On Gen6+, the fixed function clipper and triangle setup already take
care of this.

Fixes 11 piglits:
spec/glsl-1.10/execution/interpolation/interpolation-none-gl_Back*Color-*

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-14 19:58:11 +12:00
Roland Scheidegger
796b73d1fe gallivm: (trivial) use constant instead of exp2f() function
Some lame compilers can't do exp2f() and as far as I can tell they can't do
exp2() (with doubles) neither so instead of providing some workaround for
that (wouldn't actually be too bad just replace with pow) and since it is
used with a constant only just use the precalculated constant.
2013-07-14 02:39:33 +02:00
Chia-I Wu
62c546bbf8 ilo: skip 3DSTATE_INDEX_BUFFER when possible
When only the offset to the index buffer is changed, we can skip the
3DSTATE_INDEX_BUFFER if we always use 0 for the offset, and add
(offset / index_size) to Start Vertex Location in 3DPRIMITIVE.
2013-07-14 05:59:52 +08:00
Roland Scheidegger
6bcbb0dc82 gallivm: handle srgb-to-linear and linear-to-srgb conversions
srgb-to-linear is using 3rd degree polynomial for now which should be _just_
good enough. Reverse is using some rational polynomials and is quite accurate,
though not hooked into llvmpipe's blend code yet and hence unused (untested).
Using a table might also be an option (for srgb-to-linear especially).
This does not enable any new features yet because EXT_texture_srgb was already
supported via util_format fallbacks, but performance was lacking probably due
to the external function call (the table used by the util_format_srgb code may
not be all that much slower on its own).
Some performance figures (taken from modified gloss, replaced both base and
sphere texture to use GL_SRGB instead of GL_RGB, measured on 1Ghz Sandy Bridge,
the numbers aren't terribly accurate):

normal gloss, aos, 8-wide: 47 fps
normal gloss, aos, 4-wide: 48 fps

normal gloss, forced to soa, 8-wide: 48 fps
normal gloss, forced to soa, 4-wide: 47 fps

patched gloss, old code, soa, 8-wide: 21 fps
patched gloss, old code, soa, 4-wide: 24 fps

patched gloss, new code, soa, 8-wide: 41 fps
patched gloss, new code, soa, 4-wide: 38 fps

So there's a performance hit but it seems acceptable, certainly better
than using the fallback.
Note the new code only works for 4x8bit srgb formats, others (L8/L8A8) will
continue to use the old util_format fallback, because I can't be bothered
to write code for formats noone uses anyway (as decoding is done as part of
lp_build_unpack_rgba_soa which can only handle block type width of 32).
Compressed srgb formats should get their own path though eventually (it is
going to be expensive in any case, first decompress, then convert).
No piglit regressions.

v2: use lp_build_polynomial instead of ad-hoc polynomial construction, also
since keeping both linear to srgb functions for now make sure both are
compiled (since they share quite some code just integrate into the same
function).

v3: formatting fixes and bugfix in the complicated (disabled) linear-to-srgb
path.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-13 18:42:17 +02:00
Roland Scheidegger
9b8d97e5bf gallivm: better support for fast rsqrt
We had to disable fast rsqrt before because it wasn't precise enough etc.
However in situations when we know we're not going to need more precision
we can still use a fast rsqrt (which can be several times faster than
the quite expensive sqrt). Hence introduce a new helper which does exactly
that - it is probably not useful calling it in some situations if there's
no fast rsqrt available so make it queryable if it's available too.

v2: use fast_rsqrt consistently instead of rsqrt_fast, fix indentation,
let rsqrt use fast_rsqrt.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-13 18:42:17 +02:00
Klemens Baum
45574ab2e9 configure.ac: better detection of LLVM version
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-12 21:20:59 -07:00
Vinson Lee
b0c3c955ae r600g/sb: Initialize ra_constraint::cost.
Fixes "Uninitialized scalar field" reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-07-13 06:57:26 +04:00
Vinson Lee
be8d787873 glsl: Initialize ast_aggregate_initializer::constructor_type.
Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12 18:42:46 -07:00
Paul Berry
c6bfe62e21 glsl: Make gl_TexCoord compatibility-only
gl_TexCoord was deprecated in GLSL 1.30.  In GLSL 1.40 it was marked
as ARB_compatibility-only, and in GLSL 1.50 and above it was marked as
only appearing in the compatibility profile.  It has never appeared in
GLSL ES.

However, Mesa erroneously included it in all desktop versions of GLSL,
even versions 1.40 and 1.50 (which do not currently support the
compatibility profile).  This patch makes gl_TexCoord available in the
compatibility profile (and GLSL versions 1.30 and prior) only.

NOTE: although this is a simple bug fix, it probably isn't sensible to
cherry-pick it to stable release branches, since its only effect is to
cause incorrectly-written shaders to fail to compile.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12 18:18:49 -07:00
Paul Berry
8f51d68f8c glsl ES: Fix magnitude of gl_MaxVertexUniformVectors.
Previously, we set it equal to MaxVertexUniformComponents.  It should
be MaxVertexUniformComponents / 4.

NOTE: This is a candidate for the stable branches.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12 18:18:48 -07:00
Marek Olšák
06b38dbab2 winsys/radeon: allow a NULL cs pointer in radeon_bo_map to fix a segfault
The original idea was that cs=NULL should be allowed here, but we never used
NULL until 862f69fbe1. This fixes a segfault in CoreBreach.
2013-07-13 02:38:23 +02:00
Chia-I Wu
8d4ac98549 ilo: move a santiy check into its assert()
The compiler does not know that ilo_3d_pipeline_estimate_size() is pure and
can be eliminated in a release build in gen6_pipeline_end().  Move the call
into the assert().
2013-07-13 07:27:28 +08:00
Chia-I Wu
bf9670270f ilo: mark some states dirty when they are really changed
The checks may seem redundant because cso_context handles them, but
util_blitter does not have access to cso_context.
2013-07-13 06:43:53 +08:00
Chia-I Wu
9047598a8d ilo: clean up ilo_blitter_pipe_begin()
Document why certain states need to be saved, and fix a bug when blitting with
scissor enabled.
2013-07-13 06:43:53 +08:00
Alex Deucher
e0a7565832 r600g: don't use the CB/DB CP COHER logic on r6xx
There are hw bugs.  Flush and inv event is sufficient.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=66837

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-12 18:07:56 -04:00
Jonathan Liu
af16f73051 configure: Avoid use of AC_CHECK_FILE for cross compiling
The AC_CHECK_FILE macro can't be used for cross compiling as it will
result in "error: cannot check for file existence when cross compiling".
Replace it with the AS_IF macro.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jonathan Liu <net147@gmail.com>
2013-07-12 13:21:28 -07:00
Brian Paul
bf86e0e050 nv30: fix KILL_IF breakage
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66858
2013-07-12 10:00:18 -06:00
Zack Rusin
00cd455bd5 gallium: fixup definitions of the rsq and sqrt
GLSL spec says that rsq is undefined for src<=0, but the D3D10
spec says it needs to be a NaN, so lets stop taking an absolute
value of the source which completely breaks that behavior. For
the gl program we can simply insert an extra abs instrunction
which produces the desired behavior there.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-11 20:19:04 -04:00
José Fonseca
a171812d27 util/u_format: Comment out half float denormal test case.
So that lp_test_format doesn't fail until we decide what should be done.
2013-07-12 15:48:38 +01:00
José Fonseca
1b0d29b5da gallivm: Eliminate redundant lp_build_select calls.
lp_build_cmp already returns 0 / ~0, so the lp_build_select call is
unnecessary.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-12 15:40:16 +01:00
Brian Paul
46205ab8cc tgsi: rename the TGSI fragment kill opcodes
TGSI_OPCODE_KIL and KILP had confusing names.  The former was conditional
kill (if any src component < 0).  The later was unconditional kill.
At one time KILP was supposed to work with NV-style condition
codes/predicates but we never had that in TGSI.

This patch renames both opcodes:
  TGSI_OPCODE_KIL -> KILL_IF   (kill if src.xyzw < 0)
  TGSI_OPCODE_KILP -> KILL     (unconditional kill)

Note: I didn't just transpose the opcode names to help ensure that I
didn't miss updating any code anywhere.

I believe I've updated all the relevant code and comments but I'm
not 100% sure that some drivers had this right in the first place.
For example, the radeon driver might have llvm.AMDGPU.kill and
llvm.AMDGPU.kilp mixed up.  Driver authors should review their code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-12 08:32:51 -06:00
Brian Paul
f501baabdb tgsi: fix-up KILP comments
KILP is really unconditional fragment kill.

We've had KIL and KILP transposed forever.  I'll fix that next.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-12 08:32:51 -06:00
Brian Paul
e7c3898725 tgsi: exec TGSI_OPCODE_SQRT as a scalar instruction, not vector
To align with the docs and the state tracker.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-12 08:32:51 -06:00
Brian Paul
f3fad24b62 tgsi: use X component of the second operand in exec_scalar_binary()
The code happened to work in the past since the (scalar) src args
effectively always have a swizzle of .xxxx, .yyyy, .zzzz, or .wwww so
whether you grab the X or Y component doesn't really matter.  Just
fixing the code to make it look right.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-12 08:32:51 -06:00
Brian Paul
cb2de08f27 mesa: update glext.h to version 20130708
This update fixes the problem with duplicated typedefs for
GLclampf and GLclampd in the previous version.

It also changes some parameter types for glDebugMessageCallbackARB()
and glTransformFeedbackVaryingsEXT().

Note we should someday update the glapi-gen code so that it
understands void pointer parameters.  Currently, the Python code
only understands "GLvoid *" but not "void *".  Luckily, the
compilers don't seem to complain about mixing GLvoid and void.
2013-07-12 08:32:51 -06:00
Brian Paul
5749aea255 mesa: fix Address Sanitizer (ASan) issue in _mesa_add_parameter()
If the size argument isn't a multiple of four, we would have read/
copied uninitialized memory.

Fixes an issue reported by Myles C. Maxfield <myles.maxfield@gmail.com>
2013-07-12 08:32:51 -06:00
Brian Paul
9ca026e220 mesa: simplify some _mesa_IsEnabled() queries
No need to test array->Enabled != 0 since the Enabled field can
only be 0 or 1.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-12 08:32:50 -06:00
Brian Paul
9fc532a263 os: add os_get_process_name() function
v2: explicitly test for BSD/APPLE, #warning for unexpected
environments.
2013-07-12 08:32:50 -06:00
Brian Paul
3fb3e1e38c mesa: whitespace, formatting, 80-column wrapping 2013-07-12 08:32:22 -06:00
Brian Paul
919236f3a2 softpipe: silence some MSVC warnings 2013-07-12 08:19:52 -06:00
Brian Paul
76666b9394 hud: silence some MSVC warnings 2013-07-12 08:19:52 -06:00
Brian Paul
d7a852b3a1 util: add casts to silence MSVC warnings in u_blit.c 2013-07-12 08:19:51 -06:00
Brian Paul
c45d8f2e98 tgsi: s/unsigned/int/ to silence MSVC warning 2013-07-12 08:19:50 -06:00
Brian Paul
2cfd768473 mesa: s/unsigned/int/ to fix MSVC warning in uniforms.c 2013-07-12 08:19:50 -06:00
Brian Paul
5b0fbf1b0b mesa: s/GLuint/GLint/ to silence MSVC warning in textore.c 2013-07-12 08:19:50 -06:00
Brian Paul
721f47227e mesa: add casts to fix MSVC warnings in multisample.c 2013-07-12 08:19:49 -06:00
Brian Paul
528e5b9476 mesa: s/GLint/GLuint/ to fix MSVC warnings in mipmap.c 2013-07-12 08:19:49 -06:00
Brian Paul
738337356b mesa: fix inconsistent function declaration, definitions
To silence MSVC warnings that the declaration and definitions
were different.
2013-07-12 08:19:49 -06:00
Brian Paul
8ba5c79d2c mesa: add cast to silence MSVC warning 2013-07-12 08:19:49 -06:00
Christian König
1681bd7f2b radeon/uvd: fall back to shader based decoding for MPEG2 on UVD 2.x v2
UVD 2.x doesn't support hardware decoding of MPEG2, just use shader
based decoding for those chipsets.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=66450

v2: fix interlacing as well

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-07-12 10:52:27 +02:00
José Fonseca
649ef4da30 glsl: Avoid variable length arrays.
They are a non-standard GCC extension that's not widely supported by
other C/C++ compilers.

Use a dynamic array instead.

Trivial. Should fix the MSVC build.
2013-07-12 09:28:22 +01:00
Matt Turner
1b0d6aef03 glsl: Add support for C-style initializers.
Required by GL_ARB_shading_language_420pack.

Parts based on work done by Todd Previte and Ken Graunke, implementing
basic support for C-style initializers of arrays.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
ae79e86d4c glsl: Add infrastructure for aggregate initializers.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
8d45caaeba glsl: Add an is_declaration field to ast_struct_specifier.
Will be used in a later commit to differentiate between a structure type
declaration and a variable declaration of a struct type. I.e., the
difference between

   struct S { float x; }; (is_declaration = true)

and

   S s;                   (is_declaration = false)

Also note that is_declaration = true for

   struct S { float x; } s;

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
5df807b06f glsl: Track structs' ast_type_specifiers in symbol table.
Will be used in a future commit. An ast_type_specifier is stored (rather
than an ast_struct_specifier) with the idea that we may have more
general uses for this in the future. struct names are prefixed with
'#ast.' to avoid collisions with the glsl_types in the symbol table.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
e641b5fbee glsl: Add process_vec_mat_constructor() function.
Based largely on process_array_constructor().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
af2987d5b6 glsl: Separate code into process_record_constructor().
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
a760c73853 glsl: Add copy-constructor for ast_struct_specifier.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:59 -07:00
Matt Turner
43757135b2 glsl: Add a constructor for ast_type_specifier.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
b85f0c5121 glsl: Clean up and clarify comment explaining initializer rules.
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
ce2464a8a7 glsl: Change type of is_array to bool.
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
361206771c glsl: Add a comment to note what an exec_list is a list of.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
46b74ca7bc glsl: Fix inverted conditional in error message.
The code float a[2] = float[2]( 3.4, 4.2, 5.0 ); previously generated
this:

   error: array constructor must have at least 2 parameters

when in fact it requires exactly two.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
9749d96817 glsl: Add missing return error_value(ctx) in error path.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Matt Turner
e117eda251 glsl: Remove unnecessary #include from ast_type.cpp.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
2013-07-11 20:58:58 -07:00
Chia-I Wu
93742d9757 glsl/build: build builtin_compiler with VISIBILITY_CFLAGS
libglslcore.la and libglcpp.la that are built with builtin_compiler are also
linked to by drivers not using libdricore.  Since there is no public symbol in
them, it is better to mark all symbols hidden.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-07-12 09:42:25 +08:00
Matt Turner
08c90f651b glsl: Add comment explaining "row_major" parsing.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-11 16:22:07 -07:00
Matt Turner
14ed9018de glsl: Mark "row_major" as not a reserved word in GLSL ES 3.0.
We mark ARB_uniform_buffer_object as enabled under ES 3 since it
contains that functionality, which tricked the compiler into tokenizing
"row_major".

Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 16:22:07 -07:00
Matt Turner
c30948517e glsl: Remove outdated FINISHME comment.
Explicit index support was added by commit 1256a5dc.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-11 16:22:07 -07:00
Alex Deucher
77300bacaf radeon: bump libdrm_radeon requirement for CIK support
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-11 19:11:44 -04:00
Christoph Bumiller
9974593dfb r600g: x/y coordinates must be divided by block dim in dma blit
Note: this is a candidate for the 9.1 branch.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-11 19:11:44 -04:00
Chih-Wei Huang
1d9271a95c r600g/sb: Fix Android build v2
Add the sb CXX files to the Android Makefile and also stop using some
c++11 features.

v2 (Vadim Girlin): use &bc[0] instead of bc.begin()
2013-07-12 01:11:04 +04:00
Vadim Girlin
758ac6f918 r600g/sb: improve math optimizations v2
This patch adds support for some math optimizations that are generally
considered unsafe, that's why they are currently disabled for compute
shaders.

GL requirements are less strict, so they are enabled for
for GL shaders by default. In case of any issues with
applications that rely on higher precision than guaranteed by GL,
'sbsafemath' option in R600_DEBUG allows to disable them.

v2 - always set proper src vector size for transformed instructions
   - check for clamp modifier in the expr_handler::fold_assoc

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-07-11 23:01:01 +04:00
Jonathan Gray
c451619dde st/xvmc/tests: avoid non portable error.h functions
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-07-11 09:52:00 +02:00
Anuj Phogat
9a1a67b081 i965/blorp: Fix clear rectangle alignment in fast color clear
From BSpec: 3D-Media-GPGPU Engine > 3D Pipeline > Pixel >
Pixel Backend > MCS Buffer for Render Target(s) [DevIVB+]:
[DevHSW:GT3]: Clear rectangle must be aligned to two times
the number of pixels in the table shown below...
Observed no piglit, gles3conform regressions with this patch.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65744
2013-07-10 18:41:16 -07:00
Chia-I Wu
ad244884fc winsys/intel: build with VISIBILITY_CFLAGS
There is no public symbol in this winsys.
2013-07-11 09:03:59 +08:00
Chia-I Wu
79bc245c01 ilo: reduce PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS to 12
So that there are at most (2^22 * 6) texels, lower than the 2^26 limit.
2013-07-11 08:03:27 +08:00
Chia-I Wu
29af29b8dc ilo: correctly initialize undefined registers in fs
Initialize all 4 channels of undefined registers (that is, TEMPs that are used
before being assigned) in FS.
2013-07-11 07:01:51 +08:00
Michel Dänzer
a06ee5a09e radeonsi: Handle TGSI_OPCODE_DDX/Y using local memory
16 more little piglits.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-10 18:40:32 +02:00
Michel Dänzer
a6b83c0f23 radeonsi: Handle TGSI_OPCODE_TXD
One more little piglit.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-10 12:16:38 +02:00
José Fonseca
b042aae70d util/u_math: Use xmmintrin.h whenever possible.
It seems  __builtin_ia32_ldmxcsr is only available on gcc and only when
-msse is used. xmmintrin.h/pmmintrin.h provide portable intrinsics, but
these too are only available with gcc when -msse/-msse3 are set.

scons build always sets -msse on x86 builds, but autotools doesn't seem
to.

We could try to get this working on gcc x86 without -msse by emitting
assembly, but I believe that in this day and age we really should be
building Mesa with -msse and -msse2.
2013-07-10 07:56:17 +01:00
Chia-I Wu
045bf0db52 ilo: honor surface padding requirements
The PRM specifies several padding requirements that we failed to honor.
2013-07-10 12:40:22 +08:00
Zack Rusin
63386b2f66 util: treat denorm'ed floats like zero
The D3D10 spec is very explicit about treatment of denorm floats and
the behavior is exactly the same for them as it would be for -0 or
+0. This makes our shading code match that behavior, since OpenGL
doesn't care and on a few cpu's it's faster (worst case the same).
Float16 conversions will likely break but we'll fix them in a follow
up commit.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-09 23:30:55 -04:00
Matt Turner
80bc14370a mesa: Set ProfileMask properly for core profile.
Fixes MESA_GL_VERSION_OVERRIDE=3.2 egl-create-context-verify-gl-flavor.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-09 14:19:22 -07:00
Kenneth Graunke
8c9a54e7bc i965: Delete intel_context entirely.
This makes brw_context inherit directly from gl_context; that was the
only thing left in intel_context.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:35 -07:00
Kenneth Graunke
53631be4eb i965: Move intel_context::gen and gt fields to brw_context.
Most functions no longer use intel_context, so this patch additionally
removes the local "intel" variables to avoid compiler warnings.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:34 -07:00
Kenneth Graunke
2e26afb37b i965: Move intel_context::has_llc to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:33 -07:00
Kenneth Graunke
794de2f387 i965: Move intel_context::is_<platform> flags to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:31 -07:00
Kenneth Graunke
44fd490067 i965: Move must_use/has_separate_stencil fields to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:30 -07:00
Kenneth Graunke
3b80b147f6 i965: Move intel_context::has_hiz to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:29 -07:00
Kenneth Graunke
351d2add62 i965: Free brw, not intel.
Things worked out in the past because both brw and intel share the same
memory address (by virtue of intel being the first member of brw).

However, brw is what actually gets rzalloc'd (brw_context.c:285), so
freeing that seems safer and more obvious.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:28 -07:00
Kenneth Graunke
e3c2bb1eb4 i965: Shorten context base class dereference chains.
ctx->DrawBuffer is much more sensible than brw->intel.ctx.DrawBuffer.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:26 -07:00
Kenneth Graunke
d5b4a3f5a3 i965: Move intel_context::has_swizzling to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:25 -07:00
Kenneth Graunke
02128c448d i965: Move intel_context::intelScreen to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:24 -07:00
Kenneth Graunke
44a11eab9c i965: Delete unused intel_context::driFd field.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:23 -07:00
Kenneth Graunke
e0858763bc i965: Store brw_context as the DRI driver private, not intel_context.
Right now, they're interchangeable.  In the future, intel_context will
either go away or change purpose.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:21 -07:00
Kenneth Graunke
a1d94cdb00 i965: Move intel_context::driContext to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:20 -07:00
Kenneth Graunke
a9d33dbbdd i965: Move intel_context::NewGLState to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:19 -07:00
Kenneth Graunke
dd54558d31 i965: Move intel_context::upload to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:17 -07:00
Kenneth Graunke
0273e6e23e i965: Move intel_context::max_gtt_map_object_size to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:16 -07:00
Kenneth Graunke
b15f1fc3c6 i965: Move intel_context::perf_debug to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:14 -07:00
Kenneth Graunke
7c3180a4ad i965: Move intel_context::no_batch_wrap to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:13 -07:00
Kenneth Graunke
5314afa27a i965: Move intel_context's framerate throttling fields to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:12 -07:00
Kenneth Graunke
ec995de6fb i965: Move intel_context::stats_wm to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:10 -07:00
Kenneth Graunke
329779a0b4 i965: Move intel_context::batch to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:08 -07:00
Kenneth Graunke
5d8186ac1a i965: Move intel_context::hw_ctx to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:07 -07:00
Kenneth Graunke
eeb75b41f1 i965: Move intel_context::bufmgr to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:05 -07:00
Kenneth Graunke
e33439045d i965: Move intel_context's driconf flags to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:04 -07:00
Kenneth Graunke
fe0a8cb30d i965: Move intel_context::reduced_primitive to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:03 -07:00
Kenneth Graunke
9147b40496 i965: Move front buffer rendering fields from intel_context to brw.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:09:01 -07:00
Kenneth Graunke
e43043c316 i965: Move intel_context::vtbl to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:58 -07:00
Kenneth Graunke
fbdd3891e1 i965: Move intel_context::optionCache to brw_context.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:55 -07:00
Kenneth Graunke
ca437579b3 i965: Pass brw_context to functions rather than intel_context.
This makes brw_context available in every function that used
intel_context.  This makes it possible to start migrating fields from
intel_context to brw_context.

Surprisingly, this actually removes some code, as functions that use
OUT_BATCH don't need to declare "intel"; they just use "brw."

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:53 -07:00
Kenneth Graunke
86f2711722 i965: Remove pointless intel_context parameter from try_copy_propagate.
It's already part of the visitor class.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:51 -07:00
Kenneth Graunke
18a223d323 i965: Add forward declarations of brw_context to a few places.
These files have forward declarations for intel_context.  This makes
brw_context available in the same places without further #include
monkeying.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:50 -07:00
Kenneth Graunke
a69274454b i965: Replace #include "intel_context.h" with brw_context.h.
brw_context.h includes intel_context.h, but additionally makes the
brw_context structure available.  Switching this allows us to start
using brw_context in more places.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:48 -07:00
Kenneth Graunke
99ebf9d07a i965: Move ctx->Const setup from intelInitContext to the new helper.
This also requires moving _mesa_init_point() to after the ctx->Const
initialization.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:47 -07:00
Kenneth Graunke
963d9f78a4 i965: Split code to set ctx->Const values into a helper function.
brwCreateContext() has a lot of random things to do.  Factoring out the
part that initializes ctx->Const values and shader compiler options
makes the main function a bit easier to read.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:45 -07:00
Kenneth Graunke
d13c120573 i915: Remove i965+ chip names.
i965+ chipsets shouldn't ever hit this driver.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:44 -07:00
Kenneth Graunke
e4f3d5cdcf i965: Remove i915 chip names.
i915 chipsets shouldn't ever hit this driver.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:08:42 -07:00
Kenneth Graunke
2921390666 i965: Replace intel_context:needs_ff_sync with intel->gen == 5.
Technically, needs_ff_sync was set on Gen5+, but it was only consulted
in the clipper threads and quad/lineloop decomposition code, which are
both Gen4-5 only.  So in reality it only identified Ironlake.

The named flag doesn't really clarify things, and seems like overkill.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-07-09 14:07:13 -07:00
Kenneth Graunke
968c57782d i965: Add missing newline to blorp color clear perf_debug message.
perf_debug() doesn't add a newline for you; without this, all the
INTEL_DEBUG=perf output was jumbled together.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-09 10:10:46 -07:00
Emil Velikov
f0260f4e3d glsl: Silence unused variable warning in the release build
Resolves the following gcc warning

 opt_flip_matrices.cpp:84:32: warning: unused variable 'deref'

v2: keep the variable, but wrap it in a ifndef NDEBUG block
    (suggested by Ian)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-08 19:08:42 -07:00
Emil Velikov
4df6823f21 glsl/ast: Silence uninitialized variable warnings in the release build
Resolves the following gcc warnings

 warning: 'iface_type_name' may be used uninitialized in this function
 warning: 'var_mode' may be used uninitialized in this function

Note: The variables are initialised to UNKNOWN and ir_var_auto

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-08 19:08:30 -07:00
Paul Berry
292368570a i965: Add an assertion to brwProgramStringNotify.
driver->ProgramStringNotify is only called for ARB programs, fixed
function vertex programs, and ir_to_mesa (which isn't used by the i965
back-end).  Therefore, even after geometry shaders are added,
brwProgramStringNotify should only ever be called with a target of
GL_VERTEX_PROGRAM_ARB or GL_FRAGMENT_PROGRAM_ARB.

This patch adds an assertion to clarify that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-08 14:18:02 -07:00
Matt Turner
ba7b60d3e4 glsl: Allow non-constant expression initializers of const-qualified vars.
Required by ARB_shading_language_420pack.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-07-08 12:46:56 -07:00
Marek Olšák
1faa375573 r600g: improve the mechanism for recognizing an empty CS
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
287b2fa115 r600g: explicitly flush caches for streamout-based buffer copying & clearing
It's done automatically for vertex buffers, but not for constant buffers,
textures, and colorbuffers.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
7948ed1250 r600g: only flush the caches that need to be flushed during CP DMA operations
This should increase performance if constant uploads are done with the CP DMA,
because only the cache that needs to be flushed is flushed.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
1b40398d02 r600g: split INVAL_READ_CACHES into vertex, tex, and const cache flags
also flushing any cache in evergreen_emit_cs_shader seems to be superfluous
(we don't flush caches when changing the other shaders either)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Alex Deucher
098316211c r600g: adjust flush flags (v3)
1. flush SH with read caches
2. add flag for DB flushes
3. add flag for CB flushes

v2: flush all CBs, remove redundant emit_state variable.
v3: Marek: also set the new flags in r600_context_flush, the CP dma functions,
    and texture_barrier, and rename them

Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
862f69fbe1 r600g: don't call buffer_wait in buffer_mmap_sync_with_rings
The winsys should do this, because it measures how much time we spend
in buffer_map doing synchronization, which can be viewed with the gallium
HUD.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
94d294137e r600g: don't read back the MSAA depth buffer if the read flag is not set
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
141b892620 r600g: don't flush the context in texture_transfer_map
the winsys does this automatically

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
ae87aae0c4 r600g: fix texture offset computation for mapped MSAA depth buffers
It was wrong, because the offset shouldn't be applied to MSAA depth buffers.
This small cleanup should prevent such issues in the future.

This fixes a lockup in "piglit/fbo-depthstencil default_fb -samples=n".

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
a3263cca59 r600g: fix color resolve for RGBX8 and RGBX16 integer formats
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
b1a061b81e r600g: enable fast MSAA color clear for array/3D/cube textures
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
87669c3654 r600g: implement fast MSAA color clear for integer textures
this also fixes the fast clear with multiple colorbuffers and each having
a different format

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Christian König
085c695488 r600/uvd: fix check for UVD 2.x
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-07-08 19:51:20 +02:00
Chris Forbes
1415a1884c i965: fix alpha test for MRT
Include src0 alpha in the RT write message when using MRT, so it is used
for the alpha test instead of the normal per-RT alpha value.

Fixes broken rendering in Dota2 under Wine [FDO #62647].

No Piglit regressions on Ivybridge.

V2: reuse (and simplify) existing sample_alpha_to_coverage flag in
the FS key, rather than adding another redundant one.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewd-by: Paul Berry <stereotype441@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62647
NOTE: This is a candidate for the stable branches.
2013-07-06 12:41:54 +12:00
Roland Scheidegger
9ef49cfd84 gallivm: (trivial) fix using one lod instead of per-quad lod for texel fetch
The logic for choosing number of lods was bogus.
(The code should ultimately handle the case of only one lod even with multiple
quads but currently can't.)
2013-07-05 18:07:51 +02:00
José Fonseca
45f174ce40 gallivm: Remove bogus assert.
It is perfectly valid for the swizzle to be bigger than 2. For example the
texel offsets could be

  SAMPLE ..., IMM[0].zzz

What is not correct is for chan_index to be bigger than 2.

Trivial.
2013-07-05 14:35:54 +01:00
Ben Skeggs
c29c6b2b2e nvc0: enable very initial support for nvf0 (GK110)
Shaders need a lot of work still.  Basic stuff generally works, so this
is basically just fine for gnome-shell, OA etc at this point.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2013-07-05 14:15:04 +10:00
Roland Scheidegger
4dbca8672b gallivm: (trivial) fix bogus assertion for per-element lod with 1d resources
The assertion was always broken but the code unused until enabling the
per-element lod code. Fixes piglit texelFetch vs isampler1D and similar
tests (only run with GL 3.0 version override).
2013-07-05 01:19:23 +02:00
Roland Scheidegger
f3bbf65929 gallivm: do per-pixel lod calculations for explicit lod
d3d10 requires per-pixel lod calculations for explicit lod, lod bias and
explicit derivatives, and we should probably do it for OpenGL too - at least
if they are used from vertex or geometry shaders (so doesn't apply to lod
bias) this doesn't just affect neighboring pixels.
Some code was already there to handle this so fix it up and enable it.
There will no doubt be a performance hit unfortunately, we could do better
if we'd knew we had a real vector shift instruction (with variable shift
count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu).
Don't do anything for lod bias and explicit derivatives yet, though
no special magic should be needed for them neither.
Likewise, the size query is still broken just the same.

v2: Use information if lod is a (broadcast) scalar or not. The idea would be
to base this on the actual value, for now just pretend it's a scalar in fs
and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same
code is generated for fs as before).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-04 19:42:04 +02:00
Zack Rusin
bbd1e60198 draw: fix overflows in the indexed rendering paths
The semantics for overflow detection are a bit tricky with
indexed rendering. If the base index in the elements array
overflows, then the index of the first element should be used,
if the index with bias overflows then it should be treated
like a normal overflow. Also overflows need to be checked for
in all paths that either the bias, or the starting index location.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-03 09:06:30 -04:00
Zack Rusin
09820902d7 draw/llvm: index overflows if it's greater than elt max
The comparison, incorrectly, was greater-than-or-equal to
elt max.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-03 09:06:24 -04:00
Kenneth Graunke
764afc48cf i965: Move the rest of intel_tex_layout.c into brw_tex_layout.c.
The texture alignment unit functions are called from brw_tex_layout.c,
so it makes sense to put them there.  Since the only caller of
intel_get_texture_alignment_unit() is in brw_tex_layout.c, it could be
made into a static function.  However, this patch instead simply folds
it into the caller, as it's only two lines anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
466aa712b6 i965: Push intel_get_texture_alignment_unit call into brw_miptree_layout
intel_miptree_create_layout() calls intel_get_texture_alignment_unit()
and then immediately calls brw_miptree_layout().  There are no other
callers.

intel_get_texture_alignment_unit() populates the miptree's alignment
unit fields, which are used by brw_miptree_layout() to determine where
to place each miplevel.  Since brw_miptree_layout() needs those to be
present, it makes sense to have it initialize them as the first step.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
c4c3c0dc94 i965: Declare for-loop counters in the loop in brw_tex_layout.c.
The driver is compiled in C99 mode, so this is not a problem.  It's
slighlty tidier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
ccf312fd12 i965: Remove use of GLuint/GLint in brw_tex_layout.c.
Using GL types is silly; this isn't even remotely API-facing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
ed95e396f3 i965: Tidy the brw_tex_layout.c copyright and file header comments.
This uses Doxygen style for the file comments, and generally makes it
more consistent with the rest of the driver.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
2ea87fde31 i965: Move i945_texture_layout_2d to brw_tex_layout.c
This consolidates the miptree layout logic in a single file.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
1920209970 i965: Remove fallthrough for Gen4 cube map layout.
Now that both 2DArray and Cube layouts are taken care of by helper
functions, it's easy to just call the right function for each
generation.  This is a little cleaner than falling through.

This also reworks the comments.  Referencing "Volume 1" of the BSpec
isn't very helpful, since that's only available inside Intel, and it
doesn't even use volume numbers.  Also, "Ironlake...finally" sounds a
bit strange considering that almost all hardware uses the 2D array
approach.  At this point, Gen4 is the only special case.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
7e4007a1b3 i965: Combine GL_TEXTURE_CUBE_MAP_ARRAY case with the other array cases.
These do the exact same thing; combining them is tidier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
bc51f15b32 i965: Pull 3D texture layout code out into a helper function.
A bit cleaner than having it in one giant function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
abc2bdffd6 i965: Replace maxBatchSize variable with BATCH_SZ define.
maxBatchSize was only ever initialized to BATCH_SZ, and a few places
used BATCH_SZ directly anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
2c602d2adf i965: Move annotate_aub out of the vtable.
brw_annotate_aub() is the only implementation of this function, so it
makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
f05f8793c8 i965: Move debug_batch hook out of the vtable.
brw_debug_batch() is the only implementation of this function, so it
makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
749160aab3 i965: Remove render_target_supported from the vtable.
brw_render_target_supported() is the only implementation of this
function, so it makes sense to just call it directly.

Rather than adding an #include of brw_wm.h, this patch moves the
prototype to brw_context.h.  Prototypes seem to be in rather arbitrary
places at the moment, and either place seems as good as the other.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
7c5279e554 i965: Move is_hiz_depth_format out of the vtable.
brw_is_hiz_depth_format() is the only implementation of this function,
so it makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
607338f1cb i965: Remove the invalidate_state() vtable hook.
The hook was a noop.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
251cdcf059 i965: Replace fprintfs with assertions in GLenum comparison translators.
These functions translate GLenum comparison operations into the hardware
enumerations.  They should never be passed something other than a GL
comparison operator, or something is very broken.

Assertions seem more appropriate than fprintf.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
7ee616f1bf i965: Replace intel_state.c enums with those from brw_defines.h.
Both intel_context.h and brw_defines.h have #defines for comparison
functions, stencil ops, blending logic ops, and blending factors.
They're exactly the same values, so it makes sense to pick one.

brw_defines.h is the logical place for this kind of stuff, so this patch
converts intel_state.c to use the set defined there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
c9db037dc9 i965: Delete pre-DRI2.3 viewport hacks.
The __DRI_USE_INVALIDATE extension was added in May 11th, 2010 by commit
4258e3a2e1.  At this point, it's unlikely that anyone's using the
right mix of new and old components to hit this path.  Deleting it
removes an untested code path and cleans up the driver a bit.

Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Keith Packard <keithp@keithp.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
cbb37b7586 i965: Remove "There are probably better ways" comment.
There are always better ways to do things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
7115bee993 i965: Delete brw_print_reg() function.
This wasn't called from anywhere; presumably it was used to examine
brw_regs when debugging shader assembly.  However, it prints registers
in a different notation than brw_disasm.c which everyone is used
to...which means I doubt anyone will want to use it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
bc8b62e3a0 i965: Move contents of intel_clear.h to intel_context.h.
Having a header file for a single prototype seems rather excessive.
Plus, the actual function is in brw_clear.c, not intel_clear.c, so
there isn't even the .c/.h filename symmetry one might expect.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
7d8e70f301 i965: Move contents of intel_extensions.h to intel_context.h.
Having an entire header file for a single prototype seems a bit
excessive.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
7d119880e8 i965: Remove some dead code.
A random smattering of things that just aren't used anymore.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
d245e795cf i965: Delete dead intel_buffer_object::range_map_size field.
Nothing uses this, apparently.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
1f6ebdd43f i965: Remove intel_buffer_object::source.
This was only used for BOs backed by system memory on i915.  With that
gone, there's nothing that even sets source to non-zero, so this is
purely dead code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
6e5b80ee5a i965: Fix buffer object segfault since removal of system memory BOs.
Commit cf31a19300 removed support for BOs
backed by system memory, as it was only useful for i915.  However, it
removed a little too much code: intel_bufferobj_buffer() used to call
intel_bufferobj_alloc_buffer(), and after that commit, it didn't.

This led to NULL pointer dereferences in several test cases, such as
es3conform's transform_feedback_state_variables test.

This commit restores the allocation, preserving the original behavior.
It may not be the cleanest approach, but tidying should come later.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66432
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:12 -07:00
Matthew McClure
012ba47076 postprocess: move second temporary assertion into isolated configuration
With this patch we will only assert that the second temporary is allocated,
when there are more than two active filters.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66423

Signed-off-by: Brian Paul <brianp@vmware.com>
2013-07-03 09:19:04 -06:00
José Fonseca
9b6788eb15 glsl: Ensure snprintf is defined on MSVC builds.
Should fix:

  src\glsl\opt_dead_builtin_varyings.cpp(244) : error C3861: 'snprintf': identifier not found
  ...
2013-07-03 08:26:08 +01:00
Ilia Mirkin
4bc8e3c3e4 targets/xvmc-nouveau: add in missing nv30 lib
Currently libXvMCnouveau.so is missing nv30_screen_create. Add it in so
that it may be dlopen'd.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-07-03 09:02:40 +02:00
Marek Olšák
30c3e8718d mesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependencies
Not needed with do_dead_builtin_varyings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
74edd56927 st/mesa: disable EXT_separate_shader_objects
The extension disallows elimination of set-but-unused varyings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
b3d8b4c0b4 glsl/linker: eliminate unused and set-but-unused built-in varyings
This eliminates built-in varyings such as gl_Color, gl_SecondaryColor,
gl_TexCoord, and gl_FogFragCoord if they are unused by the next stage or
not written at all (e.g. gl_TexCoord elements). The gl_TexCoord array is
broken down into separate vec4s if needed.

v2: - use a switch statement in varying_info_visitor::visit(ir_variable*)
    - use snprintf
    - disable the optimization for GLES2

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
3c555827c3 glsl/linker: check against varying limit after unused varyings are eliminated
We counted even the varyings which were later eliminated, which was
suboptimal.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
284d954912 glsl/linker: link shaders in the opposite order (from fragment to vertex)
This ensures that inter-shader outputs and inputs are properly eliminated
across 3 or more shader stages. The behavior is unchanged with 2 or less
shader stages.

For example, elimination of unused FS inputs causes elimination of matching
GS outputs, which causes elimination of the GS inputs that were needed for
evaluation of the eliminated GS outputs, which causes elimination of
matching VS outputs. An unused FS input is all that's needed to trigger
this chain reaction.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
030ca230e2 mesa: renumber shader indices according to their placement in pipeline
See my explanation in mtypes.h.

v2: don't do this in gallium
v3: also updated the comment at the gl_shader_type definition

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
José Fonseca
84f367e69a gallivm: Simplify intrinsic name construction.
Just noticed this could be slightly shortened when fixing MSVC build.

Trivial.
2013-07-02 13:12:31 +01:00
Kenneth Graunke
15ca0ca1b6 glsl/builtins: Fix ARB_texture_cube_map_array built-in availability.
This patch adds texture() for isamplerCubeArray and usamplerCubeArray,
which were entirely missing.

It also makes texture() with a LOD bias fragment shader specific.  The
main GLSL specification explicitly says that texturing with LOD bias
should not be allowed for vertex shaders.

Affects Piglit's ARB_texture_cube_map_array/compiler/tex_bias-01.vert.
which tries to use bias in a vertex shader.  Currently, it expects this
to pass (so this patch regresses the test), but I've sent a patch to
reverse the expected behavior (so this patch would fix the updated test):
http://lists.freedesktop.org/archives/piglit/2013-June/006123.html

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2013-07-02 01:01:30 -07:00
José Fonseca
4c859901ce gallivm: Fix MSVC build. 2013-07-02 06:41:32 +01:00
José Fonseca
e621ec816d gallivm: Fix indirect immediate registers.
If reg->Register.Indirect is true then the immediate is not truly a
constant LLVM expression.

There is no performance regression in using LLVMBuildBitCast, as it will
fallback to LLVMConstBitCast internally when the argument is a constant.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-07-02 06:30:06 +01:00
Zack Rusin
70bc43acdb gallium/tests: fix the translate test 2013-06-28 09:43:17 -04:00
Anuj Phogat
722721d718 i965: Enable ext_framebuffer_multisample_blit_scaled on intel h/w
This patch enables ext_framebuffer_multisample_blit_scaled extension
on intel h/w >= gen6.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-01 15:21:25 -07:00
Anuj Phogat
6fc3da2da0 i965/blorp: Add bilinear filtering of samples for multisample scaled blits
Current implementation of ext_framebuffer_multisample_blit_scaled in
i965/blorp uses nearest filtering for multisample scaled blits. Using
nearest filtering produces blocky artifacts and negates the benefits
of MSAA. That is the reason why extension was not enabled on i965.

This patch implements the bilinear filtering of samples in blorp engine.
Images generated with this patch are free from blocky artifacts and show
big improvement in visual quality.

Observed no piglit and gles3 regressions.

V3:
- Algorithm used for filtering assumes a rectangular grid of samples
  roughly corresponding to sample locations.
- Test the boundary conditions on the edges of texture.

V4:
- Clip texcoords and use conditional MOVs.
- Send texture dimensions as push constants.
- Remove the optimization in case of scaled multisample blits.

V5:
- Move mcs_fetch() inside the 'for' loop after computing pixel coordinates.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-01 15:21:25 -07:00
Ian Romanick
27f2df2507 docs: Import 9.1.4 release notes, add news item.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-01 14:48:58 -07:00
Zack Rusin
1c2e5c223d draw/translate: fix instancing
We were incorrectly computing the buffer offset when using the
instances. The buffer offset is always equal to:
start_instance * stride + (instance_num / instance_divisor) *
stride
We were completely ignoring the start instance quite
often producing instances that completely wrong, e.g. if
start instance = 5, instance divisor = 2, then on the first
iteration it should be:
5 * stride, not (5/2) * stride as we'd have currently, and if
start instance = 1, instance divisor = 3, then on the first
iteration it should be:
1 * stride, not 0 as we'd have.
This fixes it and adjusts all the code to the changes.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 05:21:20 -04:00
Zack Rusin
df4ab7974a draw: fix incorrect clipper invocation statistics
clipper invocations are computed earlier (of course
before the emittion) so this code was adding bogus
numbers to already computed clipper invocations.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:29 -04:00
Zack Rusin
34546d61c1 draw/gallivm: export overflow arithmetic to its own file
We'll be reusing this code so lets put it in a common file
and use it in the draw module.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:24 -04:00
Zack Rusin
88de009cc1 draw: check for integer overflows in instance computation
Integers could easily overflow is the starting instance
was large enough. Instead of letting bogus counts through
set the instance to max if it overflown and let our
regular buffer overflow computation handle it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:20 -04:00
Zack Rusin
2f13f28120 draw: check for an integer overflow when computing stride
Our buffer overflow arithmetic was susceptible to integer
overflows which was the buffer overflow logic to break.
Lets use the llvm overflow intrinsics to check for integer
overflows while computing the stride/needed buffer size.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:16 -04:00
Zack Rusin
e742f7788e draw: account for elem size when computing overflow
We weren't taking into account the size of element
that is to be fetched, which meant that it was possible
to overflow the buffer reads if the stride was very
close to the end of the buffer, e.g. stride = 3, buffer
size = 4, and the element to be read = 4. This should
be properly detected as an overflow.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:12 -04:00
Vinson Lee
7214fe3cc4 i965: Initialize brw_blorp_const_color_program member variables.
Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-01 10:16:16 -07:00
Ross Burton
2c6186390c eglplatform: use unsigned long instead of 32-bit ints in generic platform
In the generic Unix case use the "unsigned long" type instead of 32-bit
integers so that the type sizes are consistant on 64-bit machines between X11
and not-X11.

Signed-off-by: Ross Burton <ross.burton@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-01 10:06:24 -07:00
Ross Burton
1a7275de9a build: fix EGL build when no X11 headers are present
eglplatform.h defaults to X11 on Unix unless told otherwise, so if we're doing a
build without any X11 support tell it so that we don't try including headers
that don't exist.

Also set GL_PC_FLAGS so that the definition is in egl.pc, so that applications
using EGL don't try to pull in X11 headers on systems where EGL was configured
without X11 support.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64959
Signed-off-by: Ross Burton <ross.burton@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-01 10:06:11 -07:00
José Fonseca
acc6a141b8 tools/trace: Return dummy fence object to silence warnings. 2013-07-01 12:06:58 +01:00
José Fonseca
0fd71ac9eb tools/trace: Don't crash if a trace has no timing information. 2013-07-01 12:05:57 +01:00
José Fonseca
fa3040c117 scons: Fix dependencies of enums.c and api_exec.c. 2013-07-01 12:04:59 +01:00
Maarten Lankhorst
bf95ca7de0 nvc0: allow frame dropping in h264
The only reason the checks existed were paranoia, when I first
wrote the code I wasn't sure it was correct. Now that I am,
the asserts triggered when XBMC was dropping frames, so remove it.

NOTE: This is a candidate for the 9.1 branch.
2013-07-01 08:47:49 +02:00
Tom Stellard
24fa43675f r300g/compiler: Prevent regalloc from swizzling texture operands v2
https://bugs.freedesktop.org/show_bug.cgi?id=63520

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-30 21:38:57 -07:00
Tom Stellard
e2c3640540 r300g/compiler/tests: Add an assembly parser
The assembly parser can be used to load r300 assembly dumps
and run them through any of the r300 compiler passes.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-30 21:38:57 -07:00
Tom Stellard
ab40d8d56f r300g: Fix make check
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-30 21:24:55 -07:00
Grigori Goronzy
30004b20c2 r600g: implement fast color clears for MSAA on evergreen+
Allows MSAA colorbuffers, which have a CMASK automatically and don't
need any further special handling, to be fast cleared. Instead
of clearing the buffer, set the clear color and the CMASK to the
cleared state.

Fast clear is used only when all bound colorbuffers fulfill certain
conditions: a CMASK is required, we have to be able to create a clear
color value for the format and the texture mustn't contain multiple
images. Technically, it should be possible to support array textures
and cubemaps if all images are attached to the framebuffer,
but this does not appear to be common.

v2: fix fast clear check
v3: Marek: - disable fast clear with 128-bit formats, which are unsupported
           - set tex->dirty_level_mask in r600_clear, so that the driver knows
             the resource must be decompressed/expanded
           - return early from r600_clear if there's nothing else to do

Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-07-01 03:02:43 +02:00
Marek Olšák
b1693194ee r600g/compute: disable unused colorbuffer slots
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-01 03:02:43 +02:00
Marek Olšák
f83e220d36 st/mesa: handle SNORM formats in generic CopyPixels path
v2: check desc->is_mixed in util_format_is_snorm
2013-06-30 22:14:37 +02:00
Matt Turner
adf8afa168 i965: NULL check depth_mt to quiet static analysis.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-29 15:19:08 -07:00
Roland Scheidegger
7d430bfab9 llvmpipe: fix timer query if there's no bins
b04a295a4a removed seemingly unnecessary
code in get_query. Turns out this code could in fact be reached - while
timestamps are always binned, if there are no bins (which happens if fb
size is 0) then the rasterization query code filling this in is still
never executed.
So fix this up by filling in some timestamp, but do it at EndQuery time
not GetQuery time which should be more appropriate.
Makes piglit arb_timer_query-timestamp-get happy again.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-29 16:58:02 +02:00
Tom Stellard
5a925cc550 clover: Don't segfault when compiling a program with no kernel 2013-06-28 15:19:06 -07:00
Eric Anholt
d7361f2943 mesa: Remove unused allow_large_textures driconf from classic drivers.
This option hasn't been used since the introduction of DRI2.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:27 -07:00
Kenneth Graunke
03600660a1 i915: Remove GLES 3.0 sRGB workaround.
Gen3 doesn't support GLES 3.0, so there's no need for it.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
dc8796506e i965: Remove is_945.
Only relevant on Gen3.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
a4e31956ac i965: Delete hw_stencil flag.
This was only used by i915.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
4299e35888 i965: Remove hw_stipple flag.
This was only used by i915.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
1a5dca38e9 i965: Remove use_early_z option.
This was only used by i965+.

v2: Also remove the option from the driconf list. (change by anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
2cc5724db2 i965: Remove unused SUBPIXEL_* macros.
Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
2e9fe0ca12 i965: Remove redundant Gen3 PCI IDs.
Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
1811f5c43d intel: Remove unused INTEL_MAX_FIXUP macro.
v2: Remove it from i915, too (change by anholt)

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Eric Anholt
0ac0a1b02e i965: Drop i915 register/instruction definitions.
v2: Remove unused DV_PF_* macros, too. (change by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:26 -07:00
Eric Anholt
1b67cd29a1 i965: Drop code for calling the empty brw_update_draw_buffers() hook.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
7c232189c5 i965: Drop dead i915 blend state code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
d58d0a3754 i965: Drop i915-specific blit clear code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
cf31a19300 i965: Drop the system-memory VBO support for i915.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
814440aadd i965: Drop i915 swtnl code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
bb2e312d4d i965: Drop i915-specific vtbl entries.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
a61d8f6110 i965: Drop swtnl fallback code for i915.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
28e80d7136 i965: Drop i915 code from intel_screen.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
4a08a86f22 i965: Drop #ifdef I915 code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
6fddd375d7 i965: Drop code checking for gen <= 3.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
3c231b8631 i915: Remove a duplicated set of PCI IDs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
8ac1ed92aa i915: Remove various remaining dead code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
934974fba6 i915: Remove dead debug flags.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
39c5fd7f13 i915: Remove state batch emit support.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
a40f9871a0 i915: Drop unused register #defines from the shared reg file.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
173666e2ed i915: Drop 965+ GL version setup.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
f6426509dc i915: Remove gen6+ batchbuffer support.
While i915 does have hardware contexts in hardware, we don't expect there
to ever be SW support for it (given that support hasn't even made it back
to gen5 or gen4).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
c25e3c34d6 i915: Drop chipset detection code for 965+ chipsets.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
014251ef42 i915: Drop context fields specific to 965+ chipsets.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
d71b7301ec i915: Drop all has_llc code.
i915 never has llc.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:24 -07:00
Eric Anholt
be63c1c993 i915: Remove the remainder of the batchbuffer caching.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
7f210bf535 i915: Remove miscellanous uncalled gen4 code from formerly shared files.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
6bdc5ecbba i915: Remove most of the code under gen >= 4 checks.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
18100d415e i915: Remove fake ETC support that only existed on gen4+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
27eedca3e0 i915: Remove separate stencil code.
This was formerly-shared code for supporting gen5+.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
279f0bce47 i915: Remove the I915 macro from the formerly shared code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
f26104eb5b i915: Remove all the MSAA support code.
This hardware doesn't have MSAA support, so this code is all a waste for it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Eric Anholt
0f31e06a2e i915: Remove all the HiZ code from i915.
v2: Remove extra struct forward declaration (change by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:23 -07:00
Ian Romanick
927f572c27 mesa: GL_EXT_shadow_funcs is not optional with GL_ARB_shadow
Every driver left in Mesa that enables one also enables the other.
There's no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
41853b598c mesa: GL_ARB_texture_storage_multisample is not optional with GL_ARB_texture_multisample
In Mesa, this extension is implemented purely in software.  Drivers may
*optionally* provide optimized paths.  If a driver enables,
GL_ARB_texture_multisample, it gets GL_ARB_texture_storage_multisample
for free.

NOTE: This has the side effect of enabling the extension in Gallium
drivers that enable GL_ARB_texture_multisample.

v2 (Ken): Still prevent multisample texture targets in TexParameter for
implementations that don't support multisampling.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
d5b6b7a39b mesa: GL_ARB_texture_storage is not optional
In Mesa, this extension is implemented purely in software.  Drivers may
*optionally* provide optimized paths.

NOTE: This has the side effect of enabling the extension in the radeon,
r200, and nouveau drivers.

v2: Minor whitespace tidying (suggested by Brian).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
70966570f3 mesa: GL_ARB_shading_language_100 is not optional
This extension just provides some of the most basic software framework
for GLSL.  Without GL_ARB_vertex_shader or GL_ARB_fragment_shader,
applications still cannot use GLSL.  There's no value in
conditionalizing support for this extension.

NOTE: This has the side effect of enabling the extension in the radeon,
r200, and nouveau drivers.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
e6ec425d6e mesa: GL_ARB_shader_objects is not optional
This extension just provides some of the most basic software framework
for GLSL.  Without GL_ARB_vertex_shader or GL_ARB_fragment_shader,
applications still cannot use GLSL.  There's no value in
conditionalizing support for this extension.

NOTE: This has the side effect of enabling the extension in the radeon,
r200, and nouveau drivers.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
9bc24b4fc4 mesa: GL_NV_blend_square is not optional
Every driver left in Mesa enables this extension all the time.  There's
no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
338ea2e4d1 mesa: GL_EXT_fog_coord is not optional
Every driver left in Mesa enables this extension all the time.  There's
no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
c139708087 mesa: GL_EXT_secondary_color is not optional
Every driver left in Mesa enables this extension all the time.  There's
no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
b5305a303b mesa: GL_EXT_framebuffer_object is not optional
Every driver left in Mesa enables this extension all the time.  There's
no reason to let it be optional.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:22 -07:00
Ian Romanick
f4571640b8 mesa: Remove GL_MESA_resize_buffers
Commit bab755a made the implementation a no-op, and it was only ever
enabled by software rasterizers.

v2: Move the spec into docs/specs/OLD since it's now obsolete
    (squashed patch from Andreas Boll)

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
34e8905077 mesa: Remove _mesa_{enable, disable}_extension and _mesa_extension_is_enabled
They're not used anywhere.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
e14b486113 mesa: Just set extension flags instead of calling _mesa_enable_extension
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
b0d755f00b mesa: Remove _mesa_enable_._._extensions functions
After the preceeding commits, they are not used.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
45099ec175 swrast: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions
_mesa_enable_sw_extensions enables all the extensions (and more) that
the others enable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
a964397fd9 osmesa: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions
_mesa_enable_sw_extensions enables all the extensions (and more) that
the others enable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
c9edd661c4 wmesa: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions
_mesa_enable_sw_extensions enables all the extensions (and more) that
the others enable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
89cf6e6273 x11: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions
_mesa_enable_sw_extensions enables all the extensions (and more) that
the others enable.  Also, don't duplicate the DXTn checks.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 13:35:21 -07:00
Ian Romanick
0b9398c74f i965: Merge the two GEN >= 6 extension enable blocks
There's no reason for these blocks to be separate.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:21 -07:00
Ian Romanick
ae66a656fd i965: Move GEN >= 4 extensions into the "always on" list
This copy of the source file is only used for GEN >= 4, so extensions
that are enabled for GEN >= 4 are always enabled.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:21 -07:00
Ian Romanick
4ed976f6b5 i965: Move GEN >= 3 extensions into the "always on" list
This copy of the source file is only used for GEN >= 4, so extensions
that are enabled for GEN >= 3 are always enabled.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:20 -07:00
Ian Romanick
e621208e29 i915: Remove GEN >= 4 extension support
This copy of the source file is only used for GEN <= 3, so remove the
dead code.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:20 -07:00
Kenneth Graunke
745f6c692c i965: Split surface format code into a new file (brw_surface_formats.c).
brw_wm_surface_state.c has gotten rather large and unwieldy.  At this
point, it consists of two separate portions:

1. Surface format code

   This includes the giant table of surface formats and what features
   they support on each generation, as well as the code to translate
   between Mesa formats and hardware formats.

   This is used across all generations.

2. Binding table (SURFACE_STATE) related code.

   This is the code to generate SURFACE_STATE entries for renderbuffers,
   textures, transform feedback buffers, constant buffers, and so on, as
   well as the code to assemble them into binding tables.

   This is only used on Gen4-6; gen7_surface_state.c has Gen7+ code.

Since the two are logically separate, and one is reused on every
generation while the other is not, it makes a lot of sense to split
them out.  It should also make finding code easier.

No code is changed by this patch.  I simply copied the file then deleted
portions of both.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:11 -07:00
Alex Deucher
c309e64db8 radeonsi: add kabini pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:17:27 -04:00
Alex Deucher
b6b1346691 radeonsi: add bonaire pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:17:18 -04:00
Alex Deucher
d669992e35 radeonsi: disable 2D tiling on CIK for now
Causes GPU hangs.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:17:10 -04:00
Alex Deucher
1357624abc radeonsi: add llvm processor names for CIK
Requires updated llvm.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:17:00 -04:00
Alex Deucher
234d81e6b2 radeonsi: emit PA_SC_RASTER_CONFIG[_1] on cik
Use the golden values for each asic.

Todo: update Kabini and Kaveri.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:53 -04:00
Alex Deucher
9d8ad222c6 radeonsi: PA_CL_ENHANCE is privileged on CIK
Needs to be and is set by the kernel.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:46 -04:00
Alex Deucher
72c10be3a7 radeonsi: update surface sync packet emit for CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:35 -04:00
Alex Deucher
f2a9bd8084 radeonsi: store chip class in the pm4 struct
Will be used for asic specific pm4 behavior.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:27 -04:00
Alex Deucher
3a47f1945f radeonsi: properly handle DB tiling setup on CIK
On CIK, DB switches back to using per-surface tiling
parameters rather than the tile index used on SI.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:17 -04:00
Alex Deucher
8c903f5df9 radeonsi: emit additional shader pgm rsrc registers for CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:10 -04:00
Alex Deucher
59e4fe0b75 radeonsi: emit TA_BC_BASE_ADDR_HI for border color on CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:16:03 -04:00
Alex Deucher
b363a45c54 radeonsi: fix VGT_PRIMITIVE_TYPE emit for CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:54 -04:00
Alex Deucher
ecb679a8d3 radeonsi: register updates for CIK
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:46 -04:00
Alex Deucher
deb2358243 radeonsi: initial PM4 changes for CIK
note which packets are removed and add new ones.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:36 -04:00
Alex Deucher
f29f206c93 radeonsi: initial support for CIK chips
Add the infrastructure to differentiate them.
Just treat them like SI for now.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:28 -04:00
Alex Deucher
5b3f1ea933 radeonsi: rename SI chip class from TAHITI to SI
Covers the entire family.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-28 15:15:20 -04:00
Tom Stellard
47e35eff9d r600g: Fix build
Broken since 2840bec56f when opencl is
disabled.
2013-06-28 11:11:43 -07:00
Anuj Phogat
ee723ffabb mesa: Return ZeroVec/dummyReg instead of NULL pointer
Assertions are not sufficient to check for null pointers as they don't
show up in release builds. So, return ZeroVec/dummyReg instead of NULL
pointer in get_{src,dst}_register_pointer(). This should calm down the
warnings from static analysis tool.

Note: This is a candidate for the 9.1 branch.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 10:53:43 -07:00
Tom Stellard
bee49cb0ec mesa: Fix build with older gcc since update of glext.h
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-28 08:49:06 -07:00
Tom Stellard
2840bec56f r600g/compute: Accept LDS size from the LLVM backend
And allocate the correct amount before dispatching the kernel.

Tested-by: Aaron Watry <awatry@gmail.com>
2013-06-28 08:33:11 -07:00
Tom Stellard
2639fca1f0 r600g/compute: Move compute_shader_create() function into evergreen_compute.c
Tested-by: Aaron Watry <awatry@gmail.com>
2013-06-28 08:33:11 -07:00
Brian Paul
ba4979810f svga: pass svga_compile_key by reference instead of value
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-28 08:38:00 -06:00
Brian Paul
74e8a7d1dd svga: use switch statement in svga_shader_type()
Safer in case the PIPE_SHADER_x tokens get renumbered (as Marek
wanted to do).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-28 08:37:59 -06:00
Chia-I Wu
24b05ff158 ilo: clean up states that use ilo_view_surface
Use variables that are easier to remember what they are.
2013-06-28 15:01:00 +08:00
Chia-I Wu
2c9b6a2164 ilo: remove ilo_cbuf_state::count
We can derive it from enabled_mask.
2013-06-28 15:01:00 +08:00
Chia-I Wu
7ea3ed81c8 ilo: clean up ilo_set_constant_buffer()
Add loops that will be optimized away.
2013-06-28 15:01:00 +08:00
Chia-I Wu
11d283cde9 ilo: clean up states that take a start_slot
They are similar, so clean them up to make them look similar.
2013-06-28 15:00:42 +08:00
Vinson Lee
def634979d glsl: Initialize member variable is_ubo_var in constructor.
Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-06-27 21:51:32 -07:00
Chia-I Wu
20c691b936 ilo: use shorter names for dirty flags
The new names match those of ilo_context's members respectively, and are
shorter.
2013-06-28 10:44:51 +08:00
Chia-I Wu
cabc7b44c0 ilo: track if primitive restart has changed
Re-emit 3DSTATE_INDEX_BUFFER to enable/disable primitive restart.
2013-06-28 10:44:38 +08:00
Chia-I Wu
e071812e46 ilo: avoid potential dangling pointer dereference
Set pipe_draw_info to NULL after draw_vbo().
2013-06-28 10:11:49 +08:00
Ian Romanick
c74a7eb9c5 mesa: Remove GL_EXT_clip_volume_hint
As far as I can tell, no driver has enabled this extension since c6499a7
back in 2007.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-27 18:14:33 -07:00
Chad Versace
6b676e6634 i965,i915: Return early if miptree allocation fails
If allocation fails in intel_miptree_create_layout(), don't proceed to
dereference the miptree. Return an early NULL.

Fixes static analysis error reported by Klocwork.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-27 13:16:47 -07:00
Roland Scheidegger
670f829102 llvmpipe: handle offset_clamp
This was just ignored (unless for some reason like unfilled polys draw was
handling this).
I'm not convinced of that code, putting the float for the clamp in the key
isn't really a good idea. Then again the other floats for depth bias are
already in there too anyway (should probably have a jit_context for the
setup function), so this is just a quick fix.
Also, the "minimum resolvable depth difference" used isn't really right as it
should be calculated according to the z values of the current primitive
and not be a constant (of course, this only makes a difference for float
depth buffers), at least for d3d10, so depth biasing is still not quite right.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 19:06:40 +02:00
Roland Scheidegger
b04a295a4a llvmpipe: remove never reached code for timestamp queries.
timestamp queries are always binned in an active scene, therefore
always have a result.
2013-06-27 19:06:40 +02:00
Roland Scheidegger
59b8689d37 llvmpipe: fix a bug in opaque optimization
If there are queries active the opaque optimization reseting the bin needs to
be disabled.
(Not really tested since the bug was discovered by code inspection not
an actual test failure.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 19:06:40 +02:00
Vinson Lee
f12e551810 radeonsi/compute: Fix memory leak in radeonsi_launch_grid.
Fixes "Resource leak" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-06-27 10:03:33 -07:00
Tom Stellard
0e990736f3 clover: Fix build with LLVM 3.4
Reported on IRC by lordheavy
2013-06-27 10:03:33 -07:00
Bill York
191795eaf1 docs: updated instructions for Mesa on Windows
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-06-27 09:49:41 -06:00
Matthew McClure
e87fc11cac postprocess: handle partial intialization failures.
This patch fixes segfaults observed when enabling the post processing
features. When the format is not supported, or a texture cannot be
created, the code must gracefully handle failure and report the error to
the calling code for proper failure handling.

To accomplish this the following changes were made to the filters.h
prototypes:

- bool return for pp_init_func
- Added pp_free_func for filter specific resource destruction

Fixes segfaults from backtraces:

* util_destroy_blit
  pp_free

* u_transfer_inline_write_vtbl
  pp_jimenezmlaa_init_run
  pp_init

This patch also uses tgsi_alloc_tokens to allocate temporary tokens in
pp_tgsi_to_state, instead of allocating the array on the stack. This
fixes the following stack corruption segfault in pp_run.c:

* _int_free
  aaline_delete_fs_state
  pp_free

Bug Number: 1021843
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-27 09:44:29 -06:00
Brian Paul
482c43a946 glx: return True/False instead of GL_TRUE/GL_FALSE
Just to be consistent with the functions' Bool return type.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 07:48:19 -06:00
Brian Paul
d171bc9d19 glx: move declarations before code
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 07:48:18 -06:00
Brian Paul
d43548ca37 mesa: move declarations before code
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 07:48:18 -06:00
José Fonseca
15085b477b glsl: Use the C99 variadic macro syntax.
MSVC does not support the old GCC syntax.

See also
http://gcc.gnu.org/onlinedocs/gcc/Variadic-Macros.html
2013-06-27 07:44:11 +01:00
José Fonseca
bcd6f3b23c scons: Add dependencies to all .xml files.
Should prevent stuck builds when only some of the included .xml files
change.
2013-06-27 07:25:10 +01:00
Chia-I Wu
9f3cfe6aaf ilo: plug a potential index buffer leak
This is harmless since st_context and u_vbuf both set index buffer to NULL
before destroying themselves.  But we do not want to rely on that behavior.
2013-06-27 11:46:58 +08:00
Roland Scheidegger
eabe068747 softpipe: honor predication for clear_render_target and clear_depth_stencil
trivial, copied from llvmpipe

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 23:17:53 +02:00
Roland Scheidegger
2e4da1f594 llvmpipe: add support for nested / overlapping queries
OpenGL doesn't support this but d3d10 does.
It is a bit of a pain as it is necessary to keep track of queries
still active at the end of a scene, which is also why I cheat a bit
and limit the amount of simultaneously active queries to (arbitrary)
16 (simplifies things because don't have to deal with a real list
that way). I can't think of a reason why you'd really want large
numbers of overlapping/nested queries so it is hopefully fine.
(This only affects queries which need to be binned.)

v2: don't copy remainder of array when deleting an entry simply replace
the deleted entry with the last one (order doesn't matter).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 23:17:53 +02:00
Roland Scheidegger
0820342880 llvmpipe: rework query logic
Previously lp_rast_begin_query commands were always inserted into each bin,
and re-issued if the scene was restarted, while lp_rast_end_query commands
were executed for each still active query at the end of tile rasterization.
Also, the ps_invocations and vis_counter were set to zero when the respective
command was encountered.
This however cannot work for multiple queries of the same type (note that
occlusion counter and occlusion predicate while different type were also
affected).
So, change the logic to always set the ps_invocations and vis_counter to zero
at the start of tile rasterization, and then use "start" and "end" per-thread
query values when encountering the begin/end query commands instead, which
should work for multiple queries of the same type. This also means queries do
not have to be reissued in a new scene, however they still need to be finished
at end of tile rasterization, so a list of queries still active at the end of
a scene needs to be maintained.
Also while here don't bin the queries which don't do anything in rasterization.
(This change does not actually handle multiple queries of the same type yet,
as the list of active queries is just a simple fixed array and setup can still
only have one query active per type.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 23:17:53 +02:00
Eric Anholt
3dbba95b72 i965: Move the remaining intel code to the i965 directory.
Now that i915's forked off, they don't need to live in a shared directory.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Adam Jackson <ajax@redhat.com>
(and I hear second hand that idr is OK with it, too)
2013-06-26 12:28:26 -07:00
Eric Anholt
733d32f376 i915: Fork the shared code from i965.
Of this 15000 lines of code in intel/, we've identified 4000 lines that
are trivially unnecessary for i915, and another 1000 that are pointless for
i965, and expect to find more as time goes on.  Split the i915 driver off,
so that we can continue active development on i965 without worrying about
breaking i915.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Adam Jackson <ajax@redhat.com>
(and I hear second hand that idr is OK with it, too)
2013-06-26 12:28:25 -07:00
Eric Anholt
43a6795a1f i915: Remove dead symlink. 2013-06-26 12:28:25 -07:00
Eric Anholt
fc32d40534 glx: Fix another missed glMultiDrawElementsEXT const change.
The build was broken for me since
b7d9478f36.
2013-06-26 12:28:25 -07:00
Ian Romanick
c170c901d0 glsl: Move all var decls to the front of the IR list in reverse order
This has the (intended!) side effect that vertex shader inputs and
fragment shader outputs will appear in the IR in the same order that
they appeared in the shader code.  This results in the locations being
assigned in the declared order.  Many (arguably buggy) applications
depend on this behavior, and it matches what nearly all other drivers
do.

Fixes the (new) piglit test attrib-assignments.

NOTE: This is a candidate for stable release branches (and requires the
previous commit to prevent a regression in OpenGL ES 2.0 conformance
test stencil_plane_operation).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-26 12:27:23 -07:00
Ian Romanick
329cd6a9b1 i965: Be more careful with the interleaved user array upload optimization
The checks to determine when the data can be uploaded in an interleaved
fashion can be tricked by certain data layouts.  For example,

    float data[...];

    glVertexAttribPointer(0, 4, GL_FLOAT, GL_FALSE, 16, &data[0]);
    glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, 16, &data[4]);
    glDrawArrays(GL_POINTS, 0, 1);

will hit the interleaved path with an incorrect size (16 bytes instead
of 32 bytes).  As a result, the data for attribute 1 never gets
uploaded.  The single element draw case is the only sensible case I can
think of for non-interleaved-that-looks-like-interleaved data, but there
may be others as well.

To fix this, make sure that the end of the element in the array being
checked is within the stride "window."  Previously the code would check
that the begining of the element was within the window.

NOTE: This is a candidate for stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 12:27:23 -07:00
Brian Paul
b7d9478f36 mesa: add const qualifier to glMultiDrawElementsEXT() indices param
The 20130624 version of glext.h changed this to match the
glMultiDrawElements() function which already had the extra const
qualifier.

Fixes warnings/errors that seem to vary from one compiler to the next.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 13:12:01 -06:00
Brian Paul
15436adab0 mesa: remove const from glDebugMessageCallbackARB() function parameter
The new 20130624 version of glext.h removed the const qualifier on
the 'userParam' parameter.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 13:12:01 -06:00
Kenneth Graunke
dd0b99b0be i965/vs: Combine code generation's inst->opcode switch statements.
vec4_visitor::generate_code() switches on vec4_instruction::opcode and
calls into the brw_eu_emit.c layer to generate code for some of them.
It then has a default case which calls generate_vec4_instruction() to
handle the rest...which switches on opcode and handles the rest of the
cases.

The split apparently is that generate_code() handles the actual hardware
opcodes (BRW_OPCODE_*) while generate_vec4_instruction() handles the
virtual opcodes (SHADER_OPCODE_* and VS_OPCODE_*).  But this looks
fairly arbitrary, and it makes more sense to combine the two switches.

This patch moves the cases from generate_code() into the helper function
so that generate_code() isn't as large.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:13 -07:00
Kenneth Graunke
55272883ac i965: Remove broken source type assertions from brw_alu3().
Commit 526ffdfc03 attempted to generalize
the source register type assertions to allow D and UD.  However, the
src1 and src2 assertions actually checked src0.type against D and UD due
to a copy and paste bug.

It also began setting the source and destination register types based on
dest.type, ignoring src0/src1/src2.type completely.  BFE and BFI2 may
actually pass mixed D/UD types and expect them to be ignored, which is
arguably a bit sloppy, but not too crazy either.

This patch simply removes the source register assertions as those values
aren't used anyway.  It also clarifies the comment above the block that
sets the register types.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-06-26 11:25:13 -07:00
Kenneth Graunke
9321f3257f i965: Add back strict type assertions for MAD and LRP.
Commit 526ffdfc03 relaxed the type
assertions in brw_alu3 to allow D/UD types (required by BFE and BFI2).
This lost us the strict type checking for MAD and LRP, which require
all four types to be float.

This patch adds a new ALU3F wrapper which checks these once again.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2013-06-26 11:25:12 -07:00
Kenneth Graunke
4563dfe23a glsl: Streamline the built-in type handling code.
Over the last few years, the compiler has grown to support 7 different
language versions and 6 extensions that add new built-in types.  With
more and more features being added, some of our core code has devolved
into an unmaintainable spaghetti of sorts.

A few problems with the old code:
1. Built-in types are declared...where exactly?

   The types in builtin_types.h were organized in arrays by the language
   version or extension they were introduced in.  It's factored out to
   avoid duplicates---every type only exists in one array.  But that
   means that sampler1D is declared in 110, sampler2D is in core types,
   sampler3D is a unique global not in a list...and so on.

2. Spaghetti call-chains with weird parameters:

   generate_300ES_types calls generate_130_types which calls
   generate_120_types and generate_EXT_texture_array_types, which calls
   generate_110_types, which calls generate_100ES_types...and more

   Except that ES doesn't want 1D types, so we have a skip_1d parameter.
   add_deprecated also falls into this category.

3. Missing type accessors.

   Common types have convenience pointers (like glsl_type::vec4_type),
   but others may not be accessible at all without a symbol table (for
   example, sampler types).

4. Global variable declarations in a header file?

   #include "builtin_types.h" in two C++ files would break the build.

The new code addresses these problems.  All built-in types are declared
together in a single table, independent of when they were introduced.
The macro that declares a new built-in type also creates a convenience
pointer, so every type is available and it won't get out of sync.

The code to populate a symbol table with the appropriate types for a
particular language version and set of extensions is now a single
table-driven function.  The table lists the type name and GL/ES versions
when it was introduced (similar to how the lexer handles reserved
words).  A single loop adds types based on the language version.
Explicit extension checks then add additional types.  If they were
already added based on the language version, glsl_symbol_table simply
ignores the request to add them a second time, meaning we don't need
to worry about duplicates and can simply list types where they belong.

v2: Mark uvecs and shadow samplers as ES3 only, and 1DArrayShadow as
    unsupported in ES entirely.  Add a touch more doxygen.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:12 -07:00
Kenneth Graunke
818da74af5 glsl: Don't use random pointers as an array of glsl_type objects.
Using a random glsl_type convenience pointer as an array is a really bad
idea, for all the reasons mentioned in the previous commit.

The new glsl_type::bvec() function is simpler anyway.

Prevents breakage in the next commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:12 -07:00
Kenneth Graunke
4530ed4f26 glsl: Stop being clever with pointer arithmetic when fetching types.
Currently, vector types are linked together closely: the glsl_type
objects for float, vec2, vec3, and vec4 are all elements of the same
array, in that exact order.  This makes it possible to obtain vector
types via pointer arithmetic on the scalar type's convenience pointer.
For example, float_type + (3 - 1) = vec3.

However, relying on this is extremely fragile.  There's no particular
reason the underlying type objects need to be stored in an array.  They
could be individual class members, possibly with padding between them.
Then the pointer arithmetic would break, and we'd get bad pointers to
non-heap allocated data, causing subtle breakage that can't be detected
by valgrind.  Cue insanity.

Or someone could simply reorder the type variables, causing us to get
the wrong type entirely.  Also cue insanity.

Writing this explicitly is much safer.  With the new helper functions,
it's a bit less code even.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:12 -07:00
Kenneth Graunke
d367a1cbdb glsl: Add simple vector type accessor helpers.
This patch introduces new functions to quickly grab a pointer to a
vector type.  For example:

   glsl_type::bvec(4)   returns   glsl_type::bvec4_type
   glsl_type::ivec(3)   returns   glsl_type::ivec3_type
   glsl_type::uvec(2)   returns   glsl_type::uvec2_type
   glsl_type::vec(1)    returns   glsl_type::float_type

This is less wordy than glsl_type::get_instance(GLSL_TYPE_BOOL, 4, 1),
which can help avoid extra word wrapping.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-26 11:25:12 -07:00
Brian Paul
9a14e412d6 mesa: update glext.h to version 20130624
In glapi_priv.h we always need the typedef for the GLclampx type
since GL_OES_fixed_point is now defined in glext.h but the
GLclampx type is not.  GLclampx is not used by anything in glext.h
but we need it for GL ES dispatch.

This is a huge patch because the structure of the file has been
changed.

The following extensions are new, however:

GL_AMD_interleaved_elements
GL_AMD_shader_trinary_minmax
GL_IBM_static_data
GL_INTEL_map_texture
GL_NV_compute_program5
GL_NV_deep_texture3D
GL_NV_draw_texture
GL_NV_shader_atomic_counters
GL_NV_shader_storage_buffer_object
GL_NVX_conditional_render
GL_OES_byte_coordinates
GL_OES_compressed_paletted_texture
GL_OES_fixed_point
GL_OES_query_matrix
GL_OES_single_precision

And these extensions were removed:

GL_FfdMaskSGIX
GL_INGR_palette_buffer
GL_INTEL_texture_scissor
GL_SGI_depth_pass_instrument
GL_SGIX_fog_scale
GL_SGIX_impact_pixel_texture
GL_SGIX_texture_select

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-06-26 10:43:27 -06:00
Brian Paul
bc6eb8068f st/mesa: add casts to silence MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
202299d16e st/mesa: make rtt_level, face, slice unsigned to silence MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
2285645aa2 hud: add float casts to silence MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
87d5a16927 hud: include stdio.h since we use fprintf(), fscanf(), etc 2013-06-26 10:42:59 -06:00
Brian Paul
61964a9ceb hud: add cast to silence MSVC warning 2013-06-26 10:42:59 -06:00
Brian Paul
f06e60fde4 os: add cast in os_time_sleep() to silence MSVC warning 2013-06-26 10:42:59 -06:00
Brian Paul
21f8729c3d vega: add some casts to silence MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
4d452f1988 util: int/unsigned changes to silence some MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
bbdd7cfb8b util: add some casts to silence some MSVC warnings 2013-06-26 10:42:59 -06:00
Brian Paul
aab8ca8fd1 util: s/int/unsigned/ to silence some MSVC warnings 2013-06-26 10:42:58 -06:00
Maarten Lankhorst
e72cc26518 nvc0: set rsvd_kick correctly
This prevents trampling beyond the end of the command stream during flushes.

NOTE: This is a candidate for the stable branches.

Reported-by: Christoph Bumiller <christoph.bumiller@speed.at>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-06-26 16:50:08 +02:00
Maarten Lankhorst
30c2c34464 nvc0: fix push_space checks for video decoding 2013-06-26 16:18:42 +02:00
Vinson Lee
e6479b4330 ilo: Remove max_threads dead code path.
max_threads cannot be greater than 28. It is either 21 or 28.

Fixes "Logically dead code" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-06-26 21:51:07 +08:00
Jean-Sébastien Pédron
c6d52f2290 winsys/intel: fix typo in "ETIMEOUT"
Should be "ETIMEDOUT".

[olv: commit message slightly re-formatted]

Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-06-26 21:51:07 +08:00
Chia-I Wu
c610b67972 ilo: use a bitmask for enabled constant buffers
Looping over 4 * 13 constant buffers while in most cases only two are enabled
is stupid.
2013-06-26 21:50:26 +08:00
Maarten Lankhorst
9aebad618c vl/mpeg12: handle mpeg-1 bitstreams more correctly
Add support for D-frames.
Add support for slices ending on a different horizontal row of macroblocks.
2013-06-26 11:40:47 +02:00
Chia-I Wu
95c21f12f3 ilo: support PIPE_CAP_USER_INDEX_BUFFERS
We want to access the user buffer, if available, when primitive restart is
enabled and the restart index/primitive type is not natively supported.

And since we are handling index buffer uploads in the driver with this change,
we can also work around misalignment of index buffer offsets.
2013-06-26 16:42:46 +08:00
Chia-I Wu
5fb5d4f0a6 ilo: make pipe_draw_info a context state
Rename ilo_finalize_states() to ilo_finalize_3d_states(), and bind
pipe_draw_info to the context when it is called.  This saves us from having to
pass pipe_draw_info around in several places.
2013-06-26 16:42:46 +08:00
Chia-I Wu
3eb6754e94 ilo: support PIPE_CAP_USER_CONSTANT_BUFFERS
We need it for HUD support, and will need it for push constants in the future.
2013-06-26 16:42:45 +08:00
Eric Anholt
79385950f3 i915: Drop dead batch dumping code.
Batch dumping is now handled by shared code in libdrm.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:12 -07:00
Eric Anholt
57407bcaf8 intel: Drop little bits of dead code.
I noticed these while building the fork-i915 branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:12 -07:00
Eric Anholt
88514d922e i965: Stop recomputing the miptree's size from the texture image.
We've already computed what the dimensions of the miptree are, and stored
it in the miptree.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:12 -07:00
Eric Anholt
820325b258 i965: Drop unused argument to translate_tex_format().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
c20f973c4f i965/gen4-5: Stop using bogus polygon_offset_scale field.
The polygon offset math used for triangles by the WM is "OffsetUnits * 2 *
MRD + OffsetFactor * m" where 'MRD' is the minimum resolvable difference
for the depth buffer (~1/(1<<16) or ~1/(1<<24)), 'm' is the approximated
slope from the GL spec, and '2' is this magic number from the original
i965 code dump that we deviate from the GL spec by because "it makes glean
work" (except that it doesn't, because of some hilarity with 0.5 *
approximately 2.0 != 1.0.  go glean!).

This clipper code for unfilled polygons, on the other hand, was doing
"OffsetUnits * garbage + OffsetFactor * m", where garbage was MRD in the
case of 16-bit depth visual (regardless the FBO's depth resolution), or
128 * MRD for 24-bit depth visual.

This change just makes the unfilled polygons behavior match the WM's
filled polygons behavior.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
dba46831b0 i915: Use the current drawbuffer's depth for polygon offset scale.
There's no reason to care about the window system visual's depth for
handling polygon offset in an FBO, and it could only lead to pain.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
c31aee99f3 intel: Add perf debug for glCopyPixels() fallback checks.
The separate function for the fallback checks wasn't particularly
clarifying things, so I put the improved checks in the caller.  (Note that
the dropped _mesa_update_state() had already happened once at the start of
the caller)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
a2ca98b211 i965: Add debug to INTEL_DEBUG=blorp describing hiz/blit/clear ops.
I think we've all added instrumentation at one point or another to see
what's being called in blorp.  Now you can quickly get output like:

Testing glCopyPixels(depth).
intel_hiz_exec depth clear to mt 0x16d9160 level 0 layer 0
intel_hiz_exec depth resolve to mt 0x16d9160 level 0 layer 0
intel_hiz_exec hiz ambiguate to mt 0x16d9160 level 0 layer 0
intel_hiz_exec depth resolve to mt 0x16d9160 level 0 layer 0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-26 01:07:11 -07:00
Eric Anholt
da00782ed8 ra: Fix register spilling.
Commit 551c991606 tried to avoid spilling
registers that were trivially colorable.  But since we do optimistic
coloring, the top of the stack also contains nodes that are not trivially
colorable, so we need to consider them for spilling (since they are some
of our best candidates).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58384
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63674
NOTE: This is a candidate for the 9.1 branch.
2013-06-26 01:07:11 -07:00
Eric Anholt
c6d74a4992 i965/fs: Dump IR when fatally not compiling due to bad register spilling.
It should never happen, but it does, and at this point, you're going to
_mesa_problem() and abort() (unless it's just in precompile).  Give the
developer something to look at.
2013-06-26 01:07:11 -07:00
Naohiro Aota
95e145aaee xmlpool/build: Make sure to set mo properly
Some shells does not set variables sequentially in a statement i.e. "a=X
b=${a}" won't set "b" to "X" but empty value.

This patch introduce ";" to make sure "mo" is set properly before "lang"
assignment.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=471302
2013-06-25 21:22:56 -07:00
Eric Anholt
04e03d9645 i965: Remove the rest of brw_update_draw_buffer().
The last piece of code with an effect was flagging _NEW_BUFFERS.  Only,
that is already flagged from everything that calls this function: Mesa GL
state updates flag it before even calling down into the driver, and the
calls from the DRI2 window system framebuffer update path end up flagging
it as part of the ResizeBuffers() hook.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:22 -07:00
Eric Anholt
c39111509d i965: Stop updating FBO state on drawbuffers change.
The computed fields are updated appropriately as part of the normal draw
call path due to _NEW_BUFFERS being set.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:22 -07:00
Eric Anholt
9d523e3372 i965: Stop recomputing drawbuffer bounds on drawbuffer change.
For winsys FBOs, the bounds are appropriately updated immediately upon
_mesa_resize_framebuffer().  For user FBOs, they're updated as part of the
normal draw path state update due to _NEW_BUFFERS having been flagged.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
15c47481ba i965: Remove _NEW_DEPTH state flagging on drawbuffers change.
Of the places noting a _NEW_DEPTH dependency, all were already checking
for _NEW_BUFFERS if appropriate.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
94ecf913b4 intel: Stop doing special _NEW_STENCIL state flagging on drawbuffers.
2/3 packets depending on Stencil._Enabled already checked for
_NEW_BUFFERS, so just add _NEW_BUFFERS to the remaining one.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
3faccc42ad i965: Stop flagging viewport/scissor change on drawbuffers change.
The viewport (ctx->Viewport._WindowMap) doesn't change with drawable size
changes, and we update scissor (ctx->DrawBuffer->_Xmin and friends) on
_NEW_BUFFERS in things like brw_sf_state.c.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
438f85717d i965: Stop flagging _NEW_POLYGON on drawbuffers change.
Things like brw_sf.c that need to know about orientation are already
recomputing on _NEW_BUFFERS.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
b04c718ebd radeon: Remove gratuitous custom framebuffer resize code.
_mesa_resize_framebuffer(), the default value of the ResizeBuffers hook,
already checks for a window system framebuffer and walks the renderbuffers
calling AllocStorage().

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
17bc8fdb1d intel: Remove gratuitous custom framebuffer resize code.
_mesa_resize_framebuffer(), the default value of the ResizeBuffers hook,
already checks for a window system framebuffer and walks the renderbuffers
calling AllocStorage().

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:21 -07:00
Eric Anholt
d7165b383d mesa: Remove the Initialized field from framebuffers.
This existed to tell the core not to call GetBufferSize, except that even
if you didn't set it nothing happened because nobody had a GetBufferSize.

v2: Remove two more instances of setting the field (from Brian)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:20 -07:00
Eric Anholt
bab755ad1b mesa: Remove Driver.GetBufferSize and its callers.
Only the GDI driver set it to non-NULL any more, and that driver has a
Viewport hook that should keep it limping along as well as it ever has.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-25 19:19:20 -07:00
Vinson Lee
61bfed2d09 glsl: Fix gl_shader_program::UniformLocationBaseScale assert.
commit 26d86d26f9 added
gl_shader_program::UniformLocationBaseScale. According to the code
comments in that commit, UniformLocationBaseScale "must be >=1".

UniformLocationBaseScale is of type unsigned. Coverity reported a "Macro
compares unsigned to 0" defect as well.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-06-25 18:45:01 -07:00
Brian Paul
0b994961ff svga: allow 3D transfers in svga_texture_transfer_map()
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-25 17:54:24 -06:00
Brian Paul
808da7d8ca svga: use new svga_define_texture_level() helper
To get array bounds checking.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-25 17:54:24 -06:00
Brian Paul
2cc27c3faa svga: fix layer/level mix-up in svga_mark_surface_dirty()
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-25 17:54:24 -06:00
Brian Paul
04e3969597 svga: use new svga_age_texture_view() helper
The function does array bounds checking.  Note, this exposes a
bug in the svga_mark_surface_dirty() function: we're calling
svga_age_texture_view() with a texture slice instead of mipmap
level.  This can lead to a failed assertion.  That'll be fixed next.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-25 17:54:24 -06:00
Brian Paul
a4e4a413e5 svga: add array index assertion in svga_validate_sampler_view() 2013-06-25 17:54:24 -06:00
Brian Paul
82d6a52530 svga: use svga_texture() helper instead of casting 2013-06-25 17:54:23 -06:00
José Fonseca
464c6949cb util/debug: Cleanup/improve debug_symbol_name_dbghelp.
- use mgwhelp -- the successor for bfdhelp which does not have a hard
  dependency on BFD, and works on 64bits.
- use a macro instead of hand-typing to dispatch DbgHelp functions
- dump line numbers
- dump module names when symbols are not available
- support 64bits.
- add comments

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-25 18:41:59 +01:00
José Fonseca
a26f834a39 util/debug: Make debug_backtrace_capture work for 64bit windows.
Rely on Windows' CaptureStackBackTrace to do the grunt work.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-25 18:41:59 +01:00
Zack Rusin
29dacd9803 draw: allow overflows in the llvm paths
Because our code couldn't handle it we were skipping rendering
if we detected overflows. According to the spec we should
still render but with all 0 vertices, which is what the llvm
code already does. So for the llvm paths lets enable processing
even if an overflow condition has been detected.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-25 11:57:01 -04:00
Zack Rusin
f96326b2f6 draw: avoid overflows in the llvm draw loop
Before we could easily overflow if start+count>max integer. To
avoid it we can just iterate over the count. This makes sure
that we never crash, since most of the overflow conditions
is already handled.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-25 11:56:41 -04:00
Maarten Lankhorst
e2b02080d8 nvc0: do not set tiled mode on gart bo when fence debugging is used
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-06-25 13:34:15 +02:00
Chia-I Wu
c8240c9dea ilo: honor render condition in blitter
Make pass_render_condition() available for blitter, and check for render
condition in (and only in) clear(), clear_render_target(), and
clear_depth_stencil().
2013-06-25 15:38:07 +08:00
Chia-I Wu
5f4b769127 ilo: remove ilo_shader_internal.h from GEN6 pipeline
Replace direct shader accesses with ilo_shader_get_kernel_param() and etc.
2013-06-25 13:51:59 +08:00
Chia-I Wu
63165df90f ilo: remove ilo_shader_internal.h from GEN7 pipeline
Replace direct shader accesses with ilo_shader_get_kernel_param() and etc.
2013-06-25 13:51:59 +08:00
Chia-I Wu
855b684141 ilo: speed up ilo_shader_select_kernel_routing() a bit
Remember the order of the source attributes and avoid recomputation when it
does not change.
2013-06-25 13:51:59 +08:00
Chia-I Wu
9b18df6e08 ilo: move SBE setup code to ilo_shader.c
Add ilo_shader_select_kernel_routing() to construct 3DSTATE_SBE.  It is called
in ilo_finalize_states(), rather than in create_fs_state(), as it depends on
VS/GS and rasterizer states.

With this change, ilo_shader_internal.h is no longer needed for
ilo_gpe_gen6.c.
2013-06-25 13:51:58 +08:00
Chia-I Wu
c4fa24ff08 ilo: use ilo_shader_state exclusively in GPE
This allows us to remove ilo_shader_internal.h from ilo_gpe_gen7.c.  The
unfinished code in 3DSTATE_DS, 3DSTATE_HS, and INTERFACE_DESCRIPTOR_DATA are
partly or entirely removed.
2013-06-25 13:18:08 +08:00
Chia-I Wu
91cf6c1e92 ilo: map SO registers at shader compile time
The unmodified pipe_stream_output_info describes its outputs as if they are in
TGSI_FILE_OUTPUT.  Remap the register indices to where they appear in the VUE.

TGSI_SEMANTIC_PSIZE needs a little care because it is at the W channel.
2013-06-25 13:18:08 +08:00
Chia-I Wu
68522bf36c ilo: use ilo_shader_cso for FS
Add ilo_gpe_init_fs_cso() to construct 3DSTATE_PS and shader part of
3DSTATE_WM once and early for fragment shaders.
2013-06-25 13:18:08 +08:00
Chia-I Wu
639a2cddc6 ilo: use ilo_rasterizer_state exclusively in GPE
Replace pipe_rasterizer_state by ilo_rasterizer_state for the remaining GPE
functions for consistency.
2013-06-25 13:18:07 +08:00
Chia-I Wu
54ab03523b ilo: convert pipe_rasterizer_state to ilo_rasterizer_wm
Add ilo_gpe_init_rasterizer_wm() to construct fixed-function part of
3DSTATE_WM once in create_rasterizer_state().
2013-06-25 13:17:56 +08:00
Chia-I Wu
851202c319 ilo: use ilo_shader_cso for GS
Add ilo_gpe_init_gs_cso() to construct 3DSTATE_GS once and early for geometry
shaders.
2013-06-25 13:17:21 +08:00
Chia-I Wu
d209da5e33 ilo: introduce ilo_shader_cso for VS
When a new VS kernel is generated, a newly added function,
ilo_gpe_init_vs_cso(), is called to construct 3DSTATE_VS command in
ilo_shader_cso.  When the command needs to be emitted later, we copy the
command from the CSO instead of constructing it dynamically.
2013-06-25 12:42:04 +08:00
Chia-I Wu
5c8db569ab ilo: add functions to query shaders
Add ilo_shader_get_type() to query the type (PIPE_SHADER_x) of the shader.
Add ilo_shader_get_kernel_offset() and ilo_shader_get_kernel_param() to query
the cache offset and various kernel parameters of the selected kernel.
2013-06-25 12:28:54 +08:00
Chia-I Wu
96e2133e72 ilo: clean up finalize_shader_states()
Add ilo_shader_select_kernel() to replace the dependency table,
ilo_shader_variant_init(), and ilo_shader_state_use_variant().

With the changes, we no longer need to include ilo_shader_internal.h in
ilo_state.c.
2013-06-25 12:10:34 +08:00
Chia-I Wu
f0afedeb75 ilo: use multiple entry points for shader creation
Replace ilo_shader_state_create() by

 ilo_shader_create_vs()
 ilo_shader_create_gs()
 ilo_shader_create_fs()
 ilo_shader_create_cs()

Rename ilo_shader_state_destroy() to ilo_shader_destroy().  The old
ilo_shader_destroy() is renamed to ilo_shader_destroy_kernel().
2013-06-25 11:54:14 +08:00
Chia-I Wu
4d789c76dc ilo: move internal shader interface to a new header
Move it to ilo_shader_internal.h.  The goal is to make files not part of the
compiler include only ilo_shader.h eventually.
2013-06-25 11:51:26 +08:00
Brian Paul
e3cbb18321 gallium/hud: do not use free() for the free_query_data hook
That confuses Gallium's memory debugging code where CALLOC/MALLOC
must be matched with FREE, not free().

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-06-24 14:23:54 -06:00
Matthew McClure
e5bf19ac1c draw: check for out-of-memory conditions in the AA line module.
To prevent segfaults in the AA line module, the code will check for a
valid pointer to the aaline_stage in the draw context.

Fixes segfault from backtrace:

* aaline_stage_from_pipe
  aaline_delete_fs_state

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-24 08:36:47 -06:00
José Fonseca
06badea0da tests/graw: Fix typo in shader-leak.c 2013-06-24 15:29:25 +01:00
José Fonseca
a3d75db022 tools/trace: Fix syntax.
Cleaned/commented up the code, but forgot to actually test before
commiting...
2013-06-24 15:28:48 +01:00
Richard Sandiford
5a0556f061 st/dri/sw: Fix pitch calculation in drisw_update_tex_buffer
swrastGetImage rounds the pitch up to 4 bytes for compatibility reasons
that are explained in drisw_glx.c:bytes_per_line, so drisw_update_tex_buffer
must do the same.

Fixes window skew seen while running firefox over vnc on a 16-bit screen.

NOTE: This is a candidate for the stable branches.

[ajax: fixed typo in comment]

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-24 09:52:24 -04:00
Adam Jackson
2151d893fb gallium: Fix llvmpipe on big-endian machines
Squashed commit of the following:

commit 0857a7e105bfcbc4d1431b2cc56612094c747ca3
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:07 2013 -0400

    gallivm: Fix lp_build_rgba8_to_fi32_soa for big endian

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 0d65131649a8aa140e2db228ba779d685c4333e3
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:07 2013 -0400

    gallivm: Fix big-endian machines

    This adds a bit-shift count to the format table, and adds the concept of
    vector or bitwise alignment on gathers.

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 9740bda9b7dc894b629ed38be9b51059ce90818f
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:07 2013 -0400

    llvmpipe: Fix convert_to_blend_type on big-endian

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit ae037c2de0f029e4e99371c0de25560484f0d8df
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    util: Convert color pack to packed formats

    This fixes them on big-endian.

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 5b05ac0c89ae092ea8ba5bba9f739708d7396b5c
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    graw-xlib: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 51396e7d098cb6ff794391cf11afe4dbf86dbea0
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    format: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 417b60bc66eb450e68a92ab0e47f76e292b385e6
Author: Adam Jackson <ajax@redhat.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    st/dri: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 0934b2e022a5e0847d312c40734e2b44cac52fd8
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    st/xlib: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit a307ea3c3716a706963acce7966b5e405ba11db9
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    gbm: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 53eebdd253e1960a645ea278f31d7ef6a6cf4aeb
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    tests: Convert to packed formats

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 2f77fe3ee524945eacd546efcac34f7799fb3124
Author: Adam Jackson <ajax@redhat.com>
Date:   Tue Jun 18 13:07:37 2013 -0400

    gallium: Document packed formats

    Signed-off-by: Adam Jackson <ajax@redhat.com>

commit 1f1017159ce951f922210a430de9229f91f62714
Author: Richard Sandiford <r.sandiford@uk.ibm.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    gallium: Introduce 32-bit packed format names

    These are for interacting with buffers natively described in terms of
    bit shifts, like X11 visuals:

        uint32_t xyzw8888 = (x << 0) | (y << 8) | (z << 16) | (w << 24);

    Define these in terms of (endian-dependent) aliases to the array-style
    format names.

    Reviewed-by: Adam Jackson <ajax@redhat.com>
    Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>

commit 6cc7ab1ee66ed668da78c1d951dfd7782b4e786a
Author: Adam Jackson <ajax@redhat.com>
Date:   Mon Jun 3 12:10:32 2013 -0400

    gallium: Document format name conventions

    v2:
    - Fix a channel name thinko (Michel Dänzer)
    - Elaborate on SCALED versus INT
    - Add links to DirectX and FOURCC docs

    Signed-off-by: Adam Jackson <ajax@redhat.com>

commit df4d269e7fb62051a3c029b84147465001e5776e
Author: Adam Jackson <ajax@redhat.com>
Date:   Tue Jun 18 12:25:06 2013 -0400

    gallivm: Remove all notion of byte-swapping

    Signed-off-by: Adam Jackson <ajax@redhat.com>

Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-06-24 09:48:56 -04:00
Roland Scheidegger
d282f4ea9b llvmpipe: fix wrong results for queries not in a scene
The result isn't always 0 in this case (depends on query type),
so instead of special casing this just use the ordinary path (should result
in correct values thanks to initialization in query_begin/end), just
skipping the fence wait.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-22 17:09:37 +02:00
Brian Paul
a415aa9489 gallium/docs: more documentation for pipe_resource::array_size
It should never be zero and for cube/cube_arrays it should be a
multiple of six.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-06-22 08:50:15 -06:00
Brian Paul
cba7939790 svga: minor cleanups, comments in svga_tgsi_insn.c 2013-06-22 08:49:09 -06:00
Brian Paul
b03f394508 svga: add null ptr check in svga_get_tex_sampler_view()
Trivial.
2013-06-22 08:49:09 -06:00
José Fonseca
67bfdea933 tools/trace: Several tweaks/fixes to dump_state 2013-06-22 12:30:39 +01:00
José Fonseca
545d3d32d8 trace: Dump result of create_stream_output_target 2013-06-22 12:30:39 +01:00
Maarten Lankhorst
6aabd9490c vl/mpeg12: fix mpeg-1 bytestream parsing
This fixes the bytestream parsing of mpeg-1 stream, but still leaves
open a number of issues with the interpretation:
- IDCT mismatch control is not correct for MPEG-1.
- Slices do not have to start and end on the same horizontal row of macroblocks.
- picture_coding_type = 4 (D-pictures) is not handled.
- full_pel_*_vector is not handled.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-06-22 09:40:15 +02:00
Rob Clark
efdc6caaf5 freedreno/a3xx/compiler: ensure min # of cycles after bary instr
The results of a bary.f do not appear to be immediatley available, but
there is no explicit sync bit.  Instead the compiler must just ensure
that there are a minimum number of instructions following the bary
before use of the result of the bary.  We aren't clever enough for that
so just throw in some nop's.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-21 15:37:05 -04:00
Rob Clark
d4aaa4439a freedreno/a3xx/compiler: add TGSI_OPCODE_ABS
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-21 15:37:05 -04:00
Rob Clark
fe4ae1163d freedreno/a3xx/compiler: add TGSI_OPCODE_DPH
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-21 15:37:05 -04:00
Rob Clark
3f965556b4 freedreno/a3xx/compiler: fix for replicating instructions
If we are accumulating result into tmp.x, and need a mov to final
destination, we want to move the .x component into all of the components
enabled from the read dest's writemask, ie. we want:

  MOV dst.xyzw tmp.xxxx

rather than:

  MOV dst.xyzw tmp.xyzw

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-06-21 15:37:05 -04:00
Eric Anholt
0343f20e2f mesa: Move the common _mesa_glsl_compile_shader() code to glsl/.
This code had no relation to ir_to_mesa.cpp, since it was also used by
intel and state_tracker, and most of it was duplicated with the standalone
compiler (which has periodically drifted from the Mesa copy).

v2: Split from the ir_to_mesa to shaderapi.c changes.

Acked-by: Paul Berry <stereotype441@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-21 10:04:30 -07:00
Eric Anholt
10c14d16d2 mesa: Move shader compiler API code to shaderapi.c
There was nothing ir_to_mesa-specific about this code, but it's not
exactly part of the compiler's core turning-source-into-IR job either.

v2: Split from the ir_to_mesa to glsl/ commit, avoid renaming the sh
    variable.

Acked-by: Paul Berry <stereotype441@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-21 10:04:29 -07:00
Eric Anholt
88398a817c mesa: Fix missing setting of shader->IsES.
I noticed this while trying to merge code with the builtin compiler, which
does set it.

Note that this causes two regressions in piglit in
default-precision-sampler.* which try to link without a vertex or fragment
shader, due to being run under the desktop glslparsertest binary (using
ARB_ES3_compatibility) that doesn't know about this requirement.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-21 10:04:29 -07:00
Eric Anholt
faf3dbad0d mesa: Use shared code for converting shader targets to short strings.
We were duplicating this code all over the place, and they all would need
updating for the next set of shader targets.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-21 10:04:29 -07:00
Eric Anholt
426ca34b7a glsl: Remove ir_print_visitor.h includes and usage
We have ir->print() to do the old declaration of a visitor and having the
IR accept the visitor (yuck!).  And now you can call _mesa_print_ir()
safely anywhere that you know what an ir_instruction is.

A couple of missing printf("\n")s are added in error paths -- when an
expression is handed to the visitor, it doesn't print '\n' (since it might
be a step in printing a whole expression tree).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-21 10:04:29 -07:00
Eric Anholt
2b049aa53e glsl: Make _mesa_print_ir() available from anything including ir.h.
No more forgetting to #include "ir_print_visitor.h" when doing temporary
debug code, or forgetting and leaving it in after removing your temporary
debug code.  Also, available from C code so you don't need to move the
caller to C++ just to call it (see also: ir_to_mesa.cpp).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-21 10:04:29 -07:00
Paul Berry
d0abac22c3 glsl: Make some files safe to include from C
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-21 10:04:28 -07:00
José Fonseca
2d7e837716 tools/trace: Quick instructions/notes.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
c14f516e58 tools/trace: Do a better job at comparing multi line strings.
For TGSI diffing.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
9b7d21f8f5 tools/trace: Tool to compare json state dumps.
Copied verbatim from apitrace's scripts/jsondiff.py
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
cc4ad695ca tools/trace: Tool to dump gallium state at any draw call.
Based from the code from the good old python state tracker.

Extremely handy to diagnose regressions in state trackers.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
a7bccb33b9 tools/trace: Defer blob hex-decoding.
To speed up parsing.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:19 +01:00
José Fonseca
a8f7e12d92 trace: Don't dump texture transfers.
Huge trace files with little value.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:19 +01:00
Chia-I Wu
bbd2d575e6 ilo: replace a boolean by bool
bool is used internally.  This is just cosmetic.
2013-06-20 11:40:20 +08:00
Chia-I Wu
8b2cba8f97 ilo: rename cache_seqno to uploaded
It has been used as a bool since shader cache rework.
2013-06-20 11:36:54 +08:00
Roland Scheidegger
ffebefa114 util: (trivial) add has_popcnt field
Not used yet but there's a couple of places in llvmpipe which should use this
(occlusion count is currently very inefficent if there's no cpu popcnt
instruction).
2013-06-19 23:47:36 +02:00
Roland Scheidegger
5c9aee111e llvmpipe: use 64bit counter for occlusion queries
Some APIs require 64bit and at least for 64bit archs the overhead
should be minimal.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:36 +02:00
Roland Scheidegger
dc5dc4fd94 llvmpipe: handle more queries
Handle PIPE_QUERY_GPU_FINISHED and PIPE_QUERY_TIMESTAMP_DISJOINT, and
also fill out the ps_invocations and c_primitives from the
PIPE_QUERY_PIPELINE_STATISTICS (the others in there should already
be handled). Note that ps_invocations isn't pixel exact, just 16 pixel
exact but I guess it's better than nothing.
Doesn't really seem to work correctly but there's probably bugs elsewhere.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:36 +02:00
Roland Scheidegger
bf5096303f softpipe: handle all queries, and change for the new disjoint semantics
The driver can do render_condition but wasn't handling the occlusion
and so_overflow predicates (though the latter might not work yet due
to gs support).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:36 +02:00
Roland Scheidegger
cdf89d0b5c gallium: fix PIPE_QUERY_TIMESTAMP_DISJOINT
The semantics didn't really make sense, not really matching neither d3d9
(though the docs are all broken there) nor d3d10. So make it match d3d10
semantics, which actually gives meaning to the "disjoint" part.
Drivers are fixed up in a very primitive way, I have no idea what could
actually cause the counter to become unreliable so just always return
FALSE for the disjoint part.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:35 +02:00
José Fonseca
a0a40805dd trace: Dump pipe_rasterizer_state::clip_halfz.
Trivial.
2013-06-19 18:16:16 +01:00
Brian Paul
1e16e48f88 svga: add some comments about primitive conversion
And clean up the svga_translate_prim() function with better
variable names.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
8b3d4efed8 indices: add some comments
This is pretty complicated code with few/any comments.  Here's a first stab.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
2e8c51c98f svga: reindent svga_tgsi.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
0de01a47dd svga: whitespace, comment, formatting fixes in svga_tgsi_emit.h
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
1f57349e20 svga: move some svga/tgsi functions
Move some functions from the svga_tgsi_insn.h header into the
svga_tgsi_insn.c file since they're only used there.  Plus, add
comments and fix formatting.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
3abd9285be svga: formatting fixes in svga_tgsi_insn.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:13 -06:00
Brian Paul
9e6c29bf12 mesa: wrap comments, code to 78 columns in multisample.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:13 -06:00
Brian Paul
bdd5a0c12b mesa: remove unused BITSET64 macros
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:13 -06:00
Maarten Lankhorst
f1cccd6ca0 nvc0: kill assert in ppp code
It's no longer always true, and the video tilign aligment should
ensure the alignment is handled correctly regardless.
2013-06-19 13:08:51 +02:00
Chia-I Wu
cf41fae96b ilo: rework shader cache
The new code makes the shader cache manages all shaders and be able to upload
all of them to a caller-provided bo as a whole.

Previously, we uploaded only the bound shaders.  When a different set of
shaders is bound, we had to allocate a new kernel bo to upload if the current
one is busy.
2013-06-19 16:46:42 +08:00
Emil Velikov
7f7b05d6b3 nv50: avoid crash on updating RASTERIZE_ENABLE state
When doing blit using the 3D engine, the rasterizer cso may be NULL.

Ported from nvc0 commit 8aa8b0539.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-06-19 00:02:24 +02:00
Kristian Høgsberg
712269d674 wayland: Handle global_remove event as well
We need to set up a handler for the global_remove event that gets sent
out when a global gets removed.  Without the handler we end up calling
a NULL pointer.

https://bugs.freedesktop.org/show_bug.cgi?id=65910

NOTE: This is a candidate for the stable branches.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2013-06-18 17:45:19 -04:00
Jordan Justen
adeda5afd4 gen7: fix GPU hang on WebGL texture-size test
When rendering to a texture with BaseLevel set, the miptree may be laid
out such that BaseLevel is in level 0 of the miptree (to avoid wasting
memory on unused levels between 0 and BaseLevel-1).  In that case, we
have to shift our render target's level down to the appropriate level of
the smaller miptree.

The WebGL test in combination with a meta code relating to
glGenerateMipmap also triggered a similar failure scenario.

This GPU hang regression was introduced by c754f7a8.

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=65324
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-18 14:06:46 -07:00
Eric Anholt
248fddecd8 intel: Remove unused IS_POWER_OF_TWO() macro.
The is_power_of_two() inline function has been used instead.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-06-18 12:08:08 -07:00
Zack Rusin
9542131b27 Revert "draw: clear the draw buffers in draw"
This reverts commit 41966fdb3b.
While it's a lot cleaner it causes regressions because
the draw interface is always called from the draw functions
of the drivers (because the buffers need to be mapped) which
means that the stream output buffers endup being cleared on
every draw rather than on setting.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-17 21:43:10 -04:00
Roland Scheidegger
8975dc798d llvmpipe: fixes for conditional rendering
honor render_condition for clear_render_target and clear_depth_stencil.
Also add minimal support for occlusion predicate, though it can't be active
at the same time as an occlusion query yet.
While here also switchify some large if-else (actually just mutually
exclusive if-if-if...) constructs.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-18 18:01:24 +02:00
Roland Scheidegger
793e8e3d7e gallium: add condition parameter to render_condition
For conditional rendering this makes it possible to skip rendering
if either the predicate is true or false, as supported by d3d10
(in fact previously it was sort of implied skip rendering if predicate
is false for occlusion predicate, and true for so_overflow predicate).
There's no cap bit for this as presumably all drivers could do it trivially
(but this patch does not implement it for the drivers using true
hw predicates, nvxx, r600, radeonsi, no change is expected for OpenGL
functionality).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-18 18:01:24 +02:00
Chia-I Wu
443dc15cf7 ilo: construct depth/stencil command in create_surface()
Add ilo_gpe_init_zs_surface() to construct

 3DSTATE_DEPTH_BUFFER
 3DSTATE_STENCIL_BUFFER
 3DSTATE_HIER_DEPTH_BUFFER

at surface creation time.  This allows fast state emission in draw_vbo().
2013-06-18 16:23:13 +08:00
Eric Anholt
eb20215075 intel: Allow blorp CopyTexSubImage to nonzero destination slices.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
746b57ef0e intel: Allow blit CopyTexSubImage to nonzero destination slices.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
b0e3c3b852 intel: Directly implement blit glBlitFramebuffer instead of awkward reuse.
This gets us support for blitting to attachment types other than
textures.

v2: fix up comments from review by Kenneth.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
815dce9282 intel: Move XRGB->ARGB blit logic into intel_miptree_blit().
Now any caller (such as glCopyPixels()) can benefit from it, and it only
changes the correct subset of the destination instead of a whole teximage.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
04a5e940c9 intel: Fix Y tiling support for glCopyTexSubImage's alpha override.
Apparently we don't have any piglit tests for this, because it would have
assertion failed in a debug build, or just rendered wrong in a non-debug
build if the destination wasn't covering whole tiles.

v2: Use the new macros.

Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-06-17 15:43:23 -07:00
Eric Anholt
78c2fc5925 intel: Make batch macros for doing BCS_SWCTRL setup.
We're going to add more BCS_SWCTRL setup instances soon, and you have to
be careful to have the set and restore atomic with the rendering that's
done, so that our state doesn't leak out to other rendering processes.

v2: Rewrite the patch to have batch begin/advance macros so that magic
    numbers don't get sprinkled around (and so you don't mix up your
    do-I-need-to-reset vs what-do-I-reset-to logic, which I nearly did in
    the next patch when first writing it)

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-17 15:43:13 -07:00
Eric Anholt
b65b1c3148 mesa: Hide weirdness of 1D_ARRAY textures from Driver.CopyTexSubImage().
Intel had brokenness here, and I'd like to continue moving Mesa toward
hiding 1D_ARRAY's ridiculousness inside of the core, like we did with
MapTextureImage.  Fixes copyteximage 1D_ARRAY on intel.

There's still an impedance mismatch in meta when falling back to read and
texsubimage, since texsubimage expects coordinates into 1D_ARRAY as
(width, slice, 0) instead of (width, 0, slice).

v2: Fix offset of scanline reads from the source. (Thanks Brian!), replace
    dd.h comment with Paul's text and replace early exit with an assert.

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)
2013-06-17 15:26:20 -07:00
Dave Airlie
9e8400f4c9 tgsi: text parser: fix parsing of array in declaration
I noticed this code didn't work as advertised while doing some passing around
of TGSI shaders and trying to reparse them, and things failing.

This seems to fix it here for at least the small test case I hacked into a
graw test.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-06-18 08:25:12 +10:00
Sven Joachim
0829b893a9 mesa: Fix ieee fp on Alpha
Commit 1f82bf12ed inadvertently broke it, checking for __IEEE_FLOAT on all
Alpha machines instead of only on VMS as before.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Sven Joachim <svenjoac@gmx.de>
2013-06-17 10:02:56 -07:00
Richard Sandiford
c132c2978b st/xlib: Fix XImage stride calculation
Fixes window skew seen while running gnome on a 16-bit screen over vnc.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-17 12:15:13 -04:00
Richard Sandiford
876fefe2ff st/xlib Fix XIMage bytes-per-pixel calculation
Fixes a crash seen while running gnome on a 16-bit screen over vnc.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-17 12:14:32 -04:00
Jonathan Gray
ebd68dd029 gallium: replace bswap_32 calls with util_bswap32
byteswap.h and bswap_32 aren't portable, replace them with calls to
gallium's util_bswap32 as suggested by Mark Kettenis.  Lets these files
build on OpenBSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-06-17 17:22:28 +02:00
Zack Rusin
7807763dd8 draw: fix a regression in computing max elt
gl can use elts without setting indices, in which case
our eltMax was set to 0 and always invoking the overflow
condition. So by default set eltMax to maximum, it will
be curbed by draw_set_indexes (if it ever comes) and if
not then it will let gl's glVertexPointer/glDrawArrays
work correctly. Fixes piglit's
triangle-rasterization-overdraw test.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-17 11:06:39 -04:00
Zack Rusin
41966fdb3b draw: clear the draw buffers in draw
Moves clearing of the draw so target buffers to the draw
module. They had to be cleared in the drivers before
which was quite messy.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-17 11:06:39 -04:00
Chia-I Wu
98bc4c62a6 ilo: add pipe-based copy method to ilo_blitter
It enables accelerated resource_copy_region() when blt-based method fails.
2013-06-17 18:28:58 +08:00
Chia-I Wu
ebfd7a61c0 ilo: add BLT-based blitting methods to ilo_blitter
Port BLT code in ilo_blit.c to BLT-based blitting methods of ilo_blitter.  Add
BLT-based clears.  The latter is verifed with util_clear(), but it is not in
use yet.
2013-06-17 16:36:53 +08:00
Chia-I Wu
b4b3a5c6dc ilo: replace util_blitter by ilo_blitter
ilo_blitter is just a wrapper for util_blitter for now.  We will port BLT code
to ilo_blitter shortly.
2013-06-17 14:37:10 +08:00
Kenneth Graunke
6d7abafdc8 i965: Assume flexible hardware primitive restart exists in the future.
Primitive restart with an arbitrary cut index was first supported as of
Haswell.  It's very doubtful that they'd take that away in future
hardware, so we may as well alter the check now.
2013-06-14 22:58:18 -07:00
Chris Forbes
def84d8014 i965: Shrink Gen5 VUE map layout to be the same as Gen4.
The PRM suggests a larger layout, mostly to support having
gl_ClipDistance[] somewhere predictable for the fixed-function clipper
-- but it didn't actually arrive in Gen5.

Just use the same layout for both Gen4 and Gen5.

No Piglit regressions.

Improves performance in CS:S Video Stress Test by ~3%.

V2: - Remove now-useless function for determining the SF URB read offset
    - Remove now-unused BRW_VARYING_SLOT_POS_DUPLICATE

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-16 01:05:41 +12:00
Kenneth Graunke
1b77d2133c i965: Implement 16-wide math on G45 and Ironlake.
[chrisf:]
Improves performance in CS:S video stress test by about 2%.
No piglit regressions on Ironlake.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2013-06-16 00:47:50 +12:00
Matt Turner
fcaa48d9cc glsl: Disallow return with a void argument from void functions.
NOTE: This is a candidate for the stable branches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Matt Turner
1a1b03e6bc glsl: Allow implicit conversion of return values.
Required by ARB_shading_language_420pack.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Matt Turner
876e16562b glsl: Add gl_{Max,Min}ProgramTexelOffset built-in constants.
Required by ARB_shading_language_420pack. Note that the 420pack spec
incorrectly specifies their values as (Min, Max) = (-7, 8) when they
should be (-8, 7) as listed in the GLSL 4.30 and ESSL 3.0 specs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Matt Turner
ed455cdb0b glsl: Allow swizzles on scalars.
Required by ARB_shading_language_420pack.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Matt Turner
a8492e8fe7 glsl: Allow .length() method on vectors and matrices.
Required by ARB_shading_language_420pack.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:49 -07:00
Todd Previte
cf7f424e18 mesa: Add infrastructure for ARB_shading_language_420pack.
v2 [mattst88]
  - Split infrastructure into separate patch.
  - Add preprocessor #define.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-14 11:25:48 -07:00
Chia-I Wu
bfa8d21759 ilo: fix for half-float vertex arrays
Commit 6fe0453c33 broke half-float vertex
arrays.  This reverts a part of that commit, and explains why.
2013-06-15 01:00:03 +08:00
Chia-I Wu
36ffd08706 ilo: add some assertions to help debugging
Assert that we do not support user vertex/index/constant buffers.  Issue a
warning when a sampler view is created for a resource without
PIPE_BIND_SAMPLER_VIEW.
2013-06-14 16:02:31 +08:00
Chia-I Wu
0d9afaad35 ilo: silence a compiler warning
The path should never be hit.
2013-06-14 15:36:30 +08:00
Vinson Lee
93534873b0 glsl: Fix null check in read_dereference.
Fixes "Logically dead code" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 22:13:34 -07:00
Chia-I Wu
399548b17f st/mesa: fix temp texture bindings in st_CopyPixels()
The temporary texture should have either PIPE_BIND_RENDER_TARGET or
PIPE_BIND_DEPTH_STENCIL set in addition to PIPE_BIND_SAMPLER_VIEW.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-06-14 08:46:04 +08:00
Zack Rusin
5507c11f85 gallium/draw: add limits to the clip and cull distances
There are strict limits on those registers. Define the maximums
and use them instead of magic numbers. Also allows us to add
some extra sanity checks.
Suggested by Brian.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-13 12:13:11 -04:00
Zack Rusin
b63eeaf7b7 draw: cleanup the distance culling code a bit
We don't need the clamped variable, because we can just
return early. We should also do the regular culling after
the distance culling passes.
All spotted by Brian.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-13 12:13:01 -04:00
Chia-I Wu
c7e9b15010 ilo: mapping a resource may make some states dirty
When a resource is busy and is mapped with
PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE, the underlying bo is replaced.  We need
to mark states affected by the resource dirty.

With this change, we no longer have to emit vertex buffers and index buffer
unconditionally.
2013-06-13 23:47:18 +08:00
Chia-I Wu
5f15050dc9 ilo: bump up PIPE_CAP_GLSL_FEATURE_LEVEL to 140
With UBO and TBO support, we are supposedly good to claim GLSL 1.40.
2013-06-13 23:47:18 +08:00
Chia-I Wu
4df85dbc06 ilo: initialize dirty flags in ilo_init_states()
Now that we have a function to initialize states, initialize dirty flags there
too.
2013-06-13 23:47:18 +08:00
Chia-I Wu
6057d7b7b5 ilo: re-emit states that involve resources
Even with hardware contexts, since we do not pin resources, we have to re-emit
the states so that the resources are referenced (by cp->bo) and their offsets
are updated in case they are moved.  This also allows us to elimiate cp flush
in is_bo_busy().
2013-06-13 12:58:47 +08:00
Chia-I Wu
b65bdc61bd ilo: fix for util_blitter_clear() changes
It has been broken since 17350ea979.
2013-06-13 12:58:47 +08:00
Manfred Ernst
bf2c074a2f mesa: Fix bug in unclamped float to ubyte conversion.
Problem: The IEEE float optimized version of UNCLAMPED_FLOAT_TO_UBYTE
in macros.h computed incorrect results for inputs in the range
0x3f7f0000 (=0.99609375) to 0x3f7f7f80 (=0.99803924560546875)
inclusive.  0x3f7f7f80 is the IEEE float value that results in 254.5
when multiplied by 255.  With rounding mode "round to closest even
integer", this is the largest float in the range 0.0-1.0 that is
converted to 254 by the generic implementation of
UNCLAMPED_FLOAT_TO_UBYTE.  The IEEE float optimized version
incorrectly defined the cut-off for mapping to 255 as 0x3f7f0000
(=255.0/256.0). The same bug was present in the function
float_to_ubyte in u_math.h.

Fix: The proposed fix replaces the incorrect cut-off value by
0x3f800000, which is the IEEE float representation of 1.0f. 0x3f7f7f81
(or any value in between) would also work, but 1.0f is probably
cleaner.

The patch does not regress piglit on llvmpipe and on i965 on sandy
bridge.

Tested-by Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-12 20:24:48 -07:00
Marek Olšák
3475b22133 st/dri: if flushing a drawable, don't set reason=SWAPBUFFERS
0 means SWAPBUFFERS.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
a713d7b1b9 st/dri: resolve the back buffer only in SwapBuffers
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
3b525036b9 st/dri: manually swap MSAA front and back buffers in SwapBuffers
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
b77316ad75 st/dri: always copy new DRI front and back buffers to corresponding MSAA buffers
This commit fixes these piglit tests with an MSAA visual forced on:
- read-front
- glx-copy-sub-buffer

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
fdf9d234e2 st/dri: refactor dri_msaa_resolve
The generic blit will be used by the following commit.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
6c6cfc02c9 st/dri: reuse depth-stencil and MSAA resources after DRI2 invalidate event
Page flipping generates an invalidate event every frame, causing reallocations
of all private resources (MSAA and depth-stencil).

Reusing the resources may improve performance (especially under memory
pressure).

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
683b065320 st/dri: fix MSAA resolving of buffers with height > width
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
526ebfa278 st/mesa: make generic CopyPixels path work with MSAA visuals
We have to use pipe->blit, not resource_copy_region, so that the read buffer
is resolved if it's multisampled. I also removed the CPU-based copying,
which just did format conversion (obsoleted by the blit).

Also, the layer/slice/face of the read buffer is taken into account (this was
ignored).

Last but not least, the format choosing is improved to take float and integer
read buffers into account.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:14 +02:00
Marek Olšák
9ef44e6eb7 st/mesa: don't use blit_copy_pixels if an occlusion query is active
CopyPixels, just as DrawPixels, should count the samples that passed
depth test.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
79e421260a st/mesa: rework blit_copy_pixels to use pipe->blit
There were 2 issues with it:
- resource_copy_region doesn't allow different sample counts of both src
  and dst, which can occur if we blit between a window and a FBO, and
  the window has an MSAA colorbuffer and the FBO doesn't.
  (this was the main motivation for using pipe->blit)
- blitting from or to a non-zero layer/slice/face was broken, because
  rtt_face and rtt_slice were ignored.

blit_copy_pixels is now used even if the formats and orientation of
framebuffers don't match.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
4d59258856 r600g: upsample and downsample MSAA resources for transfers
We did downsample (=resolve) MSAA resources to make ReadPixels work with MSAA
GLX visuals, which was enough for read-only color-only transfers.

This commit makes write color transfers and depth-stencil transfers work
in a similar manner. It does downsampling in transfer_map and upsampling
in transfer_unmap.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
72a086b8b2 gallium/u_format: add a new helper for initializing pipe_blit_info::mask
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
d6d4a9a2e8 gallium/u_blitter: make clearing independent of the colorbuffer format
There isn't any difference between 32_FLOAT and 32_*INT in vertex fetching.
Both of them don't do any format conversion.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
17350ea979 gallium/u_blitter: make clearing independent of the number of bound colorbuffers
We can use the fragment shader TGSI property WRITES_ALL_CBUFS.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
de1c38299c gallium/util: make WRITES_ALL_CBUFS optional in the passthrough fragment shader
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-13 03:54:13 +02:00
Marek Olšák
45595d5066 mesa: fix OES_EGL_image_external being partially allowed in the core profile
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-13 03:54:13 +02:00
Ian Romanick
cfa3c5ad82 glsl: Generate smaller values for uniform locations
Previously we would generate uniform locations as (slot << 16) +
array_index.  We do this to handle applications that assume the location
of a[2] will be +1 from the location of a[1].  This resulted in every
uniform location being at least 0x10000.  The OpenGL 4.3 spec was
amended to require this behavior, but previous versions did not require
locations of array (or structure) members be sequential.

We've now encountered two applications that assume uniform values will
be "small."  As far as we can tell, these applications store the GLint
returned by glGetUniformLocation in a int16_t or possibly an int8_t.

THIS BEHAVIOR IS NOT GUARANTEED OR IMPLIED BY ANY VERSION OF OpenGL.

Other implementations happen to have both these behaviors (sequential
array elements and small values) since OpenGL 2.0, so let's just match
their behavior.

Fixes "3D Bowling" on Android.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-12 16:30:29 -07:00
Ian Romanick
26d86d26f9 glsl: Add gl_shader_program::UniformLocationBaseScale
This is used by _mesa_uniform_merge_location_offset and
_mesa_uniform_split_location_offset to determine how the base and offset
are packed.  Previously, this value was hard coded as (1U<<16) in those
functions via the shift and mask contained therein.  The value is still
(1U<<16), but it can be changed in the future.

The next patch dynamically generates this value.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-12 16:30:18 -07:00
Ian Romanick
5097f35841 glsl: Add a gl_shader_program parameter to _mesa_uniform_{merge,split}_location_offset
This will be used in the next commit.

NOTE: This is a candidate for stable release branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-12 16:30:06 -07:00
Roland Scheidegger
4cce4efaa3 util: new util_fill_box helper
Use new util_fill_box helper for util_clear_render_target.
(Also fix off-by-one map error.)

v2: handle non-zero z correctly in new helper

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-13 00:41:43 +02:00
Roland Scheidegger
957c040eb8 gallivm: (trivial) remove duplicated code block (including comment) 2013-06-13 00:41:43 +02:00
Paul Berry
b09a754078 i965/gen7: Enable support for fast color clears.
This patch adds code to place mcs_state into INTEL_MCS_STATE_RESOLVED
for miptrees that are capable of supporting fast color clears.  This
will have no effect on buffers that don't undergo a fast color clear;
however, for buffers that do undergo a fast color clear, an MCS
miptree will be allocated (at the time of the first fast clear), and
will be used thereafter.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
ef9142d4a3 i965/gen7+: Disable fast color clears on shared regions.
In certain circumstances the memory region underlying a miptree is
shared with other miptrees, or with other code outside Mesa's control.
This happens, for instance, when an extension like GL_OES_EGL_image or
GLX_EXT_texture_from_pixmap extension is used to associate a miptree
with an image existing outside of Mesa.

When this happens, we need to disable fast color clears on the miptree
in question, since there's no good synchronization mechanism to ensure
that deferred clear writes get performed by the time the buffer is
examined from the other miptree, or from outside of Mesa.

Fortunately, this should not be a performance hit for most
applications, since most applications that use these extensions use
them for importing textures into Mesa, rather than for exporting
rendered images out of Mesa.  So most of the time the miptrees
involved will never experience a clear.

v2: Rework based on the fact that we have decided not to use an
accessor function to protect access to the region.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
67cd0f9703 i965/gen7+: Resolve color buffers when necessary.
Resolve color buffers that have been fast-color cleared:
    1. before texturing from the buffer (brw_predraw_resolve_buffers())
    2. before using the buffer as the source in a blorp blit
       (brw_blorp_blit_miptrees())
    3. before mapping the buffer's miptree (intel_miptree_map_raw(),
       intel_texsubimage_tiled_memcpy())
    4. before accessing the buffer using the hardware blitter
       (intel_miptree_blit(), do_blit_bitmap())

v2: Rework based on the fact that we have decided not to use an
accessor function to protect access to the region.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
e9dfcb38e9 i965/gen7+: Ensure that front/back buffers are fast-clear resolved.
We already had code in intel_downsample_for_dri2_flush() for
downsampling front and back buffers when multisampling was in use.
This patch extends that function to perform fast color clear resolves
when necessary.

To account for the additional functionality, the function is renamed
to simply intel_resolve_for_dri2_flush().

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
418aecea7d i965/blorp: Write blorp code to do render target resolves.
This patch implements the "render target resolve" blorp operation.
This will be needed when a buffer that has experienced a fast color
clear is later used for a purpose other than as a render target
(texturing, glReadPixels, or swapped to the screen).  It resolves any
remaining deferred clear operation that was not taken care of during
normal rendering.

Fortunately not much work is necessary; all we need to do is scale
down the size of the rectangle primitive being emitted, run the
fragment shader with the "Render Target Resolve Enable" bit set, and
ensure that the fragment shader writes to the render target using the
"replicated color" message.  We already have a fragment shader that
does that (the shader that we use for fast color clears), so for
simplicity we re-use it.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:07 -07:00
Paul Berry
fac32c0bd3 i965/blorp: Expand clear class hierarchy to prepare for RT resolves.
The fragment shaders that to do color clears will be re-used to
perform so-called "render target resolves" (the resolves associated
with fast color clears).  To prepare for that, this patch expands the
class hierarchy for blorp params by adding
brw_blorp_const_color_params (which will be used for all blorp
operations where the fragment shader outputs a constant color).

Some other data structures and functions were also renamed to use
"const_color" nomenclature where appropriate.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:06 -07:00
Paul Berry
5e5d4e021f i965/gen7+: Implement fast color clear operation in BLORP.
Since we defer allocation of the MCS miptree until the time of the
fast clear operation, this patch also implements creation of the MCS
miptree.

In addition, this patch adds the field
intel_mipmap_tree::fast_clear_color_value, which holds the most recent
fast color clear value, if any. We use it to set the SURFACE_STATE's
clear color for render targets.

v2: Flag BRW_NEW_SURFACES when allocating the MCS miptree.  Generate a
perf_debug message if clearing to a color that isn't compatible with
fast color clear.  Fix "control reaches end of non-void function"
build warning.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 11:10:06 -07:00
Paul Berry
dd3f950115 i965/gen7+: Create helper functions for single-sample MCS buffers.
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Paul Berry
460b7bc7a1 i965/gen7+: Set up MCS in SURFACE_STATE whenever MCS is present.
On Gen7+, MCS buffers are used both for compressed multisampled color
buffers and for "fast clear" of single-sampled color buffers.

Previous to this patch series, we didn't support fast clear, so we
only used MCS with multisampled bolor buffers.

As a first step to implementing fast clears, this patch modifies the
code that sets up SURFACE_STATE so that it configures the MCS buffer
whenever it is present, regardless of whether we are multisampling or
not.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Paul Berry
7e5cb4bc4c i965/gen7+: Create an enum for keeping track of fast color clear state.
This patch includes code to update the fast color clear state
appropriately when rendering occurs.  The state will also need to be
updated when a fast clear or a resolve operation is performed; those
state updates will be added when the fast clear and resolve operations
are added.

v2: Create a new function, intel_miptree_used_for_rendering() to
handle updating the fast color clear state when rendering occurs.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Paul Berry
8f5147c199 intel: Conditionally compile mcs-related code for i965 only.
This patch ifdefs out intel_mipmap_tree::mcs_mt when building the i915
(pre-Gen4) driver (MCS buffers aren't supported until Gen7, so there
is no need for this field in the i915 driver).  This should make it a
bit easier to implement fast color clears without undue risk to i915.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Paul Berry
a5efdca7b7 intel: Keep region name in intel_miptree_create_for_dri2_buffer().
When processing a buffer received from the X server,
intel_process_dri2_buffer() examines intel_region::name to determine
whether it's received a brand new buffer, or the same buffer it
received from the X server the last time it made a request.

However, this didn't work properly, because in the call to
intel_miptree_create_for_dri2_buffer(), we create a fresh intel_region
object to represent the buffer, and this was causing us to forget the
buffer's previous name.

This patch fixes things by copying over the region name when creating
the fresh intel_region object.

At the moment, this is just a minor performance optimization.
However, when fast color clears are added, it will be necessary to
ensure that the fast color clear state for a buffer doesn't get
discarded the next time we receive that buffer from the X server.

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-12 10:45:42 -07:00
Chia-I Wu
adf324ad28 winsys/intel: make struct intel_bo alias drm_intel_bo
There is really nothing in struct intel_bo, and having it alias drm_intel_bo
makes the winsys impose almost zero overhead.

We can make the overhead gone completely by making the functions static
inline, if needed.
2013-06-12 17:46:52 +08:00
Chia-I Wu
e7a14eea16 winsys/intel: reorganize functions
Move functions around to match the order of the declarations in the header.
2013-06-12 17:46:52 +08:00
Chia-I Wu
39226705b7 ilo: update winsys interface
The motivation is to kill tiling and pitch in struct intel_bo.  That requires
us to make tiling and pitch not queryable, and be passed around as function
parameters.
2013-06-12 17:46:52 +08:00
Chia-I Wu
cdfb2163c4 ilo: get rid of function tables in winsys
We are moving toward making struct intel_bo alias drm_intel_bo.  As a first
step, we cannot have function tables.
2013-06-12 17:46:52 +08:00
Chia-I Wu
6fe0453c33 ilo: access bo size directly
buf->bo_size is readily avaiable, no need to go via buf->bo->get_size().
2013-06-12 17:46:52 +08:00
Chia-I Wu
3f79188854 ilo: remove unnecessary tex_set_bo/buf_set_bo
Merge the bodies to tex_create_bo/buf_create_bo respectively.
2013-06-12 17:46:52 +08:00
864 changed files with 57714 additions and 39219 deletions

View File

@@ -35,7 +35,7 @@ LOCAL_C_INCLUDES += \
# define ANDROID_VERSION (e.g., 4.0.x => 0x0400)
LOCAL_CFLAGS += \
-DPACKAGE_VERSION=\"9.2.0-devel\" \
-DPACKAGE_VERSION=\"9.2.2\" \
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \
-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)

View File

@@ -1,49 +0,0 @@
# Copyright (C) 2013 The Android-x86 Open Source Project
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# If you don't need to do a full clean build but would like to touch
# a file or delete some intermediate files, add a clean step to the end
# of the list. These steps will only be run once, if they haven't been
# run before.
#
# E.g.:
# $(call add-clean-step, touch -c external/sqlite/sqlite3.h)
# $(call add-clean-step, rm -rf $(PRODUCT_OUT)/obj/STATIC_LIBRARIES/libz_intermediates)
#
# Always use "touch -c" and "rm -f" or "rm -rf" to gracefully deal with
# files that are missing or have been moved.
#
# Use $(PRODUCT_OUT) to get to the "out/target/product/blah/" directory.
# Use $(OUT_DIR) to refer to the "out" directory.
#
# If you need to re-do something that's already mentioned, just copy
# the command and add it to the bottom of the list. E.g., if a change
# that you made last week required touching a file and a change you
# made today requires touching the same file, just copy the old
# touch step and add it to the end of the list.
#
# ************************************************
# NEWER CLEAN STEPS MUST BE AT THE END OF THE LIST
# ************************************************
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/obj/STATIC_LIBRARIES/libmesa_*_intermediates)
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/obj/SHARED_LIBRARIES/libdrm_*intermediates)
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/obj/SHARED_LIBRARIES/i9?5_dri_intermediates)
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/obj/SHARED_LIBRARIES/libglapi_intermediates)
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/obj/SHARED_LIBRARIES/gralloc.drm_intermediates)
$(call add-clean-step, rm -rf $(PRODUCT_OUT)/obj/SHARED_LIBRARIES/libgralloc_drm_intermediates)
$(call add-clean-step, rm -rf $(OUT_DIR)/host/$(HOST_OS)-$(HOST_ARCH)/obj/EXECUTABLES/mesa_*_intermediates)
$(call add-clean-step, rm -rf $(OUT_DIR)/host/$(HOST_OS)-$(HOST_ARCH)/obj/EXECUTABLES/glsl_compiler_intermediates)
$(call add-clean-step, rm -rf $(OUT_DIR)/host/$(HOST_OS)-$(HOST_ARCH)/obj/STATIC_LIBRARIES/libmesa_glsl_utils_intermediates)

View File

@@ -50,6 +50,7 @@ EXTRA_FILES = \
bin/install-sh \
bin/ltmain.sh \
bin/missing \
bin/test-driver \
bin/ylwrap \
src/glsl/glsl_parser.cpp \
src/glsl/glsl_parser.h \
@@ -57,12 +58,6 @@ EXTRA_FILES = \
src/glsl/glcpp/glcpp-lex.c \
src/glsl/glcpp/glcpp-parse.c \
src/glsl/glcpp/glcpp-parse.h \
src/mesa/main/api_exec_es1.c \
src/mesa/main/api_exec_es1_dispatch.h \
src/mesa/main/api_exec_es1_remap_helper.h \
src/mesa/main/api_exec_es2.c \
src/mesa/main/api_exec_es2_dispatch.h \
src/mesa/main/api_exec_es2_remap_helper.h \
src/mesa/program/lex.yy.c \
src/mesa/program/program_parse.tab.c \
src/mesa/program/program_parse.tab.h \

View File

@@ -70,7 +70,7 @@ if env['gles']:
# Environment setup
env.Append(CPPDEFINES = [
('PACKAGE_VERSION', '\\"9.2.0-devel\\"'),
('PACKAGE_VERSION', '\\"9.2.2\\"'),
('PACKAGE_BUGREPORT', '\\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\\"'),
])

10
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,10 @@
# Already cherry picked without -x
d8ac987f6ab228df1a478b36c3d889992754374f glsl: Disallow uniform block layout qualifiers on non-uniform block vars.
# The bug fixed by this patch does not exist in 9.2. Discussed with Marek and
# Brian Paul on the mesa-stable mailing list.
89a665eb5fa176f68223bf54a472d6a0567c3546 draw: fix segfaults with aaline and aapoint stages disabled
# Previously cherry picked (patch originally appeared twice on master with a
# revert in between)
4e5eb8ba25054ede4798fa424e6f32b23aba0f98 i965/vec4: Only zero out unused message components when there are any.

View File

@@ -14,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^[[:space:]]*NOTE: .*[Cc]andidate' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

View File

@@ -6,7 +6,7 @@ dnl Tell the user about autoconf.html in the --help output
m4_divert_once([HELP_END], [
See docs/autoconf.html for more details on the options for Mesa.])
AC_INIT([Mesa], [9.2.0-devel],
AC_INIT([Mesa], [9.2.2],
[https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa])
AC_CONFIG_AUX_DIR([bin])
AC_CONFIG_MACRO_DIR([m4])
@@ -31,7 +31,7 @@ AC_SUBST([OSMESA_VERSION])
dnl Versions for external dependencies
LIBDRM_REQUIRED=2.4.24
LIBDRM_RADEON_REQUIRED=2.4.45
LIBDRM_RADEON_REQUIRED=2.4.46
LIBDRM_INTEL_REQUIRED=2.4.38
LIBDRM_NVVIEUX_REQUIRED=2.4.33
LIBDRM_NOUVEAU_REQUIRED="2.4.33 libdrm >= 2.4.41"
@@ -100,6 +100,7 @@ AC_MSG_RESULT([$acv_mesa_CLANG])
dnl If we're using GCC, make sure that it is at least version 3.3.0. Older
dnl versions are explictly not supported.
GEN_ASM_OFFSETS=no
if test "x$GCC" = xyes -a "x$acv_mesa_CLANG" = xno; then
AC_MSG_CHECKING([whether gcc version is sufficient])
major=0
@@ -117,7 +118,12 @@ if test "x$GCC" = xyes -a "x$acv_mesa_CLANG" = xno; then
else
AC_MSG_RESULT([yes])
fi
if test "x$cross_compiling" = xyes; then
GEN_ASM_OFFSETS=yes
fi
fi
AM_CONDITIONAL([GEN_ASM_OFFSETS], test "x$GEN_ASM_OFFSETS" = xyes)
dnl Make sure the pkg-config macros are defined
m4_ifndef([PKG_PROG_PKG_CONFIG],
@@ -438,7 +444,7 @@ test "x$enable_asm" = xno && AC_MSG_RESULT([no])
# disable if cross compiling on x86/x86_64 since we must run gen_matypes
if test "x$enable_asm" = xyes && test "x$cross_compiling" = xyes; then
case "$host_cpu" in
i?86 | x86_64)
i?86 | x86_64 | amd64)
enable_asm=no
AC_MSG_RESULT([no, cross compiling])
;;
@@ -449,7 +455,7 @@ if test "x$enable_asm" = xyes; then
case "$host_cpu" in
i?86)
case "$host_os" in
linux* | *freebsd* | dragonfly* | *netbsd*)
linux* | *freebsd* | dragonfly* | *netbsd* | openbsd*)
test "x$enable_64bit" = xyes && asm_arch=x86_64 || asm_arch=x86
;;
gnu*)
@@ -457,9 +463,9 @@ if test "x$enable_asm" = xyes; then
;;
esac
;;
x86_64)
x86_64|amd64)
case "$host_os" in
linux* | *freebsd* | dragonfly* | *netbsd*)
linux* | *freebsd* | dragonfly* | *netbsd* | openbsd*)
test "x$enable_32bit" = xyes && asm_arch=x86 || asm_arch=x86_64
;;
esac
@@ -478,7 +484,7 @@ if test "x$enable_asm" = xyes; then
DEFINES="$DEFINES -DUSE_X86_ASM -DUSE_MMX_ASM -DUSE_3DNOW_ASM -DUSE_SSE_ASM"
AC_MSG_RESULT([yes, x86])
;;
x86_64)
x86_64|amd64)
DEFINES="$DEFINES -DUSE_X86_64_ASM"
AC_MSG_RESULT([yes, x86_64])
;;
@@ -573,6 +579,11 @@ AC_ARG_ENABLE([osmesa],
[enable OSMesa library @<:@default=disabled@:>@])],
[enable_osmesa="$enableval"],
[enable_osmesa=no])
AC_ARG_ENABLE([gallium-osmesa],
[AS_HELP_STRING([--enable-gallium-osmesa],
[enable Gallium implementation of the OSMesa library @<:@default=disabled@:>@])],
[enable_gallium_osmesa="$enableval"],
[enable_gallium_osmesa=no])
AC_ARG_ENABLE([egl],
[AS_HELP_STRING([--disable-egl],
[disable EGL library @<:@default=enabled@:>@])],
@@ -763,7 +774,13 @@ if test "x$enable_dri" = xyes; then
GALLIUM_STATE_TRACKERS_DIRS="dri $GALLIUM_STATE_TRACKERS_DIRS"
fi
if test "x$enable_osmesa" = xyes; then
if test "x$enable_gallium_osmesa" = xyes; then
if test -z "$with_gallium_drivers"; then
AC_MSG_ERROR([Cannot enable gallium_osmesa without Gallium])
fi
if test "x$enable_osmesa" = xyes; then
AC_MSG_ERROR([Cannot enable both classic and Gallium OSMesa implementations])
fi
GALLIUM_STATE_TRACKERS_DIRS="osmesa $GALLIUM_STATE_TRACKERS_DIRS"
GALLIUM_TARGET_DIRS="$GALLIUM_TARGET_DIRS osmesa"
fi
@@ -966,7 +983,7 @@ if test "x$enable_dri" = xyes; then
DEFINES="$DEFINES -DHAVE_ALIAS"
case "$host_cpu" in
x86_64)
x86_64|amd64)
if test "x$DRI_DIRS" = "xyes"; then
DRI_DIRS="i915 i965 nouveau r200 radeon swrast"
fi
@@ -985,7 +1002,7 @@ if test "x$enable_dri" = xyes; then
;;
esac
;;
freebsd* | dragonfly* | *netbsd*)
freebsd* | dragonfly* | *netbsd* | openbsd*)
DEFINES="$DEFINES -DHAVE_PTHREAD -DUSE_EXTERNAL_DXTN_LIB=1"
DEFINES="$DEFINES -DHAVE_ALIAS"
@@ -1129,7 +1146,7 @@ x16|x32)
;;
esac
if test "x$enable_osmesa" = xyes; then
if test "x$enable_osmesa" = xyes -o "x$enable_gallium_osmesa" = xyes; then
# only link libraries with osmesa if shared
if test "$enable_static" = no; then
OSMESA_LIB_DEPS="-lm $PTHREAD_LIBS $SELINUX_LIBS $DLOPEN_LIBS"
@@ -1490,6 +1507,13 @@ AC_SUBST([EGL_NATIVE_PLATFORM])
AC_SUBST([EGL_PLATFORMS])
AC_SUBST([EGL_CFLAGS])
# If we don't have the X11 platform, set this define so we don't try to include
# the X11 headers.
if ! echo "$egl_platforms" | grep -q 'x11'; then
DEFINES="$DEFINES -DMESA_EGL_NO_X11_HEADERS"
GL_PC_CFLAGS="$GL_PC_CFLAGS -DMESA_EGL_NO_X11_HEADERS"
fi
AC_ARG_WITH([egl-driver-dir],
[AS_HELP_STRING([--with-egl-driver-dir=DIR],
[directory for EGL drivers [[default=${libdir}/egl]]])],
@@ -1566,7 +1590,7 @@ if test "x$with_gallium_drivers" = x; then
fi
if test "x$enable_gallium_llvm" = xauto; then
case "$host_cpu" in
i*86|x86_64) enable_gallium_llvm=yes;;
i*86|x86_64|amd64) enable_gallium_llvm=yes;;
esac
fi
if test "x$enable_gallium_llvm" = xyes; then
@@ -1577,42 +1601,53 @@ if test "x$enable_gallium_llvm" = xyes; then
fi
if test "x$LLVM_CONFIG" != xno; then
LLVM_VERSION=`$LLVM_CONFIG --version | sed 's/svn.*//g'`
LLVM_VERSION_INT=`echo $LLVM_VERSION | sed -e 's/\([[0-9]]\)\.\([[0-9]]\)/\10\2/g'`
LLVM_VERSION=`$LLVM_CONFIG --version | sed 's/svn.*//g'`
LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags`
LLVM_BINDIR=`$LLVM_CONFIG --bindir`
LLVM_CPPFLAGS=`strip_unwanted_llvm_flags "$LLVM_CONFIG --cppflags"`
LLVM_CFLAGS=$LLVM_CPPFLAGS # CPPFLAGS seem to be sufficient
LLVM_CXXFLAGS=`strip_unwanted_llvm_flags "$LLVM_CONFIG --cxxflags"`
LLVM_INCLUDEDIR=`$LLVM_CONFIG --includedir`
LLVM_LIBDIR=`$LLVM_CONFIG --libdir`
AC_COMPUTE_INT([LLVM_VERSION_MAJOR], [LLVM_VERSION_MAJOR],
[#include "${LLVM_INCLUDEDIR}/llvm/Config/llvm-config.h"])
AC_COMPUTE_INT([LLVM_VERSION_MINOR], [LLVM_VERSION_MINOR],
[#include "${LLVM_INCLUDEDIR}/llvm/Config/llvm-config.h"])
if test "x${LLVM_VERSION_MAJOR}" != x; then
LLVM_VERSION_INT="${LLVM_VERSION_MAJOR}0${LLVM_VERSION_MINOR}"
else
LLVM_VERSION_INT=`echo $LLVM_VERSION | sed -e 's/\([[0-9]]\)\.\([[0-9]]\)/\10\2/g'`
fi
LLVM_COMPONENTS="engine bitwriter"
if $LLVM_CONFIG --components | grep -q '\<mcjit\>'; then
if $LLVM_CONFIG --components | grep -qw 'mcjit'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} mcjit"
fi
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
# LLVM 3.3 >= 177971 requires IRReader
if $LLVM_CONFIG --components | grep -q '\<irreader\>'; then
if $LLVM_CONFIG --components | grep -qw 'irreader'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader"
fi
fi
LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags`
LLVM_BINDIR=`$LLVM_CONFIG --bindir`
LLVM_CPPFLAGS=`strip_unwanted_llvm_flags "$LLVM_CONFIG --cppflags"`
LLVM_CFLAGS=$LLVM_CPPFLAGS # CPPFLAGS seem to be sufficient
LLVM_CXXFLAGS=`strip_unwanted_llvm_flags "$LLVM_CONFIG --cxxflags"`
LLVM_INCLUDEDIR=`$LLVM_CONFIG --includedir`
LLVM_LIBDIR=`$LLVM_CONFIG --libdir`
DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT"
MESA_LLVM=1
DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT"
MESA_LLVM=1
dnl Check for Clang interanl headers
dnl Check for Clang internal headers
if test "x$enable_opencl" = xyes; then
if test "x$CLANG_LIBDIR" = x; then
CLANG_LIBDIR=${LLVM_LIBDIR}
fi
CLANG_RESOURCE_DIR=$CLANG_LIBDIR/clang/${LLVM_VERSION}
AC_CHECK_FILE("$CLANG_RESOURCE_DIR/include/stddef.h",,
AC_MSG_ERROR([Could not find clang internal header stddef.h in $CLANG_RESOURCE_DIR Use --with-clang-libdir to specify the correct path to the clang libraries.]))
AS_IF([test ! -f "$CLANG_RESOURCE_DIR/include/stddef.h"],
[AC_MSG_ERROR([Could not find clang internal header stddef.h in $CLANG_RESOURCE_DIR Use --with-clang-libdir to specify the correct path to the clang libraries.])])
fi
else
MESA_LLVM=0
LLVM_VERSION_INT=0
MESA_LLVM=0
LLVM_VERSION_INT=0
fi
else
MESA_LLVM=0
@@ -1687,7 +1722,7 @@ gallium_check_st() {
gallium_require_llvm() {
if test "x$MESA_LLVM" = x0; then
case "$host_cpu" in
i*86|x86_64) AC_MSG_ERROR([LLVM is required to build $1 on x86 and x86_64]);;
i*86|x86_64|amd64) AC_MSG_ERROR([LLVM is required to build $1 on x86 and x86_64]);;
esac
fi
}
@@ -1709,7 +1744,7 @@ radeon_llvm_check() {
if test "$LLVM_VERSION_INT" -lt "${LLVM_REQUIRED_VERSION_MAJOR}0${LLVM_REQUIRED_VERSION_MINOR}"; then
AC_MSG_ERROR([LLVM $LLVM_REQUIRED_VERSION_MAJOR.$LLVM_REQUIRED_VERSION_MINOR or newer is required for r600g and radeonsi.])
fi
if test true && $LLVM_CONFIG --targets-built | grep -qv '\<R600\>' ; then
if test true && $LLVM_CONFIG --targets-built | grep -qvw 'R600' ; then
AC_MSG_ERROR([LLVM R600 Target not enabled. You can enable it when building the LLVM
sources with the --enable-experimental-targets=R600
configure flag])
@@ -1846,7 +1881,7 @@ if test "x$MESA_LLVM" != x0; then
if test "x$with_llvm_shared_libs" = xyes; then
dnl We can't use $LLVM_VERSION because it has 'svn' stripped out,
LLVM_SO_NAME=LLVM-`$LLVM_CONFIG --version`
AC_CHECK_FILE("$LLVM_LIBDIR/lib$LLVM_SO_NAME.so", llvm_have_one_so=yes,)
AS_IF([test -f "$LLVM_LIBDIR/lib$LLVM_SO_NAME.so"], [llvm_have_one_so=yes])
if test "x$llvm_have_one_so" = xyes; then
dnl LLVM was built using auto*, so there is only one shared object.
@@ -1854,8 +1889,8 @@ if test "x$MESA_LLVM" != x0; then
else
dnl If LLVM was built with CMake, there will be one shared object per
dnl component.
AC_CHECK_FILE("$LLVM_LIBDIR/libLLVMTarget.so",,
AC_MSG_ERROR([Could not find llvm shared libraries:
AS_IF([test ! -f "$LLVM_LIBDIR/libLLVMTarget.so"],
[AC_MSG_ERROR([Could not find llvm shared libraries:
Please make sure you have built llvm with the --enable-shared option
and that your llvm libraries are installed in $LLVM_LIBDIR
If you have installed your llvm libraries to a different directory you
@@ -1866,7 +1901,7 @@ if test "x$MESA_LLVM" != x0; then
--enable-opencl
If you do not want to build with llvm shared libraries and instead want to
use llvm static libraries then remove these options from your configure
invocation and reconfigure.]))
invocation and reconfigure.])])
dnl We don't need to update LLVM_LIBS in this case because the LLVM
dnl install uses a shared object for each compoenent and we have
@@ -1890,8 +1925,8 @@ AM_CONDITIONAL(NEED_GALLIUM_SOFTPIPE_DRIVER, test "x$HAVE_GALLIUM_SVGA" = xyes -
"x$HAVE_GALLIUM_I915" = xyes -o \
"x$HAVE_GALLIUM_SOFTPIPE" = xyes)
AM_CONDITIONAL(NEED_GALLIUM_LLVMPIPE_DRIVER, test "x$HAVE_GALLIUM_I915" = xyes -o \
"x$HAVE_GALLIUM_SOFTPIPE" = xyes -a \
"x$MESA_LLVM" = x1)
"x$HAVE_GALLIUM_SOFTPIPE" = xyes \
&& test "x$MESA_LLVM" = x1)
if test "x$enable_gallium_loader" = xyes; then
GALLIUM_WINSYS_DIRS="$GALLIUM_WINSYS_DIRS sw/null"
@@ -1938,9 +1973,11 @@ AC_SUBST([ELF_LIB])
AM_CONDITIONAL(NEED_LIBPROGRAM, test "x$with_gallium_drivers" != x -o \
"x$enable_xlib_glx" = xyes -o \
"x$enable_osmesa" = xyes)
"x$enable_osmesa" = xyes -o \
"x$enable_gallium_osmesa" = xyes)
AM_CONDITIONAL(HAVE_X11_DRIVER, test "x$enable_xlib_glx" = xyes)
AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_OSMESA, test "x$enable_gallium_osmesa" = xyes)
AM_CONDITIONAL(HAVE_X86_ASM, echo "$DEFINES" | grep 'X86_ASM' >/dev/null 2>&1)
AM_CONDITIONAL(HAVE_X86_64_ASM, echo "$DEFINES" | grep 'X86_64_ASM' >/dev/null 2>&1)
@@ -2029,6 +2066,7 @@ AC_CONFIG_FILES([Makefile
src/gallium/targets/gbm/Makefile
src/gallium/targets/opencl/Makefile
src/gallium/targets/osmesa/Makefile
src/gallium/targets/osmesa/osmesa.pc
src/gallium/targets/pipe-loader/Makefile
src/gallium/targets/libgl-xlib/Makefile
src/gallium/targets/vdpau-nouveau/Makefile
@@ -2127,11 +2165,17 @@ echo " OpenVG: $enable_openvg"
dnl Driver info
echo ""
if test "x$enable_osmesa" != xno; then
case "x$enable_osmesa$enable_gallium_osmesa" in
xnoyes)
echo " OSMesa: lib$OSMESA_LIB (Gallium)"
;;
xyesno)
echo " OSMesa: lib$OSMESA_LIB"
else
;;
xnono)
echo " OSMesa: no"
fi
;;
esac
if test "x$enable_dri" != xno; then
# cleanup the drivers var

View File

@@ -1,6 +1,6 @@
File: docs/README.WIN32
Last updated: 23 April 2011
Last updated: 21 June 2013
Quick Start
@@ -30,6 +30,23 @@ At this time, only the gallium GDI driver is known to work.
Source code also exists in the tree for other drivers in
src/mesa/drivers/windows, but the status of this code is unknown.
Recipe
------
Building on windows requires several open-source packages. These are
steps that work as of this writing.
1) install python 2.7
2) install scons (latest)
3) install mingw, flex, and bison
4) install libxml2 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs
get libxml2-python-2.9.1.win-amd64-py2.7.exe
5) install pywin32 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs
get pywin32-218.4.win-amd64-py2.7.exe
6) install git
7) download mesa from git
see http://www.mesa3d.org/repository.html
8) run scons
General
-------

View File

@@ -32,7 +32,7 @@ The specifications follow.
<li><a href="specs/MESA_pixmap_colormap.spec">MESA_pixmap_colormap.spec</a>
<li><a href="specs/OLD/MESA_program_debug.spec">MESA_program_debug.spec</a> (obsolete)
<li><a href="specs/MESA_release_buffers.spec">MESA_release_buffers.spec</a>
<li><a href="specs/MESA_resize_buffers.spec">MESA_resize_buffers.spec</a>
<li><a href="specs/OLD/MESA_resize_buffers.spec">MESA_resize_buffers.spec</a> (obsolete)
<li><a href="specs/MESA_set_3dfx_mode.spec">MESA_set_3dfx_mode.spec</a>
<li><a href="specs/MESA_shader_debug.spec">MESA_shader_debug.spec</a>
<li><a href="specs/OLD/MESA_sprite_point.spec">MESA_sprite_point.spec</a> (obsolete)

View File

@@ -16,6 +16,24 @@
<h1>News</h1>
<h2>August 1, 2013</h2>
<p>
<a href="relnotes/9.1.6.html">Mesa 9.1.6</a> is released.
This is a bug fix release.
</p>
<h2>July 17, 2013</h2>
<p>
<a href="relnotes/9.1.5.html">Mesa 9.1.5</a> is released.
This is a bug fix release.
</p>
<h2>July 1, 2013</h2>
<p>
<a href="relnotes/9.1.4.html">Mesa 9.1.4</a> is released.
This is a bug fix release.
</p>
<h2>May 21, 2013</h2>
<p>
<a href="relnotes/9.1.3.html">Mesa 9.1.3</a> is released.

View File

@@ -22,6 +22,9 @@ The release notes summarize what's new or changed in each Mesa release.
<ul>
<li><a href="relnotes/9.2.html">9.2 release notes</a>
<li><a href="relnotes/9.1.6.html">9.1.6 release notes</a>
<li><a href="relnotes/9.1.5.html">9.1.5 release notes</a>
<li><a href="relnotes/9.1.4.html">9.1.4 release notes</a>
<li><a href="relnotes/9.1.3.html">9.1.3 release notes</a>
<li><a href="relnotes/9.1.2.html">9.1.2 release notes</a>
<li><a href="relnotes/9.1.1.html">9.1.1 release notes</a>

321
docs/relnotes/9.1.4.html Normal file
View File

@@ -0,0 +1,321 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 9.1.4 Release Notes / July 1st, 2013</h1>
<p>
Mesa 9.1.4 is a bug fix release which fixes bugs found since the 9.1.3 release.
</p>
<p>
Mesa 9.1 implements the OpenGL 3.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.1. OpenGL
3.1 is <strong>only</strong> available if requested at context creation
because GL_ARB_compatibility is not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
a2c4e25d0e27918bc67f61bae04d0cb8 MesaLib-9.1.4.tar.bz2
8c7e9ce5b05cb2223f0587396dd9dc08 MesaLib-9.1.4.tar.gz
020459c5793d4279bdcb2daa1f7dd9f6 MesaLib-9.1.4.zip
</pre>
<h2>New features</h2>
<p>None.</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=37871">Bug 37871</a> - [bisected i965] Bus error (core dumped) on oglc texdecaltile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=42182">Bug 42182</a> - egl/opengles1/tri_x11 renders wrong</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=44958">Bug 44958</a> - [SNB IVB HSW] mesa demo test texleak bus error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=53494">Bug 53494</a> - [snb] crash in texsubimage to a large atlas in clutter</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60518">Bug 60518</a> - glDrawElements segfault when compiled into display list</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61821">Bug 61821</a> - src/mesa/drivers/dri/common/xmlpool.h:96:29: fatal error: xmlpool/options.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=63520">Bug 63520</a> - r300g regression (RV380): Strange rendering of light sources in Penumbra (bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=63701">Bug 63701</a> - [HSW] support new haswell graphics [8086:0a2e]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64727">Bug 64727</a> - [gm45, bisected] some piglit glsl 1.10 built-in-functions tests crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64745">Bug 64745</a> - [llvmpipe] SIGSEGV src/gallium/state_trackers/glx/xlib/glx_api.c:1374</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64934">Bug 64934</a> - [llvmpipe] SIGSEGV src/gallium/state_trackers/glx/xlib/glx_api.c:1363</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=65173">Bug 65173</a> - segfault in _mesa_get_format_datatype and _mesa_get_color_read_type when state dumping with glretrace</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following GIT command:</p>
<pre>
git log mesa-9.1.3..mesa-9.1.4
</pre>
<p>Alan Coopersmith (2):</p>
<ul>
<li>integer overflow in XF86DRIOpenConnection() [CVE-2013-1993 1/2]</li>
<li>integer overflow in XF86DRIGetClientDriverName() [CVE-2013-1993 2/2]</li>
</ul>
<p>Alex Deucher (3):</p>
<ul>
<li>radeonsi: add support for hainan chips</li>
<li>radeonsi: add Hainan pci ids</li>
<li>winsys/radeon: add env var to disable VM on Cayman/Trinity</li>
</ul>
pp
<p>Andreas Boll (1):</p>
<ul>
<li>glapi: Add some missing static_dispatch="false" annotations to es_EXT.xml</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>intel: Add a null pointer check before dereferencing the pointer</li>
</ul>
<p>Armin K (1):</p>
<ul>
<li>gallivm: Fix build with LLVM 3.3</li>
</ul>
<p>Brian Paul (9):</p>
<ul>
<li>mesa: fix the compressed TexSubImage size checking code</li>
<li>st/mesa: generate GL_OUT_OF_MEMORY if we can't create the index buffer</li>
<li>mesa: fix error checking of DXT sRGB formats in _mesa_base_tex_format()</li>
<li>st/glx/xlib: check for null ctx pointer in glXIsDirect()</li>
<li>xlib: check for null ctx pointer in glXIsDirect()</li>
<li>st/glx: add null ctx check in glXDestroyContext()</li>
<li>xlib: add null ctx check in glXDestroyContext()</li>
<li>meta: move vertex array enables for mipmap generation</li>
<li>mesa: handle missing read buffer in _mesa_get_color_read_format/type()</li>
</ul>
<p>Bryan Cain (1):</p>
<ul>
<li>nv50: initialize kick_notify callback in nv50_create</li>
</ul>
<p>Chad Versace (3):</p>
<ul>
<li>egl/android: Fix error condition for EGL_ANDROID_image_native_buffer</li>
<li>i965: Fix glColorPointer(GL_FIXED)</li>
<li>intel: Return early if miptree allocation fails</li>
</ul>
<p>Chia-I Wu (1):</p>
<ul>
<li>u_vbuf: fix index buffer leak</li>
</ul>
<p>Chris Forbes (8):</p>
<ul>
<li>mesa: add accessor for effective stencil ref</li>
<li>intel: Use accessor for stencil reference values</li>
<li>nouveau: Use accessor for stencil reference values</li>
<li>radeon: Use accessor for stencil reference values</li>
<li>st: Use accessor for stencil reference values</li>
<li>swrast: Use accessor for stencil reference values</li>
<li>mesa: Stop clamping stencil reference value at specification time</li>
<li>mesa: Use accessor for stencil reference values in glGet</li>
</ul>
<p>Chí-Thanh Christopher Nguyễn (1):</p>
<ul>
<li>targets/dri-i915: Force c++ linker in all cases</li>
</ul>
<p>Daniel Martin (1):</p>
<ul>
<li>Fix build of swrast only without libdrm</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>i965: fix problem with constant out of bounds access (v3)</li>
</ul>
<p>Eric Anholt (10):</p>
<ul>
<li>mesa: Make core Mesa allocate the texture renderbuffer wrapper.</li>
<li>mesa: Make gl_renderbuffers backed by EGL images use FinishRenderTexture.</li>
<li>i965/fs: Bake regs_written into the IR instead of recomputing it later.</li>
<li>i965/vs: Fix implied_mrf_writes() for integer division pre-gen6.</li>
<li>intel: Add support for writing to our linear-temporary-CPU-map case.</li>
<li>intel: Do temporary CPU maps of textures that are too big to GTT map.</li>
<li>intel: Avoid making tiled miptrees we won't be able to blit.</li>
<li>intel: Fix MRT handling of glBitmap().</li>
<li>intel: Fix format handling of blit glBitmap()</li>
<li>i965: Shut up the last release build warning.</li>
</ul>
<p>Fabian Bieler (2):</p>
<ul>
<li>mesa/st: Don't copy propagate from swizzles.</li>
<li>mesa/program: Don't copy propagate from swizzles.</li>
</ul>
<p>Frank Henigman (1):</p>
<ul>
<li>intel: initialize fs_visitor::params_remap in constructor</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>docs: Add 9.1.3 release md5sums</li>
<li>mesa: Bump version to 9.1.4</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>scons: Fix implicit python dependency discovery on Windows.</li>
</ul>
<p>Kenneth Graunke (17):</p>
<ul>
<li>mesa: Add i965 varying index patches to .cherry-ignore.</li>
<li>i965: Turn brw-&gt;urb.vs_size and gs_size into local variables.</li>
<li>i965: Use a variable for the push constant size in kB.</li>
<li>i965: Update URB partitioning code for Haswell's GT3 variant.</li>
<li>i965: Add chipset limits for the Haswell GT3 variant.</li>
<li>i965: Enable the Bay Trail platform.</li>
<li>mesa: Add a reverted commit to cherry-ignore.</li>
<li>vbo: Ignore PRIMITIVE_RESTART_FIXED_INDEX for glDrawArrays().</li>
<li>mesa: Add a helper function for determining the restart index.</li>
<li>vbo: Use the new primitive restart index helper function.</li>
<li>i965: Use the correct restart index for fixed index mode on Haswell.</li>
<li>mesa: Cherry-ignore a patch that got picked but squashed.</li>
<li>i965: Fix can_cut_index_handle_restart_index() for byte/short types.</li>
<li>st/mesa: Go back to using ctx-&gt;Array.RestartIndex, not _RestartIndex.</li>
<li>mesa: Ignore fixed-index primitive restart in ArrayElement().</li>
<li>mesa: Delete the ctx-&gt;Array._RestartIndex derived state.</li>
<li>glsl: Bail on parsing if the #version directive is bogus.</li>
</ul>
<p>Lauri Kasanen (1):</p>
<ul>
<li>r600g: Correctly initialize the shader key, v2</li>
</ul>
<p>Maarten Lankhorst (4):</p>
<ul>
<li>nvc0: fix up video buffer alignment requirements</li>
<li>nvc0: kill assert in ppp code</li>
<li>nvc0: set rsvd_kick correctly</li>
<li>nvc0: allow frame dropping in h264</li>
</ul>
<p>Marek Olšák (7):</p>
<ul>
<li>radeonsi: increase array size for shader inputs and outputs</li>
<li>vbo: fix possible use-after-free segfault after a VAO is deleted</li>
<li>glsl: fix the value of gl_MaxFragmentUniformVectors</li>
<li>st/mesa: initialize all program constants and UBO limits</li>
<li>st/mesa: initialize Const.MaxColorAttachments</li>
<li>st/mesa: fix a couple of issues in st_bind_ubos</li>
<li>mesa: declare UniformBufferBindings as an array with a static size</li>
</ul>
<p>Matt Turner (3):</p>
<ul>
<li>configure.ac: Remove redundant checks of enable_dri.</li>
<li>configure.ac: Build dricommon for DRI gallium drivers</li>
<li>i965: NULL check depth_mt to quiet static analysis.</li>
</ul>
<p>Michel Dänzer (3):</p>
<ul>
<li>radeonsi: Fix handling of TGSI_SEMANTIC_PSIZE</li>
<li>radeonsi: Fix user clip planes</li>
<li>mesa: Note that two radeonsi fixes cannot be backported after all</li>
</ul>
<p>Mike Stroyan (1):</p>
<ul>
<li>configure.ac: Build dricommon for gallium swrast</li>
</ul>
<p>Naohiro Aota (1):</p>
<ul>
<li>xmlpool/build: Make sure to set mo properly</li>
</ul>
<p>Paul Berry (2):</p>
<ul>
<li>glsl: Fix error checking on "flat" keyword to match GLSL ES 3.00, GLSL 1.50.</li>
<li>i965/gen7.5: Allow HW primitive restart for all primitive types.</li>
</ul>
<p>Paulo Zanoni (1):</p>
<ul>
<li>i965: make GT3 machines work as GT3 instead of GT2</li>
</ul>
<p>Rodrigo Vivi (2):</p>
<ul>
<li>i965: Add missing Haswell GT3 Desktop to IS_HSW_GT3 check.</li>
<li>i965: Adding more reserved PCI IDs for Haswell.</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>gallivm: fix out-of-bounds access with mirror_clamp_to_edge address mode</li>
</ul>
<p>Stéphane Marchesin (2):</p>
<ul>
<li>st/xlib: Fix upside down coordinates for CopySubBuffer</li>
<li>st/xlib: Flush the front buffer before doing CopySubBuffer</li>
</ul>
<p>Sven Joachim (1):</p>
<ul>
<li>mesa: Fix ieee fp on Alpha</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>mesa: fix type comparison errors in sub-texture error checking code</li>
</ul>
<p>Tom Stellard (2):</p>
<ul>
<li>gallivm: Fix build with LLVM &gt;= r180063</li>
<li>r300g/compiler: Prevent regalloc from swizzling texture operands v2</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>radeon: Initialize variables in radeon_llvm_context_init.</li>
</ul>
</div>
</body>
</html>

140
docs/relnotes/9.1.5.html Normal file
View File

@@ -0,0 +1,140 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 9.1.5 Release Notes / July 17, 2013</h1>
<p>
Mesa 9.1.5 is a bug fix release which fixes bugs found since the 9.1.4 release.
</p>
<p>
Mesa 9.1 implements the OpenGL 3.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.1. OpenGL
3.1 is <strong>only</strong> available if requested at context creation
because GL_ARB_compatibility is not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
4ed2af5943141a85a21869053a2fc2eb MesaLib-9.1.5.tar.bz2
47181066acf3231d74e027b2033f9455 MesaLib-9.1.5.tar.gz
4c9c6615bd99215325250f87ed34058f MesaLib-9.1.5.zip
</pre>
<h2>New features</h2>
<p>None.</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=58384">Bug 58384</a> - [i965 Bisected]Oglc max_values(advanced.fragmentProgram.GL_MAX_PROGRAM_ENV_PARAMETERS_ARB) segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62647">Bug 62647</a> - Wrong rendering of Dota 2 on Wine (apitrace attached) - Intel IVB HD4000</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=63674">Bug 63674</a> - [IVB]frozen at the first frame when run Unigine-heaven 4.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=65910">Bug 65910</a> - Killing weston-launch causes segv in desktop-shell</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following GIT command:</p>
<pre>
git log mesa-9.1.4..mesa-9.1.5
</pre>
<p>Anuj Phogat (1):</p>
<ul>
<li>mesa: Return ZeroVec/dummyReg instead of NULL pointer</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>svga: check for NaN shader immediates</li>
</ul>
<p>Carl Worth (3):</p>
<ul>
<li>cherry-ignore: Ignore previously backported patch</li>
<li>cherry-ignore: Drop two patches which we've decided not to include</li>
<li>mesa: Bump version to 9.1.5</li>
</ul>
<p>Chris Forbes (1):</p>
<ul>
<li>i965: fix alpha test for MRT</li>
</ul>
<p>Christoph Bumiller (1):</p>
<ul>
<li>r600g: x/y coordinates must be divided by block dim in dma blit</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>ra: Fix register spilling.</li>
</ul>
<p>Ian Romanick (6):</p>
<ul>
<li>docs: Add 9.1.4 release md5sums</li>
<li>glsl: Add a gl_shader_program parameter to _mesa_uniform_{merge,split}_location_offset</li>
<li>glsl: Add gl_shader_program::UniformLocationBaseScale</li>
<li>glsl: Generate smaller values for uniform locations</li>
<li>i965: Be more careful with the interleaved user array upload optimization</li>
<li>glsl: Move all var decls to the front of the IR list in reverse order</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>glsl/builtins: Fix ARB_texture_cube_map_array built-in availability.</li>
</ul>
<p>Kristian Høgsberg (1):</p>
<ul>
<li>wayland: Handle global_remove event as well</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>register_allocate: Fix the type of best_benefit.</li>
</ul>
<p>Paul Berry (1):</p>
<ul>
<li>glsl ES: Fix magnitude of gl_MaxVertexUniformVectors.</li>
</ul>
<p>Richard Sandiford (3):</p>
<ul>
<li>st/xlib Fix XIMage bytes-per-pixel calculation</li>
<li>st/xlib: Fix XImage stride calculation</li>
<li>st/dri/sw: Fix pitch calculation in drisw_update_tex_buffer</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>swrast: Fix memory leak.</li>
</ul>
</div>
</body>
</html>

168
docs/relnotes/9.1.6.html Normal file
View File

@@ -0,0 +1,168 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 9.1.6 Release Notes / August 1, 2013</h1>
<p>
Mesa 9.1.6 is a bug fix release which fixes bugs found since the 9.1.5 release.
</p>
<p>
Mesa 9.1 implements the OpenGL 3.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.1. OpenGL
3.1 is <strong>only</strong> available if requested at context creation
because GL_ARB_compatibility is not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
443a2a352667294b53d56cb1a74114e9 MesaLib-9.1.6.tar.bz2
08d3069cccd6821e5f33e0840bca0718 MesaLib-9.1.6.tar.gz
90aa7a6d9878cdbfcb055312f356d6b9 MesaLib-9.1.6.zip
</pre>
<h2>New features</h2>
<p>None.</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=47824">Bug 47824</a> - osmesa using --enable-shared-glapi depends on libgl</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62362">Bug 62362</a> - Crash when using Wayland EGL platform</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=63435">Bug 63435</a> - [Regression since 9.0] Flickering in EGL OpenGL full-screen window with swap interval 1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64087">Bug 64087</a> - Webgl conformance shader-with-non-reserved-words crash when mesa is compiled without --enable-debug</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64330">Bug 64330</a> - WebGL snake demo crash in loop_analysis.cpp:506: bool is_loop_terminator(ir_if*): assertion „inst != __null“ failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=65236">Bug 65236</a> - [i965] Rendering artifacts in VDrift/GL2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66558">Bug 66558</a> - RS690: 3D artifacts when playing SuperTuxKart</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66847">Bug 66847</a> - compilation broken with llvm 3.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66850">Bug 66850</a> - glGenerateMipmap crashes when using GL_TEXTURE_2D_ARRAY with compressed internal format</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66921">Bug 66921</a> - [r300g] Heroes of Newerth: HiZ related corruption</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67283">Bug 67283</a> - VDPAU doesn't work on hybrid laptop through DRI_PRIME</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following GIT command:</p>
<pre>
git log mesa-9.1.5..mesa-9.1.6
</pre>
<p>Andreas Boll (1):</p>
<ul>
<li>configure.ac: Require llvm-3.2 for r600g/radeonsi llvm backends</li>
</ul>
<p>Brian Paul (4):</p>
<ul>
<li>mesa: handle 2D texture arrays in get_tex_rgba_compressed()</li>
<li>meta: handle 2D texture arrays in decompress_texture_image()</li>
<li>mesa: implement mipmap generation for compressed 2D array textures</li>
<li>mesa: improve free() cleanup in generate_mipmap_compressed()</li>
</ul>
<p>Carl Worth (7):</p>
<ul>
<li>docs: Add 9.1.5 release md5sums</li>
<li>Merge 'origin/9.1' into stable</li>
<li>cherry-ignore: Drop 13 patches from the pick list</li>
<li>get-pick-list.sh: Include commits mentionining "CC: mesa-stable..." in pick list</li>
<li>get-pick-list: Allow for non-whitespace between "CC:" and "mesa-stable"</li>
<li>get-pick-list: Ignore commits which CC mesa-stable unless they say "9.1"</li>
<li>Bump version to 9.1.6</li>
</ul>
<p>Chris Forbes (5):</p>
<ul>
<li>i965/Gen4: Zero extra coordinates for ir_tex</li>
<li>i965/vs: Fix flaky texture swizzling</li>
<li>i965/vs: set up sampler state pointer for Gen4/5.</li>
<li>i965/vs: Put lod parameter in the correct place for Gen4</li>
<li>i965/vs: Gen4/5: enable front colors if back colors are written</li>
</ul>
<p>Christoph Bumiller (1):</p>
<ul>
<li>nv50,nvc0: s/uint16/uint32 for constant buffer offset</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>gallium/vl: add prime support</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>egl: Restore "bogus" DRI2 invalidate event code.</li>
</ul>
<p>Jeremy Huddleston Sequoia (1):</p>
<ul>
<li>Apple: glFlush() is not needed with CGLFlushDrawable()</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>glsl: Classify "layout" like other identifiers.</li>
</ul>
<p>Kristian Høgsberg (1):</p>
<ul>
<li>egl-wayland: Fix left-over wl_display_roundtrip() usage</li>
</ul>
<p>Maarten Lankhorst (2):</p>
<ul>
<li>osmesa: link against static libglapi library too to get the gl exports</li>
<li>nvc0: force use of correct firmware file</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>r300g/swtcl: fix geometry corruption by uploading indices to a buffer</li>
<li>r300g/swtcl: fix a lockup in MSAA resolve</li>
<li>Revert "r300g: allow HiZ with a 16-bit zbuffer"</li>
<li>r600g: increase array size for shader inputs and outputs</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>i965: NULL check prog on shader compilation failure.</li>
<li>i965/vs: Print error if vertex shader fails to compile.</li>
</ul>
<p>Paul Berry (1):</p>
<ul>
<li>glsl: Handle empty if statement encountered during loop analysis.</li>
</ul>
</div>
</body>
</html>

206
docs/relnotes/9.2.1.html Normal file
View File

@@ -0,0 +1,206 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 9.2.1 Release Notes / (October 4, 2013)</h1>
<p>
Mesa 9.2.1 is a bug fix release which fixes bugs found since the 9.2 release.
</p>
<p>
Mesa 9.2 implements the OpenGL 3.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.1. OpenGL
3.1 is <strong>only</strong> available if requested at context creation
because GL_ARB_compatibility is not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
e6cdfa84dfddd86e3d36ec7ff4b6478a MesaLib-9.2.1.tar.gz
dd4c82667d9c19c28a553b12eba3f8a0 MesaLib-9.2.1.tar.bz2
d9af0f5607f7d275793d293057ca9ac6 MesaLib-9.2.1.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66779">Bug 66779</a> - Use of uninitialized stack variable with brw_search_cache()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68233">Bug 68233</a> - Valgrind errors in mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68250">Bug 68250</a> - Automatic mipmap generation with texture compression produces borders that fade to black</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68637">Bug 68637</a> - [Bisected IVB/HSW]Unigine demo crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68753">Bug 68753</a> - [regression bisected] GLSL ES: structs members can't have precision qualifiers anymore in 9.2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69525">Bug 69525</a> - [GM45, bisected] Piglit tex-shadow2drect fails</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following GIT command:</p>
<pre>
git log mesa-9.2..mesa-9.2.1
</pre>
<p>Alex Deucher (1):</p>
<ul>
<li>radeon/winsys: pad IBs to a multiple of 8 DWs</li>
</ul>
<p>Andreas Boll (1):</p>
<ul>
<li>os: First check for __GLIBC__ and then for PIPE_OS_BSD</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>glsl: Allow precision qualifiers for sampler types</li>
</ul>
<p>Brian Paul (2):</p>
<ul>
<li>docs: minor fixes for 9.2 release notes</li>
<li>mesa: check for bufSize &gt; 0 in _mesa_GetSynciv()</li>
</ul>
<p>Carl Worth (3):</p>
<ul>
<li>cherry-ignore: Ignore a commit which appeared twice on master</li>
<li>Use -Bsymbolic when linking libEGL.so</li>
<li>mesa: Bump version to 9.2.1</li>
</ul>
<p>Chris Forbes (3):</p>
<ul>
<li>i965/fs: Gen4: Zero out extra coordinates when using shadow compare</li>
<li>i965: Fix cube array coordinate normalization</li>
<li>i965: fix bogus swizzle in brw_cubemap_normalize</li>
</ul>
<p>Christoph Bumiller (2):</p>
<ul>
<li>nvc0/ir: add f32 long immediate cannot saturate</li>
<li>nvc0: delete compute object on screen destruction</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>st/mesa: don't dereference stObj-&gt;pt if NULL</li>
</ul>
<p>Dominik Behr (1):</p>
<ul>
<li>glsl: propagate max_array_access through function calls</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>nouveau: initialise the nouveau_transfer maps</li>
</ul>
<p>Eric Anholt (4):</p>
<ul>
<li>mesa: Rip out more extension checking from texformat.c.</li>
<li>mesa: Don't choose S3TC for generic compression if we can't compress.</li>
<li>i965/gen4: Fix fragment program rectangle texture shadow compares.</li>
<li>i965: Reenable glBitmap() after the sRGB winsys enabling.</li>
</ul>
<p>Ian Romanick (7):</p>
<ul>
<li>docs: Add 9.2 release md5sums</li>
<li>Add .cherry-ignore file</li>
<li>mesa: Note that 89a665e should not be picked</li>
<li>glsl: Reallow precision qualifiers on structure members</li>
<li>mesa: Support GL_MAX_VERTEX_OUTPUT_COMPONENTS query with ES3</li>
<li>mesa: Remove all traces of GL_OES_matrix_get</li>
<li>mesa: Don't return any data for GL_SHADER_BINARY_FORMATS</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>nv30: find first unused texcoord rather than bailing if first is used</li>
<li>nv30: fix inconsistent setting of push-&gt;user_priv</li>
</ul>
<p>Joakim Sindholt (1):</p>
<ul>
<li>nvc0: fix blitctx memory leak</li>
</ul>
<p>Johannes Obermayr (1):</p>
<ul>
<li>st/gbm: Add $(WAYLAND_CFLAGS) for HAVE_EGL_PLATFORM_WAYLAND.</li>
</ul>
<p>Kenneth Graunke (5):</p>
<ul>
<li>i965/vs: Detect GRF sources in split_virtual_grfs send-from-GRF code.</li>
<li>i965/fs: Detect GRF sources in split_virtual_grfs send-from-GRF code.</li>
<li>i965/vec4: Only zero out unused message components when there are any.</li>
<li>i965: Fix brw_vs_prog_data_compare to actually check field members.</li>
<li>meta: Set correct viewport and projection in decompress_texture_image.</li>
</ul>
<p>Maarten Lankhorst (2):</p>
<ul>
<li>st/dri: do not create a new context for msaa copy</li>
<li>nvc0: restore viewport after blit</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>r600g: fix constant buffer cache flushing</li>
<li>r600g: fix texture buffer object cache flushing</li>
</ul>
<p>Paul Berry (1):</p>
<ul>
<li>i965: Initialize inout_offset parameter to brw_search_cache().</li>
</ul>
<p>Rico Schüller (1):</p>
<ul>
<li>glx: Initialize OpenGL version to 1.0</li>
</ul>
<p>Tiziano Bacocco (1):</p>
<ul>
<li>nvc0/ir: fix use after free in texture barrier insertion pass</li>
</ul>
<p>Torsten Duwe (1):</p>
<ul>
<li>wayland-egl.pc requires wayland-client.pc.</li>
</ul>
</div>
</body>
</html>

97
docs/relnotes/9.2.2.html Normal file
View File

@@ -0,0 +1,97 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 9.2.2 Release Notes / (October 18, 2013)</h1>
<p>
Mesa 9.2.2 is a bug fix release which fixes bugs found since the 9.2.1 release.
</p>
<p>
Mesa 9.2 implements the OpenGL 3.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.1. OpenGL
3.1 is <strong>only</strong> available if requested at context creation
because GL_ARB_compatibility is not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69449">Bug 69449</a> - Valgrind error in program_resource_visitor::recursion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70411">Bug 70411</a> - glInvalidateFramebuffer fails with GL_INVALID_ENUM</li>
</ul>
<h2>Changes</h2>
<p>The full set of changes can be viewed by using the following GIT command:</p>
<pre>
git log mesa-9.2.1..mesa-9.2.2
</pre>
<p>Brian Paul (3):</p>
<ul>
<li>docs: add missing &lt;pre&gt; tag</li>
<li>svga: fix incorrect memcpy src in svga_buffer_upload_piecewise()</li>
<li>mesa: consolidate cube width=height error checking</li>
</ul>
<p>Carl Worth (3):</p>
<ul>
<li>docs: Add md5sums for 9.2.1 release</li>
<li>Bump version to 9.2.2</li>
</ul>
<p>Constantin Baranov (1):</p>
<ul>
<li>mesa: Add missing switch break in invalidate_framebuffer_storage()</li>
</ul>
<p>Eric Anholt (3):</p>
<ul>
<li>i965: Don't forget the cube map padding on gen5+.</li>
<li>mesa: Fix compiler warnings when ALIGN's alignment is "1 &lt;&lt; value".</li>
<li>i965: Fix 3D texture layout by more literally copying from the spec.</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>glsl: Fix usage of the wrong union member in program_resource_visitor::recursion.</li>
</ul>
<p>Tom Stellard (1):</p>
<ul>
<li>radeonsi: Use 'SI' as the LLVM processor for CIK on LLVM &lt;= 3.3</li>
</ul>
</div>
</body>
</html>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 9.2 Release Notes / (date TBD)</h1>
<h1>Mesa 9.2 Release Notes / (August 27, 2013)</h1>
<p>
Mesa 9.2 is a new development release.
@@ -33,7 +33,9 @@ because GL_ARB_compatibility is not supported.
<h2>MD5 checksums</h2>
<pre>
tbd
4f93c6475ec656fc1f7b93aeffc9b6c4 MesaLib-9.2.0.tar.gz
4185b6aae890bc62a964f4b24cc1aca8 MesaLib-9.2.0.tar.bz2
3bc5339bc98b9c37777ffd14e3a8eca4 MesaLib-9.2.0.zip
</pre>
@@ -44,25 +46,179 @@ Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_ARB_shading_language_420pack in all drivers that support GLSL 1.30.</li>
<li>GL_ARB_texture_buffer_range</li>
<li>GL_ARB_texture_multisample</li>
<li>GL_ARB_texture_storage_multisample</li>
<li>GL_ARB_texture_query_lod</li>
<li>GL_ARB_texture_storage on radeon, r200, and nouveau</li>
<li>GL_EXT_discard_framebuffer in all OpenGL ES (all versions) drivers</li>
<li>GL_EXT_framebuffer_multisample_blit_scaled on i965</li>
<li>Added new freedreno gallium driver</li>
<li>OSMesa interface for gallium llvmpipe/softpipe drivers</li>
<li>Gallium Heads-Up Display (HUD) feature for performance monitoring</li>
<li>Added support for UVD (2.2 and 3.0) video decoding on r600g and radeonsi through VDPAU (requires Kernel 3.10 or later)</li>
</ul>
<h2>Bug fixes</h2>
<p>TBD -- This list is likely incomplete.</p>
<p>Attempts have been made to <b>not</b> include bugs fixed in previous 9.1
releases or bugs that were regressions during 9.2 development. This list is
likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=41787">Bug 41787</a> - [llvmpipe] stencil broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=44618">Bug 44618</a> - Cross-compilation broken by glsl builtin_compiler</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=46632">Bug 46632</a> - Make the alignment checks for the readpixel blit fastpath a bit more lenient</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=47116">Bug 47116</a> - Enemy territory freezes with rs880 and commit fbebd431ec4e2e461a0cbcd5f3a04a000b8f6bbf</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=47248">Bug 47248</a> - autogen missing dependency on flex and bison, causes infinite loop in glsl build</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=48694">Bug 48694</a> - radeonsi_pipe.c:322:7: error: PIPE_CAP_DUAL_SOURCE_BLEND undeclared</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=50655">Bug 50655</a> - [r600g][RV670 HD3870] Ioquake games causes GPU lockup (waiting for 0x00003039 last fence id 0x00003030)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=51471">Bug 51471</a> - [965gm] Corrupted graphics in corners of screen with pixel shaders enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=51782">Bug 51782</a> - mesa-8.0.3: fails to compile against uclibc</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54240">Bug 54240</a> - [swrast] piglit fbo-generatemipmap-filtering regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=55503">Bug 55503</a> - Constant vertex attributes broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=55783">Bug 55783</a> - glEnable(GL_FRAMEBUFFER_SRGB) has no effect on the backbuffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=55825">Bug 55825</a> - [Bisected i965]Oglc max_values(advanced.fragmentProgram.GL_MAX_PROGRAM_ALU_INSTRUCTIONS_ARB) causes OOM-killer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=56920">Bug 56920</a> - [sandybridge][uxa] graphics very glitchy and always flickering</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=57753">Bug 57753</a> - leak in loop_analysis</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=57875">Bug 57875</a> - Second Life viewer bad rendering with git-ec83535</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=58666">Bug 58666</a> - rv670 + llvm = errors.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=58680">Bug 58680</a> - [IVB] Graphical glitches in 0 A.D</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=58872">Bug 58872</a> - Mac OS X configure: error: Couldn't find clock_gettime</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59322">Bug 59322</a> - r300g MSAA breaks Half-Life 2 in Wine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59364">Bug 59364</a> - [bisected] Mesa build fails: clientattrib.c:33:22: fatal error: indirect.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59439">Bug 59439</a> - glCopyPixels generates no fragments (occlusion_query_meta_fragments test fails)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59440">Bug 59440</a> - glBitmap generates no fragments (occlusion_query_meta_fragments test fails)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59494">Bug 59494</a> - [Bisected]Piglit glean_depthStencil fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59592">Bug 59592</a> - Radeon HD 5670: reproducable GPU lockups with htile enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59648">Bug 59648</a> - [SNB/IVB/HSW Bisected]Piglit spec/ARB_uniform_buffer/object_layout-std140-base-size-and-alignment fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59701">Bug 59701</a> - lp_test_arit fails on non-sse41 capable machines, breaking make check</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59737">Bug 59737</a> - [bisected] 0d108116bd80b757fb01a84a9f1946ef870b57b8 breaks osmesa when cross compiling</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59740">Bug 59740</a> - [i965 Bisected]Oglc api-error(negative.glEvalMesh) fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59851">Bug 59851</a> - AC_ARG_WITH misusage leading to mesa configure failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59873">Bug 59873</a> - [swrast] piglit ext_framebuffer_multisample-interpolation 0 centroid-edges regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=59876">Bug 59876</a> - glGetTexLevelParameteriv broken for indirect rendering</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60038">Bug 60038</a> - [osmesa] [git] building 32-bit mesa on 64 bit fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60047">Bug 60047</a> - [softpipe] piglit masked-clear regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60052">Bug 60052</a> - [Bisected]Piglit glx_extension_string_sanity fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60082">Bug 60082</a> - [ FAILED ] DispatchSanity_test.GL31_CORE</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60086">Bug 60086</a> - Wayland platform backend crashes if there's no back buffer during dri2_swap_buffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60098">Bug 60098</a> - [softpipe] Unexpected PIPE_CAP 78 query</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60172">Bug 60172</a> - Planeshift: triangles where grass would be</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60200">Bug 60200</a> - radeon_bo with virtual address referencing mismatch</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60212">Bug 60212</a> - [Bisected] Weston black output</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60524">Bug 60524</a> - [softpipe] piglit depthstencil-render-miplevels 146 s=z24_s8 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60527">Bug 60527</a> - [softpipe] fbo-stencil GL_DEPTH24_STENCIL8 clear regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60633">Bug 60633</a> - EXT_texture_sRGB does not work in game The Cave on IvyBridge</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60737">Bug 60737</a> - In GLSL ES, a missing FS precision qualifier does not generate an error</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60866">Bug 60866</a> - GLSL performance issues for uniform buffer objects</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61036">Bug 61036</a> - Shader fails to build in LLVMpipe, aborts program</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61200">Bug 61200</a> - insufficient linking of libxatracker.so</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61635">Bug 61635</a> - glVertexAttribPointer(id, GL_UNSIGNED_BYTE, GL_FALSE,...) does not work</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62466">Bug 62466</a> - r600g hyperz lockups with KSP 0.19</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62669">Bug 62669</a> - HyperZ freeze when playing PrBoom-Plus demo with lots of monsters</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62721">Bug 62721</a> - GPU lockup in Minecraft 1.5.1 with HyperZ</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62830">Bug 62830</a> - [i965 bisected] Wrong Lightning on Freespace 2 SCP (patch attached)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=63124">Bug 63124</a> - [r600g] HyperZ lockup on REDWOOD in Half Life 2 Deathmatch</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=63702">Bug 63702</a> - tiling2d in radeon trash vdpau UVD textures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64935">Bug 64935</a> - [swrast] s_texfetch.c:1335: set_fetch_functions: Assertion `texImage-&gt;FetchTexel' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64959">Bug 64959</a> - Cannot build against EGL without X11</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=65112">Bug 65112</a> - glcpp hangs parsing line continuations</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=65958">Bug 65958</a> - GPU Lockup on Trinity 7500G</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66450">Bug 66450</a> - JUNIPER UVD accelerated playback of MPEG 1/2 streams does not work</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66606">Bug 66606</a> - [i965 bisected]GLBenchmark 2.5.1/2.7.0 sometimes render error with gnome-session enabling SNA</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66713">Bug 66713</a> - Team Fortress 2 crashes with r600-sb on HD4850</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67354">Bug 67354</a> - glsl_parser.cpp is broken with bison 3.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67548">Bug 67548</a> - glGetAttribLocation seems to be broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67927">Bug 67927</a> - R600_DEBUG=sb: Celestia show 2 earths, one wrongly rendered</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67934">Bug 67934</a> - [SNB/IVB/HSW 9.2 Bisected]Ogles2conform/GL2Tests/glUniform/glUniform.test fails with gnome-session enable compositing</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68162">Bug 68162</a> - [radeonsi] texture rendering is broken in Source-Engine games</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68195">Bug 68195</a> - piglit tests vs-struct-pad and fs-struct-pad both fail</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Removed d3d1x state tracker (unused, unmaintained and broken)</li>
<li>Removed GL_EXT_clip_volume_hint because no driver had enabled it since
2007.</li>
<li>Removed GL_MESA_resize_buffers because it was only really implemented by
the (unsupported) GDI driver.</li>
<li>GL_EXT_separate_shader_objects has been removed from all Gallium drivers,
because it disallows a critical GLSL shader optimization.
GL_ARB_separate_shader_objects doesn't have this issue.</li>
<li>i965 Gen6+ requires Kernel 3.6 or later. (92d2f5a)</li>
</ul>
</div>

View File

@@ -12,7 +12,7 @@ Contact
Status
Shipping (since Mesa version 2.2)
Obsolete.
Version

View File

@@ -109,8 +109,8 @@ typedef void *EGLNativeDisplayType;
#ifdef MESA_EGL_NO_X11_HEADERS
typedef void *EGLNativeDisplayType;
typedef khronos_uint32_t EGLNativePixmapType;
typedef khronos_uint32_t EGLNativeWindowType;
typedef khronos_uintptr_t EGLNativePixmapType;
typedef khronos_uintptr_t EGLNativeWindowType;
#else

File diff suppressed because it is too large Load Diff

View File

@@ -552,6 +552,8 @@ struct __DRIuseInvalidateExtensionRec {
#define __DRI_ATTRIB_RGBA_BIT 0x01
#define __DRI_ATTRIB_COLOR_INDEX_BIT 0x02
#define __DRI_ATTRIB_LUMINANCE_BIT 0x04
#define __DRI_ATTRIB_FLOAT_BIT 0x08
#define __DRI_ATTRIB_UNSIGNED_FLOAT_BIT 0x10
/* __DRI_ATTRIB_CONFIG_CAVEAT */
#define __DRI_ATTRIB_SLOW_BIT 0x01
@@ -983,7 +985,6 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_FOURCC_YUV410 0x39565559
#define __DRI_IMAGE_FOURCC_YUV411 0x31315559
#define __DRI_IMAGE_FOURCC_YUV420 0x32315559
#define __DRI_IMAGE_FOURCC_YVU420 0x32315659
#define __DRI_IMAGE_FOURCC_YUV422 0x36315559
#define __DRI_IMAGE_FOURCC_YUV444 0x34325559
#define __DRI_IMAGE_FOURCC_NV12 0x3231564e

View File

@@ -70,3 +70,29 @@ CHIPSET(0x6664, HAINAN_6664, HAINAN)
CHIPSET(0x6665, HAINAN_6665, HAINAN)
CHIPSET(0x6667, HAINAN_6667, HAINAN)
CHIPSET(0x666F, HAINAN_666F, HAINAN)
CHIPSET(0x6640, BONAIRE_6640, BONAIRE)
CHIPSET(0x6641, BONAIRE_6641, BONAIRE)
CHIPSET(0x6649, BONAIRE_6649, BONAIRE)
CHIPSET(0x6650, BONAIRE_6650, BONAIRE)
CHIPSET(0x6651, BONAIRE_6651, BONAIRE)
CHIPSET(0x6658, BONAIRE_6658, BONAIRE)
CHIPSET(0x665C, BONAIRE_665C, BONAIRE)
CHIPSET(0x665D, BONAIRE_665D, BONAIRE)
CHIPSET(0x9830, KABINI_9830, KABINI)
CHIPSET(0x9831, KABINI_9831, KABINI)
CHIPSET(0x9832, KABINI_9832, KABINI)
CHIPSET(0x9833, KABINI_9833, KABINI)
CHIPSET(0x9834, KABINI_9834, KABINI)
CHIPSET(0x9835, KABINI_9835, KABINI)
CHIPSET(0x9836, KABINI_9836, KABINI)
CHIPSET(0x9837, KABINI_9837, KABINI)
CHIPSET(0x9838, KABINI_9838, KABINI)
CHIPSET(0x9839, KABINI_9839, KABINI)
CHIPSET(0x983A, KABINI_983A, KABINI)
CHIPSET(0x983B, KABINI_983B, KABINI)
CHIPSET(0x983C, KABINI_983C, KABINI)
CHIPSET(0x983D, KABINI_983D, KABINI)
CHIPSET(0x983E, KABINI_983E, KABINI)
CHIPSET(0x983F, KABINI_983F, KABINI)

View File

@@ -53,7 +53,7 @@ AC_DEFUN([AX_PROG_FLEX], [
AC_REQUIRE([AC_PROG_EGREP])
AC_CACHE_CHECK([if flex is the lexer generator],[ax_cv_prog_flex],[
AS_IF([$LEX --version 2>/dev/null | $EGREP -q '^flex '],
AS_IF([$LEX --version 2>/dev/null | $EGREP -q '^\<flex\>'],
[ax_cv_prog_flex=yes], [ax_cv_prog_flex=no])
])
AS_IF([test "$ax_cv_prog_flex" = "yes"],

View File

@@ -29,6 +29,10 @@ if HAVE_DRI_GLX
SUBDIRS += glx
endif
if HAVE_EGL_PLATFORM_WAYLAND
SUBDIRS += egl/wayland
endif
if HAVE_GBM
SUBDIRS += gbm
endif

View File

@@ -21,8 +21,4 @@
SUBDIRS=
if HAVE_EGL_PLATFORM_WAYLAND
SUBDIRS += wayland
endif
SUBDIRS += drivers main

View File

@@ -28,6 +28,7 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/egl/wayland/wayland-drm \
-I$(top_builddir)/src/egl/wayland/wayland-drm \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
$(LIBDRM_CFLAGS) \
$(LIBUDEV_CFLAGS) \
$(LIBKMS_CFLAGS) \

View File

@@ -75,7 +75,7 @@ EGLint dri2_to_egl_attribute_map[] = {
0, /* __DRI_ATTRIB_TRANSPARENT_GREEN_VALUE */
0, /* __DRI_ATTRIB_TRANSPARENT_BLUE_VALUE */
0, /* __DRI_ATTRIB_TRANSPARENT_ALPHA_VALUE */
0, /* __DRI_ATTRIB_FLOAT_MODE */
0, /* __DRI_ATTRIB_FLOAT_MODE (deprecated) */
0, /* __DRI_ATTRIB_RED_MASK */
0, /* __DRI_ATTRIB_GREEN_MASK */
0, /* __DRI_ATTRIB_BLUE_MASK */
@@ -141,7 +141,7 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
else if (value & __DRI_ATTRIB_LUMINANCE_BIT)
value = EGL_LUMINANCE_BUFFER;
else
/* not valid */;
return NULL;
_eglSetConfigKey(&base, EGL_COLOR_BUFFER_TYPE, value);
break;

View File

@@ -38,7 +38,6 @@
#include <xf86drm.h>
#include <i915_drm.h>
#include <radeon_drm.h>
#include <gralloc_drm.h>
#include "egl_dri2.h"
#include "gralloc_drm.h"
@@ -57,9 +56,9 @@ get_format_bpp(int native)
case HAL_PIXEL_FORMAT_RGB_888:
bpp = 3;
break;
case HAL_PIXEL_FORMAT_DRM_NV12:
case HAL_PIXEL_FORMAT_YV12:
case HAL_PIXEL_FORMAT_RGB_565:
case HAL_PIXEL_FORMAT_RGBA_5551:
case HAL_PIXEL_FORMAT_RGBA_4444:
bpp = 2;
break;
default:
@@ -340,7 +339,6 @@ dri2_create_image_android_native_buffer(_EGLDisplay *disp, _EGLContext *ctx,
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
struct dri2_egl_image *dri2_img;
int name;
uint32_t offsets[3], strides[3], handles[3], tmp;
EGLint format;
if (ctx != NULL) {
@@ -369,12 +367,6 @@ dri2_create_image_android_native_buffer(_EGLDisplay *disp, _EGLContext *ctx,
/* see the table in droid_add_configs_for_visuals */
switch (buf->format) {
case HAL_PIXEL_FORMAT_DRM_NV12:
format = __DRI_IMAGE_FOURCC_NV12;
break;
case HAL_PIXEL_FORMAT_YV12:
format = __DRI_IMAGE_FOURCC_YVU420;
break;
case HAL_PIXEL_FORMAT_BGRA_8888:
format = __DRI_IMAGE_FORMAT_ARGB8888;
break;
@@ -388,6 +380,8 @@ dri2_create_image_android_native_buffer(_EGLDisplay *disp, _EGLContext *ctx,
format = __DRI_IMAGE_FORMAT_XBGR8888;
break;
case HAL_PIXEL_FORMAT_RGB_888:
case HAL_PIXEL_FORMAT_RGBA_5551:
case HAL_PIXEL_FORMAT_RGBA_4444:
/* unsupported */
default:
_eglLog(_EGL_WARNING, "unsupported native buffer format 0x%x", buf->format);
@@ -406,70 +400,14 @@ dri2_create_image_android_native_buffer(_EGLDisplay *disp, _EGLContext *ctx,
return NULL;
}
switch (format) {
case __DRI_IMAGE_FORMAT_ARGB8888:
case __DRI_IMAGE_FORMAT_RGB565:
case __DRI_IMAGE_FORMAT_ABGR8888:
case __DRI_IMAGE_FORMAT_XBGR8888:
dri2_img->dri_image =
dri2_dpy->image->createImageFromName(dri2_dpy->dri_screen,
buf->width,
buf->height,
format,
name,
buf->stride,
dri2_img);
break;
case __DRI_IMAGE_FOURCC_YVU420:
offsets[0] = offsets[1] = offsets[2] = 0;
strides[0] = strides[1] = strides[2] = 0;
gralloc_drm_resolve_format(buf->handle, &strides[0], &offsets[0],
&handles[0]);
/* u anv v are given in wrong order than what we need here thus this:*/
tmp = offsets[1];
offsets[1] = offsets[2];
offsets[2] = tmp;
tmp = strides[1];
strides[1] = strides[2];
strides[2] = tmp;
dri2_img->dri_image =
dri2_dpy->image->createImageFromNames(dri2_dpy->dri_screen,
buf->width,
buf->height,
format,
&name, 1,
(int*)strides,
(int*)offsets,
dri2_img);
break;
case __DRI_IMAGE_FOURCC_NV12:
offsets[0] = offsets[1] = offsets[2] = 0;
strides[0] = strides[1] = strides[2] = 0;
gralloc_drm_resolve_format(buf->handle, &strides[0], &offsets[0],
&handles[0]);
dri2_img->dri_image =
dri2_dpy->image->createImageFromNames(dri2_dpy->dri_screen,
buf->width,
buf->height,
format,
&name, 1,
(int*)strides,
(int*)offsets,
dri2_img);
break;
default:
/* We should never arrive here */
_eglLog(_EGL_WARNING, "unsupported native buffer format 0x%x",
buf->format);
break;
}
dri2_img->dri_image =
dri2_dpy->image->createImageFromName(dri2_dpy->dri_screen,
buf->width,
buf->height,
format,
name,
buf->stride,
dri2_img);
if (!dri2_img->dri_image) {
free(dri2_img);
_eglError(EGL_BAD_ALLOC, "droid_create_image_mesa_drm");

View File

@@ -715,8 +715,15 @@ registry_handle_global(void *data, struct wl_registry *registry, uint32_t name,
}
}
static void
registry_handle_global_remove(void *data, struct wl_registry *registry,
uint32_t name)
{
}
static const struct wl_registry_listener registry_listener = {
registry_handle_global
registry_handle_global,
registry_handle_global_remove
};
EGLBoolean

View File

@@ -212,7 +212,7 @@ dri2_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
dri2_surf->drawable, s.data->root,
dri2_surf->base.Width, dri2_surf->base.Height);
} else {
dri2_surf->drawable = (xcb_drawable_t)window;
dri2_surf->drawable = window;
}
if (dri2_dpy->dri2) {
@@ -743,6 +743,20 @@ dri2_swap_buffers_msc(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw,
free(reply);
}
/* Since we aren't watching for the server's invalidate events like we're
* supposed to (due to XCB providing no mechanism for filtering the events
* the way xlib does), and SwapBuffers is a common cause of invalidate
* events, just shove one down to the driver, even though we haven't told
* the driver that we're the kind of loader that provides reliable
* invalidate events. This causes the driver to request buffers again at
* its next draw, so that we get the correct buffers if a pageflip
* happened. The driver should still be using the viewport hack to catch
* window resizes.
*/
if (dri2_dpy->flush &&
dri2_dpy->flush->base.version >= 3 && dri2_dpy->flush->invalidate)
(*dri2_dpy->flush->invalidate)(dri2_surf->dri_drawable);
return swap_count;
}
@@ -836,10 +850,10 @@ dri2_copy_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf,
(*dri2_dpy->flush->flush)(dri2_surf->dri_drawable);
gc = xcb_generate_id(dri2_dpy->conn);
xcb_create_gc(dri2_dpy->conn, gc, (xcb_drawable_t)target, 0, NULL);
xcb_create_gc(dri2_dpy->conn, gc, target, 0, NULL);
xcb_copy_area(dri2_dpy->conn,
dri2_surf->drawable,
(xcb_drawable_t)target,
target,
gc,
0, 0,
0, 0,

View File

@@ -22,6 +22,7 @@
AM_CFLAGS = \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src/egl/main \
$(VISIBILITY_CFLAGS) \
$(X11_CFLAGS) \
$(DEFINES)

View File

@@ -121,13 +121,11 @@ endif
# r300g/r600g/radeonsi
ifneq ($(filter r300g r600g radeonsi, $(MESA_GPU_DRIVERS)),)
gallium_DRIVERS += libmesa_winsys_radeon
LOCAL_SHARED_LIBRARIES += libdrm_radeon
ifneq ($(filter r300g, $(MESA_GPU_DRIVERS)),)
gallium_DRIVERS += libmesa_pipe_r300
endif
ifneq ($(filter r600g, $(MESA_GPU_DRIVERS)),)
gallium_DRIVERS += libmesa_pipe_r600 libmesa_pipe_radeon
LOCAL_SHARED_LIBRARIES += libstlport
gallium_DRIVERS += libmesa_pipe_r600
endif
ifneq ($(filter radeonsi, $(MESA_GPU_DRIVERS)),)
gallium_DRIVERS += libmesa_pipe_radeonsi

View File

@@ -29,6 +29,7 @@ AM_CFLAGS = \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src/gbm/main \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
$(EGL_CFLAGS) \
-D_EGL_NATIVE_PLATFORM=$(EGL_NATIVE_PLATFORM) \
-D_EGL_DRIVER_SEARCH_DIR=\"$(EGL_DRIVER_INSTALL_DIR)\" \
@@ -74,7 +75,7 @@ libEGL_la_SOURCES = \
libEGL_la_LIBADD = \
$(EGL_LIB_DEPS)
libEGL_la_LDFLAGS = -version-number 1:0 -no-undefined
libEGL_la_LDFLAGS = -Wl,-Bsymbolic -version-number 1:0 -no-undefined
if HAVE_EGL_PLATFORM_X11
AM_CFLAGS += -DHAVE_X11_PLATFORM

View File

@@ -1,6 +1,7 @@
AM_CFLAGS = -I$(top_srcdir)/src/egl/main \
-I$(top_srcdir)/include \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
$(WAYLAND_CFLAGS)
noinst_LTLIBRARIES = libwayland-drm.la

View File

@@ -2,6 +2,7 @@ pkgconfigdir = $(libdir)/pkgconfig
pkgconfig_DATA = wayland-egl.pc
AM_CFLAGS = $(DEFINES) \
$(VISIBILITY_CFLAGS) \
$(WAYLAND_CFLAGS)
lib_LTLIBRARIES = libwayland-egl.la

View File

@@ -6,5 +6,6 @@ includedir=@includedir@
Name: wayland-egl
Description: Mesa wayland-egl library
Version: @VERSION@
Requires: wayland-client
Libs: -L${libdir} -lwayland-egl
Cflags: -I${includedir}

View File

@@ -61,7 +61,7 @@ ifneq ($(filter r300g, $(MESA_GPU_DRIVERS)),)
SUBDIRS += drivers/r300
endif
ifneq ($(filter r600g, $(MESA_GPU_DRIVERS)),)
SUBDIRS += drivers/r600 drivers/radeon
SUBDIRS += drivers/r600
endif
ifneq ($(filter radeonsi, $(MESA_GPU_DRIVERS)),)
SUBDIRS += drivers/radeonsi

View File

@@ -38,13 +38,17 @@ libgallium_la_SOURCES += \
endif
indices/u_indices_gen.c: $(srcdir)/indices/u_indices_gen.py
$(MKDIR_P) indices
$(AM_V_GEN) $(PYTHON2) $< > $@
indices/u_unfilled_gen.c: $(srcdir)/indices/u_unfilled_gen.py
$(MKDIR_P) indices
$(AM_V_GEN) $(PYTHON2) $< > $@
util/u_format_srgb.c: $(srcdir)/util/u_format_srgb.py
$(MKDIR_P) util
$(AM_V_GEN) $(PYTHON2) $< > $@
util/u_format_table.c: $(srcdir)/util/u_format_table.py $(srcdir)/util/u_format_pack.py $(srcdir)/util/u_format_parse.py $(srcdir)/util/u_format.csv
$(MKDIR_P) util
$(AM_V_GEN) $(PYTHON2) $(srcdir)/util/u_format_table.py $(srcdir)/util/u_format.csv > $@

View File

@@ -44,6 +44,7 @@ C_SOURCES := \
hud/hud_fps.c \
hud/hud_driver_query.c \
os/os_misc.c \
os/os_process.c \
os/os_time.c \
pipebuffer/pb_buffer_fenced.c \
pipebuffer/pb_buffer_malloc.c \
@@ -163,6 +164,7 @@ GENERATED_SOURCES := \
GALLIVM_SOURCES := \
gallivm/lp_bld_arit.c \
gallivm/lp_bld_arit_overflow.c \
gallivm/lp_bld_assert.c \
gallivm/lp_bld_bitarit.c \
gallivm/lp_bld_const.c \
@@ -171,6 +173,7 @@ GALLIVM_SOURCES := \
gallivm/lp_bld_format_aos.c \
gallivm/lp_bld_format_aos_array.c \
gallivm/lp_bld_format_float.c \
gallivm/lp_bld_format_srgb.c \
gallivm/lp_bld_format_soa.c \
gallivm/lp_bld_format_yuv.c \
gallivm/lp_bld_gather.c \

View File

@@ -111,6 +111,7 @@ struct cso_context {
void *velements, *velements_saved;
struct pipe_query *render_condition, *render_condition_saved;
uint render_condition_mode, render_condition_mode_saved;
boolean render_condition_cond, render_condition_cond_saved;
struct pipe_clip_state clip;
struct pipe_clip_state clip_saved;
@@ -723,13 +724,17 @@ void cso_restore_stencil_ref(struct cso_context *ctx)
}
void cso_set_render_condition(struct cso_context *ctx,
struct pipe_query *query, uint mode)
struct pipe_query *query,
boolean condition, uint mode)
{
struct pipe_context *pipe = ctx->pipe;
if (ctx->render_condition != query || ctx->render_condition_mode != mode) {
pipe->render_condition(pipe, query, mode);
if (ctx->render_condition != query ||
ctx->render_condition_mode != mode ||
ctx->render_condition_cond != condition) {
pipe->render_condition(pipe, query, condition, mode);
ctx->render_condition = query;
ctx->render_condition_cond = condition;
ctx->render_condition_mode = mode;
}
}
@@ -737,12 +742,14 @@ void cso_set_render_condition(struct cso_context *ctx,
void cso_save_render_condition(struct cso_context *ctx)
{
ctx->render_condition_saved = ctx->render_condition;
ctx->render_condition_cond_saved = ctx->render_condition_cond;
ctx->render_condition_mode_saved = ctx->render_condition_mode;
}
void cso_restore_render_condition(struct cso_context *ctx)
{
cso_set_render_condition(ctx, ctx->render_condition_saved,
ctx->render_condition_cond_saved,
ctx->render_condition_mode_saved);
}

View File

@@ -170,7 +170,8 @@ void cso_save_stencil_ref(struct cso_context *cso);
void cso_restore_stencil_ref(struct cso_context *cso);
void cso_set_render_condition(struct cso_context *cso,
struct pipe_query *query, uint mode);
struct pipe_query *query,
boolean condition, uint mode);
void cso_save_render_condition(struct cso_context *cso);
void cso_restore_render_condition(struct cso_context *cso);

View File

@@ -58,7 +58,7 @@ draw_get_option_use_llvm(void)
#ifdef PIPE_ARCH_X86
util_cpu_detect();
/* require SSE2 due to LLVM PR6960. */
/* require SSE2 due to LLVM PR6960. XXX Might be fixed by now? */
if (!util_cpu_caps.has_sse2)
value = FALSE;
#endif
@@ -78,6 +78,9 @@ draw_create_context(struct pipe_context *pipe, boolean try_llvm)
if (draw == NULL)
goto err_out;
/* we need correct cpu caps for disabling denorms in draw_vbo() */
util_cpu_detect();
#if HAVE_LLVM
if (try_llvm && draw_get_option_use_llvm()) {
draw->llvm = draw_llvm_create(draw);
@@ -138,6 +141,7 @@ boolean draw_init(struct draw_context *draw)
draw->clip_z = TRUE;
draw->pt.user.planes = (float (*) [DRAW_TOTAL_CLIP_PLANES][4]) &(draw->plane[0]);
draw->pt.user.eltMax = ~0;
if (!draw_pipeline_init( draw ))
return FALSE;
@@ -738,6 +742,7 @@ draw_current_shader_clipvertex_output(const struct draw_context *draw)
uint
draw_current_shader_clipdistance_output(const struct draw_context *draw, int index)
{
debug_assert(index < PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT);
if (draw->gs.geometry_shader)
return draw->gs.geometry_shader->clipdistance_output[index];
return draw->vs.clipdistance_output[index];
@@ -756,6 +761,7 @@ draw_current_shader_num_written_clipdistances(const struct draw_context *draw)
uint
draw_current_shader_culldistance_output(const struct draw_context *draw, int index)
{
debug_assert(index < PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT);
if (draw->gs.geometry_shader)
return draw->gs.geometry_shader->culldistance_output[index];
return draw->vs.vertex_shader->culldistance_output[index];

View File

@@ -792,13 +792,13 @@ draw_create_geometry_shader(struct draw_context *draw,
if (gs->info.output_semantic_name[i] == TGSI_SEMANTIC_VIEWPORT_INDEX)
gs->viewport_index_output = i;
if (gs->info.output_semantic_name[i] == TGSI_SEMANTIC_CLIPDIST) {
if (gs->info.output_semantic_index[i] == 0)
gs->clipdistance_output[0] = i;
else
gs->clipdistance_output[1] = i;
debug_assert(gs->info.output_semantic_index[i] <
PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT);
gs->clipdistance_output[gs->info.output_semantic_index[i]] = i;
}
if (gs->info.output_semantic_name[i] == TGSI_SEMANTIC_CULLDIST) {
debug_assert(gs->info.output_semantic_index[i] < Elements(gs->culldistance_output));
debug_assert(gs->info.output_semantic_index[i] <
PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT);
gs->culldistance_output[gs->info.output_semantic_index[i]] = i;
}
}

View File

@@ -67,8 +67,8 @@ struct draw_geometry_shader {
struct tgsi_shader_info info;
unsigned position_output;
unsigned viewport_index_output;
unsigned clipdistance_output[2];
unsigned culldistance_output[2];
unsigned clipdistance_output[PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT];
unsigned culldistance_output[PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT];
unsigned max_output_vertices;
unsigned primitive_boundary;

View File

@@ -32,6 +32,7 @@
#include "draw_gs.h"
#include "gallivm/lp_bld_arit.h"
#include "gallivm/lp_bld_arit_overflow.h"
#include "gallivm/lp_bld_logic.h"
#include "gallivm/lp_bld_const.h"
#include "gallivm/lp_bld_swizzle.h"
@@ -673,6 +674,7 @@ generate_vs(struct draw_llvm_variant *variant,
static void
generate_fetch(struct gallivm_state *gallivm,
struct draw_context *draw,
LLVMValueRef vbuffers_ptr,
LLVMValueRef *res,
struct pipe_vertex_element *velem,
@@ -695,35 +697,58 @@ generate_fetch(struct gallivm_state *gallivm,
LLVMValueRef buffer_size = draw_jit_dvbuffer_size(gallivm, vbuffer_ptr);
LLVMValueRef stride;
LLVMValueRef buffer_overflowed;
LLVMValueRef needed_buffer_size;
LLVMValueRef temp_ptr =
lp_build_alloca(gallivm,
lp_build_vec_type(gallivm, lp_float32_vec4_type()), "");
LLVMValueRef ofbit = NULL;
struct lp_build_if_state if_ctx;
if (velem->instance_divisor) {
/* array index = instance_id / instance_divisor */
index = LLVMBuildUDiv(builder, instance_id,
lp_build_const_int32(gallivm, velem->instance_divisor),
"instance_divisor");
/* Index is equal to the start instance plus the number of current
* instance divided by the divisor. In this case we compute it as:
* index = start_instance + ((instance_id - start_instance) / divisor)
*/
LLVMValueRef current_instance;
index = lp_build_const_int32(gallivm, draw->start_instance);
current_instance = LLVMBuildSub(builder, instance_id, index, "");
current_instance = LLVMBuildUDiv(builder, current_instance,
lp_build_const_int32(gallivm, velem->instance_divisor),
"instance_divisor");
index = LLVMBuildAdd(builder, index, current_instance, "instance");
}
stride = LLVMBuildMul(builder, vb_stride, index, "");
stride = lp_build_umul_overflow(gallivm, vb_stride, index, &ofbit);
stride = lp_build_uadd_overflow(gallivm, stride, vb_buffer_offset, &ofbit);
stride = lp_build_uadd_overflow(
gallivm, stride,
lp_build_const_int32(gallivm, velem->src_offset), &ofbit);
needed_buffer_size = lp_build_uadd_overflow(
gallivm, stride,
lp_build_const_int32(gallivm,
util_format_get_blocksize(velem->src_format)),
&ofbit);
stride = LLVMBuildAdd(builder, stride,
vb_buffer_offset,
"");
stride = LLVMBuildAdd(builder, stride,
lp_build_const_int32(gallivm, velem->src_offset),
"");
buffer_overflowed = LLVMBuildICmp(builder, LLVMIntUGE,
stride, buffer_size,
buffer_overflowed = LLVMBuildICmp(builder, LLVMIntUGT,
needed_buffer_size, buffer_size,
"buffer_overflowed");
/*
lp_build_printf(gallivm, "vbuf index = %d, stride is %d\n", indices, stride);
lp_build_print_value(gallivm, " buffer size = ", buffer_size);
buffer_overflowed = LLVMBuildOr(builder, buffer_overflowed, ofbit, "");
#if 0
lp_build_printf(gallivm, "vbuf index = %u, vb_stride is %u\n",
index, vb_stride);
lp_build_printf(gallivm, " vb_buffer_offset = %u, src_offset is %u\n",
vb_buffer_offset,
lp_build_const_int32(gallivm, velem->src_offset));
lp_build_print_value(gallivm, " blocksize = ",
lp_build_const_int32(
gallivm,
util_format_get_blocksize(velem->src_format)));
lp_build_printf(gallivm, " instance_id = %u\n", instance_id);
lp_build_printf(gallivm, " stride = %u\n", stride);
lp_build_printf(gallivm, " buffer size = %u\n", buffer_size);
lp_build_printf(gallivm, " needed_buffer_size = %u\n", needed_buffer_size);
lp_build_print_value(gallivm, " buffer overflowed = ", buffer_overflowed);
*/
#endif
lp_build_if(&if_ctx, gallivm, buffer_overflowed);
{
@@ -1595,6 +1620,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant,
if (elts) {
start = zero;
end = fetch_count;
count = fetch_count;
}
else {
end = lp_build_add(&bld, start, count);
@@ -1604,7 +1630,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant,
fetch_max = LLVMBuildSub(builder, end, one, "fetch_max");
lp_build_loop_begin(&lp_loop, gallivm, start);
lp_build_loop_begin(&lp_loop, gallivm, zero);
{
LLVMValueRef inputs[PIPE_MAX_SHADER_INPUTS][TGSI_NUM_CHANNELS];
LLVMValueRef aos_attribs[PIPE_MAX_SHADER_INPUTS][LP_MAX_VECTOR_WIDTH / 32] = { { 0 } };
@@ -1612,10 +1638,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant,
LLVMValueRef clipmask; /* holds the clipmask value */
const LLVMValueRef (*ptr_aos)[TGSI_NUM_CHANNELS];
if (elts)
io_itr = lp_loop.counter;
else
io_itr = LLVMBuildSub(builder, lp_loop.counter, start, "");
io_itr = lp_loop.counter;
io = LLVMBuildGEP(builder, io_ptr, &io_itr, 1, "");
#if DEBUG_STORE
@@ -1628,6 +1651,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant,
LLVMBuildAdd(builder,
lp_loop.counter,
lp_build_const_int32(gallivm, i), "");
true_index = LLVMBuildAdd(builder, start, true_index, "");
/* make sure we're not out of bounds which can happen
* if fetch_count % 4 != 0, because on the last iteration
@@ -1647,7 +1671,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant,
gallivm,
lp_build_vec_type(gallivm, lp_type_int(32)), "");
struct lp_build_if_state if_ctx;
index_overflowed = LLVMBuildICmp(builder, LLVMIntUGE,
index_overflowed = LLVMBuildICmp(builder, LLVMIntUGT,
true_index, fetch_elt_max,
"index_overflowed");
@@ -1681,7 +1705,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant,
LLVMValueRef vb_index =
lp_build_const_int32(gallivm, velem->vertex_buffer_index);
LLVMValueRef vb = LLVMBuildGEP(builder, vb_ptr, &vb_index, 1, "");
generate_fetch(gallivm, vbuffers_ptr,
generate_fetch(gallivm, draw, vbuffers_ptr,
&aos_attribs[j][i], velem, vb, true_index,
system_values.instance_id);
}
@@ -1744,8 +1768,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant,
vs_info->num_outputs, vs_type,
have_clipdist);
}
lp_build_loop_end_cond(&lp_loop, end, step, LLVMIntUGE);
lp_build_loop_end_cond(&lp_loop, count, step, LLVMIntUGE);
sampler->destroy(sampler);

View File

@@ -238,6 +238,7 @@ draw_llvm_sampler_soa_emit_fetch_texel(const struct lp_build_sampler_soa *base,
const struct lp_derivatives *derivs,
LLVMValueRef lod_bias, /* optional */
LLVMValueRef explicit_lod, /* optional */
boolean scalar_lod,
LLVMValueRef *texel)
{
struct draw_llvm_sampler_soa *sampler = (struct draw_llvm_sampler_soa *)base;
@@ -256,7 +257,7 @@ draw_llvm_sampler_soa_emit_fetch_texel(const struct lp_build_sampler_soa *base,
coords,
offsets,
derivs,
lod_bias, explicit_lod,
lod_bias, explicit_lod, scalar_lod,
texel);
}

View File

@@ -831,7 +831,12 @@ static struct aaline_stage *
aaline_stage_from_pipe(struct pipe_context *pipe)
{
struct draw_context *draw = (struct draw_context *) pipe->draw;
return aaline_stage(draw->pipeline.aaline);
if (draw) {
return aaline_stage(draw->pipeline.aaline);
} else {
return NULL;
}
}
@@ -844,7 +849,12 @@ aaline_create_fs_state(struct pipe_context *pipe,
const struct pipe_shader_state *fs)
{
struct aaline_stage *aaline = aaline_stage_from_pipe(pipe);
struct aaline_fragment_shader *aafs = CALLOC_STRUCT(aaline_fragment_shader);
struct aaline_fragment_shader *aafs = NULL;
if (aaline == NULL)
return NULL;
aafs = CALLOC_STRUCT(aaline_fragment_shader);
if (aafs == NULL)
return NULL;
@@ -864,6 +874,10 @@ aaline_bind_fs_state(struct pipe_context *pipe, void *fs)
struct aaline_stage *aaline = aaline_stage_from_pipe(pipe);
struct aaline_fragment_shader *aafs = (struct aaline_fragment_shader *) fs;
if (aaline == NULL) {
return;
}
/* save current */
aaline->fs = aafs;
/* pass-through */
@@ -877,14 +891,19 @@ aaline_delete_fs_state(struct pipe_context *pipe, void *fs)
struct aaline_stage *aaline = aaline_stage_from_pipe(pipe);
struct aaline_fragment_shader *aafs = (struct aaline_fragment_shader *) fs;
/* pass-through */
aaline->driver_delete_fs_state(pipe, aafs->driver_fs);
if (aafs == NULL) {
return;
}
if (aafs->aaline_fs)
aaline->driver_delete_fs_state(pipe, aafs->aaline_fs);
if (aaline != NULL) {
/* pass-through */
aaline->driver_delete_fs_state(pipe, aafs->driver_fs);
if (aafs->aaline_fs)
aaline->driver_delete_fs_state(pipe, aafs->aaline_fs);
}
FREE((void*)aafs->state.tokens);
FREE(aafs);
}
@@ -895,6 +914,10 @@ aaline_bind_sampler_states(struct pipe_context *pipe,
{
struct aaline_stage *aaline = aaline_stage_from_pipe(pipe);
if (aaline == NULL) {
return;
}
/* save current */
memcpy(aaline->state.sampler, sampler, num * sizeof(void *));
aaline->num_samplers = num;
@@ -912,6 +935,10 @@ aaline_set_sampler_views(struct pipe_context *pipe,
struct aaline_stage *aaline = aaline_stage_from_pipe(pipe);
uint i;
if (aaline == NULL) {
return;
}
/* save current */
for (i = 0; i < num; i++) {
pipe_sampler_view_reference(&aaline->state.sampler_views[i], views[i]);

View File

@@ -308,9 +308,9 @@ aa_transform_inst(struct tgsi_transform_context *ctx,
newInst.Src[1].Register.SwizzleY = TGSI_SWIZZLE_W;
ctx->emit_instruction(ctx, &newInst);
/* KIL -tmp0.yyyy; # if -tmp0.y < 0, KILL */
/* KILL_IF -tmp0.yyyy; # if -tmp0.y < 0, KILL */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_KIL;
newInst.Instruction.Opcode = TGSI_OPCODE_KILL_IF;
newInst.Instruction.NumDstRegs = 0;
newInst.Instruction.NumSrcRegs = 1;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;

View File

@@ -1,5 +1,5 @@
/**************************************************************************
*
*
* Copyright 2007 Tungsten Graphics, Inc., Cedar Park, Texas.
* All Rights Reserved.
*
@@ -10,11 +10,11 @@
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
@@ -22,7 +22,7 @@
* ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
*
**************************************************************************/
/**
@@ -51,10 +51,10 @@ static INLINE struct cull_stage *cull_stage( struct draw_stage *stage )
return (struct cull_stage *)stage;
}
static INLINE
boolean cull_distance_is_out(float dist)
static INLINE boolean
cull_distance_is_out(float dist)
{
return (dist < 0) || util_is_inf_or_nan(dist);
return (dist < 0.0f) || util_is_inf_or_nan(dist);
}
/*
@@ -68,23 +68,21 @@ static void cull_point( struct draw_stage *stage,
{
const unsigned num_written_culldistances =
draw_current_shader_num_written_culldistances(stage->draw);
unsigned i;
if (num_written_culldistances) {
unsigned i;
boolean culled = FALSE;
for (i = 0; i < num_written_culldistances; ++i) {
unsigned cull_idx = i / 4;
unsigned out_idx =
draw_current_shader_culldistance_output(stage->draw, cull_idx);
unsigned idx = i % 4;
float cull1 = header->v[0]->data[out_idx][idx];
boolean vert1_out = cull_distance_is_out(cull1);
if (vert1_out)
culled = TRUE;
}
if (!culled)
stage->next->point( stage->next, header );
debug_assert(num_written_culldistances);
for (i = 0; i < num_written_culldistances; ++i) {
unsigned cull_idx = i / 4;
unsigned out_idx =
draw_current_shader_culldistance_output(stage->draw, cull_idx);
unsigned idx = i % 4;
float cull1 = header->v[0]->data[out_idx][idx];
boolean vert1_out = cull_distance_is_out(cull1);
if (vert1_out)
return;
}
stage->next->point( stage->next, header );
}
/*
@@ -94,29 +92,27 @@ static void cull_point( struct draw_stage *stage,
* on primitives without faces (e.g. points and lines)
*/
static void cull_line( struct draw_stage *stage,
struct prim_header *header )
struct prim_header *header )
{
const unsigned num_written_culldistances =
draw_current_shader_num_written_culldistances(stage->draw);
unsigned i;
if (num_written_culldistances) {
unsigned i;
boolean culled = FALSE;
for (i = 0; i < num_written_culldistances; ++i) {
unsigned cull_idx = i / 4;
unsigned out_idx =
draw_current_shader_culldistance_output(stage->draw, cull_idx);
unsigned idx = i % 4;
float cull1 = header->v[0]->data[out_idx][idx];
float cull2 = header->v[1]->data[out_idx][idx];
boolean vert1_out = cull_distance_is_out(cull1);
boolean vert2_out = cull_distance_is_out(cull2);
if (vert1_out && vert2_out)
culled = TRUE;
}
if (!culled)
stage->next->line( stage->next, header );
debug_assert(num_written_culldistances);
for (i = 0; i < num_written_culldistances; ++i) {
unsigned cull_idx = i / 4;
unsigned out_idx =
draw_current_shader_culldistance_output(stage->draw, cull_idx);
unsigned idx = i % 4;
float cull1 = header->v[0]->data[out_idx][idx];
float cull2 = header->v[1]->data[out_idx][idx];
boolean vert1_out = cull_distance_is_out(cull1);
boolean vert2_out = cull_distance_is_out(cull2);
if (vert1_out && vert2_out)
return;
}
stage->next->line( stage->next, header );
}
/*
@@ -133,7 +129,6 @@ static void cull_tri( struct draw_stage *stage,
/* Do the distance culling */
if (num_written_culldistances) {
unsigned i;
boolean culled = FALSE;
for (i = 0; i < num_written_culldistances; ++i) {
unsigned cull_idx = i / 4;
unsigned out_idx =
@@ -146,10 +141,8 @@ static void cull_tri( struct draw_stage *stage,
boolean vert2_out = cull_distance_is_out(cull2);
boolean vert3_out = cull_distance_is_out(cull3);
if (vert1_out && vert2_out && vert3_out)
culled = TRUE;
return;
}
if (!culled)
stage->next->tri( stage->next, header );
}
/* Do the regular face culling */
@@ -166,7 +159,7 @@ static void cull_tri( struct draw_stage *stage,
const float fx = v1[0] - v2[0];
const float fy = v1[1] - v2[1];
/* det = cross(e,f).z */
header->det = ex * fy - ey * fx;
@@ -217,7 +210,7 @@ static void cull_first_line( struct draw_stage *stage,
}
}
static void cull_first_tri( struct draw_stage *stage,
static void cull_first_tri( struct draw_stage *stage,
struct prim_header *header )
{
struct cull_stage *cull = cull_stage(stage);

View File

@@ -278,7 +278,7 @@ pstip_transform_inst(struct tgsi_transform_context *ctx,
/*
* Insert new MUL/TEX/KILP instructions at start of program
* Insert new MUL/TEX/KILL_IF instructions at start of program
* Take gl_FragCoord, divide by 32 (stipple size), sample the
* texture and kill fragment if needed.
*
@@ -315,9 +315,9 @@ pstip_transform_inst(struct tgsi_transform_context *ctx,
newInst.Src[1].Register.Index = pctx->freeSampler;
ctx->emit_instruction(ctx, &newInst);
/* KIL -texTemp; # if -texTemp < 0, KILL fragment */
/* KILL_IF -texTemp; # if -texTemp < 0, KILL fragment */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_KIL;
newInst.Instruction.Opcode = TGSI_OPCODE_KILL_IF;
newInst.Instruction.NumDstRegs = 0;
newInst.Instruction.NumSrcRegs = 1;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
@@ -402,7 +402,7 @@ pstip_update_texture(struct pstip_stage *pstip)
/*
* Load alpha texture.
* Note: 0 means keep the fragment, 255 means kill it.
* We'll negate the texel value and use KILP which kills if value
* We'll negate the texel value and use KILL_IF which kills if value
* is negative.
*/
for (i = 0; i < 32; i++) {

View File

@@ -138,7 +138,7 @@ emit_vertex( struct vbuf_stage *vbuf,
/* Note: we really do want data[0] here, not data[pos]:
*/
vbuf->translate->set_buffer(vbuf->translate, 0, vertex->data[0], 0, ~0);
vbuf->translate->run(vbuf->translate, 0, 1, 0, vbuf->vertex_ptr);
vbuf->translate->run(vbuf->translate, 0, 1, 0, 0, vbuf->vertex_ptr);
if (0) draw_dump_emitted_vertex(vbuf->vinfo, (uint8_t *)vbuf->vertex_ptr);

View File

@@ -55,6 +55,10 @@ struct gallivm_state;
/** Sum of frustum planes and user-defined planes */
#define DRAW_TOTAL_CLIP_PLANES (6 + PIPE_MAX_CLIP_PLANES)
/**
* The largest possible index of a vertex that can be fetched.
*/
#define DRAW_MAX_FETCH_IDX 0xffffffff
struct pipe_context;
struct draw_vertex_shader;
@@ -306,6 +310,7 @@ struct draw_context
} extra_shader_outputs;
unsigned instance_id;
unsigned start_instance;
#ifdef HAVE_LLVM
struct draw_llvm *llvm;
@@ -467,14 +472,13 @@ void
draw_stats_clipper_primitives(struct draw_context *draw,
const struct draw_prim_info *prim_info);
/**
* Return index i from the index buffer.
* If the index buffer would overflow we return the
* index of the first element in the vb.
* maximum possible index.
*/
#define DRAW_GET_IDX(_elts, _i) \
(((_i) >= draw->pt.user.eltMax) ? 0 : (_elts)[_i])
(((_i) >= draw->pt.user.eltMax) ? DRAW_MAX_FETCH_IDX : (_elts)[_i])
/**
* Return index of the given viewport clamping it
@@ -486,5 +490,20 @@ draw_clamp_viewport_idx(int idx)
return ((PIPE_MAX_VIEWPORTS > idx || idx < 0) ? idx : 0);
}
/**
* Adds two unsigned integers and if the addition
* overflows then it returns the value from
* from the overflow_value variable.
*/
static INLINE unsigned
draw_overflow_uadd(unsigned a, unsigned b,
unsigned overflow_value)
{
unsigned res = a + b;
if (res < a || res < b) {
res = overflow_value;
}
return res;
}
#endif /* DRAW_PRIVATE_H */

View File

@@ -345,7 +345,8 @@ draw_print_arrays(struct draw_context *draw, uint prim, int start, uint count)
/** Helper code for below */
#define PRIM_RESTART_LOOP(elements) \
do { \
for (i = start; i < end; i++) { \
for (j = 0; j < count; j++) { \
i = draw_overflow_uadd(start, j, MAX_LOOP_IDX); \
if (i < elt_max && elements[i] == info->restart_index) { \
if (cur_count > 0) { \
/* draw elts up to prev pos */ \
@@ -377,9 +378,11 @@ draw_pt_arrays_restart(struct draw_context *draw,
const unsigned prim = info->mode;
const unsigned start = info->start;
const unsigned count = info->count;
const unsigned end = start + count;
const unsigned elt_max = draw->pt.user.eltMax;
unsigned i, cur_start, cur_count;
unsigned i, j, cur_start, cur_count;
/* The largest index within a loop using the i variable as the index.
* Used for overflow detection */
const unsigned MAX_LOOP_IDX = 0xffffffff;
assert(info->primitive_restart);
@@ -456,8 +459,14 @@ draw_vbo(struct draw_context *draw,
unsigned instance;
unsigned index_limit;
unsigned count;
unsigned fpstate = util_fpstate_get();
struct pipe_draw_info resolved_info;
/* Make sure that denorms are treated like zeros. This is
* the behavior required by D3D10. OpenGL doesn't care.
*/
util_fpstate_set_denorms_to_zero(fpstate);
resolve_draw_info(info, &resolved_info);
info = &resolved_info;
@@ -508,11 +517,16 @@ draw_vbo(struct draw_context *draw,
draw->pt.vertex_element,
draw->pt.nr_vertex_elements,
info);
if (index_limit == 0) {
#if HAVE_LLVM
if (!draw->llvm)
#endif
{
if (index_limit == 0) {
/* one of the buffers is too small to do any valid drawing */
debug_warning("draw: VBO too small to draw anything\n");
return;
debug_warning("draw: VBO too small to draw anything\n");
util_fpstate_set(fpstate);
return;
}
}
/* If we're collecting stats then make sure we start from scratch */
@@ -529,6 +543,13 @@ draw_vbo(struct draw_context *draw,
for (instance = 0; instance < info->instance_count; instance++) {
draw->instance_id = instance + info->start_instance;
draw->start_instance = info->start_instance;
/* check for overflow */
if (draw->instance_id < instance ||
draw->instance_id < info->start_instance) {
/* if we overflown just set the instance id to the max */
draw->instance_id = 0xffffffff;
}
draw_new_instance(draw);
@@ -544,4 +565,5 @@ draw_vbo(struct draw_context *draw,
if (draw->collect_statistics) {
draw->render->pipeline_statistics(draw->render, &draw->statistics);
}
util_fpstate_set(fpstate);
}

View File

@@ -171,6 +171,7 @@ draw_pt_emit(struct pt_emit *emit,
translate->run(translate,
0,
vertex_count,
draw->start_instance,
draw->instance_id,
hw_verts );
@@ -234,6 +235,7 @@ draw_pt_emit_linear(struct pt_emit *emit,
translate->run(translate,
0,
count,
draw->start_instance,
draw->instance_id,
hw_verts);
@@ -253,12 +255,6 @@ draw_pt_emit_linear(struct pt_emit *emit,
i < prim_info->primitive_count;
start += prim_info->primitive_lengths[i], i++)
{
if (draw->collect_statistics) {
draw->statistics.c_invocations +=
u_decomposed_prims_for_vertices(prim_info->prim,
prim_info->primitive_lengths[i]);
}
render->draw_arrays(render,
start,
prim_info->primitive_lengths[i]);

View File

@@ -168,6 +168,7 @@ draw_pt_fetch_run(struct pt_fetch *fetch,
translate->run_elts( translate,
elts,
count,
draw->start_instance,
draw->instance_id,
verts );
}
@@ -195,6 +196,7 @@ draw_pt_fetch_run_linear(struct pt_fetch *fetch,
translate->run( translate,
start,
count,
draw->start_instance,
draw->instance_id,
verts );
}

View File

@@ -210,6 +210,7 @@ static void fetch_emit_run( struct draw_pt_middle_end *middle,
feme->translate->run_elts( feme->translate,
fetch_elts,
fetch_count,
draw->start_instance,
draw->instance_id,
hw_verts );
@@ -267,6 +268,7 @@ static void fetch_emit_run_linear( struct draw_pt_middle_end *middle,
feme->translate->run( feme->translate,
start,
count,
draw->start_instance,
draw->instance_id,
hw_verts );
@@ -326,6 +328,7 @@ static boolean fetch_emit_run_linear_elts( struct draw_pt_middle_end *middle,
feme->translate->run( feme->translate,
start,
count,
draw->start_instance,
draw->instance_id,
hw_verts );

View File

@@ -182,12 +182,29 @@ static void so_emit_prim(struct pt_so_emit *so,
buffer = (float *)((char *)draw->so.targets[ob]->mapping +
draw->so.targets[ob]->target.buffer_offset +
draw->so.targets[ob]->internal_offset) + state->output[slot].dst_offset;
draw->so.targets[ob]->internal_offset) +
state->output[slot].dst_offset;
if (idx == so->pos_idx && pcp_ptr)
memcpy(buffer, &pre_clip_pos[start_comp], num_comps * sizeof(float));
memcpy(buffer, &pre_clip_pos[start_comp],
num_comps * sizeof(float));
else
memcpy(buffer, &input[idx][start_comp], num_comps * sizeof(float));
memcpy(buffer, &input[idx][start_comp],
num_comps * sizeof(float));
#if 0
{
int j;
debug_printf("VERT[%d], offset = %d, slot[%d] sc = %d, num_c = %d, idx = %d = [",
i + draw->so.targets[ob]->emitted_vertices,
draw->so.targets[ob]->internal_offset,
slot, start_comp, num_comps, idx);
for (j = 0; j < num_comps; ++j) {
unsigned *ubuffer = (unsigned*)buffer;
debug_printf("%d (0x%x), ", ubuffer[j], ubuffer[j]);
}
debug_printf("]\n");
}
#endif
}
for (ob = 0; ob < draw->so.num_targets; ++ob) {
struct draw_so_target *target = draw->so.targets[ob];

View File

@@ -33,6 +33,9 @@
#define SEGMENT_SIZE 1024
#define MAP_SIZE 256
/* The largest possible index withing an index buffer */
#define MAX_ELT_IDX 0xffffffff
struct vsplit_frontend {
struct draw_pt_front_end base;
struct draw_context *draw;
@@ -82,16 +85,15 @@ vsplit_flush_cache(struct vsplit_frontend *vsplit, unsigned flags)
* Add a fetch element and add it to the draw elements.
*/
static INLINE void
vsplit_add_cache(struct vsplit_frontend *vsplit, unsigned fetch)
vsplit_add_cache(struct vsplit_frontend *vsplit, unsigned fetch, unsigned ofbias)
{
struct draw_context *draw = vsplit->draw;
unsigned hash;
fetch = MIN2(fetch, draw->pt.max_index);
hash = fetch % MAP_SIZE;
if (vsplit->cache.fetches[hash] != fetch) {
/* If the value isn't in the cache of it's an overflow due to the
* element bias */
if (vsplit->cache.fetches[hash] != fetch || ofbias) {
/* update cache */
vsplit->cache.fetches[hash] = fetch;
vsplit->cache.draws[hash] = vsplit->cache.num_fetch_elts;
@@ -104,22 +106,109 @@ vsplit_add_cache(struct vsplit_frontend *vsplit, unsigned fetch)
vsplit->draw_elts[vsplit->cache.num_draw_elts++] = vsplit->cache.draws[hash];
}
/**
* Returns the base index to the elements array.
* The value is checked for overflows (both integer overflows
* and the elements array overflow).
*/
static INLINE unsigned
vsplit_get_base_idx(struct vsplit_frontend *vsplit,
unsigned start, unsigned fetch, unsigned *ofbit)
{
struct draw_context *draw = vsplit->draw;
unsigned elt_idx = draw_overflow_uadd(start, fetch, MAX_ELT_IDX);
if (ofbit)
*ofbit = 0;
/* Overflown indices need to wrap to the first element
* in the index buffer */
if (elt_idx >= draw->pt.user.eltMax) {
if (ofbit)
*ofbit = 1;
elt_idx = 0;
}
return elt_idx;
}
/**
* Returns the element index adjust for the element bias.
* The final element index is created from the actual element
* index, plus the element bias, clamped to maximum elememt
* index if that addition overflows.
*/
static INLINE unsigned
vsplit_get_bias_idx(struct vsplit_frontend *vsplit,
int idx, int bias, unsigned *ofbias)
{
int res = idx + bias;
if (ofbias)
*ofbias = 0;
if (idx > 0 && bias > 0) {
if (res < idx || res < bias) {
res = DRAW_MAX_FETCH_IDX;
if (ofbias)
*ofbias = 1;
}
} else if (idx < 0 && bias < 0) {
if (res > idx || res > bias) {
res = DRAW_MAX_FETCH_IDX;
if (ofbias)
*ofbias = 1;
}
}
return res;
}
#define VSPLIT_CREATE_IDX(elts, start, fetch, elt_bias) \
unsigned elt_idx; \
unsigned ofbit; \
unsigned ofbias; \
elt_idx = vsplit_get_base_idx(vsplit, start, fetch, &ofbit); \
elt_idx = vsplit_get_bias_idx(vsplit, ofbit ? 0 : DRAW_GET_IDX(elts, elt_idx), elt_bias, &ofbias)
static INLINE void
vsplit_add_cache_ubyte(struct vsplit_frontend *vsplit, const ubyte *elts,
unsigned start, unsigned fetch, int elt_bias)
{
struct draw_context *draw = vsplit->draw;
VSPLIT_CREATE_IDX(elts, start, fetch, elt_bias);
vsplit_add_cache(vsplit, elt_idx, ofbias);
}
static INLINE void
vsplit_add_cache_ushort(struct vsplit_frontend *vsplit, const ushort *elts,
unsigned start, unsigned fetch, int elt_bias)
{
struct draw_context *draw = vsplit->draw;
VSPLIT_CREATE_IDX(elts, start, fetch, elt_bias);
vsplit_add_cache(vsplit, elt_idx, ofbias);
}
/**
* Add a fetch element and add it to the draw elements. The fetch element is
* in full range (uint).
*/
static INLINE void
vsplit_add_cache_uint(struct vsplit_frontend *vsplit, unsigned fetch)
vsplit_add_cache_uint(struct vsplit_frontend *vsplit, const uint *elts,
unsigned start, unsigned fetch, int elt_bias)
{
/* special care for 0xffffffff */
if (fetch == 0xffffffff && !vsplit->cache.has_max_fetch) {
struct draw_context *draw = vsplit->draw;
unsigned raw_elem_idx = start + fetch + elt_bias;
VSPLIT_CREATE_IDX(elts, start, fetch, elt_bias);
/* special care for DRAW_MAX_FETCH_IDX */
if (raw_elem_idx == DRAW_MAX_FETCH_IDX && !vsplit->cache.has_max_fetch) {
unsigned hash = fetch % MAP_SIZE;
vsplit->cache.fetches[hash] = fetch - 1; /* force update */
vsplit->cache.fetches[hash] = raw_elem_idx - 1; /* force update */
vsplit->cache.has_max_fetch = TRUE;
}
vsplit_add_cache(vsplit, fetch);
vsplit_add_cache(vsplit, elt_idx, ofbias);
}
@@ -128,17 +217,17 @@ vsplit_add_cache_uint(struct vsplit_frontend *vsplit, unsigned fetch)
#define FUNC vsplit_run_ubyte
#define ELT_TYPE ubyte
#define ADD_CACHE(vsplit, fetch) vsplit_add_cache(vsplit, fetch)
#define ADD_CACHE(vsplit, ib, start, fetch, bias) vsplit_add_cache_ubyte(vsplit,ib,start,fetch,bias)
#include "draw_pt_vsplit_tmp.h"
#define FUNC vsplit_run_ushort
#define ELT_TYPE ushort
#define ADD_CACHE(vsplit, fetch) vsplit_add_cache(vsplit, fetch)
#define ADD_CACHE(vsplit, ib, start, fetch, bias) vsplit_add_cache_ushort(vsplit,ib,start,fetch, bias)
#include "draw_pt_vsplit_tmp.h"
#define FUNC vsplit_run_uint
#define ELT_TYPE uint
#define ADD_CACHE(vsplit, fetch) vsplit_add_cache_uint(vsplit, fetch)
#define ADD_CACHE(vsplit, ib, start, fetch, bias) vsplit_add_cache_uint(vsplit, ib, start, fetch, bias)
#include "draw_pt_vsplit_tmp.h"

View File

@@ -47,13 +47,20 @@ CONCAT(vsplit_primitive_, ELT_TYPE)(struct vsplit_frontend *vsplit,
const unsigned start = istart;
const unsigned end = istart + icount;
/* If the index buffer overflows we'll need to run
* through the normal paths */
if (start >= draw->pt.user.eltMax ||
end > draw->pt.user.eltMax ||
end < istart || end < icount)
return FALSE;
/* use the ib directly */
if (min_index == 0 && sizeof(ib[0]) == sizeof(draw_elts[0])) {
if (icount > vsplit->max_vertices)
return FALSE;
for (i = start; i < end; i++) {
ELT_TYPE idx = DRAW_GET_IDX(ib, i);
for (i = 0; i < icount; i++) {
ELT_TYPE idx = DRAW_GET_IDX(ib, start + i);
if (idx < min_index || idx > max_index) {
debug_printf("warning: index out of range\n");
}
@@ -82,25 +89,29 @@ CONCAT(vsplit_primitive_, ELT_TYPE)(struct vsplit_frontend *vsplit,
fetch_start = min_index + elt_bias;
fetch_count = max_index - min_index + 1;
/* Check for overflow in the fetch_start */
if (fetch_start < min_index || fetch_start < elt_bias)
return FALSE;
if (!draw_elts) {
if (min_index == 0) {
for (i = start; i < end; i++) {
ELT_TYPE idx = DRAW_GET_IDX(ib, i);
for (i = 0; i < icount; i++) {
ELT_TYPE idx = DRAW_GET_IDX(ib, i + start);
if (idx < min_index || idx > max_index) {
debug_printf("warning: index out of range\n");
}
vsplit->draw_elts[i - start] = (ushort) idx;
vsplit->draw_elts[i] = (ushort) idx;
}
}
else {
for (i = start; i < end; i++) {
ELT_TYPE idx = DRAW_GET_IDX(ib, i);
for (i = 0; i < icount; i++) {
ELT_TYPE idx = DRAW_GET_IDX(ib, i + start);
if (idx < min_index || idx > max_index) {
debug_printf("warning: index out of range\n");
}
vsplit->draw_elts[i - start] = (ushort) (idx - min_index);
vsplit->draw_elts[i] = (ushort) (idx - min_index);
}
}
@@ -137,41 +148,36 @@ CONCAT(vsplit_segment_cache_, ELT_TYPE)(struct vsplit_frontend *vsplit,
spoken = !!spoken;
if (ibias == 0) {
if (spoken)
ADD_CACHE(vsplit, DRAW_GET_IDX(ib, ispoken));
ADD_CACHE(vsplit, ib, 0, ispoken, 0);
for (i = spoken; i < icount; i++)
ADD_CACHE(vsplit, DRAW_GET_IDX(ib, istart + i));
for (i = spoken; i < icount; i++) {
ADD_CACHE(vsplit, ib, istart, i, 0);
}
if (close)
ADD_CACHE(vsplit, DRAW_GET_IDX(ib, iclose));
ADD_CACHE(vsplit, ib, 0, iclose, 0);
}
else if (ibias > 0) {
if (spoken)
ADD_CACHE(vsplit, (uint) DRAW_GET_IDX(ib, ispoken) + ibias);
ADD_CACHE(vsplit, ib, 0, ispoken, ibias);
for (i = spoken; i < icount; i++)
ADD_CACHE(vsplit, (uint) DRAW_GET_IDX(ib, istart + i) + ibias);
ADD_CACHE(vsplit, ib, istart, i, ibias);
if (close)
ADD_CACHE(vsplit, (uint) DRAW_GET_IDX(ib, iclose) + ibias);
ADD_CACHE(vsplit, ib, 0, iclose, ibias);
}
else {
if (spoken) {
if ((int) ib[ispoken] < -ibias)
return;
ADD_CACHE(vsplit, DRAW_GET_IDX(ib, ispoken) + ibias);
ADD_CACHE(vsplit, ib, 0, ispoken, ibias);
}
for (i = spoken; i < icount; i++) {
if ((int) DRAW_GET_IDX(ib, istart + i) < -ibias)
return;
ADD_CACHE(vsplit, DRAW_GET_IDX(ib, istart + i) + ibias);
ADD_CACHE(vsplit, ib, istart, i, ibias);
}
if (close) {
if ((int) DRAW_GET_IDX(ib, iclose) < -ibias)
return;
ADD_CACHE(vsplit, DRAW_GET_IDX(ib, iclose) + ibias);
ADD_CACHE(vsplit, ib, 0, iclose, ibias);
}
}

View File

@@ -86,12 +86,12 @@ draw_create_vertex_shader(struct draw_context *draw,
found_clipvertex = TRUE;
vs->clipvertex_output = i;
} else if (vs->info.output_semantic_name[i] == TGSI_SEMANTIC_CLIPDIST) {
if (vs->info.output_semantic_index[i] == 0)
vs->clipdistance_output[0] = i;
else
vs->clipdistance_output[1] = i;
debug_assert(vs->info.output_semantic_index[i] <
PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT);
vs->clipdistance_output[vs->info.output_semantic_index[i]] = i;
} else if (vs->info.output_semantic_name[i] == TGSI_SEMANTIC_CULLDIST) {
debug_assert(vs->info.output_semantic_index[i] < Elements(vs->culldistance_output));
debug_assert(vs->info.output_semantic_index[i] <
PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT);
vs->culldistance_output[vs->info.output_semantic_index[i]] = i;
}
}

View File

@@ -112,8 +112,8 @@ struct draw_vertex_shader {
unsigned position_output;
unsigned edgeflag_output;
unsigned clipvertex_output;
unsigned clipdistance_output[2];
unsigned culldistance_output[2];
unsigned clipdistance_output[PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT];
unsigned culldistance_output[PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT];
/* Extracted from shader:
*/
const float (*immediates)[4];

View File

@@ -168,6 +168,7 @@ static void PIPE_CDECL vsvg_run_elts( struct draw_vs_variant *variant,
vsvg->fetch->run_elts( vsvg->fetch,
elts,
count,
vsvg->draw->start_instance,
vsvg->draw->instance_id,
temp_buffer );
@@ -211,6 +212,7 @@ static void PIPE_CDECL vsvg_run_elts( struct draw_vs_variant *variant,
vsvg->emit->run( vsvg->emit,
0, count,
vsvg->draw->start_instance,
vsvg->draw->instance_id,
output_buffer );
@@ -234,6 +236,7 @@ static void PIPE_CDECL vsvg_run_linear( struct draw_vs_variant *variant,
vsvg->fetch->run( vsvg->fetch,
start,
count,
vsvg->draw->start_instance,
vsvg->draw->instance_id,
temp_buffer );
@@ -274,6 +277,7 @@ static void PIPE_CDECL vsvg_run_linear( struct draw_vs_variant *variant,
vsvg->emit->run( vsvg->emit,
0, count,
vsvg->draw->start_instance,
vsvg->draw->instance_id,
output_buffer );

View File

@@ -62,6 +62,7 @@
#include "lp_bld_debug.h"
#include "lp_bld_bitarit.h"
#include "lp_bld_arit.h"
#include "lp_bld_flow.h"
#define EXP_POLY_DEGREE 5
@@ -2305,19 +2306,14 @@ lp_build_rsqrt(struct lp_build_context *bld,
/*
* This should be faster but all denormals will end up as infinity.
*/
if (0 && ((util_cpu_caps.has_sse && type.width == 32 && type.length == 4) ||
(util_cpu_caps.has_avx && type.width == 32 && type.length == 8))) {
if (0 && lp_build_fast_rsqrt_available(type)) {
const unsigned num_iterations = 1;
LLVMValueRef res;
unsigned i;
const char *intrinsic = NULL;
if (type.length == 4) {
intrinsic = "llvm.x86.sse.rsqrt.ps";
}
else {
intrinsic = "llvm.x86.avx.rsqrt.ps.256";
}
/* rsqrt(1.0) != 1.0 here */
res = lp_build_fast_rsqrt(bld, a);
if (num_iterations) {
/*
* Newton-Raphson will result in NaN instead of infinity for zero,
@@ -2337,8 +2333,6 @@ lp_build_rsqrt(struct lp_build_context *bld,
inf = LLVMBuildBitCast(builder, inf, lp_build_vec_type(bld->gallivm, type), "");
res = lp_build_intrinsic_unary(builder, intrinsic, bld->vec_type, a);
for (i = 0; i < num_iterations; ++i) {
res = lp_build_rsqrt_refine(bld, a, res);
}
@@ -2349,11 +2343,6 @@ lp_build_rsqrt(struct lp_build_context *bld,
cmp = lp_build_compare(bld->gallivm, type, PIPE_FUNC_EQUAL, a, bld->one);
res = lp_build_select(bld, cmp, bld->one, res);
}
else {
/* rsqrt(1.0) != 1.0 here */
res = lp_build_intrinsic_unary(builder, intrinsic, bld->vec_type, a);
}
return res;
}
@@ -2361,6 +2350,58 @@ lp_build_rsqrt(struct lp_build_context *bld,
return lp_build_rcp(bld, lp_build_sqrt(bld, a));
}
/**
* If there's a fast (inaccurate) rsqrt instruction available
* (caller may want to avoid to call rsqrt_fast if it's not available,
* i.e. for calculating x^0.5 it may do rsqrt_fast(x) * x but if
* unavailable it would result in sqrt/div/mul so obviously
* much better to just call sqrt, skipping both div and mul).
*/
boolean
lp_build_fast_rsqrt_available(struct lp_type type)
{
assert(type.floating);
if ((util_cpu_caps.has_sse && type.width == 32 && type.length == 4) ||
(util_cpu_caps.has_avx && type.width == 32 && type.length == 8)) {
return true;
}
return false;
}
/**
* Generate 1/sqrt(a).
* Result is undefined for values < 0, infinity for +0.
* Precision is limited, only ~10 bits guaranteed
* (rsqrt 1.0 may not be 1.0, denorms may be flushed to 0).
*/
LLVMValueRef
lp_build_fast_rsqrt(struct lp_build_context *bld,
LLVMValueRef a)
{
LLVMBuilderRef builder = bld->gallivm->builder;
const struct lp_type type = bld->type;
assert(lp_check_value(type, a));
if (lp_build_fast_rsqrt_available(type)) {
const char *intrinsic = NULL;
if (type.length == 4) {
intrinsic = "llvm.x86.sse.rsqrt.ps";
}
else {
intrinsic = "llvm.x86.avx.rsqrt.ps.256";
}
return lp_build_intrinsic_unary(builder, intrinsic, bld->vec_type, a);
}
else {
debug_printf("%s: emulating fast rsqrt with rcp/sqrt\n", __FUNCTION__);
}
return lp_build_rcp(bld, lp_build_sqrt(bld, a));
}
/**
* Generate sin(a) using SSE2
@@ -2561,15 +2602,14 @@ lp_build_sin(struct lp_build_context *bld,
* xmm3 = poly_mask;
* y2 = _mm_and_ps(xmm3, y2); //, xmm3);
* y = _mm_andnot_ps(xmm3, y);
* y = _mm_add_ps(y,y2);
* y = _mm_or_ps(y,y2);
*/
LLVMValueRef y2_i = LLVMBuildBitCast(b, y2_9, bld->int_vec_type, "y2_i");
LLVMValueRef y_i = LLVMBuildBitCast(b, y_10, bld->int_vec_type, "y_i");
LLVMValueRef y2_and = LLVMBuildAnd(b, y2_i, poly_mask, "y2_and");
LLVMValueRef inv = lp_build_const_int_vec(gallivm, bld->type, ~0);
LLVMValueRef poly_mask_inv = LLVMBuildXor(b, poly_mask, inv, "poly_mask_inv");
LLVMValueRef poly_mask_inv = LLVMBuildNot(b, poly_mask, "poly_mask_inv");
LLVMValueRef y_and = LLVMBuildAnd(b, y_i, poly_mask_inv, "y_and");
LLVMValueRef y_combine = LLVMBuildAdd(b, y_and, y2_and, "y_combine");
LLVMValueRef y_combine = LLVMBuildOr(b, y_and, y2_and, "y_combine");
/*
* update the sign
@@ -2779,14 +2819,14 @@ lp_build_cos(struct lp_build_context *bld,
* xmm3 = poly_mask;
* y2 = _mm_and_ps(xmm3, y2); //, xmm3);
* y = _mm_andnot_ps(xmm3, y);
* y = _mm_add_ps(y,y2);
* y = _mm_or_ps(y,y2);
*/
LLVMValueRef y2_i = LLVMBuildBitCast(b, y2_9, bld->int_vec_type, "y2_i");
LLVMValueRef y_i = LLVMBuildBitCast(b, y_10, bld->int_vec_type, "y_i");
LLVMValueRef y2_and = LLVMBuildAnd(b, y2_i, poly_mask, "y2_and");
LLVMValueRef poly_mask_inv = LLVMBuildXor(b, poly_mask, inv, "poly_mask_inv");
LLVMValueRef poly_mask_inv = LLVMBuildNot(b, poly_mask, "poly_mask_inv");
LLVMValueRef y_and = LLVMBuildAnd(b, y_i, poly_mask_inv, "y_and");
LLVMValueRef y_combine = LLVMBuildAdd(b, y_and, y2_and, "y_combine");
LLVMValueRef y_combine = LLVMBuildOr(b, y_and, y2_and, "y_combine");
/*
* update the sign
@@ -2855,7 +2895,7 @@ lp_build_log(struct lp_build_context *bld,
* Generate polynomial.
* Ex: coeffs[0] + x * coeffs[1] + x^2 * coeffs[2].
*/
static LLVMValueRef
LLVMValueRef
lp_build_polynomial(struct lp_build_context *bld,
LLVMValueRef x,
const double *coeffs,

View File

@@ -231,6 +231,19 @@ LLVMValueRef
lp_build_rsqrt(struct lp_build_context *bld,
LLVMValueRef a);
boolean
lp_build_fast_rsqrt_available(struct lp_type type);
LLVMValueRef
lp_build_fast_rsqrt(struct lp_build_context *bld,
LLVMValueRef a);
LLVMValueRef
lp_build_polynomial(struct lp_build_context *bld,
LLVMValueRef x,
const double *coeffs,
unsigned num_coeffs);
LLVMValueRef
lp_build_cos(struct lp_build_context *bld,
LLVMValueRef a);

View File

@@ -0,0 +1,151 @@
/**************************************************************************
*
* Copyright 2013
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
* IN NO EVENT SHALL VMWARE AND/OR ITS SUPPLIERS BE LIABLE FOR
* ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
/**
* @file
* Helper
*
* The functions in this file implement arthmetic operations with support
* for overflow detection and reporting.
*
*/
#include "lp_bld_arit_overflow.h"
#include "lp_bld_type.h"
#include "lp_bld_const.h"
#include "lp_bld_init.h"
#include "lp_bld_intr.h"
#include "lp_bld_logic.h"
#include "lp_bld_pack.h"
#include "lp_bld_debug.h"
#include "lp_bld_bitarit.h"
#include "util/u_memory.h"
#include "util/u_debug.h"
#include "util/u_math.h"
#include "util/u_string.h"
#include "util/u_cpu_detect.h"
#include <float.h>
static LLVMValueRef
build_binary_int_overflow(struct gallivm_state *gallivm,
const char *intr_prefix,
LLVMValueRef a,
LLVMValueRef b,
LLVMValueRef *ofbit)
{
LLVMBuilderRef builder = gallivm->builder;
char intr_str[256];
LLVMTypeRef type_ref;
LLVMTypeKind type_kind;
unsigned type_width;
LLVMTypeRef oelems[2];
LLVMValueRef oresult;
LLVMTypeRef otype;
debug_assert(LLVMTypeOf(a) == LLVMTypeOf(b));
type_ref = LLVMTypeOf(a);
type_kind = LLVMGetTypeKind(type_ref);
debug_assert(type_kind == LLVMIntegerTypeKind);
type_width = LLVMGetIntTypeWidth(type_ref);
debug_assert(type_width == 16 || type_width == 32 || type_width == 64);
util_snprintf(intr_str, sizeof intr_str, "%s.i%u",
intr_prefix, type_width);
oelems[0] = type_ref;
oelems[1] = LLVMInt1TypeInContext(gallivm->context);
otype = LLVMStructTypeInContext(gallivm->context, oelems, 2, FALSE);
oresult = lp_build_intrinsic_binary(builder, intr_str,
otype, a, b);
if (ofbit) {
if (*ofbit) {
*ofbit = LLVMBuildOr(
builder, *ofbit,
LLVMBuildExtractValue(builder, oresult, 1, ""), "");
} else {
*ofbit = LLVMBuildExtractValue(builder, oresult, 1, "");
}
}
return LLVMBuildExtractValue(builder, oresult, 0, "");
}
/**
* Performs unsigned addition of two integers and reports
* overflow if detected.
*
* The values @a and @b must be of the same integer type. If
* an overflow is detected the IN/OUT @ofbit parameter is used:
* - if it's pointing to a null value, the overflow bit is simply
* stored inside the variable it's pointing to,
* - if it's pointing to a valid value, then that variable,
* which must be of i1 type, is ORed with the newly detected
* overflow bit. This is done to allow chaining of a number of
* overflow functions together without having to test the
* overflow bit after every single one.
*/
LLVMValueRef
lp_build_uadd_overflow(struct gallivm_state *gallivm,
LLVMValueRef a,
LLVMValueRef b,
LLVMValueRef *ofbit)
{
return build_binary_int_overflow(gallivm, "llvm.uadd.with.overflow",
a, b, ofbit);
}
/**
* Performs unsigned multiplication of two integers and
* reports overflow if detected.
*
* The values @a and @b must be of the same integer type. If
* an overflow is detected the IN/OUT @ofbit parameter is used:
* - if it's pointing to a null value, the overflow bit is simply
* stored inside the variable it's pointing to,
* - if it's pointing to a valid value, then that variable,
* which must be of i1 type, is ORed with the newly detected
* overflow bit. This is done to allow chaining of a number of
* overflow functions together without having to test the
* overflow bit after every single one.
*/
LLVMValueRef
lp_build_umul_overflow(struct gallivm_state *gallivm,
LLVMValueRef a,
LLVMValueRef b,
LLVMValueRef *ofbit)
{
return build_binary_int_overflow(gallivm, "llvm.umul.with.overflow",
a, b, ofbit);
}

View File

@@ -0,0 +1,57 @@
/**************************************************************************
*
* Copyright 2013 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
* IN NO EVENT SHALL VMWARE AND/OR ITS SUPPLIERS BE LIABLE FOR
* ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
/**
* @file
* Helper arithmetic functions with support for overflow detection
* and reporting.
*
* @author Zack Rusin <zackr@vmware.com>
*/
#ifndef LP_BLD_ARIT_OVERFLOW_H
#define LP_BLD_ARIT_OVERFLOW_H
#include "gallivm/lp_bld.h"
struct gallivm_state;
LLVMValueRef
lp_build_uadd_overflow(struct gallivm_state *gallivm,
LLVMValueRef a,
LLVMValueRef b,
LLVMValueRef *ofbit);
LLVMValueRef
lp_build_umul_overflow(struct gallivm_state *gallivm,
LLVMValueRef a,
LLVMValueRef b,
LLVMValueRef *ofbit);
#endif /* !LP_BLD_ARIT_OVERFLOW_H */

View File

@@ -79,82 +79,6 @@
/**
* Byte swap on element. It will construct a call to intrinsic llvm.bswap
* based on the type.
*
* @param res element to byte swap.
* @param type int16_t, int32_t, int64_t, float or double
* @param
*/
LLVMValueRef
lp_build_bswap(struct gallivm_state *gallivm,
LLVMValueRef res,
struct lp_type type)
{
LLVMTypeRef int_type = LLVMIntTypeInContext(gallivm->context,
type.width);
const char *intrinsic = NULL;
if (type.width == 8)
return res;
if (type.width == 16)
intrinsic = "llvm.bswap.i16";
else if (type.width == 32)
intrinsic = "llvm.bswap.i32";
else if (type.width == 64)
intrinsic = "llvm.bswap.i64";
assert (intrinsic != NULL);
/* In case of a floating-point type cast to a int of same size and then
* cast back to fp type.
*/
if (type.floating)
res = LLVMBuildBitCast(gallivm->builder, res, int_type, "");
res = lp_build_intrinsic_unary(gallivm->builder, intrinsic, int_type, res);
if (type.floating)
res = LLVMBuildBitCast(gallivm->builder, res,
lp_build_elem_type(gallivm, type), "");
return res;
}
/**
* Byte swap every element in the vector.
*
* @param packed <vector> to convert
* @param src_type <vector> type of int16_t, int32_t, int64_t, float or
* double
* @param dst_type <vector> type to return
*/
LLVMValueRef
lp_build_bswap_vec(struct gallivm_state *gallivm,
LLVMValueRef packed,
struct lp_type src_type_vec,
struct lp_type dst_type_vec)
{
LLVMBuilderRef builder = gallivm->builder;
LLVMTypeRef dst_type = lp_build_elem_type(gallivm, dst_type_vec);
LLVMValueRef res;
if (src_type_vec.length == 1) {
res = lp_build_bswap(gallivm, packed, src_type_vec);
res = LLVMBuildBitCast(gallivm->builder, res, dst_type, "");
} else {
unsigned i;
res = LLVMGetUndef(lp_build_vec_type(gallivm, dst_type_vec));
for (i = 0; i < src_type_vec.length; ++i) {
LLVMValueRef index = lp_build_const_int32(gallivm, i);
LLVMValueRef elem = LLVMBuildExtractElement(builder, packed, index, "");
elem = lp_build_bswap(gallivm, elem, src_type_vec);
elem = LLVMBuildBitCast(gallivm->builder, elem, dst_type, "");
res = LLVMBuildInsertElement(gallivm->builder, res, elem, index, "");
}
}
return res;
}
/**
* Converts int16 half-float to float32
* Note this can be performed in 1 instruction if vcvtph2ps exists (f16c/cvt16)

View File

@@ -42,17 +42,6 @@
struct lp_type;
LLVMValueRef
lp_build_bswap(struct gallivm_state *gallivm,
LLVMValueRef res,
struct lp_type type);
LLVMValueRef
lp_build_bswap_vec(struct gallivm_state *gallivm,
LLVMValueRef packed,
struct lp_type src_type,
struct lp_type dst_type);
LLVMValueRef
lp_build_half_to_float(struct gallivm_state *gallivm,
LLVMValueRef src);

View File

@@ -188,7 +188,7 @@ lp_build_mask_value(struct lp_build_mask_context *mask)
/**
* Update boolean mask with given value (bitwise AND).
* Typically used to update the quad's pixel alive/killed mask
* after depth testing, alpha testing, TGSI_OPCODE_KIL, etc.
* after depth testing, alpha testing, TGSI_OPCODE_KILL_IF, etc.
*/
void
lp_build_mask_update(struct lp_build_mask_context *mask,

View File

@@ -158,4 +158,16 @@ lp_build_rgb9e5_to_float(struct gallivm_state *gallivm,
LLVMValueRef src,
LLVMValueRef *dst);
LLVMValueRef
lp_build_float_to_srgb_packed(struct gallivm_state *gallivm,
const struct util_format_description *dst_fmt,
struct lp_type src_type,
LLVMValueRef *src);
LLVMValueRef
lp_build_srgb_to_linear(struct gallivm_state *gallivm,
struct lp_type src_type,
LLVMValueRef src);
#endif /* !LP_BLD_FORMAT_H */

View File

@@ -139,12 +139,12 @@ format_matches_type(const struct util_format_description *desc,
/**
* Unpack a single pixel into its RGBA components.
* Unpack a single pixel into its XYZW components.
*
* @param desc the pixel format for the packed pixel value
* @param packed integer pixel in a format such as PIPE_FORMAT_B8G8R8A8_UNORM
*
* @return RGBA in a float[4] or ubyte[4] or ushort[4] vector.
* @return XYZW in a float[4] or ubyte[4] or ushort[4] vector.
*/
static INLINE LLVMValueRef
lp_build_unpack_arith_rgba_aos(struct gallivm_state *gallivm,
@@ -159,7 +159,6 @@ lp_build_unpack_arith_rgba_aos(struct gallivm_state *gallivm,
boolean normalized;
boolean needs_uitofp;
unsigned shift;
unsigned i;
/* TODO: Support more formats */
@@ -172,10 +171,6 @@ lp_build_unpack_arith_rgba_aos(struct gallivm_state *gallivm,
* matches floating point size */
assert (LLVMTypeOf(packed) == LLVMInt32TypeInContext(gallivm->context));
#ifdef PIPE_ARCH_BIG_ENDIAN
packed = lp_build_bswap(gallivm, packed, lp_type_uint(32));
#endif
/* Broadcast the packed value to all four channels
* before: packed = BGRA
* after: packed = {BGRA, BGRA, BGRA, BGRA}
@@ -194,11 +189,11 @@ lp_build_unpack_arith_rgba_aos(struct gallivm_state *gallivm,
/* Initialize vector constants */
normalized = FALSE;
needs_uitofp = FALSE;
shift = 0;
/* Loop over 4 color components */
for (i = 0; i < 4; ++i) {
unsigned bits = desc->channel[i].size;
unsigned shift = desc->channel[i].shift;
if (desc->channel[i].type == UTIL_FORMAT_TYPE_VOID) {
shifts[i] = LLVMGetUndef(LLVMInt32TypeInContext(gallivm->context));
@@ -224,12 +219,10 @@ lp_build_unpack_arith_rgba_aos(struct gallivm_state *gallivm,
else
scales[i] = lp_build_const_float(gallivm, 1.0);
}
shift += bits;
}
/* Ex: convert packed = {BGRA, BGRA, BGRA, BGRA}
* into masked = {B, G, R, A}
/* Ex: convert packed = {XYZW, XYZW, XYZW, XYZW}
* into masked = {X, Y, Z, W}
*/
shifted = LLVMBuildLShr(builder, packed, LLVMConstVector(shifts, 4), "");
masked = LLVMBuildAnd(builder, shifted, LLVMConstVector(masks, 4), "");
@@ -276,7 +269,6 @@ lp_build_pack_rgba_aos(struct gallivm_state *gallivm,
LLVMValueRef shifts[4];
LLVMValueRef scales[4];
boolean normalized;
unsigned shift;
unsigned i, j;
assert(desc->layout == UTIL_FORMAT_LAYOUT_PLAIN);
@@ -302,9 +294,9 @@ lp_build_pack_rgba_aos(struct gallivm_state *gallivm,
LLVMConstVector(swizzles, 4), "");
normalized = FALSE;
shift = 0;
for (i = 0; i < 4; ++i) {
unsigned bits = desc->channel[i].size;
unsigned shift = desc->channel[i].shift;
if (desc->channel[i].type == UTIL_FORMAT_TYPE_VOID) {
shifts[i] = LLVMGetUndef(LLVMInt32TypeInContext(gallivm->context));
@@ -325,8 +317,6 @@ lp_build_pack_rgba_aos(struct gallivm_state *gallivm,
else
scales[i] = lp_build_const_float(gallivm, 1.0);
}
shift += bits;
}
if (normalized)
@@ -410,16 +400,11 @@ lp_build_fetch_rgba_aos(struct gallivm_state *gallivm,
packed = lp_build_gather(gallivm, type.length/4,
format_desc->block.bits, type.width*4,
base_ptr, offset);
base_ptr, offset, TRUE);
assert(format_desc->block.bits <= vec_len);
packed = LLVMBuildBitCast(gallivm->builder, packed, dst_vec_type, "");
#ifdef PIPE_ARCH_BIG_ENDIAN
if (type.floating)
packed = lp_build_bswap_vec(gallivm, packed, type,
lp_type_float_vec(type.width, vec_len));
#endif
return lp_build_format_swizzle_aos(format_desc, &bld, packed);
}
@@ -453,7 +438,7 @@ lp_build_fetch_rgba_aos(struct gallivm_state *gallivm,
packed = lp_build_gather_elem(gallivm, num_pixels,
format_desc->block.bits, 32,
base_ptr, offset, k);
base_ptr, offset, k, FALSE);
tmps[k] = lp_build_unpack_arith_rgba_aos(gallivm,
format_desc,

View File

@@ -40,58 +40,6 @@
#include "pipe/p_state.h"
#ifdef PIPE_ARCH_BIG_ENDIAN
static LLVMValueRef
lp_build_read_int_bswap(struct gallivm_state *gallivm,
LLVMValueRef base_ptr,
unsigned src_width,
LLVMTypeRef src_type,
unsigned i,
LLVMTypeRef dst_type)
{
LLVMBuilderRef builder = gallivm->builder;
LLVMValueRef index = lp_build_const_int32(gallivm, i);
LLVMValueRef ptr = LLVMBuildGEP(builder, base_ptr, &index, 1, "");
LLVMValueRef res = LLVMBuildLoad(builder, ptr, "");
res = lp_build_bswap(gallivm, res, lp_type_uint(src_width));
return LLVMBuildBitCast(builder, res, dst_type, "");
}
static LLVMValueRef
lp_build_fetch_read_big_endian(struct gallivm_state *gallivm,
struct lp_type src_type,
LLVMValueRef base_ptr)
{
LLVMBuilderRef builder = gallivm->builder;
unsigned src_width = src_type.width;
unsigned length = src_type.length;
LLVMTypeRef src_elem_type = LLVMIntTypeInContext(gallivm->context, src_width);
LLVMTypeRef dst_elem_type = lp_build_elem_type (gallivm, src_type);
LLVMTypeRef src_ptr_type = LLVMPointerType(src_elem_type, 0);
LLVMValueRef res;
base_ptr = LLVMBuildPointerCast(builder, base_ptr, src_ptr_type, "");
if (length == 1) {
/* Scalar */
res = lp_build_read_int_bswap(gallivm, base_ptr, src_width, src_elem_type,
0, dst_elem_type);
} else {
/* Vector */
LLVMTypeRef dst_vec_type = LLVMVectorType(dst_elem_type, length);
unsigned i;
res = LLVMGetUndef(dst_vec_type);
for (i = 0; i < length; ++i) {
LLVMValueRef index = lp_build_const_int32(gallivm, i);
LLVMValueRef elem = lp_build_read_int_bswap(gallivm, base_ptr, src_width,
src_elem_type, i, dst_elem_type);
res = LLVMBuildInsertElement(builder, res, elem, index, "");
}
}
return res;
}
#endif
/**
* @brief lp_build_fetch_rgba_aos_array
@@ -124,13 +72,9 @@ lp_build_fetch_rgba_aos_array(struct gallivm_state *gallivm,
/* Read whole vector from memory, unaligned */
ptr = LLVMBuildGEP(builder, base_ptr, &offset, 1, "");
#ifdef PIPE_ARCH_BIG_ENDIAN
res = lp_build_fetch_read_big_endian(gallivm, src_type, ptr);
#else
ptr = LLVMBuildPointerCast(builder, ptr, LLVMPointerType(src_vec_type, 0), "");
res = LLVMBuildLoad(builder, ptr, "");
lp_set_load_alignment(res, src_type.width / 8);
#endif
/* Truncate doubles to float */
if (src_type.floating && src_type.width == 64) {

View File

@@ -115,7 +115,6 @@ lp_build_unpack_rgba_soa(struct gallivm_state *gallivm,
LLVMBuilderRef builder = gallivm->builder;
struct lp_build_context bld;
LLVMValueRef inputs[4];
unsigned start;
unsigned chan;
assert(format_desc->layout == UTIL_FORMAT_LAYOUT_PLAIN);
@@ -128,9 +127,9 @@ lp_build_unpack_rgba_soa(struct gallivm_state *gallivm,
lp_build_context_init(&bld, gallivm, type);
/* Decode the input vector components */
start = 0;
for (chan = 0; chan < format_desc->nr_channels; ++chan) {
const unsigned width = format_desc->channel[chan].size;
const unsigned start = format_desc->channel[chan].shift;
const unsigned stop = start + width;
LLVMValueRef input;
@@ -164,11 +163,23 @@ lp_build_unpack_rgba_soa(struct gallivm_state *gallivm,
*/
if (type.floating) {
if(format_desc->channel[chan].normalized)
input = lp_build_unsigned_norm_to_float(gallivm, width, type, input);
else
input = LLVMBuildSIToFP(builder, input,
lp_build_vec_type(gallivm, type), "");
if (format_desc->colorspace == UTIL_FORMAT_COLORSPACE_SRGB) {
assert(width == 8);
if (format_desc->swizzle[3] == chan) {
input = lp_build_unsigned_norm_to_float(gallivm, width, type, input);
}
else {
struct lp_type conv_type = lp_uint_type(type);
input = lp_build_srgb_to_linear(gallivm, conv_type, input);
}
}
else {
if(format_desc->channel[chan].normalized)
input = lp_build_unsigned_norm_to_float(gallivm, width, type, input);
else
input = LLVMBuildSIToFP(builder, input,
lp_build_vec_type(gallivm, type), "");
}
}
else if (format_desc->channel[chan].pure_integer) {
/* Nothing to do */
@@ -256,8 +267,6 @@ lp_build_unpack_rgba_soa(struct gallivm_state *gallivm,
}
inputs[chan] = input;
start = stop;
}
lp_build_format_swizzle_soa(format_desc, &bld, inputs, rgba_out);
@@ -291,7 +300,11 @@ lp_build_rgba8_to_fi32_soa(struct gallivm_state *gallivm,
/* Decode the input vector components */
for (chan = 0; chan < 4; ++chan) {
#ifdef PIPE_ARCH_LITTLE_ENDIAN
unsigned start = chan*8;
#else
unsigned start = (3-chan)*8;
#endif
unsigned stop = start + 8;
LLVMValueRef input;
@@ -343,6 +356,7 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
if (format_desc->layout == UTIL_FORMAT_LAYOUT_PLAIN &&
(format_desc->colorspace == UTIL_FORMAT_COLORSPACE_RGB ||
format_desc->colorspace == UTIL_FORMAT_COLORSPACE_SRGB ||
format_desc->colorspace == UTIL_FORMAT_COLORSPACE_ZS) &&
format_desc->block.width == 1 &&
format_desc->block.height == 1 &&
@@ -360,13 +374,14 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
/*
* gather the texels from the texture
* Ex: packed = {BGRA, BGRA, BGRA, BGRA}.
* Ex: packed = {XYZW, XYZW, XYZW, XYZW}
*/
assert(format_desc->block.bits <= type.width);
packed = lp_build_gather(gallivm,
type.length,
format_desc->block.bits,
type.width,
base_ptr, offset);
base_ptr, offset, FALSE);
/*
* convert texels to float rgba
@@ -391,7 +406,8 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
packed = lp_build_gather(gallivm, type.length,
format_desc->block.bits,
type.width, base_ptr, offset);
type.width, base_ptr, offset,
FALSE);
if (format_desc->format == PIPE_FORMAT_R11G11B10_FLOAT) {
lp_build_r11g11b10_to_float(gallivm, packed, rgba_out);
}
@@ -418,14 +434,14 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
LLVMValueRef s_offset = lp_build_const_int_vec(gallivm, type, 4);
offset = LLVMBuildAdd(builder, offset, s_offset, "");
packed = lp_build_gather(gallivm, type.length,
32, type.width, base_ptr, offset);
32, type.width, base_ptr, offset, FALSE);
packed = LLVMBuildAnd(builder, packed,
lp_build_const_int_vec(gallivm, type, mask), "");
}
else {
assert (format_desc->format == PIPE_FORMAT_Z32_FLOAT_S8X24_UINT);
packed = lp_build_gather(gallivm, type.length,
32, type.width, base_ptr, offset);
32, type.width, base_ptr, offset, TRUE);
packed = LLVMBuildBitCast(builder, packed,
lp_build_vec_type(gallivm, type), "");
}

View File

@@ -0,0 +1,344 @@
/**************************************************************************
*
* Copyright 2013 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
* IN NO EVENT SHALL VMWARE AND/OR ITS SUPPLIERS BE LIABLE FOR
* ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
/**
* @file
* Format conversion code for srgb formats.
*
* Functions for converting from srgb to linear and vice versa.
* From http://www.opengl.org/registry/specs/EXT/texture_sRGB.txt:
*
* srgb->linear:
* cl = cs / 12.92, cs <= 0.04045
* cl = ((cs + 0.055)/1.055)^2.4, cs > 0.04045
*
* linear->srgb:
* if (isnan(cl)) {
* Map IEEE-754 Not-a-number to zero.
* cs = 0.0;
* } else if (cl > 1.0) {
* cs = 1.0;
* } else if (cl < 0.0) {
* cs = 0.0;
* } else if (cl < 0.0031308) {
* cs = 12.92 * cl;
* } else {
* cs = 1.055 * pow(cl, 0.41666) - 0.055;
* }
*
* This does not need to be accurate, however at least for d3d10
* (http://msdn.microsoft.com/en-us/library/windows/desktop/dd607323%28v=vs.85%29.aspx):
* 1) For srgb->linear, it is required that the error on the srgb side is
* not larger than 0.5f, which I interpret that if you map the value back
* to srgb from linear using the ideal conversion, it would not be off by
* more than 0.5f (that is, it would map to the same 8-bit integer value
* as it was before conversion to linear).
* 2) linear->srgb is permitted 0.6f which luckily looks like quite a large
* error is allowed.
* 3) Additionally, all srgb values converted to linear and back must result
* in the same value as they were originally.
*
* @author Roland Scheidegger <sroland@vmware.com>
*/
#include "util/u_debug.h"
#include "lp_bld_type.h"
#include "lp_bld_const.h"
#include "lp_bld_arit.h"
#include "lp_bld_bitarit.h"
#include "lp_bld_logic.h"
#include "lp_bld_format.h"
/**
* Convert srgb int values to linear float values.
* Several possibilities how to do this, e.g.
* - table
* - doing the pow() with int-to-float and float-to-int tricks
* (http://stackoverflow.com/questions/6475373/optimizations-for-pow-with-const-non-integer-exponent)
* - just using standard polynomial approximation
* (3rd order polynomial is required for crappy but just sufficient accuracy)
*
* @param src integer (vector) value(s) to convert
* (8 bit values unpacked to 32 bit already).
*/
LLVMValueRef
lp_build_srgb_to_linear(struct gallivm_state *gallivm,
struct lp_type src_type,
LLVMValueRef src)
{
struct lp_type f32_type = lp_type_float_vec(32, src_type.length * 32);
struct lp_build_context f32_bld;
LLVMValueRef srcf, part_lin, part_pow, is_linear, lin_const, lin_thresh;
double coeffs[4] = {0.0023f,
0.0030f / 255.0f,
0.6935f / (255.0f * 255.0f),
0.3012f / (255.0f * 255.0f * 255.0f)
};
assert(src_type.width == 32);
lp_build_context_init(&f32_bld, gallivm, f32_type);
/*
* using polynomial: (src * (src * (src * 0.3012 + 0.6935) + 0.0030) + 0.0023)
* ( poly = 0.3012*x^3 + 0.6935*x^2 + 0.0030*x + 0.0023)
* (found with octave polyfit and some magic as I couldn't get the error
* function right). Using the above mentioned error function, the values stay
* within +-0.35, except for the lowest values - hence tweaking linear segment
* to cover the first 16 instead of the first 11 values (the error stays
* just about acceptable there too).
* Hence: lin = src > 15 ? poly : src / 12.6
* This function really only makes sense for vectors, should use LUT otherwise.
* All in all (including float conversion) 11 instructions (with sse4.1),
* 6 constants (polynomial could be done with 1 instruction less at the cost
* of slightly worse dependency chain, fma should also help).
*/
/* doing the 1/255 mul as part of the approximation */
srcf = lp_build_int_to_float(&f32_bld, src);
lin_const = lp_build_const_vec(gallivm, f32_type, 1.0f / (12.6f * 255.0f));
part_lin = lp_build_mul(&f32_bld, srcf, lin_const);
part_pow = lp_build_polynomial(&f32_bld, srcf, coeffs, 4);
lin_thresh = lp_build_const_vec(gallivm, f32_type, 15.0f);
is_linear = lp_build_compare(gallivm, f32_type, PIPE_FUNC_LEQUAL, srcf, lin_thresh);
return lp_build_select(&f32_bld, is_linear, part_lin, part_pow);
}
/**
* Convert linear float values to srgb int values.
* Several possibilities how to do this, e.g.
* - use table (based on exponent/highest order mantissa bits) and do
* linear interpolation (https://gist.github.com/rygorous/2203834)
* - Chebyshev polynomial
* - Approximation using reciprocals
* - using int-to-float and float-to-int tricks for pow()
* (http://stackoverflow.com/questions/6475373/optimizations-for-pow-with-const-non-integer-exponent)
*
* @param src float (vector) value(s) to convert.
*/
static LLVMValueRef
lp_build_linear_to_srgb(struct gallivm_state *gallivm,
struct lp_type src_type,
LLVMValueRef src)
{
LLVMBuilderRef builder = gallivm->builder;
struct lp_build_context f32_bld;
LLVMValueRef lin_thresh, lin, lin_const, is_linear, tmp, pow_final;
lp_build_context_init(&f32_bld, gallivm, src_type);
src = lp_build_clamp(&f32_bld, src, f32_bld.zero, f32_bld.one);
if (0) {
/*
* using int-to-float and float-to-int trick for pow().
* This is much more accurate than necessary thanks to the correction,
* but it most certainly makes no sense without rsqrt available.
* Bonus points if you understand how this works...
* All in all (including min/max clamp, conversion) 19 instructions.
*/
float exp_f = 2.0f / 3.0f;
/* some compilers can't do exp2f, so this is exp2f(127.0f/exp_f - 127.0f) */
float exp2f_c = 1.30438178253e+19f;
float coeff_f = 0.62996f;
LLVMValueRef pow_approx, coeff, x2, exponent, pow_1, pow_2;
struct lp_type int_type = lp_int_type(src_type);
/*
* First calculate approx x^8/12
*/
exponent = lp_build_const_vec(gallivm, src_type, exp_f);
coeff = lp_build_const_vec(gallivm, src_type,
exp2f_c * powf(coeff_f, 1.0f / exp_f));
/* premultiply src */
tmp = lp_build_mul(&f32_bld, coeff, src);
/* "log2" */
tmp = LLVMBuildBitCast(builder, tmp, lp_build_vec_type(gallivm, int_type), "");
tmp = lp_build_int_to_float(&f32_bld, tmp);
/* multiply for pow */
tmp = lp_build_mul(&f32_bld, tmp, exponent);
/* "exp2" */
pow_approx = lp_build_itrunc(&f32_bld, tmp);
pow_approx = LLVMBuildBitCast(builder, pow_approx,
lp_build_vec_type(gallivm, src_type), "");
/*
* Since that pow was inaccurate (like 3 bits, though each sqrt step would
* give another bit), compensate the error (which is why we chose another
* exponent in the first place).
*/
/* x * x^(8/12) = x^(20/12) */
pow_1 = lp_build_mul(&f32_bld, pow_approx, src);
/* x * x * x^(-4/12) = x^(20/12) */
/* Should avoid using rsqrt if it's not available, but
* using x * x^(4/12) * x^(4/12) instead will change error weight */
tmp = lp_build_fast_rsqrt(&f32_bld, pow_approx);
x2 = lp_build_mul(&f32_bld, src, src);
pow_2 = lp_build_mul(&f32_bld, x2, tmp);
/* average the values so the errors cancel out, compensate bias,
* we also squeeze the 1.055 mul of the srgb conversion plus the 255.0 mul
* for conversion to int in here */
tmp = lp_build_add(&f32_bld, pow_1, pow_2);
coeff = lp_build_const_vec(gallivm, src_type,
1.0f / (3.0f * coeff_f) * 0.999852f *
powf(1.055f * 255.0f, 4.0f));
pow_final = lp_build_mul(&f32_bld, tmp, coeff);
/* x^(5/12) = rsqrt(rsqrt(x^20/12)) */
if (lp_build_fast_rsqrt_available(src_type)) {
pow_final = lp_build_fast_rsqrt(&f32_bld,
lp_build_fast_rsqrt(&f32_bld, pow_final));
}
else {
pow_final = lp_build_sqrt(&f32_bld, lp_build_sqrt(&f32_bld, pow_final));
}
pow_final = lp_build_add(&f32_bld, pow_final,
lp_build_const_vec(gallivm, src_type, -0.055f * 255.0f));
}
else {
/*
* using "rational polynomial" approximation here.
* Essentially y = a*x^0.375 + b*x^0.5 + c, with also
* factoring in the 255.0 mul and the scaling mul.
* (a is closer to actual value so has higher weight than b.)
* Note: the constants are magic values. They were found empirically,
* possibly could be improved but good enough (be VERY careful with
* error metric if you'd want to tweak them, they also MUST fit with
* the crappy polynomial above for srgb->linear since it is required
* that each srgb value maps back to the same value).
* This function has an error of max +-0.17 (and we'd only require +-0.6),
* for the approximated srgb->linear values the error is naturally larger
* (+-0.42) but still accurate enough (required +-0.5 essentially).
* All in all (including min/max clamp, conversion) 15 instructions.
* FMA would help (minus 2 instructions).
*/
LLVMValueRef x05, x0375, a_const, b_const, c_const, tmp2;
if (lp_build_fast_rsqrt_available(src_type)) {
tmp = lp_build_fast_rsqrt(&f32_bld, src);
x05 = lp_build_mul(&f32_bld, src, tmp);
}
else {
/*
* I don't really expect this to be practical without rsqrt
* but there's no reason for triple punishment so at least
* save the otherwise resulting division and unnecessary mul...
*/
x05 = lp_build_sqrt(&f32_bld, src);
}
tmp = lp_build_mul(&f32_bld, x05, src);
if (lp_build_fast_rsqrt_available(src_type)) {
x0375 = lp_build_fast_rsqrt(&f32_bld, lp_build_fast_rsqrt(&f32_bld, tmp));
}
else {
x0375 = lp_build_sqrt(&f32_bld, lp_build_sqrt(&f32_bld, tmp));
}
a_const = lp_build_const_vec(gallivm, src_type, 0.675f * 1.0622 * 255.0f);
b_const = lp_build_const_vec(gallivm, src_type, 0.325f * 1.0622 * 255.0f);
c_const = lp_build_const_vec(gallivm, src_type, -0.0620f * 255.0f);
tmp = lp_build_mul(&f32_bld, a_const, x0375);
tmp2 = lp_build_mul(&f32_bld, b_const, x05);
tmp2 = lp_build_add(&f32_bld, tmp2, c_const);
pow_final = lp_build_add(&f32_bld, tmp, tmp2);
}
/* linear part is easy */
lin_const = lp_build_const_vec(gallivm, src_type, 12.92f * 255.0f);
lin = lp_build_mul(&f32_bld, src, lin_const);
lin_thresh = lp_build_const_vec(gallivm, src_type, 0.0031308f);
is_linear = lp_build_compare(gallivm, src_type, PIPE_FUNC_LEQUAL, src, lin_thresh);
tmp = lp_build_select(&f32_bld, is_linear, lin, pow_final);
f32_bld.type.sign = 0;
return lp_build_iround(&f32_bld, tmp);
}
/**
* Convert linear float soa values to packed srgb AoS values.
* This only handles packed formats which are 4x8bit in size
* (rgba and rgbx plus swizzles).
*
* @param src float SoA (vector) values to convert.
*/
LLVMValueRef
lp_build_float_to_srgb_packed(struct gallivm_state *gallivm,
const struct util_format_description *dst_fmt,
struct lp_type src_type,
LLVMValueRef *src)
{
LLVMBuilderRef builder = gallivm->builder;
unsigned chan;
struct lp_build_context f32_bld;
struct lp_type int32_type = lp_int_type(src_type);
LLVMValueRef tmpsrgb[4], alpha, dst;
lp_build_context_init(&f32_bld, gallivm, src_type);
/* rgb is subject to linear->srgb conversion, alpha is not */
for (chan = 0; chan < 3; chan++) {
tmpsrgb[chan] = lp_build_linear_to_srgb(gallivm, src_type, src[chan]);
}
/*
* can't use lp_build_conv since we want to keep values as 32bit
* here so we can interleave with rgb to go from SoA->AoS.
*/
alpha = lp_build_clamp(&f32_bld, src[3], f32_bld.zero, f32_bld.one);
alpha = lp_build_mul(&f32_bld, alpha,
lp_build_const_vec(gallivm, src_type, 255.0f));
tmpsrgb[3] = lp_build_iround(&f32_bld, alpha);
dst = lp_build_zero(gallivm, int32_type);
for (chan = 0; chan < dst_fmt->nr_channels; chan++) {
if (dst_fmt->swizzle[chan] <= UTIL_FORMAT_SWIZZLE_W) {
unsigned ls;
LLVMValueRef shifted, shift_val;
ls = dst_fmt->channel[dst_fmt->swizzle[chan]].shift;
shift_val = lp_build_const_int_vec(gallivm, int32_type, ls);
shifted = LLVMBuildShl(builder, tmpsrgb[chan], shift_val, "");
dst = LLVMBuildOr(builder, dst, shifted, "");
}
}
return dst;
}

View File

@@ -497,7 +497,7 @@ lp_build_fetch_subsampled_rgba_aos(struct gallivm_state *gallivm,
assert(format_desc->block.width == 2);
assert(format_desc->block.height == 1);
packed = lp_build_gather(gallivm, n, 32, 32, base_ptr, offset);
packed = lp_build_gather(gallivm, n, 32, 32, base_ptr, offset, FALSE);
(void)j;

View File

@@ -78,7 +78,8 @@ lp_build_gather_elem(struct gallivm_state *gallivm,
unsigned dst_width,
LLVMValueRef base_ptr,
LLVMValueRef offsets,
unsigned i)
unsigned i,
boolean vector_justify)
{
LLVMTypeRef src_type = LLVMIntTypeInContext(gallivm->context, src_width);
LLVMTypeRef src_ptr_type = LLVMPointerType(src_type, 0);
@@ -97,10 +98,12 @@ lp_build_gather_elem(struct gallivm_state *gallivm,
res = LLVMBuildTrunc(gallivm->builder, res, dst_elem_type, "");
} else if (src_width < dst_width) {
res = LLVMBuildZExt(gallivm->builder, res, dst_elem_type, "");
if (vector_justify) {
#ifdef PIPE_ARCH_BIG_ENDIAN
res = LLVMBuildShl(gallivm->builder, res,
LLVMConstInt(dst_elem_type, dst_width - src_width, 0), "");
res = LLVMBuildShl(gallivm->builder, res,
LLVMConstInt(dst_elem_type, dst_width - src_width, 0), "");
#endif
}
}
return res;
@@ -112,11 +115,20 @@ lp_build_gather_elem(struct gallivm_state *gallivm,
* Use for fetching texels from a texture.
* For SSE, typical values are length=4, src_width=32, dst_width=32.
*
* When src_width < dst_width, the return value can be justified in
* one of two ways:
* "integer justification" is used when the caller treats the destination
* as a packed integer bitmask, as described by the channels' "shift" and
* "width" fields;
* "vector justification" is used when the caller casts the destination
* to a vector and needs channel X to be in vector element 0.
*
* @param length length of the offsets
* @param src_width src element width in bits
* @param dst_width result element width in bits (src will be expanded to fit)
* @param base_ptr base pointer, should be a i8 pointer type.
* @param offsets vector with offsets
* @param vector_justify select vector rather than integer justification
*/
LLVMValueRef
lp_build_gather(struct gallivm_state *gallivm,
@@ -124,7 +136,8 @@ lp_build_gather(struct gallivm_state *gallivm,
unsigned src_width,
unsigned dst_width,
LLVMValueRef base_ptr,
LLVMValueRef offsets)
LLVMValueRef offsets,
boolean vector_justify)
{
LLVMValueRef res;
@@ -132,7 +145,7 @@ lp_build_gather(struct gallivm_state *gallivm,
/* Scalar */
return lp_build_gather_elem(gallivm, length,
src_width, dst_width,
base_ptr, offsets, 0);
base_ptr, offsets, 0, vector_justify);
} else {
/* Vector */
@@ -146,7 +159,7 @@ lp_build_gather(struct gallivm_state *gallivm,
LLVMValueRef elem;
elem = lp_build_gather_elem(gallivm, length,
src_width, dst_width,
base_ptr, offsets, i);
base_ptr, offsets, i, vector_justify);
res = LLVMBuildInsertElement(gallivm->builder, res, elem, index, "");
}
}

View File

@@ -47,7 +47,8 @@ lp_build_gather_elem(struct gallivm_state *gallivm,
unsigned dst_width,
LLVMValueRef base_ptr,
LLVMValueRef offsets,
unsigned i);
unsigned i,
boolean vector_justify);
LLVMValueRef
lp_build_gather(struct gallivm_state *gallivm,
@@ -55,7 +56,8 @@ lp_build_gather(struct gallivm_state *gallivm,
unsigned src_width,
unsigned dst_width,
LLVMValueRef base_ptr,
LLVMValueRef offsets);
LLVMValueRef offsets,
boolean vector_justify);
LLVMValueRef
lp_build_gather_values(struct gallivm_state * gallivm,

View File

@@ -49,7 +49,7 @@
* - MC-JIT supports limited OSes (MacOSX and Linux)
* - standard JIT in LLVM 3.1, with backports
*/
#if defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390)
#if defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) || defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64)
# define USE_MCJIT 1
# define HAVE_AVX 0
#elif HAVE_LLVM >= 0x0302 || (HAVE_LLVM == 0x0301 && defined(HAVE_JIT_AVX_SUPPORT))

View File

@@ -215,7 +215,7 @@ lp_build_rho(struct lp_build_sample_context *bld,
struct lp_build_context *float_size_bld = &bld->float_size_in_bld;
struct lp_build_context *float_bld = &bld->float_bld;
struct lp_build_context *coord_bld = &bld->coord_bld;
struct lp_build_context *perquadf_bld = &bld->perquadf_bld;
struct lp_build_context *levelf_bld = &bld->levelf_bld;
const unsigned dims = bld->dims;
LLVMValueRef ddx_ddy[2];
LLVMBuilderRef builder = bld->gallivm->builder;
@@ -235,6 +235,8 @@ lp_build_rho(struct lp_build_sample_context *bld,
/* Note that all simplified calculations will only work for isotropic filtering */
assert(bld->num_lods != length);
first_level = bld->dynamic_state->first_level(bld->dynamic_state,
bld->gallivm, texture_unit);
first_level_vec = lp_build_broadcast_scalar(int_size_bld, first_level);
@@ -248,14 +250,14 @@ lp_build_rho(struct lp_build_sample_context *bld,
* Cube map code did already everything except size mul and per-quad extraction.
*/
rho = lp_build_pack_aos_scalars(bld->gallivm, coord_bld->type,
perquadf_bld->type, cube_rho, 0);
levelf_bld->type, cube_rho, 0);
if (gallivm_debug & GALLIVM_DEBUG_NO_RHO_APPROX) {
rho = lp_build_sqrt(perquadf_bld, rho);
rho = lp_build_sqrt(levelf_bld, rho);
}
/* Could optimize this for single quad just skip the broadcast */
cubesize = lp_build_extract_broadcast(gallivm, bld->float_size_in_type,
perquadf_bld->type, float_size, index0);
rho = lp_build_mul(perquadf_bld, cubesize, rho);
levelf_bld->type, float_size, index0);
rho = lp_build_mul(levelf_bld, cubesize, rho);
}
else if (derivs && !(bld->static_texture_state->target == PIPE_TEXTURE_CUBE)) {
LLVMValueRef ddmax[3], ddx[3], ddy[3];
@@ -289,12 +291,12 @@ lp_build_rho(struct lp_build_sample_context *bld,
}
rho_vec = lp_build_max(coord_bld, rho_xvec, rho_yvec);
rho = lp_build_pack_aos_scalars(bld->gallivm, coord_bld->type,
perquadf_bld->type, rho_vec, 0);
levelf_bld->type, rho_vec, 0);
/*
* note that as long as we don't care about per-pixel lod could reduce math
* more (at some shuffle cost), but for now only do sqrt after packing.
*/
rho = lp_build_sqrt(perquadf_bld, rho);
rho = lp_build_sqrt(levelf_bld, rho);
}
else {
rho_vec = ddmax[0];
@@ -309,7 +311,7 @@ lp_build_rho(struct lp_build_sample_context *bld,
* since we can't handle per-pixel rho/lod from now on (TODO).
*/
rho = lp_build_pack_aos_scalars(bld->gallivm, coord_bld->type,
perquadf_bld->type, rho_vec, 0);
levelf_bld->type, rho_vec, 0);
}
}
else {
@@ -381,8 +383,8 @@ lp_build_rho(struct lp_build_sample_context *bld,
rho_vec = lp_build_max(coord_bld, rho_xvec, rho_yvec);
rho = lp_build_pack_aos_scalars(bld->gallivm, coord_bld->type,
perquadf_bld->type, rho_vec, 0);
rho = lp_build_sqrt(perquadf_bld, rho);
levelf_bld->type, rho_vec, 0);
rho = lp_build_sqrt(levelf_bld, rho);
}
else {
ddx_ddy[0] = lp_build_abs(coord_bld, ddx_ddy[0]);
@@ -462,7 +464,7 @@ lp_build_rho(struct lp_build_sample_context *bld,
}
}
rho = lp_build_pack_aos_scalars(bld->gallivm, coord_bld->type,
perquadf_bld->type, rho, 0);
levelf_bld->type, rho, 0);
}
else {
if (dims <= 1) {
@@ -652,11 +654,11 @@ lp_build_lod_selector(struct lp_build_sample_context *bld,
{
LLVMBuilderRef builder = bld->gallivm->builder;
struct lp_build_context *perquadf_bld = &bld->perquadf_bld;
struct lp_build_context *levelf_bld = &bld->levelf_bld;
LLVMValueRef lod;
*out_lod_ipart = bld->perquadi_bld.zero;
*out_lod_fpart = perquadf_bld->zero;
*out_lod_ipart = bld->leveli_bld.zero;
*out_lod_fpart = levelf_bld->zero;
if (bld->static_sampler_state->min_max_lod_equal) {
/* User is forcing sampling from a particular mipmap level.
@@ -666,12 +668,15 @@ lp_build_lod_selector(struct lp_build_sample_context *bld,
bld->dynamic_state->min_lod(bld->dynamic_state,
bld->gallivm, sampler_unit);
lod = lp_build_broadcast_scalar(perquadf_bld, min_lod);
lod = lp_build_broadcast_scalar(levelf_bld, min_lod);
}
else {
if (explicit_lod) {
lod = lp_build_pack_aos_scalars(bld->gallivm, bld->coord_bld.type,
perquadf_bld->type, explicit_lod, 0);
if (bld->num_lods != bld->coord_type.length)
lod = lp_build_pack_aos_scalars(bld->gallivm, bld->coord_bld.type,
levelf_bld->type, explicit_lod, 0);
else
lod = explicit_lod;
}
else {
LLVMValueRef rho;
@@ -694,29 +699,29 @@ lp_build_lod_selector(struct lp_build_sample_context *bld,
if (mip_filter == PIPE_TEX_MIPFILTER_NONE ||
mip_filter == PIPE_TEX_MIPFILTER_NEAREST) {
*out_lod_ipart = lp_build_ilog2(perquadf_bld, rho);
*out_lod_fpart = perquadf_bld->zero;
*out_lod_ipart = lp_build_ilog2(levelf_bld, rho);
*out_lod_fpart = levelf_bld->zero;
return;
}
if (mip_filter == PIPE_TEX_MIPFILTER_LINEAR &&
!(gallivm_debug & GALLIVM_DEBUG_NO_BRILINEAR)) {
lp_build_brilinear_rho(perquadf_bld, rho, BRILINEAR_FACTOR,
lp_build_brilinear_rho(levelf_bld, rho, BRILINEAR_FACTOR,
out_lod_ipart, out_lod_fpart);
return;
}
}
if (0) {
lod = lp_build_log2(perquadf_bld, rho);
lod = lp_build_log2(levelf_bld, rho);
}
else {
lod = lp_build_fast_log2(perquadf_bld, rho);
lod = lp_build_fast_log2(levelf_bld, rho);
}
/* add shader lod bias */
if (lod_bias) {
lod_bias = lp_build_pack_aos_scalars(bld->gallivm, bld->coord_bld.type,
perquadf_bld->type, lod_bias, 0);
levelf_bld->type, lod_bias, 0);
lod = LLVMBuildFAdd(builder, lod, lod_bias, "shader_lod_bias");
}
}
@@ -726,7 +731,7 @@ lp_build_lod_selector(struct lp_build_sample_context *bld,
LLVMValueRef sampler_lod_bias =
bld->dynamic_state->lod_bias(bld->dynamic_state,
bld->gallivm, sampler_unit);
sampler_lod_bias = lp_build_broadcast_scalar(perquadf_bld,
sampler_lod_bias = lp_build_broadcast_scalar(levelf_bld,
sampler_lod_bias);
lod = LLVMBuildFAdd(builder, lod, sampler_lod_bias, "sampler_lod_bias");
}
@@ -736,33 +741,33 @@ lp_build_lod_selector(struct lp_build_sample_context *bld,
LLVMValueRef max_lod =
bld->dynamic_state->max_lod(bld->dynamic_state,
bld->gallivm, sampler_unit);
max_lod = lp_build_broadcast_scalar(perquadf_bld, max_lod);
max_lod = lp_build_broadcast_scalar(levelf_bld, max_lod);
lod = lp_build_min(perquadf_bld, lod, max_lod);
lod = lp_build_min(levelf_bld, lod, max_lod);
}
if (bld->static_sampler_state->apply_min_lod) {
LLVMValueRef min_lod =
bld->dynamic_state->min_lod(bld->dynamic_state,
bld->gallivm, sampler_unit);
min_lod = lp_build_broadcast_scalar(perquadf_bld, min_lod);
min_lod = lp_build_broadcast_scalar(levelf_bld, min_lod);
lod = lp_build_max(perquadf_bld, lod, min_lod);
lod = lp_build_max(levelf_bld, lod, min_lod);
}
}
if (mip_filter == PIPE_TEX_MIPFILTER_LINEAR) {
if (!(gallivm_debug & GALLIVM_DEBUG_NO_BRILINEAR)) {
lp_build_brilinear_lod(perquadf_bld, lod, BRILINEAR_FACTOR,
lp_build_brilinear_lod(levelf_bld, lod, BRILINEAR_FACTOR,
out_lod_ipart, out_lod_fpart);
}
else {
lp_build_ifloor_fract(perquadf_bld, lod, out_lod_ipart, out_lod_fpart);
lp_build_ifloor_fract(levelf_bld, lod, out_lod_ipart, out_lod_fpart);
}
lp_build_name(*out_lod_fpart, "lod_fpart");
}
else {
*out_lod_ipart = lp_build_iround(perquadf_bld, lod);
*out_lod_ipart = lp_build_iround(levelf_bld, lod);
}
lp_build_name(*out_lod_ipart, "lod_ipart");
@@ -784,20 +789,20 @@ lp_build_nearest_mip_level(struct lp_build_sample_context *bld,
LLVMValueRef lod_ipart,
LLVMValueRef *level_out)
{
struct lp_build_context *perquadi_bld = &bld->perquadi_bld;
struct lp_build_context *leveli_bld = &bld->leveli_bld;
LLVMValueRef first_level, last_level, level;
first_level = bld->dynamic_state->first_level(bld->dynamic_state,
bld->gallivm, texture_unit);
last_level = bld->dynamic_state->last_level(bld->dynamic_state,
bld->gallivm, texture_unit);
first_level = lp_build_broadcast_scalar(perquadi_bld, first_level);
last_level = lp_build_broadcast_scalar(perquadi_bld, last_level);
first_level = lp_build_broadcast_scalar(leveli_bld, first_level);
last_level = lp_build_broadcast_scalar(leveli_bld, last_level);
level = lp_build_add(perquadi_bld, lod_ipart, first_level);
level = lp_build_add(leveli_bld, lod_ipart, first_level);
/* clamp level to legal range of levels */
*level_out = lp_build_clamp(perquadi_bld, level, first_level, last_level);
*level_out = lp_build_clamp(leveli_bld, level, first_level, last_level);
}
@@ -815,8 +820,8 @@ lp_build_linear_mip_levels(struct lp_build_sample_context *bld,
LLVMValueRef *level1_out)
{
LLVMBuilderRef builder = bld->gallivm->builder;
struct lp_build_context *perquadi_bld = &bld->perquadi_bld;
struct lp_build_context *perquadf_bld = &bld->perquadf_bld;
struct lp_build_context *leveli_bld = &bld->leveli_bld;
struct lp_build_context *levelf_bld = &bld->levelf_bld;
LLVMValueRef first_level, last_level;
LLVMValueRef clamp_min;
LLVMValueRef clamp_max;
@@ -825,11 +830,11 @@ lp_build_linear_mip_levels(struct lp_build_sample_context *bld,
bld->gallivm, texture_unit);
last_level = bld->dynamic_state->last_level(bld->dynamic_state,
bld->gallivm, texture_unit);
first_level = lp_build_broadcast_scalar(perquadi_bld, first_level);
last_level = lp_build_broadcast_scalar(perquadi_bld, last_level);
first_level = lp_build_broadcast_scalar(leveli_bld, first_level);
last_level = lp_build_broadcast_scalar(leveli_bld, last_level);
*level0_out = lp_build_add(perquadi_bld, lod_ipart, first_level);
*level1_out = lp_build_add(perquadi_bld, *level0_out, perquadi_bld->one);
*level0_out = lp_build_add(leveli_bld, lod_ipart, first_level);
*level1_out = lp_build_add(leveli_bld, *level0_out, leveli_bld->one);
/*
* Clamp both *level0_out and *level1_out to [first_level, last_level], with
@@ -843,7 +848,7 @@ lp_build_linear_mip_levels(struct lp_build_sample_context *bld,
* converting to our lp_bld_logic helpers.
*/
#if HAVE_LLVM < 0x0301
assert(perquadi_bld->type.length == 1);
assert(leveli_bld->type.length == 1);
#endif
/* *level0_out < first_level */
@@ -858,7 +863,7 @@ lp_build_linear_mip_levels(struct lp_build_sample_context *bld,
first_level, *level1_out, "");
*lod_fpart_inout = LLVMBuildSelect(builder, clamp_min,
perquadf_bld->zero, *lod_fpart_inout, "");
levelf_bld->zero, *lod_fpart_inout, "");
/* *level0_out >= last_level */
clamp_max = LLVMBuildICmp(builder, LLVMIntSGE,
@@ -872,7 +877,7 @@ lp_build_linear_mip_levels(struct lp_build_sample_context *bld,
last_level, *level1_out, "");
*lod_fpart_inout = LLVMBuildSelect(builder, clamp_max,
perquadf_bld->zero, *lod_fpart_inout, "");
levelf_bld->zero, *lod_fpart_inout, "");
lp_build_name(*level0_out, "texture%u_miplevel0", texture_unit);
lp_build_name(*level1_out, "texture%u_miplevel1", texture_unit);
@@ -1087,7 +1092,7 @@ lp_build_mipmap_level_sizes(struct lp_build_sample_context *bld,
LLVMValueRef indexi = lp_build_const_int32(bld->gallivm, i);
ileveli = lp_build_extract_broadcast(bld->gallivm,
bld->perquadi_bld.type,
bld->leveli_bld.type,
bld4.type,
ilevel,
indexi);
@@ -1116,7 +1121,7 @@ lp_build_mipmap_level_sizes(struct lp_build_sample_context *bld,
*/
assert(bld->num_lods == bld->coord_bld.type.length);
if (bld->dims == 1) {
assert(bld->int_size_bld.type.length == 1);
assert(bld->int_size_in_bld.type.length == 1);
int_size_vec = lp_build_broadcast_scalar(&bld->int_coord_bld,
bld->int_size);
/* vector shift with variable shift count alert... */
@@ -1131,10 +1136,9 @@ lp_build_mipmap_level_sizes(struct lp_build_sample_context *bld,
tmp[i] = bld->int_size;
tmp[i] = lp_build_minify(&bld->int_size_in_bld, tmp[i], ilevel1);
}
int_size_vec = lp_build_concat(bld->gallivm,
tmp,
bld->int_size_in_bld.type,
bld->num_lods);
*out_size = lp_build_concat(bld->gallivm, tmp,
bld->int_size_in_bld.type,
bld->num_lods);
}
}
}
@@ -1218,10 +1222,10 @@ lp_build_extract_image_sizes(struct lp_build_sample_context *bld,
*out_width = lp_build_pack_aos_scalars(bld->gallivm, size_type,
coord_type, size, 0);
if (dims >= 2) {
*out_width = lp_build_pack_aos_scalars(bld->gallivm, size_type,
coord_type, size, 1);
*out_height = lp_build_pack_aos_scalars(bld->gallivm, size_type,
coord_type, size, 1);
if (dims == 3) {
*out_width = lp_build_pack_aos_scalars(bld->gallivm, size_type,
*out_depth = lp_build_pack_aos_scalars(bld->gallivm, size_type,
coord_type, size, 2);
}
}

View File

@@ -268,13 +268,13 @@ struct lp_build_sample_context
struct lp_type texel_type;
struct lp_build_context texel_bld;
/** Float per-quad type */
struct lp_type perquadf_type;
struct lp_build_context perquadf_bld;
/** Float level type */
struct lp_type levelf_type;
struct lp_build_context levelf_bld;
/** Int per-quad type */
struct lp_type perquadi_type;
struct lp_build_context perquadi_bld;
/** Int level type */
struct lp_type leveli_type;
struct lp_build_context leveli_bld;
/* Common dynamic state values */
LLVMValueRef row_stride_array;
@@ -477,6 +477,7 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
const struct lp_derivatives *derivs,
LLVMValueRef lod_bias,
LLVMValueRef explicit_lod,
boolean scalar_lod,
LLVMValueRef texel_out[4]);

View File

@@ -531,7 +531,7 @@ lp_build_sample_fetch_image_nearest(struct lp_build_sample_context *bld,
bld->texel_type.length,
bld->format_desc->block.bits,
bld->texel_type.width,
data_ptr, offset);
data_ptr, offset, TRUE);
rgba8 = LLVMBuildBitCast(builder, rgba8, u8n_vec_type, "");
}
@@ -893,7 +893,7 @@ lp_build_sample_fetch_image_linear(struct lp_build_sample_context *bld,
bld->texel_type.length,
bld->format_desc->block.bits,
bld->texel_type.width,
data_ptr, offset[k][j][i]);
data_ptr, offset[k][j][i], TRUE);
rgba8 = LLVMBuildBitCast(builder, rgba8, u8n_vec_type, "");
}
@@ -1422,8 +1422,8 @@ lp_build_sample_mipmap(struct lp_build_sample_context *bld,
if (mip_filter == PIPE_TEX_MIPFILTER_LINEAR) {
LLVMValueRef h16vec_scale = lp_build_const_vec(bld->gallivm,
bld->perquadf_bld.type, 256.0);
LLVMTypeRef i32vec_type = lp_build_vec_type(bld->gallivm, bld->perquadi_bld.type);
bld->levelf_bld.type, 256.0);
LLVMTypeRef i32vec_type = bld->leveli_bld.vec_type;
struct lp_build_if_state if_ctx;
LLVMValueRef need_lerp;
unsigned num_quads = bld->coord_bld.type.length / 4;
@@ -1433,9 +1433,9 @@ lp_build_sample_mipmap(struct lp_build_sample_context *bld,
lod_fpart = LLVMBuildFPToSI(builder, lod_fpart, i32vec_type, "lod_fpart.fixed16");
/* need_lerp = lod_fpart > 0 */
if (num_quads == 1) {
if (bld->num_lods == 1) {
need_lerp = LLVMBuildICmp(builder, LLVMIntSGT,
lod_fpart, bld->perquadi_bld.zero,
lod_fpart, bld->leveli_bld.zero,
"need_lerp");
}
else {
@@ -1450,9 +1450,9 @@ lp_build_sample_mipmap(struct lp_build_sample_context *bld,
* lod_fpart values have same sign.
* We can however then skip the greater than comparison.
*/
lod_fpart = lp_build_max(&bld->perquadi_bld, lod_fpart,
bld->perquadi_bld.zero);
need_lerp = lp_build_any_true_range(&bld->perquadi_bld, num_quads, lod_fpart);
lod_fpart = lp_build_max(&bld->leveli_bld, lod_fpart,
bld->leveli_bld.zero);
need_lerp = lp_build_any_true_range(&bld->leveli_bld, bld->num_lods, lod_fpart);
}
lp_build_if(&if_ctx, bld->gallivm, need_lerp);
@@ -1462,9 +1462,6 @@ lp_build_sample_mipmap(struct lp_build_sample_context *bld,
lp_build_context_init(&u8n_bld, bld->gallivm, lp_type_unorm(8, bld->vector_width));
/* sample the second mipmap level */
lp_build_mipmap_level_sizes(bld, ilevel1,
&size1,
&row_stride1_vec, &img_stride1_vec);
lp_build_mipmap_level_sizes(bld, ilevel1,
&size1,
&row_stride1_vec, &img_stride1_vec);
@@ -1511,7 +1508,7 @@ lp_build_sample_mipmap(struct lp_build_sample_context *bld,
/* interpolate samples from the two mipmap levels */
if (num_quads == 1) {
if (num_quads == 1 && bld->num_lods == 1) {
lod_fpart = LLVMBuildTrunc(builder, lod_fpart, u8n_bld.elem_type, "");
lod_fpart = lp_build_broadcast_scalar(&u8n_bld, lod_fpart);
@@ -1526,17 +1523,16 @@ lp_build_sample_mipmap(struct lp_build_sample_context *bld,
#endif
}
else {
const unsigned num_chans_per_quad = 4 * 4;
LLVMTypeRef tmp_vec_type = LLVMVectorType(u8n_bld.elem_type, bld->perquadi_bld.type.length);
unsigned num_chans_per_lod = 4 * bld->coord_type.length / bld->num_lods;
LLVMTypeRef tmp_vec_type = LLVMVectorType(u8n_bld.elem_type, bld->leveli_bld.type.length);
LLVMValueRef shuffle[LP_MAX_VECTOR_LENGTH];
/* Take the LSB of lod_fpart */
lod_fpart = LLVMBuildTrunc(builder, lod_fpart, tmp_vec_type, "");
/* Broadcast each lod weight into their respective channels */
assert(u8n_bld.type.length == num_quads * num_chans_per_quad);
for (i = 0; i < u8n_bld.type.length; ++i) {
shuffle[i] = lp_build_const_int32(bld->gallivm, i / num_chans_per_quad);
shuffle[i] = lp_build_const_int32(bld->gallivm, i / num_chans_per_lod);
}
lod_fpart = LLVMBuildShuffleVector(builder, lod_fpart, LLVMGetUndef(tmp_vec_type),
LLVMConstVector(shuffle, u8n_bld.type.length), "");

View File

@@ -979,17 +979,17 @@ lp_build_sample_mipmap(struct lp_build_sample_context *bld,
if (mip_filter == PIPE_TEX_MIPFILTER_LINEAR) {
struct lp_build_if_state if_ctx;
LLVMValueRef need_lerp;
unsigned num_quads = bld->coord_bld.type.length / 4;
/* need_lerp = lod_fpart > 0 */
if (num_quads == 1) {
if (bld->num_lods == 1) {
need_lerp = LLVMBuildFCmp(builder, LLVMRealUGT,
lod_fpart, bld->perquadf_bld.zero,
lod_fpart, bld->levelf_bld.zero,
"need_lerp");
}
else {
/*
* We'll do mip filtering if any of the quads need it.
* We'll do mip filtering if any of the quads (or individual
* pixel in case of per-pixel lod) need it.
* It might be better to split the vectors here and only fetch/filter
* quads which need it.
*/
@@ -998,13 +998,13 @@ lp_build_sample_mipmap(struct lp_build_sample_context *bld,
* negative values which would screw up filtering if not all
* lod_fpart values have same sign.
*/
lod_fpart = lp_build_max(&bld->perquadf_bld, lod_fpart,
bld->perquadf_bld.zero);
need_lerp = lp_build_compare(bld->gallivm, bld->perquadf_bld.type,
lod_fpart = lp_build_max(&bld->levelf_bld, lod_fpart,
bld->levelf_bld.zero);
need_lerp = lp_build_compare(bld->gallivm, bld->levelf_bld.type,
PIPE_FUNC_GREATER,
lod_fpart, bld->perquadf_bld.zero);
need_lerp = lp_build_any_true_range(&bld->perquadi_bld, num_quads, need_lerp);
}
lod_fpart, bld->levelf_bld.zero);
need_lerp = lp_build_any_true_range(&bld->leveli_bld, bld->num_lods, need_lerp);
}
lp_build_if(&if_ctx, bld->gallivm, need_lerp);
{
@@ -1036,10 +1036,11 @@ lp_build_sample_mipmap(struct lp_build_sample_context *bld,
/* interpolate samples from the two mipmap levels */
lod_fpart = lp_build_unpack_broadcast_aos_scalars(bld->gallivm,
bld->perquadf_bld.type,
bld->texel_bld.type,
lod_fpart);
if (bld->num_lods != bld->coord_type.length)
lod_fpart = lp_build_unpack_broadcast_aos_scalars(bld->gallivm,
bld->levelf_bld.type,
bld->texel_bld.type,
lod_fpart);
for (chan = 0; chan < 4; chan++) {
colors0[chan] = lp_build_lerp(&bld->texel_bld, lod_fpart,
@@ -1143,7 +1144,7 @@ lp_build_sample_common(struct lp_build_sample_context *bld,
mip_filter,
lod_ipart, lod_fpart);
} else {
*lod_ipart = bld->perquadi_bld.zero;
*lod_ipart = bld->leveli_bld.zero;
}
/*
@@ -1166,7 +1167,7 @@ lp_build_sample_common(struct lp_build_sample_context *bld,
else {
first_level = bld->dynamic_state->first_level(bld->dynamic_state,
bld->gallivm, texture_index);
first_level = lp_build_broadcast_scalar(&bld->perquadi_bld, first_level);
first_level = lp_build_broadcast_scalar(&bld->leveli_bld, first_level);
*ilevel0 = first_level;
}
break;
@@ -1295,7 +1296,7 @@ lp_build_fetch_texel(struct lp_build_sample_context *bld,
const LLVMValueRef *offsets,
LLVMValueRef *colors_out)
{
struct lp_build_context *perquadi_bld = &bld->perquadi_bld;
struct lp_build_context *perquadi_bld = &bld->leveli_bld;
struct lp_build_context *int_coord_bld = &bld->int_coord_bld;
unsigned dims = bld->dims, chan;
unsigned target = bld->static_texture_state->target;
@@ -1305,10 +1306,14 @@ lp_build_fetch_texel(struct lp_build_sample_context *bld,
LLVMValueRef width, height, depth, i, j;
LLVMValueRef offset, out_of_bounds, out1;
/* XXX just like ordinary sampling, we don't handle per-pixel lod (yet). */
if (explicit_lod && bld->static_texture_state->target != PIPE_BUFFER) {
ilevel = lp_build_pack_aos_scalars(bld->gallivm, int_coord_bld->type,
perquadi_bld->type, explicit_lod, 0);
if (bld->num_lods != int_coord_bld->type.length) {
ilevel = lp_build_pack_aos_scalars(bld->gallivm, int_coord_bld->type,
perquadi_bld->type, explicit_lod, 0);
}
else {
ilevel = explicit_lod;
}
lp_build_nearest_mip_level(bld, texture_unit, ilevel, &ilevel);
}
else {
@@ -1489,6 +1494,7 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
const struct lp_derivatives *derivs, /* optional */
LLVMValueRef lod_bias, /* optional */
LLVMValueRef explicit_lod, /* optional */
boolean scalar_lod,
LLVMValueRef texel_out[4])
{
unsigned dims = texture_dims(static_texture_state->target);
@@ -1529,10 +1535,6 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
bld.float_size_in_type.length = dims > 1 ? 4 : 1;
bld.int_size_in_type = lp_int_type(bld.float_size_in_type);
bld.texel_type = type;
bld.perquadf_type = type;
/* we want native vector size to be able to use our intrinsics */
bld.perquadf_type.length = type.length > 4 ? ((type.length + 15) / 16) * 4 : 1;
bld.perquadi_type = lp_int_type(bld.perquadf_type);
/* always using the first channel hopefully should be safe,
* if not things WILL break in other places anyway.
@@ -1563,21 +1565,51 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
debug_printf(" .min_mip_filter = %u\n", derived_sampler_state.min_mip_filter);
}
/*
* This is all a bit complicated different paths are chosen for performance
* reasons.
* Essentially, there can be 1 lod per element, 1 lod per quad or 1 lod for
* everything (the last two options are equivalent for 4-wide case).
* If there's per-quad lod but we split to 4-wide so we can use AoS, per-quad
* lod is calculated then the lod value extracted afterwards so making this
* case basically the same as far as lod handling is concerned for the
* further sample/filter code as the 1 lod for everything case.
* Different lod handling mostly shows up when building mipmap sizes
* (lp_build_mipmap_level_sizes() and friends) and also in filtering
* (getting the fractional part of the lod to the right texels).
*/
/*
* There are other situations where at least the multiple int lods could be
* avoided like min and max lod being equal.
*/
if ((is_fetch && explicit_lod && bld.static_texture_state->target != PIPE_BUFFER) ||
(!is_fetch && mip_filter != PIPE_TEX_MIPFILTER_NONE)) {
if (explicit_lod && !scalar_lod &&
((is_fetch && bld.static_texture_state->target != PIPE_BUFFER) ||
(!is_fetch && mip_filter != PIPE_TEX_MIPFILTER_NONE)))
bld.num_lods = type.length;
/* TODO: for true scalar_lod should only use 1 lod value */
else if ((is_fetch && explicit_lod && bld.static_texture_state->target != PIPE_BUFFER ) ||
(!is_fetch && mip_filter != PIPE_TEX_MIPFILTER_NONE)) {
bld.num_lods = num_quads;
}
else {
bld.num_lods = 1;
}
bld.levelf_type = type;
/* we want native vector size to be able to use our intrinsics */
if (bld.num_lods != type.length) {
bld.levelf_type.length = type.length > 4 ? ((type.length + 15) / 16) * 4 : 1;
}
bld.leveli_type = lp_int_type(bld.levelf_type);
bld.float_size_type = bld.float_size_in_type;
bld.float_size_type.length = bld.num_lods > 1 ? type.length :
bld.float_size_in_type.length;
/* Note: size vectors may not be native. They contain minified w/h/d/_ values,
* with per-element lod that is w0/h0/d0/_/w1/h1/d1_/... so up to 8x4f32 */
if (bld.num_lods > 1) {
bld.float_size_type.length = bld.num_lods == type.length ?
bld.num_lods * bld.float_size_in_type.length :
type.length;
}
bld.int_size_type = lp_int_type(bld.float_size_type);
lp_build_context_init(&bld.float_bld, gallivm, bld.float_type);
@@ -1590,8 +1622,8 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
lp_build_context_init(&bld.int_size_bld, gallivm, bld.int_size_type);
lp_build_context_init(&bld.float_size_bld, gallivm, bld.float_size_type);
lp_build_context_init(&bld.texel_bld, gallivm, bld.texel_type);
lp_build_context_init(&bld.perquadf_bld, gallivm, bld.perquadf_type);
lp_build_context_init(&bld.perquadi_bld, gallivm, bld.perquadi_type);
lp_build_context_init(&bld.levelf_bld, gallivm, bld.levelf_type);
lp_build_context_init(&bld.leveli_bld, gallivm, bld.leveli_type);
/* Get the dynamic state */
tex_width = dynamic_state->width(dynamic_state, gallivm, texture_index);
@@ -1735,14 +1767,31 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
bld4.int_size_in_type = lp_int_type(bld4.float_size_in_type);
bld4.texel_type = bld.texel_type;
bld4.texel_type.length = 4;
bld4.perquadf_type = type4;
bld4.levelf_type = type4;
/* we want native vector size to be able to use our intrinsics */
bld4.perquadf_type.length = 1;
bld4.perquadi_type = lp_int_type(bld4.perquadf_type);
bld4.levelf_type.length = 1;
bld4.leveli_type = lp_int_type(bld4.levelf_type);
bld4.num_lods = 1;
bld4.int_size_type = bld4.int_size_in_type;
if (explicit_lod && !scalar_lod &&
((is_fetch && bld.static_texture_state->target != PIPE_BUFFER) ||
(!is_fetch && mip_filter != PIPE_TEX_MIPFILTER_NONE)))
bld4.num_lods = type4.length;
else
bld4.num_lods = 1;
bld4.levelf_type = type4;
/* we want native vector size to be able to use our intrinsics */
if (bld4.num_lods != type4.length) {
bld4.levelf_type.length = 1;
}
bld4.leveli_type = lp_int_type(bld4.levelf_type);
bld4.float_size_type = bld4.float_size_in_type;
if (bld4.num_lods > 1) {
bld4.float_size_type.length = bld4.num_lods == type4.length ?
bld4.num_lods * bld4.float_size_in_type.length :
type4.length;
}
bld4.int_size_type = lp_int_type(bld4.float_size_type);
lp_build_context_init(&bld4.float_bld, gallivm, bld4.float_type);
lp_build_context_init(&bld4.float_vec_bld, gallivm, type4);
@@ -1754,15 +1803,15 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
lp_build_context_init(&bld4.int_size_bld, gallivm, bld4.int_size_type);
lp_build_context_init(&bld4.float_size_bld, gallivm, bld4.float_size_type);
lp_build_context_init(&bld4.texel_bld, gallivm, bld4.texel_type);
lp_build_context_init(&bld4.perquadf_bld, gallivm, bld4.perquadf_type);
lp_build_context_init(&bld4.perquadi_bld, gallivm, bld4.perquadi_type);
lp_build_context_init(&bld4.levelf_bld, gallivm, bld4.levelf_type);
lp_build_context_init(&bld4.leveli_bld, gallivm, bld4.leveli_type);
for (i = 0; i < num_quads; i++) {
LLVMValueRef s4, t4, r4;
LLVMValueRef lod_iparts, lod_fparts = NULL;
LLVMValueRef ilevel0s, ilevel1s = NULL;
LLVMValueRef indexi = lp_build_const_int32(gallivm, i);
LLVMValueRef lod_ipart4, lod_fpart4 = NULL;
LLVMValueRef ilevel04, ilevel14 = NULL;
LLVMValueRef offsets4[4] = { NULL };
unsigned num_lods = bld4.num_lods;
s4 = lp_build_extract_range(gallivm, s, 4*i, 4);
t4 = lp_build_extract_range(gallivm, t, 4*i, 4);
@@ -1777,27 +1826,27 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
}
}
}
lod_iparts = LLVMBuildExtractElement(builder, lod_ipart, indexi, "");
ilevel0s = LLVMBuildExtractElement(builder, ilevel0, indexi, "");
lod_ipart4 = lp_build_extract_range(gallivm, lod_ipart, num_lods * i, num_lods);
ilevel04 = lp_build_extract_range(gallivm, ilevel0, num_lods * i, num_lods);
if (mip_filter == PIPE_TEX_MIPFILTER_LINEAR) {
ilevel1s = LLVMBuildExtractElement(builder, ilevel1, indexi, "");
lod_fparts = LLVMBuildExtractElement(builder, lod_fpart, indexi, "");
ilevel14 = lp_build_extract_range(gallivm, ilevel1, num_lods * i, num_lods);
lod_fpart4 = lp_build_extract_range(gallivm, lod_fpart, num_lods * i, num_lods);
}
if (use_aos) {
/* do sampling/filtering with fixed pt arithmetic */
lp_build_sample_aos(&bld4, sampler_index,
s4, t4, r4, offsets4,
lod_iparts, lod_fparts,
ilevel0s, ilevel1s,
lod_ipart4, lod_fpart4,
ilevel04, ilevel14,
texelout4);
}
else {
lp_build_sample_general(&bld4, sampler_index,
s4, t4, r4, offsets4,
lod_iparts, lod_fparts,
ilevel0s, ilevel1s,
lod_ipart4, lod_fpart4,
ilevel04, ilevel14,
texelout4);
}
for (j = 0; j < 4; j++) {
@@ -1864,6 +1913,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
lp_build_context_init(&bld_int_vec, gallivm, lp_type_int_vec(32, 128));
if (explicit_lod) {
/* FIXME: this needs to honor per-element lod */
lod = LLVMBuildExtractElement(gallivm->builder, explicit_lod, lp_build_const_int32(gallivm, 0), "");
first_level = dynamic_state->first_level(dynamic_state, gallivm, texture_unit);
lod = lp_build_broadcast_scalar(&bld_int_vec,

View File

@@ -217,6 +217,20 @@ lp_build_swizzle_scalar_aos(struct lp_build_context *bld,
a = LLVMBuildBitCast(builder, a, lp_build_vec_type(bld->gallivm, type2), "");
/*
* Vector element 0 is always channel X.
*
* 76 54 32 10 (array numbering)
* Little endian reg in: YX YX YX YX
* Little endian reg out: YY YY YY YY if shift right (shift == -1)
* XX XX XX XX if shift left (shift == 1)
*
* 01 23 45 67 (array numbering)
* Big endian reg in: XY XY XY XY
* Big endian reg out: YY YY YY YY if shift left (shift == 1)
* XX XX XX XX if shift right (shift == -1)
*
*/
#ifdef PIPE_ARCH_LITTLE_ENDIAN
shift = channel == 0 ? 1 : -1;
#else
@@ -240,10 +254,23 @@ lp_build_swizzle_scalar_aos(struct lp_build_context *bld,
/*
* Bit mask and recursive shifts
*
* Little-endian registers:
*
* 7654 3210
* WZYX WZYX .... WZYX <= input
* 00Y0 00Y0 .... 00Y0 <= mask
* 00YY 00YY .... 00YY <= shift right 1 (shift amount -1)
* YYYY YYYY .... YYYY <= shift left 2 (shift amount 2)
*
* Big-endian registers:
*
* 0123 4567
* XYZW XYZW .... XYZW <= input
* 0Y00 0Y00 .... 0Y00
* YY00 YY00 .... YY00
* YYYY YYYY .... YYYY <= output
* 0Y00 0Y00 .... 0Y00 <= mask
* YY00 YY00 .... YY00 <= shift left 1 (shift amount 1)
* YYYY YYYY .... YYYY <= shift right 2 (shift amount -2)
*
* shifts[] gives little-endian shift amounts; we need to negate for big-endian.
*/
struct lp_type type4;
const int shifts[4][2] = {
@@ -274,14 +301,15 @@ lp_build_swizzle_scalar_aos(struct lp_build_context *bld,
LLVMValueRef tmp = NULL;
int shift = shifts[channel][i];
#ifdef PIPE_ARCH_LITTLE_ENDIAN
/* See endianness diagram above */
#ifdef PIPE_ARCH_BIG_ENDIAN
shift = -shift;
#endif
if(shift > 0)
tmp = LLVMBuildLShr(builder, a, lp_build_const_int_vec(bld->gallivm, type4, shift*type.width), "");
tmp = LLVMBuildShl(builder, a, lp_build_const_int_vec(bld->gallivm, type4, shift*type.width), "");
if(shift < 0)
tmp = LLVMBuildShl(builder, a, lp_build_const_int_vec(bld->gallivm, type4, -shift*type.width), "");
tmp = LLVMBuildLShr(builder, a, lp_build_const_int_vec(bld->gallivm, type4, -shift*type.width), "");
assert(tmp);
if(tmp)
@@ -474,21 +502,39 @@ lp_build_swizzle_aos(struct lp_build_context *bld,
/*
* Mask and shift the channels, trying to group as many channels in the
* same shift as possible
* same shift as possible. The shift amount is positive for shifts left
* and negative for shifts right.
*/
for (shift = -3; shift <= 3; ++shift) {
uint64_t mask = 0;
assert(type4.width <= sizeof(mask)*8);
/*
* Vector element numbers follow the XYZW order, so 0 is always X, etc.
* After widening 4 times we have:
*
* 3210
* Little-endian register layout: WZYX
*
* 0123
* Big-endian register layout: XYZW
*
* For little-endian, higher-numbered channels are obtained by a shift right
* (negative shift amount) and lower-numbered channels by a shift left
* (positive shift amount). The opposite is true for big-endian.
*/
for (chan = 0; chan < 4; ++chan) {
/* FIXME: big endian */
if (swizzles[chan] < 4 &&
chan - swizzles[chan] == shift) {
if (swizzles[chan] < 4) {
/* We need to move channel swizzles[chan] into channel chan */
#ifdef PIPE_ARCH_LITTLE_ENDIAN
mask |= ((1ULL << type.width) - 1) << (swizzles[chan] * type.width);
if (swizzles[chan] - chan == -shift) {
mask |= ((1ULL << type.width) - 1) << (swizzles[chan] * type.width);
}
#else
mask |= ((1ULL << type.width) - 1) << (type4.width - type.width) >> (swizzles[chan] * type.width);
if (swizzles[chan] - chan == shift) {
mask |= ((1ULL << type.width) - 1) << (type4.width - type.width) >> (swizzles[chan] * type.width);
}
#endif
}
}
@@ -502,21 +548,11 @@ lp_build_swizzle_aos(struct lp_build_context *bld,
masked = LLVMBuildAnd(builder, a,
lp_build_const_int_vec(bld->gallivm, type4, mask), "");
if (shift > 0) {
#ifdef PIPE_ARCH_LITTLE_ENDIAN
shifted = LLVMBuildShl(builder, masked,
lp_build_const_int_vec(bld->gallivm, type4, shift*type.width), "");
#else
shifted = LLVMBuildLShr(builder, masked,
lp_build_const_int_vec(bld->gallivm, type4, shift*type.width), "");
#endif
} else if (shift < 0) {
#ifdef PIPE_ARCH_LITTLE_ENDIAN
shifted = LLVMBuildLShr(builder, masked,
lp_build_const_int_vec(bld->gallivm, type4, -shift*type.width), "");
#else
shifted = LLVMBuildShl(builder, masked,
lp_build_const_int_vec(bld->gallivm, type4, -shift*type.width), "");
#endif
} else {
shifted = masked;
}

View File

@@ -390,11 +390,8 @@ lp_build_emit_fetch_texoffset(
if (chan_index == LP_CHAN_ALL) {
swizzle = ~0;
} else {
assert(chan_index < TGSI_SWIZZLE_W);
swizzle = tgsi_util_get_src_register_swizzle(&reg.Register, chan_index);
if (swizzle > 2) {
assert(0 && "invalid swizzle in emit_fetch_texoffset()");
return bld_base->base.undef;
}
}
assert(off->Index <= bld_base->info->file_max[off->File]);

View File

@@ -184,6 +184,7 @@ struct lp_build_sampler_soa
const struct lp_derivatives *derivs,
LLVMValueRef lod_bias, /* optional */
LLVMValueRef explicit_lod, /* optional */
boolean scalar_lod,
LLVMValueRef *texel);
void

View File

@@ -396,7 +396,7 @@ frc_emit(
TGSI_OPCODE_SUB, emit_data->args[0], tmp);
}
/* TGSI_OPCODE_KIL */
/* TGSI_OPCODE_KILL_IF */
static void
kil_fetch_args(
@@ -419,7 +419,7 @@ kil_fetch_args(
emit_data->dst_type = LLVMVoidTypeInContext(bld_base->base.gallivm->context);
}
/* TGSI_OPCODE_KILP */
/* TGSI_OPCODE_KILL */
static void
kilp_fetch_args(
@@ -633,8 +633,6 @@ rsq_emit(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
emit_data->args[0] = lp_build_emit_llvm_unary(bld_base, TGSI_OPCODE_ABS,
emit_data->args[0]);
if (bld_base->rsq_action.emit) {
bld_base->rsq_action.emit(&bld_base->rsq_action, bld_base, emit_data);
} else {
@@ -871,8 +869,8 @@ lp_set_default_actions(struct lp_build_tgsi_context * bld_base)
bld_base->op_actions[TGSI_OPCODE_EX2].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_IF].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_UIF].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_KIL].fetch_args = kil_fetch_args;
bld_base->op_actions[TGSI_OPCODE_KILP].fetch_args = kilp_fetch_args;
bld_base->op_actions[TGSI_OPCODE_KILL_IF].fetch_args = kil_fetch_args;
bld_base->op_actions[TGSI_OPCODE_KILL].fetch_args = kilp_fetch_args;
bld_base->op_actions[TGSI_OPCODE_RCP].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_SIN].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_LG2].fetch_args = scalar_unary_fetch_args;
@@ -1161,14 +1159,9 @@ iset_emit_cpu(
struct lp_build_emit_data * emit_data,
unsigned pipe_func)
{
LLVMValueRef nz = lp_build_const_vec(bld_base->base.gallivm,
bld_base->int_bld.type, ~0U);
LLVMValueRef cond = lp_build_cmp(&bld_base->int_bld, pipe_func,
emit_data->args[0], emit_data->args[1]);
emit_data->output[emit_data->chan] = lp_build_select(&bld_base->int_bld,
cond,
nz,
bld_base->int_bld.zero);
emit_data->output[emit_data->chan] = cond;
}
/* TGSI_OPCODE_IMAX (CPU Only) */
@@ -1354,9 +1347,6 @@ rcp_emit_cpu(
}
/* Reciprical squareroot (CPU Only) */
/* This is not the same as TGSI_OPCODE_RSQ, which requres the argument to be
* greater than or equal to 0 */
static void
recip_sqrt_emit_cpu(
const struct lp_build_tgsi_action * action,
@@ -1620,14 +1610,9 @@ uset_emit_cpu(
struct lp_build_emit_data * emit_data,
unsigned pipe_func)
{
LLVMValueRef nz = lp_build_const_vec(bld_base->base.gallivm,
bld_base->uint_bld.type, ~0U);
LLVMValueRef cond = lp_build_cmp(&bld_base->uint_bld, pipe_func,
emit_data->args[0], emit_data->args[1]);
emit_data->output[emit_data->chan] = lp_build_select(&bld_base->uint_bld,
cond,
nz,
bld_base->uint_bld.zero);
emit_data->output[emit_data->chan] = cond;
}

View File

@@ -657,12 +657,10 @@ lp_emit_instruction_aos(
case TGSI_OPCODE_DDY:
return FALSE;
case TGSI_OPCODE_KILP:
/* predicated kill */
case TGSI_OPCODE_KILL:
return FALSE;
case TGSI_OPCODE_KIL:
/* conditional kill */
case TGSI_OPCODE_KILL_IF:
return FALSE;
case TGSI_OPCODE_PK2H:

View File

@@ -1026,9 +1026,9 @@ emit_fetch_immediate(
}
if (stype == TGSI_TYPE_UNSIGNED) {
res = LLVMConstBitCast(res, bld_base->uint_bld.vec_type);
res = LLVMBuildBitCast(builder, res, bld_base->uint_bld.vec_type, "");
} else if (stype == TGSI_TYPE_SIGNED) {
res = LLVMConstBitCast(res, bld_base->int_bld.vec_type);
res = LLVMBuildBitCast(builder, res, bld_base->int_bld.vec_type, "");
}
return res;
}
@@ -1576,6 +1576,7 @@ emit_tex( struct lp_build_tgsi_soa_context *bld,
LLVMValueRef offsets[3] = { NULL };
struct lp_derivatives derivs;
struct lp_derivatives *deriv_ptr = NULL;
boolean scalar_lod;
unsigned num_coords, num_derivs, num_offsets;
unsigned i;
@@ -1693,6 +1694,9 @@ emit_tex( struct lp_build_tgsi_soa_context *bld,
}
}
/* TODO: use scalar lod if explicit_lod, lod_bias or derivs are broadcasted scalars */
scalar_lod = bld->bld_base.info->processor == TGSI_PROCESSOR_FRAGMENT;
bld->sampler->emit_fetch_texel(bld->sampler,
bld->bld_base.base.gallivm,
bld->bld_base.base.type,
@@ -1701,7 +1705,7 @@ emit_tex( struct lp_build_tgsi_soa_context *bld,
coords,
offsets,
deriv_ptr,
lod_bias, explicit_lod,
lod_bias, explicit_lod, scalar_lod,
texel);
}
@@ -1719,6 +1723,7 @@ emit_sample(struct lp_build_tgsi_soa_context *bld,
LLVMValueRef offsets[3] = { NULL };
struct lp_derivatives derivs;
struct lp_derivatives *deriv_ptr = NULL;
boolean scalar_lod;
unsigned num_coords, num_offsets, num_derivs;
unsigned i;
@@ -1784,13 +1789,6 @@ emit_sample(struct lp_build_tgsi_soa_context *bld,
return;
}
/*
* unlike old-style tex opcodes the texture/sampler indices
* always come from src1 and src2 respectively.
*/
texture_unit = inst->Src[1].Register.Index;
sampler_unit = inst->Src[2].Register.Index;
if (modifier == LP_BLD_TEX_MODIFIER_LOD_BIAS) {
lod_bias = lp_build_emit_fetch( &bld->bld_base, inst, 3, 0 );
explicit_lod = NULL;
@@ -1843,6 +1841,9 @@ emit_sample(struct lp_build_tgsi_soa_context *bld,
}
}
/* TODO: use scalar lod if explicit_lod, lod_bias or derivs are broadcasted scalars */
scalar_lod = bld->bld_base.info->processor == TGSI_PROCESSOR_FRAGMENT;
bld->sampler->emit_fetch_texel(bld->sampler,
bld->bld_base.base.gallivm,
bld->bld_base.base.type,
@@ -1851,7 +1852,7 @@ emit_sample(struct lp_build_tgsi_soa_context *bld,
coords,
offsets,
deriv_ptr,
lod_bias, explicit_lod,
lod_bias, explicit_lod, scalar_lod,
texel);
}
@@ -1866,6 +1867,7 @@ emit_fetch_texels( struct lp_build_tgsi_soa_context *bld,
LLVMValueRef explicit_lod = NULL;
LLVMValueRef coords[3];
LLVMValueRef offsets[3] = { NULL };
boolean scalar_lod;
unsigned num_coords;
unsigned dims;
unsigned i;
@@ -1934,6 +1936,9 @@ emit_fetch_texels( struct lp_build_tgsi_soa_context *bld,
}
}
/* TODO: use scalar lod if explicit_lod is broadcasted scalar */
scalar_lod = bld->bld_base.info->processor == TGSI_PROCESSOR_FRAGMENT;
bld->sampler->emit_fetch_texel(bld->sampler,
bld->bld_base.base.gallivm,
bld->bld_base.base.type,
@@ -1942,7 +1947,7 @@ emit_fetch_texels( struct lp_build_tgsi_soa_context *bld,
coords,
offsets,
NULL,
NULL, explicit_lod,
NULL, explicit_lod, scalar_lod,
texel);
}
@@ -2038,7 +2043,7 @@ near_end_of_shader(struct lp_build_tgsi_soa_context *bld,
* Kill fragment if any of the src register values are negative.
*/
static void
emit_kil(
emit_kill_if(
struct lp_build_tgsi_soa_context *bld,
const struct tgsi_full_instruction *inst,
int pc)
@@ -2091,13 +2096,12 @@ emit_kil(
/**
* Predicated fragment kill.
* XXX Actually, we do an unconditional kill (as in tgsi_exec.c).
* Unconditional fragment kill.
* The only predication is the execution mask which will apply if
* we're inside a loop or conditional.
*/
static void
emit_kilp(struct lp_build_tgsi_soa_context *bld,
emit_kill(struct lp_build_tgsi_soa_context *bld,
int pc)
{
LLVMBuilderRef builder = bld->bld_base.base.gallivm->builder;
@@ -2315,25 +2319,25 @@ ddy_emit(
}
static void
kilp_emit(
kill_emit(
const struct lp_build_tgsi_action * action,
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
struct lp_build_tgsi_soa_context * bld = lp_soa_context(bld_base);
emit_kilp(bld, bld_base->pc - 1);
emit_kill(bld, bld_base->pc - 1);
}
static void
kil_emit(
kill_if_emit(
const struct lp_build_tgsi_action * action,
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
struct lp_build_tgsi_soa_context * bld = lp_soa_context(bld_base);
emit_kil(bld, emit_data->inst, bld_base->pc - 1);
emit_kill_if(bld, emit_data->inst, bld_base->pc - 1);
}
static void
@@ -3164,8 +3168,8 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
bld.bld_base.op_actions[TGSI_OPCODE_ENDSWITCH].emit = endswitch_emit;
bld.bld_base.op_actions[TGSI_OPCODE_IF].emit = if_emit;
bld.bld_base.op_actions[TGSI_OPCODE_UIF].emit = uif_emit;
bld.bld_base.op_actions[TGSI_OPCODE_KIL].emit = kil_emit;
bld.bld_base.op_actions[TGSI_OPCODE_KILP].emit = kilp_emit;
bld.bld_base.op_actions[TGSI_OPCODE_KILL_IF].emit = kill_if_emit;
bld.bld_base.op_actions[TGSI_OPCODE_KILL].emit = kill_emit;
bld.bld_base.op_actions[TGSI_OPCODE_NRM].emit = nrm_emit;
bld.bld_base.op_actions[TGSI_OPCODE_NRM4].emit = nrm_emit;
bld.bld_base.op_actions[TGSI_OPCODE_RET].emit = ret_emit;

View File

@@ -33,6 +33,8 @@
* Set GALLIUM_HUD=help for more info.
*/
#include <stdio.h>
#include "hud/hud_context.h"
#include "hud/hud_private.h"
#include "hud/font.h"
@@ -106,8 +108,8 @@ hud_draw_colored_prims(struct hud_context *hud, unsigned prim,
hud->constants.color[1] = g;
hud->constants.color[2] = b;
hud->constants.color[3] = a;
hud->constants.translate[0] = xoffset;
hud->constants.translate[1] = yoffset;
hud->constants.translate[0] = (float) xoffset;
hud->constants.translate[1] = (float) yoffset;
hud->constants.scale[0] = 1;
hud->constants.scale[1] = yscale;
cso_set_constant_buffer(cso, PIPE_SHADER_VERTEX, 0, &hud->constbuf);
@@ -127,10 +129,10 @@ hud_draw_colored_quad(struct hud_context *hud, unsigned prim,
float r, float g, float b, float a)
{
float buffer[] = {
x1, y1,
x1, y2,
x2, y2,
x2, y1,
(float) x1, (float) y1,
(float) x1, (float) y2,
(float) x2, (float) y2,
(float) x2, (float) y1,
};
hud_draw_colored_prims(hud, prim, buffer, 4, r, g, b, a, 0, 0, 1);
@@ -145,17 +147,17 @@ hud_draw_background_quad(struct hud_context *hud,
assert(hud->bg.num_vertices + 4 <= hud->bg.max_num_vertices);
vertices[num++] = x1;
vertices[num++] = y1;
vertices[num++] = (float) x1;
vertices[num++] = (float) y1;
vertices[num++] = x1;
vertices[num++] = y2;
vertices[num++] = (float) x1;
vertices[num++] = (float) y2;
vertices[num++] = x2;
vertices[num++] = y2;
vertices[num++] = (float) x2;
vertices[num++] = (float) y2;
vertices[num++] = x2;
vertices[num++] = y1;
vertices[num++] = (float) x2;
vertices[num++] = (float) y1;
hud->bg.num_vertices += num/2;
}
@@ -200,25 +202,25 @@ hud_draw_string(struct hud_context *hud, unsigned x, unsigned y,
assert(hud->text.num_vertices + num/4 + 4 <= hud->text.max_num_vertices);
vertices[num++] = x1;
vertices[num++] = y1;
vertices[num++] = tx1;
vertices[num++] = ty1;
vertices[num++] = (float) x1;
vertices[num++] = (float) y1;
vertices[num++] = (float) tx1;
vertices[num++] = (float) ty1;
vertices[num++] = x1;
vertices[num++] = y2;
vertices[num++] = tx1;
vertices[num++] = ty2;
vertices[num++] = (float) x1;
vertices[num++] = (float) y2;
vertices[num++] = (float) tx1;
vertices[num++] = (float) ty2;
vertices[num++] = x2;
vertices[num++] = y2;
vertices[num++] = tx2;
vertices[num++] = ty2;
vertices[num++] = (float) x2;
vertices[num++] = (float) y2;
vertices[num++] = (float) tx2;
vertices[num++] = (float) ty2;
vertices[num++] = x2;
vertices[num++] = y1;
vertices[num++] = tx2;
vertices[num++] = ty1;
vertices[num++] = (float) x2;
vertices[num++] = (float) y1;
vertices[num++] = (float) tx2;
vertices[num++] = (float) ty1;
x += hud->font.glyph_width;
s++;
@@ -316,25 +318,25 @@ hud_pane_accumulate_vertices(struct hud_context *hud,
/* draw border */
assert(hud->whitelines.num_vertices + num/2 + 8 <= hud->whitelines.max_num_vertices);
line_verts[num++] = pane->x1;
line_verts[num++] = pane->y1;
line_verts[num++] = pane->x2;
line_verts[num++] = pane->y1;
line_verts[num++] = (float) pane->x1;
line_verts[num++] = (float) pane->y1;
line_verts[num++] = (float) pane->x2;
line_verts[num++] = (float) pane->y1;
line_verts[num++] = pane->x2;
line_verts[num++] = pane->y1;
line_verts[num++] = pane->x2;
line_verts[num++] = pane->y2;
line_verts[num++] = (float) pane->x2;
line_verts[num++] = (float) pane->y1;
line_verts[num++] = (float) pane->x2;
line_verts[num++] = (float) pane->y2;
line_verts[num++] = pane->x1;
line_verts[num++] = pane->y2;
line_verts[num++] = pane->x2;
line_verts[num++] = pane->y2;
line_verts[num++] = (float) pane->x1;
line_verts[num++] = (float) pane->y2;
line_verts[num++] = (float) pane->x2;
line_verts[num++] = (float) pane->y2;
line_verts[num++] = pane->x1;
line_verts[num++] = pane->y1;
line_verts[num++] = pane->x1;
line_verts[num++] = pane->y2;
line_verts[num++] = (float) pane->x1;
line_verts[num++] = (float) pane->y1;
line_verts[num++] = (float) pane->x1;
line_verts[num++] = (float) pane->y2;
/* draw horizontal lines inside the graph */
for (i = 0; i <= 5; i++) {
@@ -405,8 +407,8 @@ hud_draw(struct hud_context *hud, struct pipe_resource *tex)
hud->fb_width = tex->width0;
hud->fb_height = tex->height0;
hud->constants.two_div_fb_width = 2.0 / hud->fb_width;
hud->constants.two_div_fb_height = 2.0 / hud->fb_height;
hud->constants.two_div_fb_width = 2.0f / hud->fb_width;
hud->constants.two_div_fb_height = 2.0f / hud->fb_height;
cso_save_framebuffer(cso);
cso_save_sample_mask(cso);
@@ -456,7 +458,7 @@ hud_draw(struct hud_context *hud, struct pipe_resource *tex)
cso_set_geometry_shader_handle(cso, NULL);
cso_set_vertex_shader_handle(cso, hud->vs);
cso_set_vertex_elements(cso, 2, hud->velems);
cso_set_render_condition(cso, NULL, 0);
cso_set_render_condition(cso, NULL, FALSE, 0);
cso_set_sampler_views(cso, PIPE_SHADER_FRAGMENT, 1,
&hud->font_sampler_view);
cso_set_samplers(cso, PIPE_SHADER_FRAGMENT, 1, sampler_states);
@@ -486,7 +488,7 @@ hud_draw(struct hud_context *hud, struct pipe_resource *tex)
hud->constants.color[0] = 0;
hud->constants.color[1] = 0;
hud->constants.color[2] = 0;
hud->constants.color[3] = 0.666;
hud->constants.color[3] = 0.666f;
hud->constants.translate[0] = 0;
hud->constants.translate[1] = 0;
hud->constants.scale[0] = 1;
@@ -562,7 +564,7 @@ void
hud_pane_set_max_value(struct hud_pane *pane, uint64_t value)
{
pane->max_value = value;
pane->yscale = -(int)pane->inner_height / (double)pane->max_value;
pane->yscale = -(int)pane->inner_height / (float)pane->max_value;
}
static struct hud_pane *
@@ -634,8 +636,8 @@ hud_graph_add_value(struct hud_graph *gr, uint64_t value)
gr->vertices[1] = gr->vertices[(gr->index-1)*2+1];
gr->index = 1;
}
gr->vertices[(gr->index)*2+0] = gr->index*2;
gr->vertices[(gr->index)*2+1] = value;
gr->vertices[(gr->index)*2+0] = (float) (gr->index * 2);
gr->vertices[(gr->index)*2+1] = (float) value;
gr->index++;
if (gr->num_vertices < gr->pane->max_num_vertices) {
@@ -715,8 +717,8 @@ hud_parse_env_var(struct hud_context *hud, const char *env)
*/
period_env = getenv("GALLIUM_HUD_PERIOD");
if (period_env) {
float p = atof(period_env);
if (p >= 0.0) {
float p = (float) atof(period_env);
if (p >= 0.0f) {
period = (unsigned) (p * 1000 * 1000);
}
}
@@ -959,7 +961,8 @@ hud_create(struct pipe_context *pipe, struct cso_context *cso)
hud->fs_color =
util_make_fragment_passthrough_shader(pipe,
TGSI_SEMANTIC_COLOR,
TGSI_INTERPOLATE_CONSTANT);
TGSI_INTERPOLATE_CONSTANT,
TRUE);
{
/* Read a texture and do .xxxx swizzling. */

View File

@@ -116,6 +116,12 @@ query_cpu_load(struct hud_graph *gr)
}
}
static void
free_query_data(void *p)
{
FREE(p);
}
void
hud_cpu_graph_install(struct hud_pane *pane, unsigned cpu_index)
{
@@ -144,7 +150,11 @@ hud_cpu_graph_install(struct hud_pane *pane, unsigned cpu_index)
}
gr->query_new_value = query_cpu_load;
gr->free_query_data = free;
/* Don't use free() as our callback as that messes up Gallium's
* memory debugger. Use simple free_query_data() wrapper.
*/
gr->free_query_data = free_query_data;
info = gr->query_data;
info->cpu_index = cpu_index;

View File

@@ -52,7 +52,7 @@ query_fps(struct hud_graph *gr)
info->frames = 0;
info->last_time = now;
hud_graph_add_value(gr, fps);
hud_graph_add_value(gr, (uint64_t) fps);
}
}
else {
@@ -60,6 +60,12 @@ query_fps(struct hud_graph *gr)
}
}
static void
free_query_data(void *p)
{
FREE(p);
}
void
hud_fps_graph_install(struct hud_pane *pane)
{
@@ -76,7 +82,11 @@ hud_fps_graph_install(struct hud_pane *pane)
}
gr->query_new_value = query_fps;
gr->free_query_data = free;
/* Don't use free() as our callback as that messes up Gallium's
* memory debugger. Use simple free_query_data() wrapper.
*/
gr->free_query_data = free_query_data;
hud_pane_add_graph(pane, gr);
}

View File

@@ -42,7 +42,7 @@ struct hud_graph {
char name[128];
void *query_data;
void (*query_new_value)(struct hud_graph *gr);
void (*free_query_data)(void *ptr);
void (*free_query_data)(void *ptr); /**< do not use ordinary free() */
/* mutable variables */
unsigned num_vertices;

View File

@@ -150,9 +150,26 @@ int u_index_translator( unsigned hw_mask,
}
/**
* If a driver does not support a particular gallium primitive type
* (such as PIPE_PRIM_QUAD_STRIP) this function can be used to help
* convert the primitive into a simpler type (like PIPE_PRIM_TRIANGLES).
*
* The generator functions generates a number of ushort or uint indexes
* for drawing the new type of primitive.
*
* \param hw_mask a bitmask of (1 << PIPE_PRIM_x) values that indicates
* kind of primitives are supported by the driver.
* \param prim the PIPE_PRIM_x that the user wants to draw
* \param start index of first vertex to draw
* \param nr number of vertices to draw
* \param in_pv user's provoking vertex (PV_FIRST/LAST)
* \param out_pv desired proking vertex for the hardware (PV_FIRST/LAST)
* \param out_prim returns the new primitive type for the driver
* \param out_index_size returns OUT_USHORT or OUT_UINT
* \param out_nr returns new number of vertices to draw
* \param out_generate returns pointer to the generator function
*/
int u_index_generator( unsigned hw_mask,
unsigned prim,
unsigned start,

View File

@@ -151,7 +151,14 @@ int u_unfilled_translator( unsigned prim,
}
/**
* Utility for converting unfilled polygons into points, lines, triangles.
* Few drivers have direct support for OpenGL's glPolygonMode.
* This function helps with converting triangles into points or lines
* when the front and back fill modes are the same. When there's
* different front/back fill modes, that can be handled with the
* 'draw' module.
*/
int u_unfilled_generator( unsigned prim,
unsigned start,
unsigned nr,

View File

@@ -0,0 +1,92 @@
/**************************************************************************
*
* Copyright 2013 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR
* ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
#include "pipe/p_config.h"
#include "os/os_process.h"
#include "util/u_memory.h"
#if defined(PIPE_SUBSYSTEM_WINDOWS_USER)
# include <windows.h>
#elif defined(__GLIBC__)
# include <errno.h>
#elif defined(PIPE_OS_BSD) || defined(PIPE_OS_APPLE)
# include <stdlib.h>
#else
#warning unexpected platform in os_process.c
#endif
/**
* Return the name of the current process.
* \param procname returns the process name
* \param size size of the procname buffer
* \return TRUE or FALSE for success, failure
*/
boolean
os_get_process_name(char *procname, size_t size)
{
const char *name;
#if defined(PIPE_SUBSYSTEM_WINDOWS_USER)
char szProcessPath[MAX_PATH];
char *lpProcessName;
char *lpProcessExt;
GetModuleFileNameA(NULL, szProcessPath, Elements(szProcessPath));
lpProcessName = strrchr(szProcessPath, '\\');
lpProcessName = lpProcessName ? lpProcessName + 1 : szProcessPath;
lpProcessExt = strrchr(lpProcessName, '.');
if (lpProcessExt) {
*lpProcessExt = '\0';
}
name = lpProcessName;
#elif defined(__GLIBC__)
name = program_invocation_short_name;
#elif defined(PIPE_OS_BSD) || defined(PIPE_OS_APPLE)
/* *BSD and OS X */
name = getprogname();
#else
#warning unexpected platform in os_process.c
return FALSE;
#endif
assert(size > 0);
assert(procname);
if (name && procname && size > 0) {
strncpy(procname, name, size);
procname[size - 1] = '\0';
return TRUE;
}
else {
return FALSE;
}
}

View File

@@ -0,0 +1,40 @@
/**************************************************************************
*
* Copyright 2013 VMware, Inc.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sub license, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice (including the
* next paragraph) shall be included in all copies or substantial portions
* of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR
* ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*
**************************************************************************/
#ifndef OS_PROCESS_H
#define OS_PROCESS_H
#include "pipe/p_compiler.h"
extern boolean
os_get_process_name(char *str, size_t size);
#endif /* OS_PROCESS_H */

View File

@@ -84,7 +84,7 @@ os_time_get_nano(void)
void
os_time_sleep(int64_t usecs)
{
DWORD dwMilliseconds = (usecs + 999) / 1000;
DWORD dwMilliseconds = (DWORD) ((usecs + 999) / 1000);
/* Avoid Sleep(O) as that would cause to sleep for an undetermined duration */
if (dwMilliseconds) {
Sleep(dwMilliseconds);

View File

@@ -30,8 +30,9 @@
#include "postprocess/postprocess.h"
typedef void (*pp_init_func) (struct pp_queue_t *, unsigned int,
typedef bool (*pp_init_func) (struct pp_queue_t *, unsigned int,
unsigned int);
typedef void (*pp_free_func) (struct pp_queue_t *, unsigned int);
struct pp_filter_t
{
@@ -41,18 +42,19 @@ struct pp_filter_t
unsigned int verts; /* How many are vertex shaders */
pp_init_func init; /* Init function */
pp_func main; /* Run function */
pp_free_func free; /* Free function */
};
/* Order matters. Put new filters in a suitable place. */
static const struct pp_filter_t pp_filters[PP_FILTERS] = {
/* name inner shaders verts init run */
{ "pp_noblue", 0, 2, 1, pp_noblue_init, pp_nocolor },
{ "pp_nogreen", 0, 2, 1, pp_nogreen_init, pp_nocolor },
{ "pp_nored", 0, 2, 1, pp_nored_init, pp_nocolor },
{ "pp_celshade", 0, 2, 1, pp_celshade_init, pp_nocolor },
{ "pp_jimenezmlaa", 2, 5, 2, pp_jimenezmlaa_init, pp_jimenezmlaa },
{ "pp_jimenezmlaa_color", 2, 5, 2, pp_jimenezmlaa_init_color, pp_jimenezmlaa_color },
/* name inner shaders verts init run free */
{ "pp_noblue", 0, 2, 1, pp_noblue_init, pp_nocolor, pp_nocolor_free },
{ "pp_nogreen", 0, 2, 1, pp_nogreen_init, pp_nocolor, pp_nocolor_free },
{ "pp_nored", 0, 2, 1, pp_nored_init, pp_nocolor, pp_nocolor_free },
{ "pp_celshade", 0, 2, 1, pp_celshade_init, pp_nocolor, pp_celshade_free },
{ "pp_jimenezmlaa", 2, 5, 2, pp_jimenezmlaa_init, pp_jimenezmlaa, pp_jimenezmlaa_free },
{ "pp_jimenezmlaa_color", 2, 5, 2, pp_jimenezmlaa_init_color, pp_jimenezmlaa_color, pp_jimenezmlaa_free },
};
#endif

View File

@@ -53,11 +53,13 @@ struct pp_queue_t
struct pipe_resource *depth; /* depth of original input */
struct pipe_resource *stencil; /* stencil shared by inner_tmps */
struct pipe_resource *constbuf; /* MLAA constant buffer */
struct pipe_resource *areamaptex; /* MLAA area map texture */
struct pipe_surface *tmps[2], *inner_tmps[3], *stencils;
void ***shaders; /* Shaders in TGSI form */
unsigned int *verts;
unsigned int *filters; /* Active filter to filters.h mapping. */
struct program *p;
bool fbos_init;
@@ -75,6 +77,14 @@ void pp_debug(const char *, ...);
struct program *pp_init_prog(struct pp_queue_t *, struct pipe_context *pipe,
struct cso_context *);
void pp_init_fbos(struct pp_queue_t *, unsigned int, unsigned int);
void pp_blit(struct pipe_context *pipe,
struct pipe_resource *src_tex,
int srcX0, int srcY0,
int srcX1, int srcY1,
int srcZ0,
struct pipe_surface *dst,
int dstX0, int dstY0,
int dstX1, int dstY1);
/* The filters */
@@ -88,14 +98,20 @@ void pp_jimenezmlaa_color(struct pp_queue_t *, struct pipe_resource *,
/* The filter init functions */
void pp_celshade_init(struct pp_queue_t *, unsigned int, unsigned int);
bool pp_celshade_init(struct pp_queue_t *, unsigned int, unsigned int);
void pp_nored_init(struct pp_queue_t *, unsigned int, unsigned int);
void pp_nogreen_init(struct pp_queue_t *, unsigned int, unsigned int);
void pp_noblue_init(struct pp_queue_t *, unsigned int, unsigned int);
bool pp_nored_init(struct pp_queue_t *, unsigned int, unsigned int);
bool pp_nogreen_init(struct pp_queue_t *, unsigned int, unsigned int);
bool pp_noblue_init(struct pp_queue_t *, unsigned int, unsigned int);
void pp_jimenezmlaa_init(struct pp_queue_t *, unsigned int, unsigned int);
void pp_jimenezmlaa_init_color(struct pp_queue_t *, unsigned int,
bool pp_jimenezmlaa_init(struct pp_queue_t *, unsigned int, unsigned int);
bool pp_jimenezmlaa_init_color(struct pp_queue_t *, unsigned int,
unsigned int);
/* The filter free functions */
void pp_celshade_free(struct pp_queue_t *, unsigned int);
void pp_nocolor_free(struct pp_queue_t *, unsigned int);
void pp_jimenezmlaa_free(struct pp_queue_t *, unsigned int);
#endif

View File

@@ -30,9 +30,17 @@
#include "postprocess/pp_filters.h"
/** Init function */
void
bool
pp_celshade_init(struct pp_queue_t *ppq, unsigned int n, unsigned int val)
{
ppq->shaders[n][1] =
pp_tgsi_to_state(ppq->p->pipe, celshade, false, "celshade");
return (ppq->shaders[n][1] != NULL) ? TRUE : FALSE;
}
/** Free function */
void
pp_celshade_free(struct pp_queue_t *ppq, unsigned int n)
{
}

Some files were not shown because too many files have changed in this diff Show More