Compare commits

...

127 Commits

Author SHA1 Message Date
Emil Velikov
088d350178 Add release notes for the 10.3.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-13 00:16:59 +01:00
Emil Velikov
85421100fb Update VERSION to 10.3.1
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-12 21:44:45 +01:00
Tomasz Figa
c90cd077bd st/mesa: Fix paths used in Android builds
With current makefiles the build fails because source and build paths
are generated incorrectly. With Android build system the top_srcdir and
top_builddir variables are undefined and all paths are relative to where
Android.mk is located. This ends up with path likes
external/mesa/src/mesa/src/mesa/ for both source and build paths, which
are obviously wrong.

This patch fixes this by overriding resulting SRCDIR and BUILDDIR
variables with empty string, so that paths end up being relative to
Android.mk file again. Appending correct build path to generated files
is already done in Android.gen.mk.

Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit b4ffd19e6c)
2014-10-03 01:28:02 +01:00
Tomasz Figa
dffbee6668 st/mesa: Generate format_info.c in Android builds
Current Android makefiles lack generation of format_info.c, which is
a dependency of main/format.c. This patch adds necessary code to
Android.gen.mk.

Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 98445fd25e)
2014-10-03 01:27:56 +01:00
Tomasz Figa
58ba481e8e util: Include in Android builds
This patch fixes Android build failures by including src/util directory
in compilation. Files inside of this directory are compiled into
libmesa_util static library and linked with resulting libGLES_mesa.

Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit d703abf735)
2014-10-03 01:27:50 +01:00
Keith Packard
ccf908e382 glx/dri3: Provide error diagnostics when DRI3 allocation fails
Instead of just segfaulting in the driver when a buffer allocation fails,
report error messages indicating what went wrong so that we can debug things.

As a simple example, chromium wraps Mesa in a sandbox which doesn't allow
access to most syscalls, including the ability to create shared memory
segments for fences. Before, you'd get a simple segfault in mesa and your 3D
acceleration would fail. Now you get:

$ chromium --disable-gpu-blacklist
[10618:10643:0930/200525:ERROR:nss_util.cc(856)] After loading Root Certs, loaded==false: NSS error code: -8018
libGL: pci id for fd 12: 8086:0a16, driver i965
libGL: OpenDriver: trying /local-miki/src/mesa/mesa/lib/i965_dri.so
libGL: Can't open configuration file /home/keithp/.drirc: Operation not permitted.
libGL: Can't open configuration file /home/keithp/.drirc: Operation not permitted.
libGL error: DRI3 Fence object allocation failure Operation not permitted
[10618:10618:0930/200525:ERROR:command_buffer_proxy_impl.cc(153)] Could not send GpuCommandBufferMsg_Initialize.
[10618:10618:0930/200525:ERROR:webgraphicscontext3d_command_buffer_impl.cc(236)] CommandBufferProxy::Initialize failed.
[10618:10618:0930/200525:ERROR:webgraphicscontext3d_command_buffer_impl.cc(256)] Failed to initialize command buffer.

This made it pretty easy to diagnose the problem in the referenced bug report.

Bugzilla: https://code.google.com/p/chromium/issues/detail?id=415681
Signed-off-by: Keith Packard <keithp@keithp.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 3202926746)
2014-10-03 01:27:42 +01:00
Thomas Hellstrom
ed440234d4 st/xa: Fix regression in xa_yuv_planar_blit()
Commit "st/xa: scissor to help tilers" broke xa_yuv_planar_blit() and vmwgfx
textured video. Fix this by implementing scissors also in the yuv draw path.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 46537f1d03)
2014-10-03 01:27:34 +01:00
Marek Olšák
d95520d297 st/dri: remove GALLIUM_MSAA and __GL_FSAA_MODE environment variables
Some users don't understand that these variables can break OpenGL.
The general is rule is that if an app supports MSAA, you mustn't use
GALLIUM_MSAA.

For example, if an app has an 8xMSAA FBO and GALLIUM_MSAA=4
is set, resolving the FBO to the back buffer will be rejected which will look
like this on all gallium drivers:

http://www.phoronix.com/scan.php?page=article&item=amd_radeonsi_msaa

The environment variables also have no effect on modern apps like TF2, but
there is still a performance hit due to wasted bandwidth and VRAM.

In a nutshell, it does more harm than good.

Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 8449121971)
2014-09-28 20:52:02 +01:00
Tom Stellard
3e980357c5 configure.ac: Compute LLVM_VERSION_PATCH using llvm-config
This is the only guaranteed way get the patch level for llvm,
since the define cannot always be found in config.h depending
on the version of llvm or the build system used.

CC: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jonathan Gray <jsg@jsg.id.au>
(cherry picked from commit ec566e0f16)
2014-09-27 18:56:40 +01:00
Ian Romanick
384816c6db glsl: Strip arrayness from ir_type_dereference_variable too
If the thing being dereferenced is a record or an array of records, it
should be treated as row-major.  The ir_type_derference_record path
already does this, and I think I intended to do the same for this path
in b17a4d5d.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83741
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c3f17bb18f)
2014-09-27 18:56:39 +01:00
Ian Romanick
d556ed889d glsl: Round struct size up to at least 16 bytes
Per rule #9, the size of the structure is vec4 aligned.  The MAX2 in the
loop ensures that sizes >= 16 bytes are vec4 aligned.  The new MAX2
after the loop ensures that sizes < 16 bytes are vec4 aligned.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82932
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 2ab71e1486)
2014-09-27 18:56:39 +01:00
Ian Romanick
d9444533aa glsl: Make sure row-major array-of-structure get correct layout
Whether or not the field is row-major (because it might be a bvec2 or
something) does not affect the array itself.  We need to know whether an
array element in its entirety is row-major.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83506
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 5c75270c34)
2014-09-27 18:56:39 +01:00
Ian Romanick
9328440ef7 glsl: Make sure fields after small structs have correct padding
Previously the linker would correctly calculate the layout, but the
lower_ubo_reference pass would not apply correct alignment to fields
following small (less than 16-byte) nested structures.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83533
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8e01c66da6)
2014-09-27 18:56:39 +01:00
Michel Dänzer
1ac204121b st/mesa: Use PIPE_USAGE_STAGING for GL_STATIC/DYNAMIC/STREAM_READ buffers
Such buffers can only be useful by reading from them with the CPU, so we
need to make sure CPU reads are fast.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84178
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 7e55c3b352)
2014-09-27 18:56:39 +01:00
Ilia Mirkin
fef6059a81 gm107/ir: take relative pfetch offset into account
There is no dedicated instruction for this, so just combine it with the
constant offset.

Acked-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a5bbfeda97)
2014-09-27 18:56:38 +01:00
Ilia Mirkin
34809f8eef gm107/ir: add support for indirect const buffer selection
This was missed in the commit that enabled it for fermi/kepler as part
of ARB_gpu_shader5

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cdc4de1215)
2014-09-27 18:56:38 +01:00
Ilia Mirkin
9a79018840 gm107/ir: fix texture argument order
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0532a5fd00)
2014-09-27 18:56:38 +01:00
Ilia Mirkin
5aff846a60 gm107/ir: fix manual TXD for array targets
This parallels the fixes in commit afea9bae.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d3c3bba6d0)
2014-09-27 18:56:38 +01:00
Ilia Mirkin
fb4e23626f nv50/ir: avoid deleting pseudo instructions too early
What happens is that a SPLIT operation is part of the spill node, and as
a pseudo op, the instruction gets erased after processing its first def.
However the later defs still need to refer to it, so instead delay
deleting until after that whole RA node is done processing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79462
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0147c10c5f)
2014-09-27 18:56:38 +01:00
Kenneth Graunke
607d0b9578 mesa: Set correct array element in vbo_exec_vtx_init.
I'm not familiar with this code, but this sure appears to be a typo.
It looks like the intent is to set each array element, not arrays[0]
each time.  Notably, the loop just below uses "array", not "arrays".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit f81052dc9b)
2014-09-27 18:56:38 +01:00
Kenneth Graunke
4fce87bcee mesa: Use proper structure for glGet*(GL_TEXTURE_COORD_ARRAY*).
The code in get.c that handles this uses ctx->Array.VAO->VertexAttrib,
which is a gl_vertex_attrib_array structure, not a gl_client_array.

The offsets of all fields happened to be the same in both structures, at
least on x86_64.  "Size," "Type," and "Stride" are obviously the same:
both structures start with the same fields, in the same order.

"Enabled" is dicier: there are different fields before it in both
structures, including pointer sized values which might need special
alignment.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d0ec6e8509)
2014-09-27 18:56:37 +01:00
Marek Olšák
8e2d0f59f7 radeonsi: properly destroy the GS copy shader and scratch_bo for compute
Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit dc05a9e4e0)
[Emil Velikov: remove unref scratch_bo, s/si_shader/si_pipe_shader/]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-27 18:55:52 +01:00
Marek Olšák
4748d2f065 radeonsi: release GS rings at context destruction
Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 711623f7c8)
[Emil Velikov: s/ring/ring.buffer/]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-27 18:55:07 +01:00
Andreas Pokorny
f74bca93b4 i915: Fix black buffers when importing prime fds
Width and Height of the imported image was never initialized from the
imported bo.

Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Pokorny <andreas.pokorny@canonical.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit df341320c9)
2014-09-27 18:12:57 +01:00
Andreas Pokorny
ceebec140b egl/drm: expose KHR_image_pixmap extension
This changes enables EGL_KHR_image_pixmap in the egl drm platform, which is implemented
there but has not been advertised yet.

Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Pokorny <andreas.pokorny@canonical.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 53b614bfd3)
2014-09-27 18:12:51 +01:00
Roland Scheidegger
095a6a0af1 gallivm: fix idiv
ffeb77c7b0 had a typo which turned all signed
integer divisions into unsigned ones. Oops.
This gets us back the 51 little piglits
(all from glsl built-in-functions, fs/vs/gs-op-div-int-ivec2 and similar).

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 5e1fcc6258)
2014-09-27 18:12:44 +01:00
rconde
04a9d7d44a gallivm,tgsi: fix idiv by zero crash
While the result of signed integer division by zero is undefined by glsl
(and doesn't exist with d3d10), we must not crash, so need to make sure we
don't get sigfpe much like udiv already does.
Unlike udiv where we return 0xffffffff (as required by d3d10) there is
no requirement right now to return anything specific so we use zero.

(cherry picked from commit ffeb77c7b0)
Nominated-by: Roland Scheidegger <sroland@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83570
2014-09-23 00:52:51 +01:00
Tom Stellard
d4289fc37b clover: Add support to mem objects for multiple destructor callbacks v2
The spec says that mem objects should maintain a stack of callbacks
not just one.

v2:
  - Remove stray printf.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

CC: "10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c6d9801409)
2014-09-23 00:46:00 +01:00
Brian Paul
9599cd6a2f mesa: fix prog_optimize.c assertions triggered by SWZ opcode
The SWZ instruction can have swizzle terms >4 (SWIZZLE_ZERO, SWIZZLE_ONE).
These swizzle terms caused a few assertions to fail.
This started happening after the commit "mesa: Actually use the Mesa IR
optimizer for ARB programs." when replaying some apitrace files.

A new piglit test (tests/asmparsertest/shaders/ARBfp1.0/swz-08.txt)
exercises this.

Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 7b2c703244)
2014-09-23 00:45:21 +01:00
Richard Sandiford
27f70a9273 swrast: Fix handling of MESA_FORMAT_L8A8_SRGB for big-endian
Luminance is the least-significant byte of the uint16, rather than the
lowest byte in memory.  Other parts of mesa already handle this correctly
for big-endian, and swrast already handles other MESA_FORMAT_x8y8 formats
correctly.  This case was just an odd-one-out.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ecc48f83c8)
2014-09-23 00:45:01 +01:00
Richard Sandiford
0a6e33ea74 mesa: Fix alpha component in unpack_R8G8B8X8_SRGB.
The function was using the "X" component as the alpha channel,
rather than setting alpha to 1.0.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3ff5c6a6c4)
2014-09-23 00:44:30 +01:00
Emil Velikov
18571edea8 docs: Add 10.3 sha256 sums, news item and link release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-19 20:01:04 +01:00
Emil Velikov
1b12af300d docs: Update 10.3 release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-19 19:43:01 +01:00
Emil Velikov
4c4846b588 Bump version to 10.3 (final)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-19 19:27:45 +01:00
Connor Abbott
e471841048 r300g: set register classes before interferences
In commit 567e2769b8 ("ra: make the p, q
test more efficient") I unknowingly introduced a new requirement to the
register allocator API: the user must set the register class of all
nodes before setting up their interferences, because
ra_add_conflict_list() now uses the classes of the two interfering
nodes. i965 already did this, but r300g was setting up register classes
interleaved with setting up the interference graph. This led to us
calculating the wrong q total, and in certain cases
e78a01d5e6 (" ra: optimistically color
only one node at a time") made it so that this bug caused a segfault. In
particular, the error occurred if the q total was decremented to 1 below
0 for the last node to be pushed onto the stack.  Since q_total is an
unsigned integer, it overflowed to 0xffffffff, which is what
lowest_q_total happens to be initialzed to. This means that we would
fail the "new_q_total < lowest_q_total" check on line 476 of
register_allocate.c, and so the node would never be pushed onto the
stack, which led to segfaults in ra_select() when we failed to ever give
it a register.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82828
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Pavel Ondračka <pavel.ondracka@email.cz>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit afd82dcad1)
2014-09-16 22:18:34 +01:00
Gwenole Beauchesne
f86efb4285 i965: add support for RGBA dma_buf imports.
This allows for importing foreign buffers in RGB32 native endian
byte order, i.e. DRM_FORMAT_XBGR8888, and DRM_FORMAT_ABGR8888.

Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e1c50abf8a)
2014-09-16 22:17:58 +01:00
Kenneth Graunke
84a58f462a i965: Mark delta_x/y as BAD_FILE if remapped away completely.
Commit afe3d1556f (i965: Stop doing
remapping of "special" regs.) stopped remapping delta_x/delta_y, and
additionally stopped considering them always-live.  We later realized
delta_x was used in register allocaiton, so we actually needed to remap
it, which was fixed in commit 23d782067a
(i965/fs: Keep track of the register that hold delta_x/delta_y.).

However, that commit didn't restore the "always consider it live" part.
If all the code using delta_x was eliminated, fs_visitor::delta_x would
be left pointing at its old register number.  Later code in register
allocation would handle that register number specially...even though it
wasn't actually delta_x.

To combat this, set delta_x/y to BAD_FILE if they're eliminated, and
check for that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83127
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 78bd126194)
2014-09-16 22:17:40 +01:00
Richard Sandiford
605734780e gallivm: Fix uses of 2^24
Fallback cases in lp_bld_arit.c used 2^24 to mean "2 to the power 24",
but in C it's "2 xor 24", i.e. 26.  Fixed by using 1<< instead.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1a65629ccc)
2014-09-16 22:16:58 +01:00
Ilia Mirkin
efe8fc687d nouveau: change internal variables to avoid conflicts with macro args
Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b13a4ca3f7)
2014-09-16 22:16:16 +01:00
Brian Paul
051543962f mesa: fix _mesa_free_pipeline_data() use-after-free bug
Unreference the ctx->_Shader object before we delete all the pipeline
objects in the hash table.  Before, ctx->_Shader could point to freed
memory when _mesa_reference_pipeline_object(ctx, &ctx->_Shader, NULL)
was called.

Fixes crash when exiting the piglit rendezvous_by_location test on
Windows.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0d73ac6b02)
2014-09-16 22:15:29 +01:00
Andreas Boll
b92ea2a10d gallium/util: add missing u_debug include
Needed for assert.
Fixes build on BE archs with -Werror=implicit-function-declaration.

In file included from
../../../../../src/gallium/auxiliary/draw/draw_fs.c:30:0:
../../../../../src/gallium/auxiliary/util/u_math.h: In function
'util_memcpy_cpu_to_le32':
../../../../../src/gallium/auxiliary/util/u_math.h:810:4: error:
implicit declaration of function 'assert'
[-Werror=implicit-function-declaration]
    assert(n % 4 == 0);
        ^

Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 2a13ff954d)
2014-09-16 22:14:03 +01:00
Ilia Mirkin
b0131d951b nouveau: only enable stencil func if the visual has stencil bits
The _Enabled property already has the relevant information.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3c81de5851)
2014-09-16 22:13:45 +01:00
Ilia Mirkin
0c1f24b46c nouveau: only enable the depth test if there actually is a depth buffer
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 79959e5de5)
2014-09-16 22:13:00 +01:00
Maarten Lankhorst
a4d4ab929e nouveau: remove unneeded assert
No idea why it was added, but the code runs fine even on videos
where it triggers.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8ab85bfcd5)
2014-09-16 22:08:48 +01:00
Maarten Lankhorst
2b43d48509 nouveau: rework reference frame handling
Fixes a regression from "nouveau/vdec: small fixes to h264 handling"

New picking order for frames:
 1. Vidbuf pointer matches.
 2. Take the first kicked ref.
 3. If that fails, take a ref that has a different last_used.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a41aad8431)
2014-09-16 22:08:27 +01:00
Maarten Lankhorst
62f56a08af nouveau: fix MPEG4 hw decoding
Reorder some fields to make I-frame decoding work correctly.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 121ceb38f4)
2014-09-16 22:07:58 +01:00
Maarten Lankhorst
a3c52ce0b4 nouveau: re-allocate bo's on overflow
The BSP bo might be too small to contain all of the bsp data,
bump its size on overflow. Also bump inter_bo when this happens,
it might be too small otherwise.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f6afed7076)
2014-09-16 22:07:23 +01:00
Ian Romanick
6c562f3d1a i965/vec4: Only examine virtual_grf_end for GRF sources
If the source is not a GRF, it could have a register >= virtual_grf_count.
Accessing virtual_grf_end with such a register would lead to
out-of-bounds access.  Make sure the source is a GRF before accessing
virtual_grf_end.

Fixes Valgrind complaints while compiling some shaders.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 7aeb853c90)
2014-09-16 22:06:03 +01:00
Iago Toral Quiroga
6240628e05 i965: Implement GL_PRIMITIVES_GENERATED with non-zero streams.
So far we have been using CL_INVOCATION_COUNT to resolve this query but this
is no good with streams, as only stream 0 reaches the clipping stage.

From ARB_transform_feedback3:

"When a generated primitive query for a vertex stream is active, the
 primitives-generated count is incremented every time a primitive emitted to
 that stream reaches the Discarding Rasterization stage (see Section 3.x)
 right before rasterization. This counter is incremented whether or not
 transform feedback is active."

Unfortunately, we don't have any registers that provide the number of primitives
written to a specific stream other than the ones that track the number of
primitives written to transform feedback in the SOL stage, so we can't
implement this exactly as specified.

In the past we implemented this feature by activating the SOL unit even if
transform feeback was disabled, but making it so that all buffers were
disabled and it only recorded statistics, which gave us the right semantics
(see 3178d2474a). Unfortunately, this came with
a significant performance impact and had to be reverted.

This new take does not intend to implement the exact semantics required by
the spec, but improves what we have now, since now we return the primitive
count for stream 0 in all cases. With this patch we use
GEN7_SO_PRIM_STORAGE_NEEDED to resolve GL_PRIMITIVES_GENERATED queries
for non-zero streams. This would return the number of primitives written
to transform feedback for each stream instead. Since non-zero streams are
only useful in combination with transform feedback this should not be too
bad, and the only case that I think we would not be supporting would be
the one in which we want to use both GL_PRIMITIVES_GENERATED and
GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN on the same non-zero stream to
detect buffer overflow.

This patch also fixes the following piglit test:
arb_gpu_shader5-xfb-streams-without-invocations

This test uses both GL_PRIMITIVES_GENERATED and
GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN queries on non-zero streams, but it
does never hit the overflow case, so both queries are always expected to return
the same value.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f976b4c1bf)
Nominated-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-16 22:01:53 +01:00
Kenneth Graunke
0f4dc09807 glsl: Speed up constant folding for swizzles.
ir_rvalue::constant_expression_value() recursively walks down an IR
tree, attempting to reduce it to a single constant value.  This is
useful when you want to know whether a variable has a constant
expression value at all, and if so, what it is.

The constant folding optimization pass attempts to replace rvalues with
their constant expression value from the bottom up.  That way, we can
optimize subexpressions, and ideally stop as soon as we find a
non-constant subexpression.

In order to obtain the actual value of an expression, the optimization
pass calls constant_expression_value().  But it should only do so if it
knows the value can be combined into a constant.  Otherwise, at each
step of walking back up the tree, it will walk down the tree again, only
to discover what it already knew: it isn't constant.

We properly avoided this call for ir_expression nodes, but not for
ir_swizzle nodes.  This patch fixes that, drastically reducing compile
times on certain shaders where tree grafting has given us huge
expression trees.  It also fixes SuperTuxKart.

Thanks to Iago and Mike for help in tracking this down.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78468
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 84a40ce86b)
2014-09-12 16:51:52 -07:00
Kenneth Graunke
eeba3c94b1 i965/vec4: Make type_size() return 0 for samplers.
The FS backend has always used 0, and the VS backend has always used 1.
I think 1 is just working around other problems, and is incorrect.
Samplers are baked in; nothing uses the UNIFORM register we would
create, and we shouldn't upload any constant values for them.

Fixes ES3-CTS.shaders.struct.uniform.sampler_array_vertex.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7865026c04)
2014-09-12 16:51:52 -07:00
Kenneth Graunke
0eeec2871d i965: Skip allocating UNIFORM file storage for uniforms of size 0.
Samplers take up zero slots and therefore don't exist in the params
array, nor are they included in stage_prog_data->nr_params.  There's no
need to store their size in param_size, as it's only used for dealing
with arrays of "real" uniforms (ones uploaded as shader constants).

We run into all kinds of problems trying to refer to the uniform storage
for variables that don't have uniform storage.  For one, we may use some
other variable's index, or access out of bounds in arrays.  In the FS
backend, our extra 2 * MaxSamplerImageUnits params for texture rectangle
rescaling paper over a lot of problems.  In the VS backend, we claim
samplers take up a slot, which also papers over problems.

Instead, just skip allocating storage for variables that don't have any.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2408f166db)
2014-09-12 16:51:52 -07:00
Kenneth Graunke
8f1ccf3577 i965: Disable guardband clipping in the smaller-than-viewport case.
Apparently guardband clipping doesn't work like we thought: objects
entirely outside fthe guardband are trivially rejected, regardless of
their relation to the viewport.  Normally, the guardband is larger than
the viewport, so this is not a problem.  However, when the viewport is
larger than the guardband, this means that we would discard primitives
which were wholly outside of the guardband, but still visible.

We always program the guardband to 8K x 8K to enforce the restriction
that the screenspace bounding box of a single triangle must be no more
than 8K x 8K.  So, if the viewport is larger than that, we need to
disable guardband clipping.

Fixes ES3 conformance tests:
- framebuffer_blit_functionality_negative_height_blit
- framebuffer_blit_functionality_negative_width_blit
- framebuffer_blit_functionality_negative_dimensions_blit
- framebuffer_blit_functionality_magnifying_blit
- framebuffer_blit_functionality_multisampled_to_singlesampled_blit

v2: Mention the acronym expansion for TA/TR/MC in the comments.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 0bac2551e4)
2014-09-12 16:51:52 -07:00
Kenneth Graunke
8e05b2bfae i965: Separate gl_InstanceID and gl_VertexID uploading.
We always uploaded them together, mostly out of laziness - both required
an additional vertex element.  However, gl_VertexID now also requires an
additional vertex buffer for storing gl_BaseVertex; for non-indirect
draws this also means uploading (a small amount of) data.  This is extra
overhead we don't need if the shader only uses gl_InstanceID.

In particular, our clear shaders currently use gl_InstanceID for doing
layered clears, but don't need gl_VertexID.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6b6145204d)
2014-09-12 16:51:51 -07:00
Kenneth Graunke
997f634c33 i965: Fix reference counting in new basevertex upload code.
In the non-indirect draw case, we call intel_upload_data to upload
gl_BaseVertex.  It makes brw->draw.draw_params_bo point to the upload
buffer, and increments the upload BO reference count.

So, we need to unreference it when making brw->draw.draw_params_bo point
at something else, or else we'll retain a reference to stale upload
buffers and hold on to them forever.

This also means that the indirect case should increment the reference
count on the indirect draw buffer when making brw->draw.draw_params_bo
point at it.  That way, both paths increment the reference count, so
we can safely unreference it every time.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit e980fe6071)
2014-09-12 16:51:51 -07:00
Ian Romanick
a58ae20536 i965: Request lowering gl_VertexID
Fixes the (new) piglit tests gles-3.0-drawarrays-vertexid,
gl-3.0-multidrawarrays-vertexid, and gl-3.2-basevertex-vertexid.

Fixes gles3conform failure in:

ES3-CTS.gtf.GL3Tests.transform_feedback.transform_feedback_vertex_id

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80247
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 927f5db461)
2014-09-12 16:51:51 -07:00
Kenneth Graunke
80f93d6937 i965: Expose gl_BaseVertex via a vertex attribute.
Now that we have the data available, we need to expose it to the
shaders.  We can reuse the same vertex element that we use for
gl_VertexID, but we need to back it by an actual vertex buffer.

A hardware restriction requires that vertex attributes coming from a
buffer (STORE_SRC) must come before any other types (i.e. STORE_0).
So, we have to make gl_BaseVertex be the .x component of the vertex
attribute.  This means moving gl_VertexID to a different component.

I chose to move gl_VertexID and gl_InstanceID to the .z and .w
components, respectively, to make room for gl_BaseInstance in the .y
component (which would also come from a buffer, and therefore be
STORE_SRC).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit fbb353bc13)
2014-09-12 16:51:51 -07:00
Kenneth Graunke
860af662fa i965: Refactor Gen4-7 VERTEX_BUFFER_STATE emission into a helper.
We'll need to emit another VERTEX_BUFFER_STATE for gl_BaseVertex;
pulling this into a helper function will save us from having to deal
with cross-generation differences in that code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 87b10c4a71)
2014-09-12 16:51:51 -07:00
Kenneth Graunke
10aee701ae i965: Make gl_BaseVertex available in a buffer object.
This will be used for GL_ARB_shader_draw_parameters, as well as fixing
gl_VertexID, which is supposed to include gl_BaseVertex's value.

For indirect draws, we simply point at the indirect buffer; for normal
draws, we upload the value via the upload buffer.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit fdbabf22e1)
2014-09-12 16:51:51 -07:00
Kenneth Graunke
afe5db3293 i965: Calculate start/base_vertex_location after preparing vertices.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit c89306983c)
2014-09-12 16:51:51 -07:00
Ian Romanick
d9df31cc6e i965: Handle SYSTEM_VALUE_VERTEX_ID_ZERO_BASE
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 9975792abd)
2014-09-12 16:51:51 -07:00
Kenneth Graunke
f009cb080e mesa: Fix glGetActiveAttribute for gl_VertexID when lowered.
The lower_vertex_id pass converts uses of the gl_VertexID system value
to the gl_BaseVertex and gl_VertexIDMESA system values.  Since
gl_VertexID is no longer accessed, it would not be considered active.

Of course, it should be, since the shader uses gl_VertexID.

v2: Move the var->name dereference past the var != NULL check.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 26e949b26e)
2014-09-12 16:51:50 -07:00
Kenneth Graunke
09a763bea5 mesa: Replace string comparisons with SYSTEM_VALUE enum checks.
This is more efficient.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 26c9514155)
2014-09-12 16:51:50 -07:00
Ian Romanick
9c5ffa7f7a glsl: Add a lowering pass for gl_VertexID
Converts gl_VertexID to (gl_VertexIDMESA + gl_BaseVertex). gl_VertexIDMESA
is backed by SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, and gl_BaseVertex is backed
by SYSTEM_VALUE_BASE_VERTEX.

v2: Put the enum in struct gl_constants and propoerly resolve the scope
in C++ code.  Fix suggested by Marek.

v3: Reabase on Matt's foreach_in_list changes (was using foreach_list).

v4 (Ken): Use a systemvalue instead of a uniform because
STATE_BASE_VERTEX has been removed.

v5: Use a boolean to select lowering, and only allow one lowering
method.  Suggested by Ken.

v6 (Ken): Replace strcmp against literal "gl_BaseVertex"/"gl_VertexID"
with SYSTEM_VALUE enum checks, for efficiency.

v7: Rebase on context constant initialization work.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ec08b5e768)
2014-09-12 16:51:50 -07:00
Ian Romanick
31414ada14 glsl/linker: Make get_main_function_signature public
The next patch will use this function in a different file.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 04d3323d4b)
2014-09-12 16:51:50 -07:00
Ian Romanick
002c284fb4 mesa: Add SYSTEM_VALUE_BASE_VERTEX
This system value represents the basevertex value passed to
glDrawElementsBaseVertex and related functions.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 1e87fbd78f)
2014-09-12 16:51:50 -07:00
Ian Romanick
73192345c3 mesa: Add SYSTEM_VALUE_VERTEX_ID_ZERO_BASE
There exists hardware, such as i965, that does not implement the OpenGL
semantic for gl_VertexID.  Instead, that hardware does not include the
value of basevertex in the gl_VertexID value.
SYSTEM_VALUE_VERTEX_ID_ZERO_BASE is the system value that represents
this semantic.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 5964a4f344)
2014-09-12 16:46:28 -07:00
Ian Romanick
6bc4331c8e mesa: Document SYSTEM_VALUE_VERTEX_ID and SYSTEM_VALUE_INSTANCE_ID
v2: Additions to the documentation for SYSTEM_VALUE_VERTEX_ID.  Quote
the GL_ARB_shader_draw_parameters spec and mention DirectX SV_VertexID.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 9afb5ae8ca)
2014-09-12 16:46:25 -07:00
Matt Turner
72d8ebb7fb i965/vec4: Reswizzle sources when necessary.
Despite the comment above the function claiming otherwise, the function
did not reswizzle sources, which would lead to bad code generation since
commit 04895f5c, which began claiming we could do such swizzling when we
could not.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82932
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 1ee1d8ab46)
2014-09-10 10:58:46 -07:00
Jonathan Gray
9f67c26d1b configure.ac: strip _GNU_SOURCE from llvm-config output
Mesa already defines _GNU_SOURCE for glibc based systems and defining
_GNU_SOURCE will break the Mesa build on other systems such as OpenBSD.

_GNU_SOURCE only seems to be included in llvm-config output when
LLVM is built via autoconf and not when it is built by cmake.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
(cherry picked from commit c68073e65f)
2014-09-09 21:39:00 +01:00
Emil Velikov
07426ad102 configure: enable the gallium loader only when needed
With the gallium megadrivers we've converted most ST to optionally
use either statically linked in or shared pipe-drivers.

The hardcoded switch forgot to conditionally enable the build of the
shared pipe-drivers which resulted in them being constantly build.

Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Cc: James Ausmus <james.ausmus@intel.com>
Reported-by: James Ausmus <james.ausmus@intel.com>
Tested-by: James Ausmus <james.ausmus@intel.com>
Bugzilla: https://code.google.com/p/chromium/issues/detail?id=412089
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 44ec468e80)
2014-09-09 21:38:53 +01:00
Emil Velikov
414de21449 configure: bail out if building svga without libdrm
With recent commit we removed the NEED_NONNULL_WINSYS checks when
selecting the hardware (inc svga) winsys. svga has only one winsys
that explicitly requires libdrm (via it's bundled version of
vmwgfx_drm.h) but configure.ac never really checks for it.

Add the check early to prevent people from shooting themselves when
they select the driver but lack libdrm.

$ ./autogen.sh --disable-dri --disable-egl --disable-gallium-llvm
--with-dri-drivers=swrast --with-gallium-drivers=svga,swrast

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82539
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 40bb6f9313)
2014-09-09 21:38:47 +01:00
Ilia Mirkin
31adc40680 nv50/ir: avoid array overrun when checking for supported mods
Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 874a9396c5)
2014-09-09 21:38:40 +01:00
Kenneth Graunke
a318e2f383 i965: Handle ir_binop_ubo_load in boolean expression code.
UBO loads can be boolean-valued expressions, too, so we need to handle
them in emit_bool_to_cond_code() and emit_if_gen6().

However, unlike most expressions, it doesn't make sense to evaluate
their operands, then do something with the results.  We just want to
evaluate the UBO load as a whole---which performs the read from
memory---then load the boolean result into the flag register.

Instead of adding code to handle it, we can simply bypass the
ir_expression handling, and fall through to the default code, which will
do exactly that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83468
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit a20cc2796f)
2014-09-09 21:38:33 +01:00
Kenneth Graunke
3a49ccc134 i965: Handle ir_triop_csel in emit_if_gen6().
ir_triop_csel can return a boolean expression, so we need to handle it
here; we simply forgot when we added ir_triop_csel, and forgot again
when adding it to emit_bool_to_cond_code.

Fixes Piglit's EXT_shader_integer_mix/{vs,fs}-mix-if-bool on Sandybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 6272e60ca3)
2014-09-09 21:38:07 +01:00
Ulrich Weigand
b148cd6586 gallivm: Fix Altivec pack intrinsics for little-endian
This patch fixes use of Altivec pack intrinsics on little-endian PowerPC
systems.  Since little-endian operation only affects the load and store
instructions, the semantics of pack (and other) instructions that take
two input vectors implicitly change: the pack instructions still fill
a register placing values from the first operand into the "high" parts
of the register, and values from the second operand into the "low" parts
of the register, but since vector loads and stores perform an endian swap,
the high parts end up at high memory addresses.

To still achieve the desired effect, we have to swap the two inputs to
the pack instruction on little-endian systems.  This is done automatically
by the back-end for instructions generated by LLVM, but needs to be done
manually when emitting intrisincs (which still result in that instruction
being emitted directly).

Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Maarten Lankhorst <dev@mblankhorst.nl>
(cherry picked from commit 0feb977bbf)
Nominated-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2014-09-08 17:14:44 +01:00
Christian König
7fb0fed989 mesa/st: don't advertise NV_vdpau_interop if it doesn't work.
As long as we don't have a workaround for frame based
decoding in VDPAU we should not advertise NV_vdpau_interop.

v2: fix commit message, check if get_video_param is present

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 12fb74fe89)
2014-09-08 17:05:44 +01:00
Kristian Høgsberg
8e551f4220 i965: Adjust fast-clear resolve rect for BDW
The scale factors for the resolve rectangle change for BDW and we have
to look at brw->gen now to figure out how big it should be.

Fixes: https://bugs.freedesktop.org/attachment.cgi?id=105777
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 2d6d3461d3)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83046
2014-09-08 17:05:26 +01:00
Christoph Bumiller
bb06f2cd93 nvc0/ir: clarify recursion fix to finding first tex uses
This is a simple shader for reproducing the case mentioned:

FRAG
DCL IN[0], GENERIC[0], PERSPECTIVE
DCL OUT[0], COLOR
DCL SAMP[0]
DCL CONST[0]
DCL TEMP[0..1], LOCAL
IMM[0] FLT32 {    0.0000,    -1.0000,     1.0000,     0.0000}
  0: MOV TEMP[0].x, CONST[0].wwww
  1: MOV TEMP[1].x, CONST[0].wwww
  2: BGNLOOP
  3:   IF TEMP[0].xxxx
  4:     BRK
  5:   ENDIF
  6:   ADD TEMP[0].x, TEMP[0], IMM[0].zzzz
  7:   IF CONST[0].xxxx
  8:     TEX TEMP[1].x, CONST[0], SAMP[0], 2D
  9:   ENDIF
 10:   IF CONST[0].zzzz
 11:     MOV TEMP[1].x, CONST[0].zzzz
 12:   ENDIF
 13: ENDLOOP
 14: MOV OUT[0], TEMP[1].xxxx
 15: END

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit ca9ab05d45)
2014-09-08 17:03:21 +01:00
Christoph Bumiller
d3745890c6 nv50/ir/util: fix BitSet issues
BitSet::allocate() is being used with the expectation that it would
leave the bitfield untouched if its size hasn't changed, however,
the function always zeroed the last word, which led to obscure bugs
with live set computation.

This also fixes BitSet::resize(), which was broken, but luckily not
being used.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit b9f9e3ce03)
2014-09-08 17:03:16 +01:00
Jason Ekstrand
7a2018b968 i965/blorp: Pass image formats seperately from the miptree
When a texture is wrapped in a texture view, we can't trust the format in
the miptree itself.  This patch allows us to pass the format seperately
through blorp so we can proprerly handled wrapped textures.

It's worth noting here that we can use the miptree format directly for
depth/stencil formats because they cannot be reinterpreted by a texture
view.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
CC: "10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
(cherry picked from commit 7599886b26)
2014-09-08 17:00:54 +01:00
Emil Velikov
4e1ca4a190 Increment version to 10.3.0-rc3
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-05 17:00:40 +01:00
Marek Olšák
06f1f1ea81 st/mesa: use 1.0f as boolean true on drivers without integer support
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82882

Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 1a00f24751)
2014-09-05 16:32:48 +01:00
Marek Olšák
e842a02df3 mesa: set UniformBooleanTrue = 1.0f by default
because NativeIntegers is 0 by default.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82882

Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit d67db73458)
2014-09-05 16:31:58 +01:00
Rob Clark
96bca3617c freedreno/ir3: fix potential null ptr deref
Fix potential segfault in debug code.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit c06afcede2)
2014-09-05 16:28:51 +01:00
Rob Clark
c221e96a13 freedreno/a2xx: fix segfault
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 306e421887)
2014-09-05 16:28:20 +01:00
Rob Clark
640ddefd96 freedreno/a3xx: handle first/last level properly
Fixes some assumptions about first_level being zero.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit bd3b096467)
2014-09-05 16:28:04 +01:00
Rob Clark
7cd0fa023e freedreno: implement pipe_flush_resource()
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit b40a6c2b17)
2014-09-05 16:27:55 +01:00
Rob Clark
cd94c64421 freedreno: don't ignore src/dst level
Don't ignore src/dst_level in pipe_copy_region.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 478a08ebd2)
2014-09-05 16:06:36 +01:00
Jonathan Gray
e9923b2194 automake: check if the linker supports --dynamic-list
As older versions of gnu ld did not support --dynamic-list check to see
if it is supported before using it.  Non gnu linkers such the apple one
likely lack this option as well.

Fixes the build on OpenBSD which has binutils 2.15 and 2.17.
The --dynamic-list option seems to been have introduced sometime after
binutils 2.17 was released as it is present in 2.18.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 635477dc4b)
2014-09-05 15:45:46 +01:00
Andreas Pokorny
2e56334a2a kms-swrast: Support Prime fd handling
Allows using prime fds as display target and from display target.
Test for PRIME capability after initializing kms_swrast screen.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andreas Pokorny <andreas.pokorny@canonical.com>
(cherry picked from commit 8bcd57a46c)
2014-09-05 15:45:46 +01:00
Marek Olšák
ead7f72a2c r600g,radeonsi: make sure there's enough CS space before resuming queries
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83432

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 3dbf55c1be)
2014-09-05 15:45:45 +01:00
Marek Olšák
139d176f54 mesa: invalidate draw state in glPopClientAttrib
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82538

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 374f3e9e19)
2014-09-05 15:45:45 +01:00
Thomas Hellstrom
941b2ae35f winsys/svga: Fix incorrect type usage in IOCTL v2
While similar in layout, the size of the SVGA3dSize type may be smaller than
the struct drm_vmw_size type that is part of the ioctl interface. The kernel
driver could accordingly overwrite a memory area following the size variable
on the stack. Typically that would be another local variable, causing
breakage in, for example, ubuntu 12.04.5 where the handle local variable
becomes overwritten.

v2: Fix whitespace errors

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Cc: "10.1 10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2d6206140a)
2014-09-05 15:45:45 +01:00
Kenneth Graunke
4b38838ef4 i965: Handle ir_triop_csel in emit_bool_to_cond_code().
ir_triop_csel can return a boolean expression, so we need to handle it
here; we simply forgot when we added it.

Fixes Piglit's EXT_shader_integer_mix/{vs,fs}-mix-if-bool.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8270b048cf)
2014-09-05 15:43:08 +01:00
tiffany
3fdd08c9b4 glsl: fix assertion which fails for unsigned array indices.
According to the GLSL 1.40 spec, section 5.7 Structure and Array Operations:

"Array elements are accessed using an expression whose type is int or uint."

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit cfc42db592)
2014-09-05 14:44:22 +01:00
Jason Ekstrand
f8ff31e528 i965/copy_image: Divide the x offsets by block width when using the blitter
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82804
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 11ee9a4d99)
2014-09-05 14:43:53 +01:00
Jason Ekstrand
ab53a29892 i965/copy_image: Use the correct block dimension
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82804
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 499acf6e4a)
2014-09-05 14:42:56 +01:00
Jason Ekstrand
4073e96a3b meta/copy_image: Use the correct texture level when creating views
Previously, we were accidentally assuming that the level of both textures
was 0.  Now we actually use the correct level in our hacked texture view.
This doesn't 100% fix the meta path because the texture type is getting
lost somewhere in the pipeline.  However, it actually copies to/from the
correct layer now.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82804
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b608cd7fbf)
2014-09-05 14:42:36 +01:00
Jason Ekstrand
4eed41b967 i965/copy_image: Use the correct texture level
Previously, we were using the source images level for both source and
destination.  Also, we weren't taking the MinLevel from a potential texture
view into account.  This commit fixes both problems.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82804
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit fcb6d5b9ef)
2014-09-05 14:41:47 +01:00
Marek Olšák
c546523b4d r600g: fix alpha-test with HyperZ enabled, fixing L4D2 tree corruption
*_update_db_shader_control depends on the alpha test state. The problem was
it was in a block which is only entered if the pixel shader is changed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74863

Cc: mesa-stable@lists.freedesktop.org
Tested-by: Benjamin Bellec <b.bellec@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 8abdc3c4a9)
2014-09-05 14:40:45 +01:00
Kristian Høgsberg
282a3098e6 meta: Make MESA_META_DRAW_BUFFERS restore properly
A meta begin/end pair with MESA_META_DRAW_BUFFERS will change visible GL
state.  We recreate the draw buffer enums from the buffer bitfield, which
changes GL_BACK to GL_BACK_LEFT (and GL_FRONT to GL_FRONT_LEFT).

This commit modifes the save/restore logic to instead copy the buffer enums
from the gl_framebuffer and then set them on restore using
_mesa_drawbuffers().

It's not clear how this breaks the benchmark in 82796, but fixing meta to not
leak the state change fixes the regression.

No piglit regressions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=82796
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8f55174fbd)
2014-09-05 14:36:43 +01:00
Emil Velikov
ec4a333c37 Revert "mesa: fix make tarballs"
This reverts commit 0fbb9a599d.

Rather than adding hacks around the issue drop the sources from the
final tarball, and re-add them back with 'make dist'. This fixes a
problem when running parallel 'make install' fails as it recreates
sources and triggers partial recompilation.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83355
Reported-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
(cherry picked from commit 5a4e0f3873)
2014-09-05 14:04:52 +01:00
Dave Airlie
35bb6b058c i965: add missing parens in vec4 visitor
coverity reported this, Matt said it look like missing parens,
not bad identing, so lets try that.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 94a909ec2d)
2014-09-05 14:04:48 +01:00
Ilia Mirkin
24e226d0f5 nv50: attach the buffer bo to the miptree structures
The current code... makes no sense. Use nouveau_bo_ref to attach the bo
to the exposed resource so as to have the proper lifetime guarantees.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2c44043313)
2014-09-05 14:04:48 +01:00
Ilia Mirkin
39ad62ce51 nv50: mt address may not be the underlying bo's start address
With VP2, nv50_miptree is faked because the underlying bo's have to be
laid out in a certain way. This is done by adjusting the address. Make
sure that blits (and everything else for consistency) use the mt address
rather than the bo address as a base.

This fixes retrieving chroma plane with VDPAU.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82255
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9d52e551a5)
2014-09-05 14:04:48 +01:00
Ilia Mirkin
f2b2309281 nv50: set the miptree address when clearing bo's in vp2 init
The mt address is about to be used more, make sure it's set
appropriately.

Reported-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2528d402b9)
2014-09-05 14:04:47 +01:00
Ilia Mirkin
a4b3c4e3ec nv50/ir: avoid creating instructions that can't be emitted
When constant folding a MAD operation, we first fold the multiply and
generate an ADD. However we do so without making sure that the immediate
can be handled in the saturate case. If it can't, load the immediate in
a separate instruction.

Reported-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6c2b079231)
2014-09-05 14:04:47 +01:00
Ilia Mirkin
01dda9d0bd nvc0: don't make 1d staging textures linear
Experimentally, the sampler doesn't appear to like these, neither as
buffer nor as rect textures. So remove 1D from the list of texture types
to make linear when used for staging.

This fixes the OSD in mplayer for VDPAU.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 115d9a5525)
2014-09-05 14:04:47 +01:00
Ilia Mirkin
49cd42aab1 nv50: zero out unbound samplers
Samplers are only defined up to num_samplers, so set all samplers above
nr to NULL so that we don't try to read them again later.

Tested-by: Christian Ruppert <idl0r@qasl.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 362cd26960)
2014-09-05 14:04:47 +01:00
Ilia Mirkin
eaa9e14ce5 nvc0/ir: avoid infinite recursion when finding first uses of tex
In certain circumstances, findFirstUses could end up doubling back on
instructions it had already processed, resulting in an infinite
recursion. Avoid this by keeping track of already-visited instructions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83079
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c4bb436f76)
2014-09-05 14:04:46 +01:00
Marek Olšák
58be4ab741 r600g: fix layered clear
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit d159c5e3e0)
2014-09-05 14:04:46 +01:00
Marek Olšák
447785af9d glsl_to_tgsi: allocate and enlarge arrays for temporaries on demand
This fixes crashes if the number of temporaries is greater than 4096.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66184

v2: added fail paths for realloc failures

Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 482def592f)
2014-09-05 14:04:46 +01:00
Emil Velikov
390a9f6cb7 Increment version to 10.3.0-rc2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-01 00:23:50 +01:00
Emil Velikov
0fbb9a599d mesa: fix make tarballs
Current method of generating distribution tar-balls involves manually
invoking make + target name in the appropriate places. This temporary
solution is used until we get 'make dist' working.

Currently it does not work, as in order to have the target (which is
also a filename) available in the final Makefile we need to add a PHONY
target + use the correct target name.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 88cbe3908f)
2014-09-01 00:23:45 +01:00
Matt Turner
2310a4b4cf i965/vec4: Update register coalescing test.
In commit 04895f5c I added support for reswizzling writemasks. This test
was checking that we didn't support this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82881
(cherry picked from commit 8b5ac1df17)
2014-08-31 19:12:42 +01:00
Kenneth Graunke
8ef3d4fe03 i965: Add 2x MSAA support to Broadwell fast clear code.
According to the cited documentation section (but in the newer docs),
x_scaledown is the same for 2x and 4x MSAA.

+47 piglits.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83081
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e34a363a78)
2014-08-31 19:07:04 +01:00
Christian König
0c67167370 radeon/uvd: fix field handling on R6XX style UVD
The first UVD generation can only do frame based output.

Signed-off-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 80771e47b6)
Nominated-by: Alex Deucher <alexdeucher@gmail.com>
2014-08-28 23:01:44 +01:00
Christian König
60f136eed9 vl/compositor: set the scissor before clearing the render target
Otherwise we clear areas that shouldn't be cleared.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 03a99ba9e4)
2014-08-26 21:04:00 +01:00
Christian König
d2fb1da46d st/vdpau: fix vlVdpOutputSurfaceRender(Output|Bitmap)Surface
Correctly handle that the source_surface is only optional.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80561

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit b73c20759f)
2014-08-26 21:03:47 +01:00
Carl Worth
627d31dc36 glcpp: Don't use alternation in the lookahead for empty pragmas.
We've found that there's a buffer overrun bug in flex that's triggered by
using alternation in a lookahead pattern.

Fortunately, we don't need to match the exact {NEWLINE} expression to
detect an empty pragma. It suffices to verify that there are no non-space
characters before any newline character. So we can use a simple [\r\n] to
get the desired behavior while avoiding the flex bug.

Fixes the regression of piglit's 17000-consecutive-chars-identifier test,
(which has been crashing since commit
04e40fd337 ).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82472
Signed-off-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 23163df24c)
2014-08-25 22:32:10 +01:00
Carl Worth
e4f54d8b47 Makefile: Switch from md5sums to sha256sums
We switched to these several stable releases ago, (since the MD5 algorithm has
been broken for some time), but only now did I get around to fixing this in
the Makefile rather than just performing this step manually.

CC: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 46d03d37bf)
2014-08-25 22:31:44 +01:00
Alex Deucher
2edc941e75 radeonsi: add new SI pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 153df68834)
2014-08-25 22:31:19 +01:00
Alex Deucher
eb96819386 radeonsi: add new CIK pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit f50b6b4895)
2014-08-25 22:31:04 +01:00
Kenneth Graunke
f2a1b7d508 i965: Disable try_emit_b2f_of_compare on Gen4-6.
The optimization relies on CMP setting the destination to 0, which is
equivalent to 0.0f.  However, early platforms only set the least
significant byte, leaving the other bits undefined.  So, we must disable
the optimization on those platforms.

Oddly, Sandybridge wasn't reported as broken.  The PRM states that it
only sets the LSB, but the internal documentation says that it follows
the IVB behavior.  Since it wasn't reported as broken, we believe it
really does follow the IVB behavior.

v2: Allow the optimization on Sandybridge (requested by Matt).

+32 piglits on Ironlake.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?=79963
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 97d03b9366)
2014-08-22 11:43:25 -07:00
Matt Turner
53728f60aa i965: Fix JIP/UIP calculations.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82846
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82929
(cherry picked from commit d77f5603a5)
2014-08-22 09:31:22 -07:00
Carl Worth
04c3c03682 Increment version to 10.3.0-rc1 2014-08-21 08:36:46 -07:00
134 changed files with 2194 additions and 539 deletions

View File

@@ -81,6 +81,7 @@ SUBDIRS := \
src/mapi \
src/glsl \
src/mesa \
src/util \
src/egl/main
ifeq ($(strip $(MESA_BUILD_CLASSIC)),true)

View File

@@ -64,14 +64,13 @@ IGNORE_FILES = \
parsers: configure
$(MAKE) -C src/glsl glsl_parser.cpp glsl_parser.h glsl_lexer.cpp glcpp/glcpp-lex.c glcpp/glcpp-parse.c glcpp/glcpp-parse.h
$(MAKE) -C src/mesa program/lex.yy.c program/program_parse.tab.c program/program_parse.tab.h
# Everything for new a Mesa release:
ARCHIVES = $(PACKAGE_NAME).tar.gz \
$(PACKAGE_NAME).tar.bz2 \
$(PACKAGE_NAME).zip
tarballs: md5
tarballs: checksums
rm -f ../$(PACKAGE_DIR) $(PACKAGE_NAME).tar
manifest.txt: .git
@@ -98,9 +97,9 @@ $(PACKAGE_NAME).zip: parsers ../$(PACKAGE_DIR) manifest.txt
zip -q -@ $(PACKAGE_NAME).zip < $(PACKAGE_DIR)/manifest.txt ; \
mv $(PACKAGE_NAME).zip $(PACKAGE_DIR)
md5: $(ARCHIVES)
@-md5sum $(PACKAGE_NAME).tar.gz
@-md5sum $(PACKAGE_NAME).tar.bz2
@-md5sum $(PACKAGE_NAME).zip
checksums: $(ARCHIVES)
@-sha256sum $(PACKAGE_NAME).tar.gz
@-sha256sum $(PACKAGE_NAME).tar.bz2
@-sha256sum $(PACKAGE_NAME).zip
.PHONY: tarballs md5

View File

@@ -1 +1 @@
10.3.0-devel
10.3.1

View File

@@ -355,6 +355,24 @@ AC_LINK_IFELSE(
LDFLAGS=$save_LDFLAGS
AM_CONDITIONAL(HAVE_LD_VERSION_SCRIPT, test "$have_ld_version_script" = "yes")
dnl
dnl Check if linker supports dynamic list files
dnl
AC_MSG_CHECKING([if the linker supports --dynamic-list])
save_LDFLAGS=$LDFLAGS
LDFLAGS="$LDFLAGS -Wl,--dynamic-list=conftest.dyn"
cat > conftest.dyn <<EOF
{
radeon_drm_winsys_create;
};
EOF
AC_LINK_IFELSE(
[AC_LANG_SOURCE([int main() { return 0;}])],
[have_ld_dynamic_list=yes;AC_MSG_RESULT(yes)],
[have_ld_dynamic_list=no; AC_MSG_RESULT(no)])
LDFLAGS=$save_LDFLAGS
AM_CONDITIONAL(HAVE_LD_DYNAMIC_LIST, test "$have_ld_dynamic_list" = "yes")
dnl
dnl compatibility symlinks
dnl
@@ -802,6 +820,11 @@ fi
AM_CONDITIONAL(HAVE_SHARED_GLAPI, test "x$enable_shared_glapi" = xyes)
# Build the pipe-drivers as separate libraries/modules.
# Do not touch this unless you know what you are doing.
# XXX: Expose via configure option ?
enable_shared_pipe_drivers=no
dnl
dnl Driver specific build directories
dnl
@@ -822,7 +845,7 @@ esac
if test "x$enable_dri" = xyes; then
GALLIUM_WINSYS_DIRS="$GALLIUM_WINSYS_DIRS sw/dri"
GALLIUM_STATE_TRACKERS_DIRS="dri $GALLIUM_STATE_TRACKERS_DIRS"
enable_gallium_loader=yes
enable_gallium_loader="$enable_shared_pipe_drivers"
fi
if test "x$enable_gallium_osmesa" = xyes; then
@@ -1295,7 +1318,8 @@ if test "x$enable_gallium_egl" = xyes; then
GALLIUM_STATE_TRACKERS_DIRS="egl $GALLIUM_STATE_TRACKERS_DIRS"
GALLIUM_TARGET_DIRS="$GALLIUM_TARGET_DIRS egl-static"
# enable_gallium_loader=yes
# XXX: Uncomment once converted to use static/shared pipe-drivers
# enable_gallium_loader=$enable_shared_pipe_drivers
fi
AM_CONDITIONAL(HAVE_GALLIUM_EGL, test "x$enable_gallium_egl" = xyes)
@@ -1324,7 +1348,7 @@ if test "x$enable_gallium_gbm" = xyes; then
GALLIUM_STATE_TRACKERS_DIRS="gbm $GALLIUM_STATE_TRACKERS_DIRS"
GALLIUM_TARGET_DIRS="$GALLIUM_TARGET_DIRS gbm"
enable_gallium_loader=yes
enable_gallium_loader=$enable_shared_pipe_drivers
fi
AM_CONDITIONAL(HAVE_GALLIUM_GBM, test "x$enable_gallium_gbm" = xyes)
@@ -1341,7 +1365,7 @@ if test "x$enable_xa" = xyes; then
Example: ./configure --enable-xa --with-gallium-drivers=svga...])
fi
GALLIUM_STATE_TRACKERS_DIRS="xa $GALLIUM_STATE_TRACKERS_DIRS"
enable_gallium_loader=yes
enable_gallium_loader=$enable_shared_pipe_drivers
fi
AM_CONDITIONAL(HAVE_ST_XA, test "x$enable_xa" = xyes)
@@ -1389,7 +1413,7 @@ fi
if test "x$enable_xvmc" = xyes; then
PKG_CHECK_MODULES([XVMC], [xvmc >= $XVMC_REQUIRED x11-xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
GALLIUM_STATE_TRACKERS_DIRS="$GALLIUM_STATE_TRACKERS_DIRS xvmc"
enable_gallium_loader=yes
enable_gallium_loader=$enable_shared_pipe_drivers
fi
AM_CONDITIONAL(HAVE_ST_XVMC, test "x$enable_xvmc" = xyes)
@@ -1397,14 +1421,14 @@ if test "x$enable_vdpau" = xyes; then
PKG_CHECK_MODULES([VDPAU], [vdpau >= $VDPAU_REQUIRED x11-xcb xcb-dri2 >= $XCBDRI2_REQUIRED],
[VDPAU_LIBS="`$PKG_CONFIG --libs x11-xcb xcb-dri2`"])
GALLIUM_STATE_TRACKERS_DIRS="$GALLIUM_STATE_TRACKERS_DIRS vdpau"
enable_gallium_loader=yes
enable_gallium_loader=$enable_shared_pipe_drivers
fi
AM_CONDITIONAL(HAVE_ST_VDPAU, test "x$enable_vdpau" = xyes)
if test "x$enable_omx" = xyes; then
PKG_CHECK_MODULES([OMX], [libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED x11-xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
GALLIUM_STATE_TRACKERS_DIRS="$GALLIUM_STATE_TRACKERS_DIRS omx"
enable_gallium_loader=yes
enable_gallium_loader=$enable_shared_pipe_drivers
fi
AM_CONDITIONAL(HAVE_ST_OMX, test "x$enable_omx" = xyes)
@@ -1456,6 +1480,7 @@ if test "x$enable_opencl" = xyes; then
GALLIUM_STATE_TRACKERS_DIRS="$GALLIUM_STATE_TRACKERS_DIRS clover"
GALLIUM_TARGET_DIRS="$GALLIUM_TARGET_DIRS opencl"
# XXX: Use $enable_shared_pipe_drivers once converted to use static/shared pipe-drivers
enable_gallium_loader=yes
if test "x$enable_opencl_icd" = xyes; then
@@ -1630,6 +1655,7 @@ strip_unwanted_llvm_flags() {
# Use \> (marks the end of the word)
echo `$1` | sed \
-e 's/-DNDEBUG\>//g' \
-e 's/-D_GNU_SOURCE\>//g' \
-e 's/-pedantic\>//g' \
-e 's/-Wcovered-switch-default\>//g' \
-e 's/-O.\>//g' \
@@ -1678,11 +1704,10 @@ if test "x$enable_gallium_llvm" = xyes; then
AC_COMPUTE_INT([LLVM_VERSION_MINOR], [LLVM_VERSION_MINOR],
[#include "${LLVM_INCLUDEDIR}/llvm/Config/llvm-config.h"])
dnl In LLVM 3.4.1 patch level was defined in config.h and not
dnl llvm-config.h
AC_COMPUTE_INT([LLVM_VERSION_PATCH], [LLVM_VERSION_PATCH],
[#include "${LLVM_INCLUDEDIR}/llvm/Config/config.h"],
LLVM_VERSION_PATCH=0) dnl Default if LLVM_VERSION_PATCH not found
LLVM_VERSION_PATCH=`echo $LLVM_VERSION | cut -d. -f3 | egrep -o '^[[0-9]]+'`
if test -z "$LLVM_VERSION_PATCH"; then
LLVM_VERSION_PATCH=0
fi
if test -n "${LLVM_VERSION_MAJOR}"; then
LLVM_VERSION_INT="${LLVM_VERSION_MAJOR}0${LLVM_VERSION_MINOR}"
@@ -1756,6 +1781,7 @@ dnl
dnl Gallium Tests
dnl
if test "x$enable_gallium_tests" = xyes; then
# XXX: Use $enable_shared_pipe_drivers once converted to use static/shared pipe-drivers
enable_gallium_loader=yes
fi
AM_CONDITIONAL(HAVE_GALLIUM_TESTS, test "x$enable_gallium_tests" = xyes)
@@ -1889,6 +1915,9 @@ if test -n "$with_gallium_drivers"; then
case "x$driver" in
xsvga)
HAVE_GALLIUM_SVGA=yes
if test "x$have_libdrm" != xyes; then
AC_MSG_ERROR([Building svga requires libdrm >= $LIBDRM_REQUIRED])
fi
GALLIUM_DRIVERS_DIRS="$GALLIUM_DRIVERS_DIRS svga softpipe"
gallium_require_drm_loader
gallium_check_st "svga/drm" "dri/vmwgfx" "xa/vmwgfx"
@@ -2051,9 +2080,7 @@ AM_CONDITIONAL(NEED_GALLIUM_SOFTPIPE_DRIVER, test "x$HAVE_GALLIUM_SVGA" = xyes -
AM_CONDITIONAL(NEED_GALLIUM_LLVMPIPE_DRIVER, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes \
&& test "x$MESA_LLVM" = x1)
# Enable static gallium targets for now.
# Do not touch this unless you know what you are doing.
AM_CONDITIONAL(HAVE_GALLIUM_STATIC_TARGETS, test "xyes" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_STATIC_TARGETS, test "x$enable_shared_pipe_drivers" = xno)
# NOTE: anything using xcb or other client side libs ends up in separate
# _CLIENT variables. The pipe loader is built in two variants,

View File

@@ -16,6 +16,17 @@
<h1>News</h1>
<h2>September 19, 2014</h2>
<p>
<a href="relnotes/10.3.html">Mesa 10.3</a> is released. This is a new
development release. See the release notes for more information about
the release.
</p>
<p>
Also, <a href="relnotes/10.2.8.html">Mesa 10.2.8</a> is released.
This is a bug fix release from the 10.2 branch.
</p>
<h2>August 19, 2014</h2>
<p>
<a href="relnotes/10.2.6.html">Mesa 10.2.6</a> is released.

View File

@@ -21,6 +21,8 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/10.3.html">10.3 release notes</a>
<li><a href="relnotes/10.2.8.html">10.2.8 release notes</a>
<li><a href="relnotes/10.2.6.html">10.2.6 release notes</a>
<li><a href="relnotes/10.2.5.html">10.2.5 release notes</a>
<li><a href="relnotes/10.2.4.html">10.2.4 release notes</a>

156
docs/relnotes/10.3.1.html Normal file
View File

@@ -0,0 +1,156 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.3.1 Release Notes / October 12, 2014</h1>
<p>
Mesa 10.3.1 is a bug fix release which fixes bugs found since the 10.3 release.
</p>
<p>
Mesa 10.3.1 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBA
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79462">Bug 79462</a> - [NVC0/Codegen] Shader compilation falis in spill logic</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82932">Bug 82932</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.shaders.indexing.vector_subscript.vec3_static_loop_subscript_write_direct_read_vertex fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83506">Bug 83506</a> - [UBO] row_major layout ignored inside structures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83533">Bug 83533</a> - [UBO] nested structures don't get appropriate padding</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83570">Bug 83570</a> - Glyphy demo throws unhandled Integer division by zero exception</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83741">Bug 83741</a> - [UBO] row_major layout partially ignored for arrays of structures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84178">Bug 84178</a> - Big glamor regression in Xorg server 1.6.99.1 GIT: x11perf 1.5 Test: PutImage XY 500x500 Square</li>
</ul>
<h2>Changes</h2>
<p>Andreas Pokorny (2):</p>
<ul>
<li>egl/drm: expose KHR_image_pixmap extension</li>
<li>i915: Fix black buffers when importing prime fds</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: fix prog_optimize.c assertions triggered by SWZ opcode</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add 10.3 sha256 sums, news item and link release notes</li>
<li>Update VERSION to 10.3.1</li>
</ul>
<p>Ian Romanick (4):</p>
<ul>
<li>glsl: Make sure fields after small structs have correct padding</li>
<li>glsl: Make sure row-major array-of-structure get correct layout</li>
<li>glsl: Round struct size up to at least 16 bytes</li>
<li>glsl: Strip arrayness from ir_type_dereference_variable too</li>
</ul>
<p>Ilia Mirkin (5):</p>
<ul>
<li>nv50/ir: avoid deleting pseudo instructions too early</li>
<li>gm107/ir: fix manual TXD for array targets</li>
<li>gm107/ir: fix texture argument order</li>
<li>gm107/ir: add support for indirect const buffer selection</li>
<li>gm107/ir: take relative pfetch offset into account</li>
</ul>
<p>Keith Packard (1):</p>
<ul>
<li>glx/dri3: Provide error diagnostics when DRI3 allocation fails</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>mesa: Use proper structure for glGet*(GL_TEXTURE_COORD_ARRAY*).</li>
<li>mesa: Set correct array element in vbo_exec_vtx_init.</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>radeonsi: release GS rings at context destruction</li>
<li>radeonsi: properly destroy the GS copy shader and scratch_bo for compute</li>
<li>st/dri: remove GALLIUM_MSAA and __GL_FSAA_MODE environment variables</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>st/mesa: Use PIPE_USAGE_STAGING for GL_STATIC/DYNAMIC/STREAM_READ buffers</li>
</ul>
<p>Richard Sandiford (2):</p>
<ul>
<li>mesa: Fix alpha component in unpack_R8G8B8X8_SRGB.</li>
<li>swrast: Fix handling of MESA_FORMAT_L8A8_SRGB for big-endian</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>gallivm: fix idiv</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>st/xa: Fix regression in xa_yuv_planar_blit()</li>
</ul>
<p>Tom Stellard (2):</p>
<ul>
<li>clover: Add support to mem objects for multiple destructor callbacks v2</li>
<li>configure.ac: Compute LLVM_VERSION_PATCH using llvm-config</li>
</ul>
<p>Tomasz Figa (3):</p>
<ul>
<li>util: Include in Android builds</li>
<li>st/mesa: Generate format_info.c in Android builds</li>
<li>st/mesa: Fix paths used in Android builds</li>
</ul>
<p>rconde (1):</p>
<ul>
<li>gallivm,tgsi: fix idiv by zero crash</li>
</ul>
</div>
</body>
</html>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.3 Release Notes / TBD</h1>
<h1>Mesa 10.3 Release Notes / September 19, 2014</h1>
<p>
Mesa 10.3 is a new development release.
@@ -31,9 +31,11 @@ because compatibility contexts are not supported.
</p>
<h2>MD5 checksums</h2>
<h2>SHA256 checksums</h2>
<pre>
TBD.
9a1bf52040fc3dda81e83a35f944f1c3f532847dbe9fdf57161265cf71ea1bae MesaLib-10.3.0.tar.gz
0283bfe710fa449ed82e465cfa09612a269e19abb7e0382082608062ce7960b5 MesaLib-10.3.0.tar.bz2
221420763c2c3a244836a736e735612c4a6a0377b4e5223fca1e612f49906789 MesaLib-10.3.0.zip
</pre>
@@ -75,7 +77,249 @@ DRM drivers that don't have a full-fledged GEM (such as qxl or simpledrm)</li>
<h2>Bug fixes</h2>
TBD.
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=50754">Bug 50754</a> - Building 32 bit mesa on 64 bit OS fails since change for automake</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=53617">Bug 53617</a> - [llvmpipe] piglit fbo-depthtex regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=56127">Bug 56127</a> - [ILK bisected]unigine-sanctruary performance reduced by 98%</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66184">Bug 66184</a> - src/mesa/state_tracker/st_glsl_to_tgsi.cpp:3216:simplify_cmp: Assertion `inst-&gt;dst.index &lt; 4096' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66452">Bug 66452</a> - JUNIPER UVD accelerated playback of WMV3 streams does not work</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68365">Bug 68365</a> - [SNB Bisected]Piglit spec_ARB_framebuffer_object_fbo-blit-stretch fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70441">Bug 70441</a> - [Gen4-5 clip] Piglit spec_OpenGL_1.1_polygon-offset hits (execsize &gt;= width) assertion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73846">Bug 73846</a> - [llvmpipe] lp_test_format fails with llvm-3.5svn &gt;= r199602</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74005">Bug 74005</a> - [i965 Bisected]Piglit/glx_glx-make-glxdrawable-current fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74863">Bug 74863</a> - [r600g] HyperZ broken on RV770 and CYPRESS (Left 4 Dead 2 trees corruption) bisected!</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75010">Bug 75010</a> - clang: error: unknown argument: '-fstack-protector-strong'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75478">Bug 75478</a> - [BDW]Some Piglit and Ogles2conform cases cause GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75664">Bug 75664</a> - Unigine Valley &amp; Heaven &quot;error: syntax error, unexpected EXTENSION, expecting $end&quot; IVB HD4000</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75878">Bug 75878</a> - [BDW] GPU hang running Raytracer WebGL demo</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76188">Bug 76188</a> - EGL_EXT_image_dma_buf_import fd ownership is incorrect</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76223">Bug 76223</a> - [radeonsi] luxmark segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76939">Bug 76939</a> - [BDW] GPU hang when running “Metro:Last Light “ /“Crusader Kings II”</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77245">Bug 77245</a> - Bogus GL_ARB_explicit_attrib_location layout identifier warnings</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77493">Bug 77493</a> - lp_test_arit fails with llvm &gt;= llvm-3.5svn r206094</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77703">Bug 77703</a> - [ILK Bisected]Piglit glean_texCombine4 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77704">Bug 77704</a> - [IVB/HSW Bisected]Ogles3conform GL3Tests_shadow_shadow_execution_frag.test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77705">Bug 77705</a> - [SNB/IVB/HSW/BYT/BDW Bisected]Ogles3conform GL3Tests/packed_pixels/packed_pixels_pixelstore.test segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77707">Bug 77707</a> - [ILK Bisected]Ogles2conform GL_sin_sin_float_frag_xvary.test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77740">Bug 77740</a> - i965: Relax accumulator dependency scheduling on Gen &lt; 6</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77852">Bug 77852</a> - [BDW]Piglit spec_ARB_framebuffer_object_fbo-drawbuffers-none_glBlitFramebuffer fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77856">Bug 77856</a> - [BDW]Piglit spec_OpenGL_3.0_clearbuffer-mixed-format fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77865">Bug 77865</a> - [BDW] Many Ogles3conform framebuffer_blit cases fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78225">Bug 78225</a> - Compile error due to undefined reference to `gbm_dri_backend', fix attached</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78258">Bug 78258</a> - make check link_varyings.gl_ClipDistance failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78403">Bug 78403</a> - query_renderer_implementation_unittest.cpp:144:4: error: expected primary-expression before . token</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78468">Bug 78468</a> - Compiling of shader gets stuck in infinite loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78537">Bug 78537</a> - no anisotropic filtering in a native Half-Life 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78546">Bug 78546</a> - [swrast] piglit copyteximage-border regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78581">Bug 78581</a> - OpenCL: clBuildProgram prints error messages directly rather than storing them</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78648">Bug 78648</a> - Texture artifacts in Kerbal Space Program</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78665">Bug 78665</a> - macros in builtin_functions.cpp make invalid assumptions about M_PI definitions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78679">Bug 78679</a> - Gen4-5 code lost: runtime_check_aads_emit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78691">Bug 78691</a> - [G45 - Tesseract] Mesa 10.1.2 implementation error: Unsupported opcode 169872468 in FS</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78692">Bug 78692</a> - Football Manager 2014, gameplay rendered black &amp; white</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78716">Bug 78716</a> - Fix Mesa bugs for running Unreal Engine 4.1 Cave effects demo compiled for Linux</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78803">Bug 78803</a> - gallivm/lp_bld_debug.cpp:42:28: fatal error: llvm/IR/Module.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78842">Bug 78842</a> - [swrast] piglit fcc-read-after-clear copy rb regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78843">Bug 78843</a> - [swrast] piglit copyteximage 1D regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78872">Bug 78872</a> - [ILK Bisected]Piglit spec_ARB_depth_buffer_float_fbo-depthstencil-GL_DEPTH32F_STENCIL8-blit Aborted</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78875">Bug 78875</a> - [ILK Bisected]Webglc conformance/uniforms/uniform-default-values.html fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78888">Bug 78888</a> - test_eu_compact.c:54:3: error: implicit declaration of function brw_disasm [-Werror=implicit-function-declaration]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79029">Bug 79029</a> - INTEL_DEBUG=shader_time is full of lies</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79095">Bug 79095</a> - x86/common_x86.c:348:14: error: use of undeclared identifier 'bit_SSE4_1'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79115">Bug 79115</a> - glFramebufferRenderbuffer(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_RENDERBUFFER, 0) doesn't unbind stencil buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79263">Bug 79263</a> - Linking error in egl_gallium.la when compiling 32 bit on multiarch</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79294">Bug 79294</a> - Xlib-based build broken on non x86/x86-64 architectures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79373">Bug 79373</a> - Non-const initializers for matrix and vector constructors</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79382">Bug 79382</a> - build error: multiple definition of `loader_get_pci_id_for_fd'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79421">Bug 79421</a> - [llvmpipe] SIGSEGV src/gallium/drivers/llvmpipe/lp_rast_priv.h:218</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79440">Bug 79440</a> - prog_hash_table.c:146: undefined reference to `_mesa_error_no_memory'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79469">Bug 79469</a> - Commit e3cc0d90e14e62a0a787b6c07a6df0f5c84039be breaks unigine heaven</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79534">Bug 79534</a> - gen&lt;7 renders garbage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79616">Bug 79616</a> - L4D2 crash on startup</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79724">Bug 79724</a> - switch statement type check</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79729">Bug 79729</a> - [i965] glClear on a multisample texture doesn't work</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79809">Bug 79809</a> - radeonsi: mouse cursor corruption using weston on AMD Kaveri</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79823">Bug 79823</a> - [NV30/gallium] Mozilla apps freeze on startup with nouveau-dri-10.2.1 libs on dual-screen</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79885">Bug 79885</a> - commit b52a530 (gallium/egl: st_profiles are build time decision, treat them as such) broke egl</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79903">Bug 79903</a> - [HSW Bisected]Some Piglit and Ogles2conform cases fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79907">Bug 79907</a> - Mesa 10.2.1 --enable-vdpau default=auto broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79948">Bug 79948</a> - [i965] Incorrect pixels when using discard and uniform loads</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80015">Bug 80015</a> - Transparency glitches in native Civilization 5 (Civ5) port</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80115">Bug 80115</a> - MESA_META_DRAW_BUFFERS induced GL_INVALID_VALUE errors</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80211">Bug 80211</a> - [ILK/SNB Bisected]Piglit shaders_glsl-fs-copy-propagation-texcoords-1 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80247">Bug 80247</a> - Khronos conformance test ES3-CTS.gtf.GL3Tests.transform_feedback.transform_feedback_vertex_id fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80254">Bug 80254</a> - pipe_loader_sw.c:90: undefined reference to `dri_create_sw_winsys'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80541">Bug 80541</a> - [softpipe] piglit levelclamp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80561">Bug 80561</a> - Incorrect implementation of some VDPAU APIs.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80614">Bug 80614</a> - [regression] Error in `omxregister-bellagio': munmap_chunk(): invalid pointer: 0x00007f5f76626dab</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80778">Bug 80778</a> - [bisected regression] piglit spec/glsl-1.50/compiler/incorrect-in-layout-qualifier-repeated-prim.geom</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80827">Bug 80827</a> - [radeonsi,R9 270X] Corruptions in window menus in KDE</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80880">Bug 80880</a> - Unreal Engine 4 demos fail GLSL compiler assertion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80991">Bug 80991</a> - [BDW]Piglit spec_ARB_sample_shading_builtin-gl-sample-mask_2 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81020">Bug 81020</a> - [radeonsi][regresssion] Wireframe of background rendered through objects in Half-Life 2: Episode 2 with MSAA enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81150">Bug 81150</a> - [SNB]Piglit spec_arb_shading_language_packing_execution_built-in-functions_fs-packSnorm4x8 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81157">Bug 81157</a> - [BDW]Piglit some spec_glsl-1.50_execution_built-in-functions* cases fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81450">Bug 81450</a> - [BDW]Piglit spec_glsl-1.30_execution_tex-miplevel-selection_textureGrad_1DArray cases intel_do_flush_locked failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81828">Bug 81828</a> - [BDW Bisected]Ogles3conform GL3Tests_packed_pixels_packed_pixels_pbo.test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81834">Bug 81834</a> - TGSI constant buffer overrun causes assertion failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81857">Bug 81857</a> - [SNB+]Piglit spec_glsl-1.30_execution_switch_fs-default_last sporadically fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81967">Bug 81967</a> - [regression] Selections in Blender renders wrong</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82139">Bug 82139</a> - [r600g, bisected] multiple ubo piglit regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82159">Bug 82159</a> - No rule to make target `../../../../src/mesa/libmesa.la', needed by `collision'.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82255">Bug 82255</a> - [VP2] Chroma planes are vertically stretched during VDPAU playback</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82268">Bug 82268</a> - Add support for the OpenRISC architecture (or1k)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82428">Bug 82428</a> - [radeonsi,R9 270X] System lockup when using mplayer/mpv with VDPAU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82472">Bug 82472</a> - piglit 16385-consecutive-chars regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82483">Bug 82483</a> - format_srgb.h:145: undefined reference to `util_format_srgb_to_linear_8unorm_table'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82517">Bug 82517</a> - [RADEONSI,VDPAU] SIGSEGV in map_msg_fb_buf called from ruvd_destroy, when closing a Tab with accelerated video player</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82534">Bug 82534</a> - src\egl\main\eglapi.h : fatal error LNK1107: invalid or corrupt file: cannot read at 0x2E02</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82536">Bug 82536</a> - u_current.h:72: undefined reference to `__imp__glapi_Dispatch'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82538">Bug 82538</a> - Super Maryo Chronicles fails with st/mesa assertion failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82539">Bug 82539</a> - vmw_screen_dri.lo In file included from vmw_screen_dri.c:41: vmwgfx_drm.h:32:17: error: drm.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82546">Bug 82546</a> - [regression] libOSMesa build failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82574">Bug 82574</a> - GLSL: opt_vectorize goes wrong on texture lookups</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82628">Bug 82628</a> - bisected: GALLIUM_HUD hangs radeon 7970M (PRIME)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82671">Bug 82671</a> - [r600g-evergreen][compute]Empty kernel execution causes crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82709">Bug 82709</a> - OpenCL not working on radeon hainan</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82796">Bug 82796</a> - [IVB/BYT-M/HSW/BDW Bisected]Synmark2_v6.0_OglTerrainFlyInst/OglTerrainPanInst cannot run as image validation failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82804">Bug 82804</a> - unreal engine 4 rendering errors</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82814">Bug 82814</a> - glDrawBuffers(0, NULL) segfaults in _mesa_drawbuffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82828">Bug 82828</a> - Regression: Crash in 3Dmark2001</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82846">Bug 82846</a> - [BDW Bisected] Gpu hang when running Lightsmark v2008/Warsow v1.0/Xonotic v0.7/unigine-demos</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82881">Bug 82881</a> - test_vec4_register_coalesce regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82882">Bug 82882</a> - [swrast] piglit glsl-fs-uniform-bool-1 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82929">Bug 82929</a> - [BDW Bisected]glxgears causes X hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82932">Bug 82932</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.shaders.indexing.vector_subscript.vec3_static_loop_subscript_write_direct_read_vertex fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83046">Bug 83046</a> - [BDW bisected]] Warsow v1.0/Xonotic v0.7/Gputest v0.5_triangle_fullscreen/synmark2_v6/GLBenchmark v2.5.0/GLBenchmark v2.7.0/Ungine-demos performance reduced 30%~60%</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83079">Bug 83079</a> - [NVC0] Dota 2 (Linux native and Wine) crash with Nouveau Drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83081">Bug 83081</a> - [BDW Bisected]Piglit spec_ARB_sample_shading_builtin-gl-sample-mask_2 is core dumped</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83127">Bug 83127</a> - [ILK Bisected]Piglit glean_texCombine fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83355">Bug 83355</a> - FTBFS: src/mesa/program/program_lexer.l:122:64: error: unknown type name 'YYSTYPE'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83432">Bug 83432</a> - r600_query.c:269:r600_emit_query_end: Assertion `ctx-&gt;num_pipelinestat_queries &gt; 0' failed [Gallium HUD]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83468">Bug 83468</a> - [UBO] Using bool from UBO as if-statement condition asserts</li>
</ul>
<h2>Changes</h2>

View File

@@ -38,6 +38,7 @@ CHIPSET(0x6828, VERDE_6828, VERDE)
CHIPSET(0x6829, VERDE_6829, VERDE)
CHIPSET(0x682A, VERDE_682A, VERDE)
CHIPSET(0x682B, VERDE_682B, VERDE)
CHIPSET(0x682C, VERDE_682C, VERDE)
CHIPSET(0x682D, VERDE_682D, VERDE)
CHIPSET(0x682F, VERDE_682F, VERDE)
CHIPSET(0x6830, VERDE_6830, VERDE)
@@ -54,8 +55,11 @@ CHIPSET(0x6600, OLAND_6600, OLAND)
CHIPSET(0x6601, OLAND_6601, OLAND)
CHIPSET(0x6602, OLAND_6602, OLAND)
CHIPSET(0x6603, OLAND_6603, OLAND)
CHIPSET(0x6604, OLAND_6604, OLAND)
CHIPSET(0x6605, OLAND_6605, OLAND)
CHIPSET(0x6606, OLAND_6606, OLAND)
CHIPSET(0x6607, OLAND_6607, OLAND)
CHIPSET(0x6608, OLAND_6608, OLAND)
CHIPSET(0x6610, OLAND_6610, OLAND)
CHIPSET(0x6611, OLAND_6611, OLAND)
CHIPSET(0x6613, OLAND_6613, OLAND)
@@ -73,6 +77,8 @@ CHIPSET(0x666F, HAINAN_666F, HAINAN)
CHIPSET(0x6640, BONAIRE_6640, BONAIRE)
CHIPSET(0x6641, BONAIRE_6641, BONAIRE)
CHIPSET(0x6646, BONAIRE_6646, BONAIRE)
CHIPSET(0x6647, BONAIRE_6647, BONAIRE)
CHIPSET(0x6649, BONAIRE_6649, BONAIRE)
CHIPSET(0x6650, BONAIRE_6650, BONAIRE)
CHIPSET(0x6651, BONAIRE_6651, BONAIRE)
@@ -132,6 +138,7 @@ CHIPSET(0x1313, KAVERI_1313, KAVERI)
CHIPSET(0x1315, KAVERI_1315, KAVERI)
CHIPSET(0x1316, KAVERI_1316, KAVERI)
CHIPSET(0x1317, KAVERI_1317, KAVERI)
CHIPSET(0x1318, KAVERI_1318, KAVERI)
CHIPSET(0x131B, KAVERI_131B, KAVERI)
CHIPSET(0x131C, KAVERI_131C, KAVERI)
CHIPSET(0x131D, KAVERI_131D, KAVERI)

View File

@@ -681,6 +681,7 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
i + 1, EGL_WINDOW_BIT, attr_list, NULL);
}
disp->Extensions.KHR_image_pixmap = EGL_TRUE;
if (dri2_dpy->dri2)
disp->Extensions.EXT_buffer_age = EGL_TRUE;

View File

@@ -143,6 +143,7 @@ LOCAL_STATIC_LIBRARIES := \
libmesa_st_egl \
$(gallium_DRIVERS) \
libmesa_st_mesa \
libmesa_util \
libmesa_glsl \
libmesa_glsl_utils \
libmesa_gallium \

View File

@@ -30,7 +30,9 @@ include $(CLEAR_VARS)
LOCAL_SRC_FILES := $(C_SOURCES)
LOCAL_C_INCLUDES := $(GALLIUM_TOP)/auxiliary/util
LOCAL_C_INCLUDES := \
$(GALLIUM_TOP)/auxiliary/util \
$(MESA_TOP)/src
LOCAL_MODULE := libmesa_gallium

View File

@@ -1850,7 +1850,7 @@ lp_build_trunc(struct lp_build_context *bld,
const struct lp_type type = bld->type;
struct lp_type inttype;
struct lp_build_context intbld;
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 2^24);
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 1<<24);
LLVMValueRef trunc, res, anosign, mask;
LLVMTypeRef int_vec_type = bld->int_vec_type;
LLVMTypeRef vec_type = bld->vec_type;
@@ -1905,7 +1905,7 @@ lp_build_round(struct lp_build_context *bld,
const struct lp_type type = bld->type;
struct lp_type inttype;
struct lp_build_context intbld;
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 2^24);
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 1<<24);
LLVMValueRef res, anosign, mask;
LLVMTypeRef int_vec_type = bld->int_vec_type;
LLVMTypeRef vec_type = bld->vec_type;
@@ -1958,7 +1958,7 @@ lp_build_floor(struct lp_build_context *bld,
const struct lp_type type = bld->type;
struct lp_type inttype;
struct lp_build_context intbld;
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 2^24);
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 1<<24);
LLVMValueRef trunc, res, anosign, mask;
LLVMTypeRef int_vec_type = bld->int_vec_type;
LLVMTypeRef vec_type = bld->vec_type;
@@ -2027,7 +2027,7 @@ lp_build_ceil(struct lp_build_context *bld,
const struct lp_type type = bld->type;
struct lp_type inttype;
struct lp_build_context intbld;
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 2^24);
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 1<<24);
LLVMValueRef trunc, res, anosign, mask, tmp;
LLVMTypeRef int_vec_type = bld->int_vec_type;
LLVMTypeRef vec_type = bld->vec_type;

View File

@@ -464,6 +464,7 @@ lp_build_pack2(struct gallivm_state *gallivm,
if((util_cpu_caps.has_sse2 || util_cpu_caps.has_altivec) &&
src_type.width * src_type.length >= 128) {
const char *intrinsic = NULL;
boolean swap_intrinsic_operands = FALSE;
switch(src_type.width) {
case 32:
@@ -482,6 +483,9 @@ lp_build_pack2(struct gallivm_state *gallivm,
} else {
intrinsic = "llvm.ppc.altivec.vpkuwus";
}
#ifdef PIPE_ARCH_LITTLE_ENDIAN
swap_intrinsic_operands = TRUE;
#endif
}
break;
case 16:
@@ -490,12 +494,18 @@ lp_build_pack2(struct gallivm_state *gallivm,
intrinsic = "llvm.x86.sse2.packsswb.128";
} else if (util_cpu_caps.has_altivec) {
intrinsic = "llvm.ppc.altivec.vpkshss";
#ifdef PIPE_ARCH_LITTLE_ENDIAN
swap_intrinsic_operands = TRUE;
#endif
}
} else {
if (util_cpu_caps.has_sse2) {
intrinsic = "llvm.x86.sse2.packuswb.128";
} else if (util_cpu_caps.has_altivec) {
intrinsic = "llvm.ppc.altivec.vpkshus";
#ifdef PIPE_ARCH_LITTLE_ENDIAN
swap_intrinsic_operands = TRUE;
#endif
}
}
break;
@@ -504,7 +514,11 @@ lp_build_pack2(struct gallivm_state *gallivm,
if (intrinsic) {
if (src_type.width * src_type.length == 128) {
LLVMTypeRef intr_vec_type = lp_build_vec_type(gallivm, intr_type);
res = lp_build_intrinsic_binary(builder, intrinsic, intr_vec_type, lo, hi);
if (swap_intrinsic_operands) {
res = lp_build_intrinsic_binary(builder, intrinsic, intr_vec_type, hi, lo);
} else {
res = lp_build_intrinsic_binary(builder, intrinsic, intr_vec_type, lo, hi);
}
if (dst_vec_type != intr_vec_type) {
res = LLVMBuildBitCast(builder, res, dst_vec_type, "");
}
@@ -513,6 +527,8 @@ lp_build_pack2(struct gallivm_state *gallivm,
int num_split = src_type.width * src_type.length / 128;
int i;
int nlen = 128 / src_type.width;
int lo_off = swap_intrinsic_operands ? nlen : 0;
int hi_off = swap_intrinsic_operands ? 0 : nlen;
struct lp_type ndst_type = lp_type_unorm(dst_type.width, 128);
struct lp_type nintr_type = lp_type_unorm(intr_type.width, 128);
LLVMValueRef tmpres[LP_MAX_VECTOR_WIDTH / 128];
@@ -524,9 +540,9 @@ lp_build_pack2(struct gallivm_state *gallivm,
for (i = 0; i < num_split / 2; i++) {
tmplo = lp_build_extract_range(gallivm,
lo, i*nlen*2, nlen);
lo, i*nlen*2 + lo_off, nlen);
tmphi = lp_build_extract_range(gallivm,
lo, i*nlen*2 + nlen, nlen);
lo, i*nlen*2 + hi_off, nlen);
tmpres[i] = lp_build_intrinsic_binary(builder, intrinsic,
nintr_vec_type, tmplo, tmphi);
if (ndst_vec_type != nintr_vec_type) {
@@ -535,9 +551,9 @@ lp_build_pack2(struct gallivm_state *gallivm,
}
for (i = 0; i < num_split / 2; i++) {
tmplo = lp_build_extract_range(gallivm,
hi, i*nlen*2, nlen);
hi, i*nlen*2 + lo_off, nlen);
tmphi = lp_build_extract_range(gallivm,
hi, i*nlen*2 + nlen, nlen);
hi, i*nlen*2 + hi_off, nlen);
tmpres[i+num_split/2] = lp_build_intrinsic_binary(builder, intrinsic,
nintr_vec_type,
tmplo, tmphi);

View File

@@ -1248,8 +1248,24 @@ idiv_emit_cpu(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
emit_data->output[emit_data->chan] = lp_build_div(&bld_base->int_bld,
emit_data->args[0], emit_data->args[1]);
LLVMBuilderRef builder = bld_base->base.gallivm->builder;
LLVMValueRef div_mask = lp_build_cmp(&bld_base->uint_bld,
PIPE_FUNC_EQUAL, emit_data->args[1],
bld_base->uint_bld.zero);
/* We want to make sure that we never divide/mod by zero to not
* generate sigfpe. We don't want to crash just because the
* shader is doing something weird. */
LLVMValueRef divisor = LLVMBuildOr(builder,
div_mask,
emit_data->args[1], "");
LLVMValueRef result = lp_build_div(&bld_base->int_bld,
emit_data->args[0], divisor);
LLVMValueRef not_div_mask = LLVMBuildNot(builder,
div_mask,"");
/* idiv by zero doesn't have a guaranteed return value chose 0 for now. */
emit_data->output[emit_data->chan] = LLVMBuildAnd(builder,
not_div_mask,
result, "");
}
/* TGSI_OPCODE_INEG (CPU Only) */
@@ -1675,15 +1691,15 @@ udiv_emit_cpu(
LLVMValueRef div_mask = lp_build_cmp(&bld_base->uint_bld,
PIPE_FUNC_EQUAL, emit_data->args[1],
bld_base->uint_bld.zero);
/* We want to make sure that we never divide/mod by zero to not
* generate sigfpe. We don't want to crash just because the
/* We want to make sure that we never divide/mod by zero to not
* generate sigfpe. We don't want to crash just because the
* shader is doing something weird. */
LLVMValueRef divisor = LLVMBuildOr(builder,
div_mask,
emit_data->args[1], "");
LLVMValueRef result = lp_build_div(&bld_base->uint_bld,
emit_data->args[0], divisor);
/* udiv by zero is guaranteed to return 0xffffffff */
/* udiv by zero is guaranteed to return 0xffffffff at least with d3d10 */
emit_data->output[emit_data->chan] = LLVMBuildOr(builder,
div_mask,
result, "");

View File

@@ -3340,10 +3340,10 @@ micro_idiv(union tgsi_exec_channel *dst,
const union tgsi_exec_channel *src0,
const union tgsi_exec_channel *src1)
{
dst->i[0] = src0->i[0] / src1->i[0];
dst->i[1] = src0->i[1] / src1->i[1];
dst->i[2] = src0->i[2] / src1->i[2];
dst->i[3] = src0->i[3] / src1->i[3];
dst->i[0] = src1->i[0] ? src0->i[0] / src1->i[0] : 0;
dst->i[1] = src1->i[1] ? src0->i[1] / src1->i[1] : 0;
dst->i[2] = src1->i[2] ? src0->i[2] / src1->i[2] : 0;
dst->i[3] = src1->i[3] ? src0->i[3] / src1->i[3] : 0;
}
static void

View File

@@ -40,6 +40,7 @@
#include "pipe/p_compiler.h"
#include "util/u_debug.h"
#ifdef __cplusplus

View File

@@ -1060,6 +1060,7 @@ vl_compositor_render(struct vl_compositor_state *s,
s->scissor.maxx = dst_surface->width;
s->scissor.maxy = dst_surface->height;
}
c->pipe->set_scissor_states(c->pipe, 0, 1, &s->scissor);
gen_vertex_data(c, s, dirty_area);
@@ -1072,7 +1073,6 @@ vl_compositor_render(struct vl_compositor_state *s,
dirty_area->x1 = dirty_area->y1 = MIN_DIRTY;
}
c->pipe->set_scissor_states(c->pipe, 0, 1, &s->scissor);
c->pipe->set_framebuffer_state(c->pipe, &c->fb_state);
c->pipe->bind_vs_state(c->pipe, c->vs);
c->pipe->set_vertex_buffers(c->pipe, 0, 1, &c->vertex_buf);

View File

@@ -98,6 +98,7 @@ fd2_context_create(struct pipe_screen *pscreen, void *priv)
pctx = &fd2_ctx->base.base;
fd2_ctx->base.dev = fd_device_ref(screen->dev);
fd2_ctx->base.screen = fd_screen(pscreen);
pctx->destroy = fd2_context_destroy;
pctx->create_blend_state = fd2_blend_state_create;

View File

@@ -215,14 +215,19 @@ emit_textures(struct fd_ringbuffer *ring,
OUT_RING(ring, CP_LOAD_STATE_1_STATE_TYPE(ST_CONSTANTS) |
CP_LOAD_STATE_1_EXT_SRC_ADDR(0));
for (i = 0; i < tex->num_textures; i++) {
static const struct fd3_pipe_sampler_view dummy_view = {};
static const struct fd3_pipe_sampler_view dummy_view = {
.base.u.tex.first_level = 1,
};
const struct fd3_pipe_sampler_view *view = tex->textures[i] ?
fd3_pipe_sampler_view(tex->textures[i]) :
&dummy_view;
struct fd_resource *rsc = view->tex_resource;
unsigned start = view->base.u.tex.first_level;
unsigned end = view->base.u.tex.last_level;
for (j = 0; j < view->mipaddrs; j++) {
struct fd_resource_slice *slice = fd_resource_slice(rsc, j);
for (j = 0; j < (end - start + 1); j++) {
struct fd_resource_slice *slice =
fd_resource_slice(rsc, j + start);
OUT_RELOC(ring, rsc->bo, slice->offset, 0, 0);
}

View File

@@ -144,7 +144,8 @@ fd3_sampler_view_create(struct pipe_context *pctx, struct pipe_resource *prsc,
{
struct fd3_pipe_sampler_view *so = CALLOC_STRUCT(fd3_pipe_sampler_view);
struct fd_resource *rsc = fd_resource(prsc);
unsigned miplevels = cso->u.tex.last_level - cso->u.tex.first_level;
unsigned lvl = cso->u.tex.first_level;
unsigned miplevels = cso->u.tex.last_level - lvl;
if (!so)
return NULL;
@@ -156,7 +157,6 @@ fd3_sampler_view_create(struct pipe_context *pctx, struct pipe_resource *prsc,
so->base.context = pctx;
so->tex_resource = rsc;
so->mipaddrs = 1 + miplevels;
so->texconst0 =
A3XX_TEX_CONST_0_TYPE(tex_type(prsc->target)) |
@@ -170,11 +170,11 @@ fd3_sampler_view_create(struct pipe_context *pctx, struct pipe_resource *prsc,
so->texconst1 =
A3XX_TEX_CONST_1_FETCHSIZE(fd3_pipe2fetchsize(cso->format)) |
A3XX_TEX_CONST_1_WIDTH(prsc->width0) |
A3XX_TEX_CONST_1_HEIGHT(prsc->height0);
A3XX_TEX_CONST_1_WIDTH(u_minify(prsc->width0, lvl)) |
A3XX_TEX_CONST_1_HEIGHT(u_minify(prsc->height0, lvl));
/* when emitted, A3XX_TEX_CONST_2_INDX() must be OR'd in: */
so->texconst2 =
A3XX_TEX_CONST_2_PITCH(rsc->slices[0].pitch * rsc->cpp);
A3XX_TEX_CONST_2_PITCH(rsc->slices[lvl].pitch * rsc->cpp);
so->texconst3 = 0x00000000; /* ??? */
return &so->base;

View File

@@ -51,7 +51,6 @@ fd3_sampler_stateobj(struct pipe_sampler_state *samp)
struct fd3_pipe_sampler_view {
struct pipe_sampler_view base;
struct fd_resource *tex_resource;
uint32_t mipaddrs;
uint32_t texconst0, texconst1, texconst2, texconst3;
};

View File

@@ -304,7 +304,36 @@ fail:
return NULL;
}
static bool render_blit(struct pipe_context *pctx, struct pipe_blit_info *info);
static void fd_blitter_pipe_begin(struct fd_context *ctx);
static void fd_blitter_pipe_end(struct fd_context *ctx);
/**
* _copy_region using pipe (3d engine)
*/
static bool
fd_blitter_pipe_copy_region(struct fd_context *ctx,
struct pipe_resource *dst,
unsigned dst_level,
unsigned dstx, unsigned dsty, unsigned dstz,
struct pipe_resource *src,
unsigned src_level,
const struct pipe_box *src_box)
{
/* not until we allow rendertargets to be buffers */
if (dst->target == PIPE_BUFFER || src->target == PIPE_BUFFER)
return false;
if (!util_blitter_is_copy_supported(ctx->blitter, dst, src))
return false;
fd_blitter_pipe_begin(ctx);
util_blitter_copy_texture(ctx->blitter,
dst, dst_level, dstx, dsty, dstz,
src, src_level, src_box);
fd_blitter_pipe_end(ctx);
return true;
}
/**
* Copy a block of pixels from one resource to another.
@@ -320,40 +349,33 @@ fd_resource_copy_region(struct pipe_context *pctx,
unsigned src_level,
const struct pipe_box *src_box)
{
struct fd_context *ctx = fd_context(pctx);
/* TODO if we have 2d core, or other DMA engine that could be used
* for simple copies and reasonably easily synchronized with the 3d
* core, this is where we'd plug it in..
*/
struct pipe_blit_info info = {
.dst = {
.resource = dst,
.box = {
.x = dstx,
.y = dsty,
.z = dstz,
.width = src_box->width,
.height = src_box->height,
.depth = src_box->depth,
},
.format = util_format_linear(dst->format),
},
.src = {
.resource = src,
.box = *src_box,
.format = util_format_linear(src->format),
},
.mask = PIPE_MASK_RGBA,
.filter = PIPE_TEX_FILTER_NEAREST,
};
render_blit(pctx, &info);
/* try blit on 3d pipe: */
if (fd_blitter_pipe_copy_region(ctx,
dst, dst_level, dstx, dsty, dstz,
src, src_level, src_box))
return;
/* else fallback to pure sw: */
util_resource_copy_region(pctx,
dst, dst_level, dstx, dsty, dstz,
src, src_level, src_box);
}
/* Optimal hardware path for blitting pixels.
/**
* Optimal hardware path for blitting pixels.
* Scaling, format conversion, up- and downsampling (resolve) are allowed.
*/
static void
fd_blit(struct pipe_context *pctx, const struct pipe_blit_info *blit_info)
{
struct fd_context *ctx = fd_context(pctx);
struct pipe_blit_info info = *blit_info;
if (info.src.resource->nr_samples > 1 &&
@@ -373,21 +395,21 @@ fd_blit(struct pipe_context *pctx, const struct pipe_blit_info *blit_info)
info.mask &= ~PIPE_MASK_S;
}
render_blit(pctx, &info);
}
static bool
render_blit(struct pipe_context *pctx, struct pipe_blit_info *info)
{
struct fd_context *ctx = fd_context(pctx);
if (!util_blitter_is_blit_supported(ctx->blitter, info)) {
if (!util_blitter_is_blit_supported(ctx->blitter, &info)) {
DBG("blit unsupported %s -> %s",
util_format_short_name(info->src.resource->format),
util_format_short_name(info->dst.resource->format));
return false;
util_format_short_name(info.src.resource->format),
util_format_short_name(info.dst.resource->format));
return;
}
fd_blitter_pipe_begin(ctx);
util_blitter_blit(ctx->blitter, &info);
fd_blitter_pipe_end(ctx);
}
static void
fd_blitter_pipe_begin(struct fd_context *ctx)
{
util_blitter_save_vertex_buffer_slot(ctx->blitter, ctx->vertexbuf.vb);
util_blitter_save_vertex_elements(ctx->blitter, ctx->vtx);
util_blitter_save_vertex_shader(ctx->blitter, ctx->prog.vp);
@@ -407,15 +429,21 @@ render_blit(struct pipe_context *pctx, struct pipe_blit_info *info)
ctx->fragtex.num_textures, ctx->fragtex.textures);
fd_hw_query_set_stage(ctx, ctx->ring, FD_STAGE_BLIT);
util_blitter_blit(ctx->blitter, info);
fd_hw_query_set_stage(ctx, ctx->ring, FD_STAGE_NULL);
return true;
}
static void
fd_flush_resource(struct pipe_context *ctx, struct pipe_resource *resource)
fd_blitter_pipe_end(struct fd_context *ctx)
{
fd_hw_query_set_stage(ctx, ctx->ring, FD_STAGE_NULL);
}
static void
fd_flush_resource(struct pipe_context *pctx, struct pipe_resource *prsc)
{
struct fd_resource *rsc = fd_resource(prsc);
if (rsc->dirty)
fd_context_render(pctx);
}
void

View File

@@ -322,7 +322,8 @@ static void ir3_block_dump(struct ir3_dump_ctx *ctx,
/* draw instruction graph: */
for (i = 0; i < block->noutputs; i++)
dump_instr(ctx, block->outputs[i]);
if (block->outputs[i])
dump_instr(ctx, block->outputs[i]);
/* draw outputs: */
fprintf(ctx->f, "output%lx [shape=record,label=\"outputs", PTRID(block));

View File

@@ -58,6 +58,7 @@ GM107LoweringPass::handleManualTXD(TexInstruction *i)
Value *zero = bld.loadImm(bld.getSSA(), 0);
int l, c;
const int dim = i->tex.target.getDim();
const int array = i->tex.target.isArray();
i->op = OP_TEX; // no need to clone dPdx/dPdy later
@@ -69,7 +70,7 @@ GM107LoweringPass::handleManualTXD(TexInstruction *i)
// mov coordinates from lane l to all lanes
bld.mkOp(OP_QUADON, TYPE_NONE, NULL);
for (c = 0; c < dim; ++c) {
bld.mkOp2(OP_SHFL, TYPE_F32, crd[c], i->getSrc(c), bld.mkImm(l));
bld.mkOp2(OP_SHFL, TYPE_F32, crd[c], i->getSrc(c + array), bld.mkImm(l));
add = bld.mkOp2(OP_QUADOP, TYPE_F32, crd[c], crd[c], zero);
add->subOp = 0x00;
add->lanes = 1; /* abused for .ndv */
@@ -94,7 +95,7 @@ GM107LoweringPass::handleManualTXD(TexInstruction *i)
// texture
bld.insert(tex = cloneForward(func, i));
for (c = 0; c < dim; ++c)
tex->setSrc(c, crd[c]);
tex->setSrc(c + array, crd[c]);
bld.mkOp(OP_QUADPOP, TYPE_NONE, NULL);
// save results
@@ -158,7 +159,10 @@ GM107LoweringPass::handlePFETCH(Instruction *i)
bld.mkOp2(OP_SHR , TYPE_U32, tmp1, tmp0, bld.mkImm(16));
bld.mkOp2(OP_AND , TYPE_U32, tmp0, tmp0, bld.mkImm(0xff));
bld.mkOp2(OP_AND , TYPE_U32, tmp1, tmp1, bld.mkImm(0xff));
bld.mkOp1(OP_MOV , TYPE_U32, tmp2, bld.mkImm(i->getSrc(0)->reg.data.u32));
if (i->getSrc(1))
bld.mkOp2(OP_ADD , TYPE_U32, tmp2, i->getSrc(0), i->getSrc(1));
else
bld.mkOp1(OP_MOV , TYPE_U32, tmp2, i->getSrc(0));
bld.mkOp3(OP_MAD , TYPE_U32, tmp0, tmp0, tmp1, tmp2);
i->setSrc(0, tmp0);
i->setSrc(1, NULL);
@@ -240,6 +244,20 @@ GM107LoweringPass::visit(Instruction *i)
i->op = OP_VFETCH;
assert(prog->getType() != Program::TYPE_FRAGMENT); // INTERP
}
} else if (i->src(0).getFile() == FILE_MEMORY_CONST) {
if (i->src(0).isIndirect(1)) {
Value *ptr;
if (i->src(0).isIndirect(0))
ptr = bld.mkOp3v(OP_INSBF, TYPE_U32, bld.getSSA(),
i->getIndirect(0, 1), bld.mkImm(0x1010),
i->getIndirect(0, 0));
else
ptr = bld.mkOp2v(OP_SHL, TYPE_U32, bld.getSSA(),
i->getIndirect(0, 1), bld.mkImm(16));
i->setIndirect(0, 1, NULL);
i->setIndirect(0, 0, ptr);
i->subOp = NV50_IR_SUBOP_LDC_IS;
}
}
break;
case OP_ATOM:

View File

@@ -174,15 +174,29 @@ NVC0LegalizePostRA::findOverwritingDefs(const Instruction *texi,
}
void
NVC0LegalizePostRA::findFirstUses(const Instruction *texi,
const Instruction *insn,
std::list<TexUse> &uses)
NVC0LegalizePostRA::findFirstUses(
const Instruction *texi,
const Instruction *insn,
std::list<TexUse> &uses,
std::tr1::unordered_set<const Instruction *>& visited)
{
for (int d = 0; insn->defExists(d); ++d) {
Value *v = insn->getDef(d);
for (Value::UseIterator u = v->uses.begin(); u != v->uses.end(); ++u) {
Instruction *usei = (*u)->getInsn();
// NOTE: In case of a loop that overwrites a value but never uses
// it, it can happen that we have a cycle of uses that consists only
// of phis and no-op moves and will thus cause an infinite loop here
// since these are not considered actual uses.
// The most obvious (and perhaps the only) way to prevent this is to
// remember which instructions we've already visited.
if (visited.find(usei) != visited.end())
continue;
visited.insert(usei);
if (usei->op == OP_PHI || usei->op == OP_UNION) {
// need a barrier before WAW cases
for (int s = 0; usei->srcExists(s); ++s) {
@@ -197,11 +211,11 @@ NVC0LegalizePostRA::findFirstUses(const Instruction *texi,
usei->op == OP_PHI ||
usei->op == OP_UNION) {
// these uses don't manifest in the machine code
findFirstUses(texi, usei, uses);
findFirstUses(texi, usei, uses, visited);
} else
if (usei->op == OP_MOV && usei->getDef(0)->equals(usei->getSrc(0)) &&
usei->subOp != NV50_IR_SUBOP_MOV_FINAL) {
findFirstUses(texi, usei, uses);
findFirstUses(texi, usei, uses, visited);
} else {
addTexUse(uses, usei, insn);
}
@@ -257,8 +271,10 @@ NVC0LegalizePostRA::insertTextureBarriers(Function *fn)
uses = new std::list<TexUse>[texes.size()];
if (!uses)
return false;
for (size_t i = 0; i < texes.size(); ++i)
findFirstUses(texes[i], texes[i], uses[i]);
for (size_t i = 0; i < texes.size(); ++i) {
std::tr1::unordered_set<const Instruction *> visited;
findFirstUses(texes[i], texes[i], uses[i], visited);
}
// determine the barrier level at each use
for (size_t i = 0; i < texes.size(); ++i) {
@@ -591,6 +607,21 @@ NVC0LoweringPass::handleTEX(TexInstruction *i)
// lod bias
// depth compare
// offsets (same as fermi, except txd which takes it with array)
//
// Maxwell (tex):
// array
// coords
// indirect handle
// sample
// lod bias
// depth compare
// offsets
//
// Maxwell (txd):
// indirect handle
// coords
// array + offsets
// derivatives
if (chipset >= NVISA_GK104_CHIPSET) {
if (i->tex.rIndirectSrc >= 0 || i->tex.sIndirectSrc >= 0) {
@@ -624,12 +655,17 @@ NVC0LoweringPass::handleTEX(TexInstruction *i)
const int sat = (i->op == OP_TXF) ? 1 : 0;
DataType sTy = (i->op == OP_TXF) ? TYPE_U32 : TYPE_F32;
bld.mkCvt(OP_CVT, TYPE_U16, layer, sTy, src)->saturate = sat;
for (int s = dim; s >= 1; --s)
i->setSrc(s, i->getSrc(s - 1));
i->setSrc(0, layer);
if (i->op != OP_TXD || chipset < NVISA_GM107_CHIPSET) {
for (int s = dim; s >= 1; --s)
i->setSrc(s, i->getSrc(s - 1));
i->setSrc(0, layer);
} else {
i->setSrc(dim, layer);
}
}
// Move the indirect reference to the first place
if (i->tex.rIndirectSrc >= 0) {
if (i->tex.rIndirectSrc >= 0 && (
i->op == OP_TXD || chipset < NVISA_GM107_CHIPSET)) {
Value *hnd = i->getIndirectR();
i->setIndirectR(NULL);
@@ -732,8 +768,10 @@ NVC0LoweringPass::handleTEX(TexInstruction *i)
// create it if it's not already there, and INSBF it if it already
// is.
s = (i->tex.rIndirectSrc >= 0) ? 1 : 0;
if (chipset >= NVISA_GM107_CHIPSET)
s += dim;
if (i->tex.target.isArray()) {
bld.mkOp3(OP_INSBF, TYPE_U32, i->getSrc(0),
bld.mkOp3(OP_INSBF, TYPE_U32, i->getSrc(s),
bld.loadImm(NULL, imm), bld.mkImm(0xc10),
i->getSrc(s));
} else {

View File

@@ -20,6 +20,8 @@
* OTHER DEALINGS IN THE SOFTWARE.
*/
#include <tr1/unordered_set>
#include "codegen/nv50_ir.h"
#include "codegen/nv50_ir_build_util.h"
@@ -69,7 +71,8 @@ private:
bool insertTextureBarriers(Function *);
inline bool insnDominatedBy(const Instruction *, const Instruction *) const;
void findFirstUses(const Instruction *tex, const Instruction *def,
std::list<TexUse>&);
std::list<TexUse>&,
std::tr1::unordered_set<const Instruction *>&);
void findOverwritingDefs(const Instruction *tex, Instruction *insn,
const BasicBlock *term,
std::list<TexUse>&);

View File

@@ -567,6 +567,10 @@ ConstantFolding::expr(Instruction *i,
ImmediateValue src0;
if (i->src(0).getImmediate(src0))
expr(i, src0, *i->getSrc(1)->asImm());
if (i->saturate && !prog->getTarget()->isSatSupported(i)) {
bld.setPosition(i, false);
i->setSrc(1, bld.loadImm(NULL, res.data.u32));
}
} else {
i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary() */
}

View File

@@ -25,6 +25,7 @@
#include <stack>
#include <limits>
#include <tr1/unordered_set>
namespace nv50_ir {
@@ -1547,6 +1548,11 @@ SpillCodeInserter::run(const std::list<ValuePair>& lst)
LValue *lval = it->first->asLValue();
Symbol *mem = it->second ? it->second->asSym() : NULL;
// Keep track of which instructions to delete later. Deleting them
// inside the loop is unsafe since a single instruction may have
// multiple destinations that all need to be spilled (like OP_SPLIT).
std::tr1::unordered_set<Instruction *> to_del;
for (Value::DefIterator d = lval->defs.begin(); d != lval->defs.end();
++d) {
Value *slot = mem ?
@@ -1579,7 +1585,7 @@ SpillCodeInserter::run(const std::list<ValuePair>& lst)
d = lval->defs.erase(d);
--d;
if (slot->reg.file == FILE_MEMORY_LOCAL)
delete_Instruction(func->getProgram(), defi);
to_del.insert(defi);
else
defi->setDef(0, slot);
} else {
@@ -1587,6 +1593,9 @@ SpillCodeInserter::run(const std::list<ValuePair>& lst)
}
}
for (std::tr1::unordered_set<Instruction *>::const_iterator it = to_del.begin();
it != to_del.end(); ++it)
delete_Instruction(func->getProgram(), *it);
}
// TODO: We're not trying to reuse old slots in a potential next iteration.
@@ -1657,6 +1666,10 @@ RegAlloc::execFunc()
ret && i <= func->loopNestingBound;
sequence = func->cfg.nextSequence(), ++i)
ret = buildLiveSets(BasicBlock::get(func->cfg.getRoot()));
// reset marker
for (ArrayList::Iterator bi = func->allBBlocks.iterator();
!bi.end(); bi.next())
BasicBlock::get(bi)->liveSet.marker = false;
if (!ret)
break;
func->orderInstructions(this->insns);
@@ -1908,6 +1921,13 @@ RegAlloc::InsertConstraintsPass::texConstraintGM107(TexInstruction *tex)
if (isTextureOp(tex->op)) {
if (tex->op != OP_TXQ) {
s = tex->tex.target.getArgCount() - tex->tex.target.isMS();
if (tex->op == OP_TXD) {
// Indirect handle belongs in the first arg
if (tex->tex.rIndirectSrc >= 0)
s++;
if (!tex->tex.target.isArray() && tex->tex.useOffsets)
s++;
}
n = tex->srcCount(0xff) - s;
} else {
s = tex->srcCount(0xff);

View File

@@ -449,7 +449,7 @@ TargetNV50::isModSupported(const Instruction *insn, int s, Modifier mod) const
return false;
}
}
if (s > 3)
if (s >= 3)
return false;
return (mod & Modifier(opInfo[insn->op].srcMods[s])) == mod;
}

View File

@@ -423,7 +423,7 @@ TargetNVC0::isModSupported(const Instruction *insn, int s, Modifier mod) const
return false;
}
}
if (s > 3)
if (s >= 3)
return false;
return (mod & Modifier(opInfo[insn->op].srcMods[s])) == mod;
}

View File

@@ -254,7 +254,9 @@ bool BitSet::resize(unsigned int nBits)
return false;
}
if (n > p)
memset(&data[4 * p + 4], 0, (n - p) * 4);
memset(&data[p], 0, (n - p) * 4);
if (nBits < size && (nBits % 32))
data[(nBits + 31) / 32 - 1] &= (1 << (nBits % 32)) - 1;
size = nBits;
return true;
@@ -274,8 +276,8 @@ bool BitSet::allocate(unsigned int nBits, bool zero)
if (zero)
memset(data, 0, (size + 7) / 8);
else
if (nBits)
data[(size + 31) / 32 - 1] = 0; // clear unused bits (e.g. for popCount)
if (size % 32) // clear unused bits (e.g. for popCount)
data[(size + 31) / 32 - 1] &= (1 << (size % 32)) - 1;
return data;
}

View File

@@ -484,6 +484,7 @@ public:
FREE(data);
}
// allocate will keep old data iff size is unchanged
bool allocate(unsigned int nBits, bool zero);
bool resize(unsigned int nBits); // keep old data, zero additional bits

View File

@@ -39,6 +39,8 @@ struct nouveau_vp3_video_buffer {
#define VP_OFFSET 0x200
#define COMM_OFFSET 0x500
#define NOUVEAU_VP3_BSP_RESERVED_SIZE 0x700
#define NOUVEAU_VP3_DEBUG_FENCE 0
#if NOUVEAU_VP3_DEBUG_FENCE

View File

@@ -78,10 +78,10 @@ struct mpeg4_picparm_vp {
uint8_t top_field_first; // bool, written to vuc
uint8_t pad4[3]; // 59, 5a, 5b, contains garbage on blob
uint32_t pad5[0x10]; // 5c...9c non-inclusive, but WHY?
uint32_t intra[0x10]; // 9c
uint32_t non_intra[0x10]; // bc
uint32_t intra[0x10]; // 5c
uint32_t non_intra[0x10]; // 9c
uint32_t pad5[0x10]; // bc what does this do?
// udc..uff pad?
};
@@ -196,11 +196,15 @@ nouveau_vp3_handle_references(struct nouveau_vp3_decoder *dec, struct nouveau_vp
/* Try to find a real empty spot first, there should be one..
*/
for (i = 0; i < dec->base.max_references + 1; ++i) {
if (dec->refs[i].last_used != seq) {
if (dec->refs[i].vidbuf == target) {
empty_spot = i;
break;
}
} else if (!dec->refs[i].last_used) {
empty_spot = i;
} else if (empty_spot == ~0U && dec->refs[i].last_used != seq)
empty_spot = i;
}
assert(empty_spot < dec->base.max_references+1);
dec->refs[empty_spot].last_used = seq;
// debug_printf("Kicked %p to add %p to slot %i\n", dec->refs[empty_spot].vidbuf, target, empty_spot);
@@ -267,7 +271,6 @@ nouveau_vp3_fill_picparm_mpeg4_vp(struct nouveau_vp3_decoder *dec,
{
struct mpeg4_picparm_vp pic_vp_stub = {}, *pic_vp = &pic_vp_stub;
uint32_t ring, ret = 0x01014; // !async_shutdown << 16 | watchdog << 12 | irq_record << 4 | unk;
assert(!(dec->base.width & 0xf));
*is_ref = desc->vop_coding_type <= 1;
pic_vp->width = dec->base.width;
@@ -463,14 +466,45 @@ void nouveau_vp3_vp_caps(struct nouveau_vp3_decoder *dec, union pipe_desc desc,
case PIPE_VIDEO_FORMAT_MPEG12:
*caps = nouveau_vp3_fill_picparm_mpeg12_vp(dec, desc.mpeg12, refs, is_ref, vp);
nouveau_vp3_handle_references(dec, refs, dec->fence_seq, target);
switch (desc.mpeg12->picture_structure) {
case PIPE_MPEG12_PICTURE_STRUCTURE_FIELD_TOP:
dec->refs[target->valid_ref].decoded_top = 1;
break;
case PIPE_MPEG12_PICTURE_STRUCTURE_FIELD_BOTTOM:
dec->refs[target->valid_ref].decoded_bottom = 1;
break;
default:
dec->refs[target->valid_ref].decoded_top = 1;
dec->refs[target->valid_ref].decoded_bottom = 1;
break;
}
return;
case PIPE_VIDEO_FORMAT_MPEG4:
*caps = nouveau_vp3_fill_picparm_mpeg4_vp(dec, desc.mpeg4, refs, is_ref, vp);
nouveau_vp3_handle_references(dec, refs, dec->fence_seq, target);
// XXX: Correct?
if (!desc.mpeg4->interlaced) {
dec->refs[target->valid_ref].decoded_top = 1;
dec->refs[target->valid_ref].decoded_bottom = 1;
} else if (desc.mpeg4->top_field_first) {
if (!dec->refs[target->valid_ref].decoded_top)
dec->refs[target->valid_ref].decoded_top = 1;
else
dec->refs[target->valid_ref].decoded_bottom = 1;
} else {
if (!dec->refs[target->valid_ref].decoded_bottom)
dec->refs[target->valid_ref].decoded_bottom = 1;
else
dec->refs[target->valid_ref].decoded_top = 1;
}
return;
case PIPE_VIDEO_FORMAT_VC1: {
*caps = nouveau_vp3_fill_picparm_vc1_vp(dec, desc.vc1, refs, is_ref, vp);
nouveau_vp3_handle_references(dec, refs, dec->fence_seq, target);
if (desc.vc1->frame_coding_mode == 3)
debug_printf("Field-Interlaced possibly incorrectly handled\n");
dec->refs[target->valid_ref].decoded_top = 1;
dec->refs[target->valid_ref].decoded_bottom = 1;
return;
}
case PIPE_VIDEO_FORMAT_MPEG4_AVC: {

View File

@@ -585,9 +585,12 @@ nv50_stage_sampler_states_bind(struct nv50_context *nv50, int s,
nv50_screen_tsc_unlock(nv50->screen, old);
}
assert(nv50->num_samplers[s] <= PIPE_MAX_SAMPLERS);
for (; i < nv50->num_samplers[s]; ++i)
if (nv50->samplers[s][i])
for (; i < nv50->num_samplers[s]; ++i) {
if (nv50->samplers[s][i]) {
nv50_screen_tsc_unlock(nv50->screen, nv50->samplers[s][i]);
nv50->samplers[s][i] = NULL;
}
}
nv50->num_samplers[s] = nr;

View File

@@ -54,8 +54,8 @@ nv50_validate_fb(struct nv50_context *nv50)
assert(mt->layout_3d || !array_mode || array_size == 1);
BEGIN_NV04(push, NV50_3D(RT_ADDRESS_HIGH(i)), 5);
PUSH_DATAh(push, bo->offset + sf->offset);
PUSH_DATA (push, bo->offset + sf->offset);
PUSH_DATAh(push, mt->base.address + sf->offset);
PUSH_DATA (push, mt->base.address + sf->offset);
PUSH_DATA (push, nv50_format_table[sf->base.format].rt);
if (likely(nouveau_bo_memtype(bo))) {
PUSH_DATA (push, mt->level[sf->base.u.tex.level].tile_mode);
@@ -97,8 +97,8 @@ nv50_validate_fb(struct nv50_context *nv50)
int unk = mt->base.base.target == PIPE_TEXTURE_3D || sf->depth == 1;
BEGIN_NV04(push, NV50_3D(ZETA_ADDRESS_HIGH), 5);
PUSH_DATAh(push, bo->offset + sf->offset);
PUSH_DATA (push, bo->offset + sf->offset);
PUSH_DATAh(push, mt->base.address + sf->offset);
PUSH_DATA (push, mt->base.address + sf->offset);
PUSH_DATA (push, nv50_format_table[fb->zsbuf->format].rt);
PUSH_DATA (push, mt->level[sf->base.u.tex.level].tile_mode);
PUSH_DATA (push, mt->layer_stride >> 2);

View File

@@ -114,8 +114,8 @@ nv50_2d_texture_set(struct nouveau_pushbuf *push, int dst,
PUSH_DATA (push, mt->level[level].pitch);
PUSH_DATA (push, width);
PUSH_DATA (push, height);
PUSH_DATAh(push, bo->offset + offset);
PUSH_DATA (push, bo->offset + offset);
PUSH_DATAh(push, mt->base.address + offset);
PUSH_DATA (push, mt->base.address + offset);
} else {
BEGIN_NV04(push, SUBC_2D(mthd), 5);
PUSH_DATA (push, format);
@@ -126,8 +126,8 @@ nv50_2d_texture_set(struct nouveau_pushbuf *push, int dst,
BEGIN_NV04(push, SUBC_2D(mthd + 0x18), 4);
PUSH_DATA (push, width);
PUSH_DATA (push, height);
PUSH_DATAh(push, bo->offset + offset);
PUSH_DATA (push, bo->offset + offset);
PUSH_DATAh(push, mt->base.address + offset);
PUSH_DATA (push, mt->base.address + offset);
}
#if 0
@@ -299,8 +299,8 @@ nv50_clear_render_target(struct pipe_context *pipe,
BEGIN_NV04(push, NV50_3D(RT_CONTROL), 1);
PUSH_DATA (push, 1);
BEGIN_NV04(push, NV50_3D(RT_ADDRESS_HIGH(0)), 5);
PUSH_DATAh(push, bo->offset + sf->offset);
PUSH_DATA (push, bo->offset + sf->offset);
PUSH_DATAh(push, mt->base.address + sf->offset);
PUSH_DATA (push, mt->base.address + sf->offset);
PUSH_DATA (push, nv50_format_table[dst->format].rt);
PUSH_DATA (push, mt->level[sf->base.u.tex.level].tile_mode);
PUSH_DATA (push, mt->layer_stride >> 2);
@@ -381,8 +381,8 @@ nv50_clear_depth_stencil(struct pipe_context *pipe,
nv50->scissors_dirty |= 1;
BEGIN_NV04(push, NV50_3D(ZETA_ADDRESS_HIGH), 5);
PUSH_DATAh(push, bo->offset + sf->offset);
PUSH_DATA (push, bo->offset + sf->offset);
PUSH_DATAh(push, mt->base.address + sf->offset);
PUSH_DATA (push, mt->base.address + sf->offset);
PUSH_DATA (push, nv50_format_table[dst->format].rt);
PUSH_DATA (push, mt->level[sf->base.u.tex.level].tile_mode);
PUSH_DATA (push, mt->layer_stride >> 2);

View File

@@ -24,6 +24,8 @@ nv50_m2mf_rect_setup(struct nv50_m2mf_rect *rect,
rect->bo = mt->base.bo;
rect->domain = mt->base.domain;
rect->base = mt->level[l].offset;
if (mt->base.bo->offset != mt->base.address)
rect->base += mt->base.address - mt->base.bo->offset;
rect->pitch = mt->level[l].pitch;
if (util_format_is_plain(res->format)) {
rect->width = w << mt->ms_x;

View File

@@ -482,12 +482,14 @@ nv84_create_decoder(struct pipe_context *context,
mip.level[0].pitch = surf.width * 4;
mip.base.domain = NOUVEAU_BO_VRAM;
mip.base.bo = dec->mbring;
mip.base.address = dec->mbring->offset;
context->clear_render_target(context, &surf.base, &color, 0, 0, 64, 4760);
surf.offset = dec->vpring->size / 2 - 0x1000;
surf.width = 1024;
surf.height = 1;
mip.level[0].pitch = surf.width * 4;
mip.base.bo = dec->vpring;
mip.base.address = dec->vpring->offset;
context->clear_render_target(context, &surf.base, &color, 0, 0, 1024, 1);
surf.offset = dec->vpring->size - 0x1000;
context->clear_render_target(context, &surf.base, &color, 0, 0, 1024, 1);
@@ -683,17 +685,14 @@ nv84_video_buffer_create(struct pipe_context *pipe,
bo_size, &cfg, &buffer->full))
goto error;
mt0->base.bo = buffer->interlaced;
nouveau_bo_ref(buffer->interlaced, &mt0->base.bo);
mt0->base.domain = NOUVEAU_BO_VRAM;
mt0->base.offset = 0;
mt0->base.address = buffer->interlaced->offset + mt0->base.offset;
nouveau_bo_ref(buffer->interlaced, &empty);
mt0->base.address = buffer->interlaced->offset;
mt1->base.bo = buffer->interlaced;
nouveau_bo_ref(buffer->interlaced, &mt1->base.bo);
mt1->base.domain = NOUVEAU_BO_VRAM;
mt1->base.offset = mt0->layer_stride * 2;
mt1->base.address = buffer->interlaced->offset + mt1->base.offset;
nouveau_bo_ref(buffer->interlaced, &empty);
mt1->base.offset = mt0->total_size;
mt1->base.address = buffer->interlaced->offset + mt0->total_size;
memset(&sv_templ, 0, sizeof(sv_templ));
for (component = 0, i = 0; i < 2; ++i ) {

View File

@@ -42,8 +42,8 @@ nv98_decoder_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc,
struct nouveau_pushbuf *push = dec->pushbuf[0];
enum pipe_video_format codec = u_reduce_video_profile(dec->base.profile);
uint32_t bsp_addr, comm_addr, inter_addr;
uint32_t slice_size, bucket_size, ring_size;
uint32_t caps;
uint32_t slice_size, bucket_size, ring_size, bsp_size;
uint32_t caps, i;
int ret;
struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NOUVEAU_VP3_VIDEO_QDEPTH];
struct nouveau_bo *inter_bo = dec->inter_bo[comm_seq & 1];
@@ -65,6 +65,41 @@ nv98_decoder_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc,
fence_extra = 4;
#endif
bsp_size = NOUVEAU_VP3_BSP_RESERVED_SIZE;
for (i = 0; i < num_buffers; i++)
bsp_size += num_bytes[i];
bsp_size += 256; /* the 4 end markers */
if (!bsp_bo || bsp_size > bsp_bo->size) {
struct nouveau_bo *tmp_bo = NULL;
/* round up to the nearest mb */
bsp_size += (1 << 20) - 1;
bsp_size &= ~((1 << 20) - 1);
ret = nouveau_bo_new(dec->bitplane_bo->device, NOUVEAU_BO_VRAM, 0, bsp_size, NULL, &tmp_bo);
if (ret) {
debug_printf("reallocating bsp %u -> %u failed with %i\n",
bsp_bo ? (unsigned)bsp_bo->size : 0, bsp_size, ret);
return -1;
}
nouveau_bo_ref(NULL, &bsp_bo);
bo_refs[0].bo = dec->bsp_bo[comm_seq % NOUVEAU_VP3_VIDEO_QDEPTH] = bsp_bo = tmp_bo;
}
if (!inter_bo || bsp_bo->size * 4 > inter_bo->size) {
struct nouveau_bo *tmp_bo = NULL;
ret = nouveau_bo_new(dec->bitplane_bo->device, NOUVEAU_BO_VRAM, 0, bsp_bo->size * 4, NULL, &tmp_bo);
if (ret) {
debug_printf("reallocating inter %u -> %u failed with %i\n",
inter_bo ? (unsigned)inter_bo->size : 0, (unsigned)bsp_bo->size * 4, ret);
return -1;
}
nouveau_bo_ref(NULL, &inter_bo);
bo_refs[1].bo = dec->inter_bo[comm_seq & 1] = inter_bo = tmp_bo;
}
ret = nouveau_bo_map(bsp_bo, NOUVEAU_BO_WR, dec->client);
if (ret) {
debug_printf("map failed: %i %s\n", ret, strerror(-ret));

View File

@@ -59,7 +59,6 @@ static void dump_comm_vp(struct nouveau_vp3_decoder *dec, struct comm *comm, u32
static void
nv98_decoder_kick_ref(struct nouveau_vp3_decoder *dec, struct nouveau_vp3_video_buffer *target)
{
dec->refs[target->valid_ref].vidbuf = NULL;
dec->refs[target->valid_ref].last_used = 0;
// debug_printf("Unreffed %p\n", target);
}

View File

@@ -261,7 +261,6 @@ nvc0_miptree_create(struct pipe_screen *pscreen,
if (pt->usage == PIPE_USAGE_STAGING) {
switch (pt->target) {
case PIPE_TEXTURE_1D:
case PIPE_TEXTURE_2D:
case PIPE_TEXTURE_RECT:
if (pt->last_level == 0 &&

View File

@@ -173,16 +173,12 @@ nvc0_create_decoder(struct pipe_context *context,
ret = nouveau_bo_new(screen->device, NOUVEAU_BO_VRAM,
0x100, 4 << 20, &cfg, &dec->inter_bo[0]);
if (!ret) {
if (!kepler)
nouveau_bo_ref(dec->inter_bo[0], &dec->inter_bo[1]);
else
ret = nouveau_bo_new(screen->device, NOUVEAU_BO_VRAM,
0x100, dec->inter_bo[0]->size, &cfg,
&dec->inter_bo[1]);
ret = nouveau_bo_new(screen->device, NOUVEAU_BO_VRAM,
0x100, dec->inter_bo[0]->size, &cfg,
&dec->inter_bo[1]);
}
if (ret)
goto fail;
switch (u_reduce_video_profile(templ->profile)) {
case PIPE_VIDEO_FORMAT_MPEG12: {
codec = 1;

View File

@@ -42,8 +42,8 @@ nvc0_decoder_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc,
struct nouveau_pushbuf *push = dec->pushbuf[0];
enum pipe_video_format codec = u_reduce_video_profile(dec->base.profile);
uint32_t bsp_addr, comm_addr, inter_addr;
uint32_t slice_size, bucket_size, ring_size;
uint32_t caps;
uint32_t slice_size, bucket_size, ring_size, bsp_size;
uint32_t caps, i;
int ret;
struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NOUVEAU_VP3_VIDEO_QDEPTH];
struct nouveau_bo *inter_bo = dec->inter_bo[comm_seq & 1];
@@ -65,6 +65,49 @@ nvc0_decoder_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc,
fence_extra = 4;
#endif
bsp_size = NOUVEAU_VP3_BSP_RESERVED_SIZE;
for (i = 0; i < num_buffers; i++)
bsp_size += num_bytes[i];
bsp_size += 256; /* the 4 end markers */
if (!bsp_bo || bsp_size > bsp_bo->size) {
union nouveau_bo_config cfg;
struct nouveau_bo *tmp_bo = NULL;
cfg.nvc0.tile_mode = 0x10;
cfg.nvc0.memtype = 0xfe;
/* round up to the nearest mb */
bsp_size += (1 << 20) - 1;
bsp_size &= ~((1 << 20) - 1);
ret = nouveau_bo_new(dec->bitplane_bo->device, NOUVEAU_BO_VRAM, 0, bsp_size, &cfg, &tmp_bo);
if (ret) {
debug_printf("reallocating bsp %u -> %u failed with %i\n",
bsp_bo ? (unsigned)bsp_bo->size : 0, bsp_size, ret);
return -1;
}
nouveau_bo_ref(NULL, &bsp_bo);
bo_refs[0].bo = dec->bsp_bo[comm_seq % NOUVEAU_VP3_VIDEO_QDEPTH] = bsp_bo = tmp_bo;
}
if (!inter_bo || bsp_bo->size * 4 > inter_bo->size) {
union nouveau_bo_config cfg;
struct nouveau_bo *tmp_bo = NULL;
cfg.nvc0.tile_mode = 0x10;
cfg.nvc0.memtype = 0xfe;
ret = nouveau_bo_new(dec->bitplane_bo->device, NOUVEAU_BO_VRAM, 0, bsp_bo->size * 4, &cfg, &tmp_bo);
if (ret) {
debug_printf("reallocating inter %u -> %u failed with %i\n",
inter_bo ? (unsigned)inter_bo->size : 0, (unsigned)bsp_bo->size * 4, ret);
return -1;
}
nouveau_bo_ref(NULL, &inter_bo);
bo_refs[1].bo = dec->inter_bo[comm_seq & 1] = inter_bo = tmp_bo;
}
ret = nouveau_bo_map(bsp_bo, NOUVEAU_BO_WR, dec->client);
if (ret) {
debug_printf("map failed: %i %s\n", ret, strerror(-ret));

View File

@@ -59,7 +59,6 @@ static void dump_comm_vp(struct nouveau_vp3_decoder *dec, struct comm *comm, u32
static void
nvc0_decoder_kick_ref(struct nouveau_vp3_decoder *dec, struct nouveau_vp3_video_buffer *target)
{
dec->refs[target->valid_ref].vidbuf = NULL;
dec->refs[target->valid_ref].last_used = 0;
// debug_printf("Unreffed %p\n", target);
}

View File

@@ -572,14 +572,16 @@ static void do_advanced_regalloc(struct regalloc_state * s)
graph = ra_alloc_interference_graph(ra_state->regs,
node_count + s->NumInputs);
for (node_index = 0; node_index < node_count; node_index++) {
ra_set_node_class(graph, node_index, node_classes[node_index]);
}
/* Build the interference graph */
for (var_ptr = variables, node_index = 0; var_ptr;
var_ptr = var_ptr->Next,node_index++) {
struct rc_list * a, * b;
unsigned int b_index;
ra_set_node_class(graph, node_index, node_classes[node_index]);
for (a = var_ptr, b = var_ptr->Next, b_index = node_index + 1;
b; b = b->Next, b_index++) {
struct rc_variable * var_a = a->Item;

View File

@@ -440,7 +440,8 @@ static void r600_clear(struct pipe_context *ctx, unsigned buffers,
}
r600_blitter_begin(ctx, R600_CLEAR);
util_blitter_clear(rctx->blitter, fb->width, fb->height, 1,
util_blitter_clear(rctx->blitter, fb->width, fb->height,
util_framebuffer_get_num_layers(fb),
buffers, color, depth, stencil);
r600_blitter_end(ctx);

View File

@@ -1245,12 +1245,6 @@ static bool r600_update_derived_state(struct r600_context *rctx)
}
}
if (rctx->b.chip_class >= EVERGREEN) {
evergreen_update_db_shader_control(rctx);
} else {
r600_update_db_shader_control(rctx);
}
if (unlikely(!ps_dirty && rctx->ps_shader && rctx->rasterizer &&
((rctx->rasterizer->sprite_coord_enable != rctx->ps_shader->current->sprite_coord_enable) ||
(rctx->rasterizer->flatshade != rctx->ps_shader->current->flatshade)))) {
@@ -1264,6 +1258,12 @@ static bool r600_update_derived_state(struct r600_context *rctx)
update_shader_atom(ctx, &rctx->pixel_shader, rctx->ps_shader->current);
}
if (rctx->b.chip_class >= EVERGREEN) {
evergreen_update_db_shader_control(rctx);
} else {
r600_update_db_shader_control(rctx);
}
/* on R600 we stuff masks + txq info into one constant buffer */
/* on evergreen we only need a txq info one */
if (rctx->b.chip_class < EVERGREEN) {

View File

@@ -807,12 +807,40 @@ void r600_suspend_nontimer_queries(struct r600_common_context *ctx)
assert(ctx->num_cs_dw_nontimer_queries_suspend == 0);
}
static unsigned r600_queries_num_cs_dw_for_resuming(struct r600_common_context *ctx)
{
struct r600_query *query;
unsigned num_dw = 0;
LIST_FOR_EACH_ENTRY(query, &ctx->active_nontimer_queries, list) {
/* begin + end */
num_dw += query->num_cs_dw * 2;
/* Workaround for the fact that
* num_cs_dw_nontimer_queries_suspend is incremented for every
* resumed query, which raises the bar in need_cs_space for
* queries about to be resumed.
*/
num_dw += query->num_cs_dw;
}
/* primitives generated query */
num_dw += ctx->streamout.enable_atom.num_dw;
/* guess for ZPASS enable or PERFECT_ZPASS_COUNT enable updates */
num_dw += 13;
return num_dw;
}
void r600_resume_nontimer_queries(struct r600_common_context *ctx)
{
struct r600_query *query;
assert(ctx->num_cs_dw_nontimer_queries_suspend == 0);
/* Check CS space here. Resuming must not be interrupted by flushes. */
ctx->need_gfx_cs_space(&ctx->b,
r600_queries_num_cs_dw_for_resuming(ctx), TRUE);
LIST_FOR_EACH_ENTRY(query, &ctx->active_nontimer_queries, list) {
r600_emit_query_begin(ctx, query);
}

View File

@@ -251,8 +251,11 @@ int rvid_get_video_param(struct pipe_screen *screen,
profile != PIPE_VIDEO_PROFILE_VC1_MAIN;
case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED:
/* and MPEG2 only with shaders */
return codec != PIPE_VIDEO_FORMAT_MPEG12;
/* MPEG2 only with shaders and no support for
interlacing on R6xx style UVD */
return codec != PIPE_VIDEO_FORMAT_MPEG12 &&
/* TODO: RV770 might actually work */
rscreen->family > CHIP_RV770;
default:
break;
}

View File

@@ -39,6 +39,8 @@ static void si_destroy_context(struct pipe_context *context)
si_release_all_descriptors(sctx);
pipe_resource_reference(&sctx->esgs_ring.buffer, NULL);
pipe_resource_reference(&sctx->gsvs_ring.buffer, NULL);
pipe_resource_reference(&sctx->null_const_buf.buffer, NULL);
r600_resource_reference(&sctx->border_color_table, NULL);

View File

@@ -2894,5 +2894,9 @@ out:
void si_pipe_shader_destroy(struct pipe_context *ctx, struct si_pipe_shader *shader)
{
if (shader->gs_copy_shader)
si_pipe_shader_destroy(ctx, shader->gs_copy_shader);
r600_resource_reference(&shader->bo, NULL);
r600_resource_reference(&shader->scratch_bo, NULL);
}

View File

@@ -2316,9 +2316,10 @@ static void si_delete_shader_selector(struct pipe_context *ctx,
while (p) {
c = p->next_variant;
if (sel->type == PIPE_SHADER_GEOMETRY)
if (sel->type == PIPE_SHADER_GEOMETRY) {
si_pm4_delete_state(sctx, gs, p->pm4);
else if (sel->type == PIPE_SHADER_FRAGMENT)
si_pm4_delete_state(sctx, vs, p->gs_copy_shader->pm4);
} else if (sel->type == PIPE_SHADER_FRAGMENT)
si_pm4_delete_state(sctx, ps, p->pm4);
else if (p->key.vs.as_es)
si_pm4_delete_state(sctx, es, p->pm4);
@@ -2331,7 +2332,7 @@ static void si_delete_shader_selector(struct pipe_context *ctx,
free(sel->tokens);
free(sel);
}
}
static void si_delete_vs_shader(struct pipe_context *ctx, void *state)
{

View File

@@ -29,14 +29,16 @@ using namespace clover;
memory_obj::memory_obj(clover::context &ctx, cl_mem_flags flags,
size_t size, void *host_ptr) :
context(ctx), _flags(flags),
_size(size), _host_ptr(host_ptr),
_destroy_notify([]{}) {
_size(size), _host_ptr(host_ptr) {
if (flags & (CL_MEM_COPY_HOST_PTR | CL_MEM_USE_HOST_PTR))
data.append((char *)host_ptr, size);
}
memory_obj::~memory_obj() {
_destroy_notify();
while (_destroy_notify.size()) {
_destroy_notify.top()();
_destroy_notify.pop();
}
}
bool
@@ -46,7 +48,7 @@ memory_obj::operator==(const memory_obj &obj) const {
void
memory_obj::destroy_notify(std::function<void ()> f) {
_destroy_notify = f;
_destroy_notify.push(f);
}
cl_mem_flags

View File

@@ -26,6 +26,7 @@
#include <functional>
#include <map>
#include <memory>
#include <stack>
#include "core/object.hpp"
#include "core/queue.hpp"
@@ -61,7 +62,7 @@ namespace clover {
cl_mem_flags _flags;
size_t _size;
void *_host_ptr;
std::function<void ()> _destroy_notify;
std::stack<std::function<void ()>> _destroy_notify;
protected:
std::string data;

View File

@@ -1328,6 +1328,7 @@ dri_kms_init_screen(__DRIscreen * sPriv)
const __DRIconfig **configs;
struct dri_screen *screen;
struct pipe_screen *pscreen = NULL;
uint64_t cap;
screen = CALLOC_STRUCT(dri_screen);
if (!screen)
@@ -1339,6 +1340,13 @@ dri_kms_init_screen(__DRIscreen * sPriv)
sPriv->driverPrivate = (void *)screen;
pscreen = kms_swrast_create_screen(screen->fd);
if (drmGetCap(sPriv->fd, DRM_CAP_PRIME, &cap) == 0 &&
(cap & DRM_PRIME_CAP_IMPORT)) {
dri2ImageExtension.createImageFromFds = dri2_from_fds;
dri2ImageExtension.createImageFromDmaBufs = dri2_from_dma_bufs;
}
sPriv->extensions = dri_screen_extensions;
/* dri_init_screen_helper checks pscreen for us */

View File

@@ -227,37 +227,6 @@ dri_fill_in_modes(struct dri_screen *screen)
return (const __DRIconfig **)configs;
}
/* The Gallium way to force MSAA. */
DEBUG_GET_ONCE_NUM_OPTION(msaa, "GALLIUM_MSAA", 0);
/* The NVIDIA way to force MSAA. The same variable is used by the NVIDIA
* driver. */
DEBUG_GET_ONCE_NUM_OPTION(msaa_nv, "__GL_FSAA_MODE", 0);
static void
dri_force_msaa_visual(struct st_visual *stvis,
struct pipe_screen *screen)
{
int i;
int samples = debug_get_option_msaa();
if (!samples)
samples = debug_get_option_msaa_nv();
if (samples <= 1)
return; /* nothing to do */
/* Choose a supported sample count greater than or equal to samples. */
for (i = samples; i <= MSAA_VISUAL_MAX_SAMPLES; i++) {
if (screen->is_format_supported(screen, stvis->color_format,
PIPE_TEXTURE_2D, i,
PIPE_BIND_RENDER_TARGET)) {
stvis->samples = i;
break;
}
}
}
/**
* Roughly the converse of dri_fill_in_modes.
*/
@@ -282,10 +251,6 @@ dri_fill_st_visual(struct st_visual *stvis, struct dri_screen *screen,
if (mode->sampleBuffers) {
stvis->samples = mode->samples;
}
else {
/* This must be done after stvis->color_format is set. */
dri_force_msaa_visual(stvis, screen->base.screen);
}
switch (mode->depthBits) {
default:

View File

@@ -42,6 +42,8 @@ vdp_imp_device_create_x11(Display *display, int screen, VdpDevice *device,
VdpGetProcAddress **get_proc_address)
{
struct pipe_screen *pscreen;
struct pipe_resource *res, res_tmpl;
struct pipe_sampler_view sv_tmpl;
vlVdpDevice *dev = NULL;
VdpStatus ret;
@@ -79,6 +81,43 @@ vdp_imp_device_create_x11(Display *display, int screen, VdpDevice *device,
goto no_context;
}
memset(&res_tmpl, 0, sizeof(res_tmpl));
res_tmpl.target = PIPE_TEXTURE_2D;
res_tmpl.format = PIPE_FORMAT_R8G8B8A8_UNORM;
res_tmpl.width0 = 1;
res_tmpl.height0 = 1;
res_tmpl.depth0 = 1;
res_tmpl.array_size = 1;
res_tmpl.bind = PIPE_BIND_SAMPLER_VIEW;
res_tmpl.usage = PIPE_USAGE_DEFAULT;
if (!CheckSurfaceParams(pscreen, &res_tmpl)) {
ret = VDP_STATUS_NO_IMPLEMENTATION;
goto no_resource;
}
res = pscreen->resource_create(pscreen, &res_tmpl);
if (!res) {
ret = VDP_STATUS_RESOURCES;
goto no_resource;
}
memset(&sv_tmpl, 0, sizeof(sv_tmpl));
u_sampler_view_default_template(&sv_tmpl, res, res->format);
sv_tmpl.swizzle_r = PIPE_SWIZZLE_ONE;
sv_tmpl.swizzle_g = PIPE_SWIZZLE_ONE;
sv_tmpl.swizzle_b = PIPE_SWIZZLE_ONE;
sv_tmpl.swizzle_a = PIPE_SWIZZLE_ONE;
dev->dummy_sv = dev->context->create_sampler_view(dev->context, res, &sv_tmpl);
pipe_resource_reference(&res, NULL);
if (!dev->dummy_sv) {
ret = VDP_STATUS_RESOURCES;
goto no_resource;
}
*device = vlAddDataHTAB(dev);
if (*device == 0) {
ret = VDP_STATUS_ERROR;
@@ -93,8 +132,9 @@ vdp_imp_device_create_x11(Display *display, int screen, VdpDevice *device,
return VDP_STATUS_OK;
no_handle:
pipe_sampler_view_reference(&dev->dummy_sv, NULL);
no_resource:
dev->context->destroy(dev->context);
/* Destroy vscreen */
no_context:
vl_screen_destroy(dev->vscreen);
no_vscreen:
@@ -185,6 +225,7 @@ vlVdpDeviceFree(vlVdpDevice *dev)
{
pipe_mutex_destroy(dev->mutex);
vl_compositor_cleanup(&dev->compositor);
pipe_sampler_view_reference(&dev->dummy_sv, NULL);
dev->context->destroy(dev->context);
vl_screen_destroy(dev->vscreen);
FREE(dev);

View File

@@ -624,9 +624,9 @@ vlVdpOutputSurfaceRenderOutputSurface(VdpOutputSurface destination_surface,
uint32_t flags)
{
vlVdpOutputSurface *dst_vlsurface;
vlVdpOutputSurface *src_vlsurface;
struct pipe_context *context;
struct pipe_sampler_view *src_sv;
struct vl_compositor *compositor;
struct vl_compositor_state *cstate;
@@ -639,12 +639,19 @@ vlVdpOutputSurfaceRenderOutputSurface(VdpOutputSurface destination_surface,
if (!dst_vlsurface)
return VDP_STATUS_INVALID_HANDLE;
src_vlsurface = vlGetDataHTAB(source_surface);
if (!src_vlsurface)
return VDP_STATUS_INVALID_HANDLE;
if (source_surface == VDP_INVALID_HANDLE) {
src_sv = dst_vlsurface->device->dummy_sv;
if (dst_vlsurface->device != src_vlsurface->device)
return VDP_STATUS_HANDLE_DEVICE_MISMATCH;
} else {
vlVdpOutputSurface *src_vlsurface = vlGetDataHTAB(source_surface);
if (!src_vlsurface)
return VDP_STATUS_INVALID_HANDLE;
if (dst_vlsurface->device != src_vlsurface->device)
return VDP_STATUS_HANDLE_DEVICE_MISMATCH;
src_sv = src_vlsurface->sampler_view;
}
pipe_mutex_lock(dst_vlsurface->device->mutex);
vlVdpResolveDelayedRendering(dst_vlsurface->device, NULL, NULL);
@@ -657,7 +664,7 @@ vlVdpOutputSurfaceRenderOutputSurface(VdpOutputSurface destination_surface,
vl_compositor_clear_layers(cstate);
vl_compositor_set_layer_blend(cstate, 0, blend, false);
vl_compositor_set_rgba_layer(cstate, compositor, 0, src_vlsurface->sampler_view,
vl_compositor_set_rgba_layer(cstate, compositor, 0, src_sv,
RectToPipe(source_rect, &src_rect), NULL,
ColorsToPipe(colors, flags, vlcolors));
STATIC_ASSERT(VL_COMPOSITOR_ROTATE_0 == VDP_OUTPUT_SURFACE_RENDER_ROTATE_0);
@@ -688,9 +695,9 @@ vlVdpOutputSurfaceRenderBitmapSurface(VdpOutputSurface destination_surface,
uint32_t flags)
{
vlVdpOutputSurface *dst_vlsurface;
vlVdpBitmapSurface *src_vlsurface;
struct pipe_context *context;
struct pipe_sampler_view *src_sv;
struct vl_compositor *compositor;
struct vl_compositor_state *cstate;
@@ -703,12 +710,19 @@ vlVdpOutputSurfaceRenderBitmapSurface(VdpOutputSurface destination_surface,
if (!dst_vlsurface)
return VDP_STATUS_INVALID_HANDLE;
src_vlsurface = vlGetDataHTAB(source_surface);
if (!src_vlsurface)
return VDP_STATUS_INVALID_HANDLE;
if (source_surface == VDP_INVALID_HANDLE) {
src_sv = dst_vlsurface->device->dummy_sv;
if (dst_vlsurface->device != src_vlsurface->device)
return VDP_STATUS_HANDLE_DEVICE_MISMATCH;
} else {
vlVdpBitmapSurface *src_vlsurface = vlGetDataHTAB(source_surface);
if (!src_vlsurface)
return VDP_STATUS_INVALID_HANDLE;
if (dst_vlsurface->device != src_vlsurface->device)
return VDP_STATUS_HANDLE_DEVICE_MISMATCH;
src_sv = src_vlsurface->sampler_view;
}
context = dst_vlsurface->device->context;
compositor = &dst_vlsurface->device->compositor;
@@ -721,7 +735,7 @@ vlVdpOutputSurfaceRenderBitmapSurface(VdpOutputSurface destination_surface,
vl_compositor_clear_layers(cstate);
vl_compositor_set_layer_blend(cstate, 0, blend, false);
vl_compositor_set_rgba_layer(cstate, compositor, 0, src_vlsurface->sampler_view,
vl_compositor_set_rgba_layer(cstate, compositor, 0, src_sv,
RectToPipe(source_rect, &src_rect), NULL,
ColorsToPipe(colors, flags, vlcolors));
vl_compositor_set_layer_rotation(cstate, 0, flags & 3);

View File

@@ -348,6 +348,7 @@ typedef struct
struct vl_screen *vscreen;
struct pipe_context *context;
struct vl_compositor compositor;
struct pipe_sampler_view *dummy_sv;
pipe_mutex mutex;
struct {

View File

@@ -530,11 +530,22 @@ renderer_draw_yuv(struct xa_context *r,
src_x, src_y, src_w, src_h,
dst_x, dst_y, dst_w, dst_h, srf);
if (!r->scissor_valid) {
r->scissor.minx = 0;
r->scissor.miny = 0;
r->scissor.maxx = r->dst->tex->width0;
r->scissor.maxy = r->dst->tex->height0;
}
r->pipe->set_scissor_states(r->pipe, 0, 1, &r->scissor);
cso_set_vertex_elements(r->cso, num_attribs, r->velems);
util_draw_user_vertex_buffer(r->cso, r->buffer, PIPE_PRIM_QUADS,
4, /* verts */
num_attribs); /* attribs/vert */
r->buffer_size = 0;
xa_scissor_reset(r);
}
void

View File

@@ -146,6 +146,7 @@ xa_yuv_planar_blit(struct xa_context *r,
int w = box->x2 - box->x1;
int h = box->y2 - box->y1;
xa_scissor_update(r, x, y, box->x2, box->y2);
renderer_draw_yuv(r,
(float)src_x + scale_x * (x - dst_x),
(float)src_y + scale_y * (y - dst_y),

View File

@@ -26,7 +26,6 @@ gallium_dri_la_LDFLAGS = \
-shrext .so \
-module \
-avoid-version \
-Wl,--dynamic-list=$(top_srcdir)/src/gallium/targets/dri-vdpau.dyn \
$(GC_SECTIONS)
if HAVE_LD_VERSION_SCRIPT
@@ -34,6 +33,11 @@ gallium_dri_la_LDFLAGS += \
-Wl,--version-script=$(top_srcdir)/src/gallium/targets/dri/dri.sym
endif # HAVE_LD_VERSION_SCRIPT
if HAVE_LD_DYNAMIC_LIST
gallium_dri_la_LDFLAGS += \
-Wl,--dynamic-list=$(top_srcdir)/src/gallium/targets/dri-vdpau.dyn
endif # HAVE_LD_DYNAMIC_LIST
gallium_dri_la_LIBADD = \
$(top_builddir)/src/mesa/libmesagallium.la \
$(top_builddir)/src/mesa/drivers/dri/common/libdricommon.la \

View File

@@ -15,7 +15,6 @@ libvdpau_gallium_la_LDFLAGS = \
-module \
-no-undefined \
-version-number $(VDPAU_MAJOR):$(VDPAU_MINOR) \
-Wl,--dynamic-list=$(top_srcdir)/src/gallium/targets/dri-vdpau.dyn \
$(GC_SECTIONS) \
$(LD_NO_UNDEFINED)
@@ -24,6 +23,11 @@ libvdpau_gallium_la_LDFLAGS += \
-Wl,--version-script=$(top_srcdir)/src/gallium/targets/vdpau/vdpau.sym
endif # HAVE_LD_VERSION_SCRIPT
if HAVE_LD_DYNAMIC_LIST
libvdpau_gallium_la_LDFLAGS += \
-Wl,--dynamic-list=$(top_srcdir)/src/gallium/targets/dri-vdpau.dyn
endif # HAVE_LD_DYNAMIC_LIST
libvdpau_gallium_la_LIBADD = \
$(top_builddir)/src/gallium/state_trackers/vdpau/libvdpautracker.la \
$(top_builddir)/src/gallium/auxiliary/libgallium.la \

View File

@@ -238,7 +238,7 @@ out_mip:
static struct svga_winsys_surface *
vmw_drm_surface_from_handle(struct svga_winsys_screen *sws,
struct winsys_handle *whandle,
struct winsys_handle *whandle,
SVGA3dSurfaceFormat *format)
{
struct vmw_svga_winsys_surface *vsrf;
@@ -248,7 +248,8 @@ vmw_drm_surface_from_handle(struct svga_winsys_screen *sws,
struct drm_vmw_surface_arg *req = &arg.req;
struct drm_vmw_surface_create_req *rep = &arg.rep;
uint32_t handle = 0;
SVGA3dSize size;
struct drm_vmw_size size;
SVGA3dSize base_size;
int ret;
int i;
@@ -274,7 +275,7 @@ vmw_drm_surface_from_handle(struct svga_winsys_screen *sws,
memset(&arg, 0, sizeof(arg));
req->sid = handle;
rep->size_addr = (size_t)&size;
rep->size_addr = (unsigned long)&size;
ret = drmCommandWriteRead(vws->ioctl.drm_fd, DRM_VMW_REF_SURFACE,
&arg, sizeof(arg));
@@ -324,7 +325,11 @@ vmw_drm_surface_from_handle(struct svga_winsys_screen *sws,
*format = rep->format;
/* Estimate usage, for early flushing. */
vsrf->size = svga3dsurface_get_serialized_size(rep->format, size,
base_size.width = size.width;
base_size.height = size.height;
base_size.depth = size.depth;
vsrf->size = svga3dsurface_get_serialized_size(rep->format, base_size,
rep->mip_levels[0],
FALSE);

View File

@@ -38,6 +38,7 @@
#include <sys/mman.h>
#include <unistd.h>
#include <dlfcn.h>
#include <fcntl.h>
#include <xf86drm.h>
#include "pipe/p_compiler.h"
@@ -121,7 +122,7 @@ kms_sw_displaytarget_create(struct sw_winsys *ws,
int ret;
kms_sw_dt = CALLOC_STRUCT(kms_sw_displaytarget);
if(!kms_sw_dt)
if (!kms_sw_dt)
goto no_dt;
kms_sw_dt->ref_count = 1;
@@ -210,6 +211,38 @@ kms_sw_displaytarget_map(struct sw_winsys *ws,
return kms_sw_dt->mapped;
}
static struct kms_sw_displaytarget *
kms_sw_displaytarget_add_from_prime(struct kms_sw_winsys *kms_sw, int fd)
{
uint32_t handle = -1;
struct kms_sw_displaytarget * kms_sw_dt;
int ret;
ret = drmPrimeFDToHandle(kms_sw->fd, fd, &handle);
if (ret)
return NULL;
kms_sw_dt = CALLOC_STRUCT(kms_sw_displaytarget);
if (!kms_sw_dt)
return NULL;
kms_sw_dt->ref_count = 1;
kms_sw_dt->handle = handle;
kms_sw_dt->size = lseek(fd, 0, SEEK_END);
if (kms_sw_dt->size == (off_t)-1) {
FREE(kms_sw_dt);
return NULL;
}
lseek(fd, 0, SEEK_SET);
list_add(&kms_sw_dt->link, &kms_sw->bo_list);
return kms_sw_dt;
}
static void
kms_sw_displaytarget_unmap(struct sw_winsys *ws,
struct sw_displaytarget *dt)
@@ -231,17 +264,34 @@ kms_sw_displaytarget_from_handle(struct sw_winsys *ws,
struct kms_sw_winsys *kms_sw = kms_sw_winsys(ws);
struct kms_sw_displaytarget *kms_sw_dt;
assert(whandle->type == DRM_API_HANDLE_TYPE_KMS);
assert(whandle->type == DRM_API_HANDLE_TYPE_KMS ||
whandle->type == DRM_API_HANDLE_TYPE_FD);
LIST_FOR_EACH_ENTRY(kms_sw_dt, &kms_sw->bo_list, link) {
if (kms_sw_dt->handle == whandle->handle) {
switch(whandle->type) {
case DRM_API_HANDLE_TYPE_FD:
kms_sw_dt = kms_sw_displaytarget_add_from_prime(kms_sw, whandle->handle);
if (kms_sw_dt) {
kms_sw_dt->ref_count++;
DEBUG("KMS-DEBUG: imported buffer %u (size %u)\n", kms_sw_dt->handle, kms_sw_dt->size);
kms_sw_dt->width = templ->width0;
kms_sw_dt->height = templ->height0;
kms_sw_dt->stride = whandle->stride;
*stride = kms_sw_dt->stride;
return (struct sw_displaytarget *)kms_sw_dt;
}
return (struct sw_displaytarget *)kms_sw_dt;
case DRM_API_HANDLE_TYPE_KMS:
LIST_FOR_EACH_ENTRY(kms_sw_dt, &kms_sw->bo_list, link) {
if (kms_sw_dt->handle == whandle->handle) {
kms_sw_dt->ref_count++;
DEBUG("KMS-DEBUG: imported buffer %u (size %u)\n", kms_sw_dt->handle, kms_sw_dt->size);
*stride = kms_sw_dt->stride;
return (struct sw_displaytarget *)kms_sw_dt;
}
}
/* fallthrough */
default:
break;
}
assert(0);
@@ -253,16 +303,26 @@ kms_sw_displaytarget_get_handle(struct sw_winsys *winsys,
struct sw_displaytarget *dt,
struct winsys_handle *whandle)
{
struct kms_sw_winsys *kms_sw = kms_sw_winsys(winsys);
struct kms_sw_displaytarget *kms_sw_dt = kms_sw_displaytarget(dt);
if (whandle->type == DRM_API_HANDLE_TYPE_KMS) {
switch(whandle->type) {
case DRM_API_HANDLE_TYPE_KMS:
whandle->handle = kms_sw_dt->handle;
whandle->stride = kms_sw_dt->stride;
} else {
return TRUE;
case DRM_API_HANDLE_TYPE_FD:
if (!drmPrimeHandleToFD(kms_sw->fd, kms_sw_dt->handle,
DRM_CLOEXEC, &whandle->handle)) {
whandle->stride = kms_sw_dt->stride;
return TRUE;
}
/* fallthrough */
default:
whandle->handle = 0;
whandle->stride = 0;
return FALSE;
}
return TRUE;
}
static void
@@ -315,4 +375,4 @@ kms_dri_create_winsys(int fd)
return &ws->base;
}
/* vim: set sw=3 ts=8 sts=3 expandtab: */
/* vim: set sw=3 ts=8 sts=3 expandtab: */

View File

@@ -39,6 +39,7 @@ LOCAL_SRC_FILES := \
$(LIBGLSL_FILES)
LOCAL_C_INCLUDES := \
$(MESA_TOP)/src \
$(MESA_TOP)/src/mapi \
$(MESA_TOP)/src/mesa
@@ -59,10 +60,11 @@ LOCAL_SRC_FILES := \
$(GLSL_COMPILER_CXX_FILES)
LOCAL_C_INCLUDES := \
$(MESA_TOP)/src \
$(MESA_TOP)/src/mapi \
$(MESA_TOP)/src/mesa
LOCAL_STATIC_LIBRARIES := libmesa_glsl libmesa_glsl_utils
LOCAL_STATIC_LIBRARIES := libmesa_glsl libmesa_glsl_utils libmesa_util
LOCAL_MODULE_TAGS := eng
LOCAL_MODULE := glsl_compiler

View File

@@ -76,6 +76,7 @@ LIBGLSL_FILES = \
$(GLSL_SRCDIR)/lower_vec_index_to_swizzle.cpp \
$(GLSL_SRCDIR)/lower_vector.cpp \
$(GLSL_SRCDIR)/lower_vector_insert.cpp \
$(GLSL_SRCDIR)/lower_vertex_id.cpp \
$(GLSL_SRCDIR)/lower_output_reads.cpp \
$(GLSL_SRCDIR)/lower_ubo_reference.cpp \
$(GLSL_SRCDIR)/opt_algebraic.cpp \

View File

@@ -289,8 +289,14 @@ HEXADECIMAL_INTEGER 0[xX][0-9a-fA-F]+[uU]?
}
/* Swallow empty #pragma directives, (to avoid confusing the
* downstream compiler). */
<HASH>pragma{HSPACE}*/{NEWLINE} {
* downstream compiler).
*
* Note: We use a simple regular expression for the lookahead
* here. Specifically, we cannot use the complete {NEWLINE} expression
* since it uses alternation and we've found that there's a flex bug
* where using alternation in the lookahead portion of a pattern
* triggers a buffer overrun. */
<HASH>pragma{HSPACE}*/[\r\n] {
BEGIN INITIAL;
}

View File

@@ -965,7 +965,7 @@ glsl_type::std140_size(bool row_major) const
if (field_type->is_record() && (i + 1 < this->length))
size = glsl_align(size, 16);
}
size = glsl_align(size, max_align);
size = glsl_align(size, MAX2(max_align, 16));
return size;
}

View File

@@ -125,6 +125,8 @@ bool optimize_redundant_jumps(exec_list *instructions);
bool optimize_split_arrays(exec_list *instructions, bool linked);
bool lower_offset_arrays(exec_list *instructions);
bool lower_vertex_id(gl_shader *shader);
ir_rvalue *
compare_index_block(exec_list *instructions, ir_variable *index,
unsigned base, unsigned components, void *mem_ctx);

View File

@@ -1115,8 +1115,8 @@ move_non_declarations(exec_list *instructions, exec_node *last,
/**
* Get the function signature for main from a shader
*/
static ir_function_signature *
get_main_function_signature(gl_shader *sh)
ir_function_signature *
link_get_main_function_signature(gl_shader *sh)
{
ir_function *const f = sh->symbols->get_function("main");
if (f != NULL) {
@@ -1644,7 +1644,7 @@ link_intrastage_shaders(void *mem_ctx,
*/
gl_shader *main = NULL;
for (unsigned i = 0; i < num_shaders; i++) {
if (get_main_function_signature(shader_list[i]) != NULL) {
if (link_get_main_function_signature(shader_list[i]) != NULL) {
main = shader_list[i];
break;
}
@@ -1673,7 +1673,8 @@ link_intrastage_shaders(void *mem_ctx,
/* The a pointer to the main function in the final linked shader (i.e., the
* copy of the original shader that contained the main function).
*/
ir_function_signature *const main_sig = get_main_function_signature(linked);
ir_function_signature *const main_sig =
link_get_main_function_signature(linked);
/* Move any instructions other than variable declarations or function
* declarations into main.
@@ -1736,6 +1737,9 @@ link_intrastage_shaders(void *mem_ctx,
}
}
if (ctx->Const.VertexID_is_zero_based)
lower_vertex_id(linked);
/* Make a pass over all variable declarations to ensure that arrays with
* unspecified sizes have a size specified. The size is inferred from the
* max_array_access field.

View File

@@ -26,6 +26,9 @@
#ifndef GLSL_LINKER_H
#define GLSL_LINKER_H
ir_function_signature *
link_get_main_function_signature(gl_shader *sh);
extern bool
link_function_calls(gl_shader_program *prog, gl_shader *main,
gl_shader **shader_list, unsigned num_shaders);

View File

@@ -111,7 +111,7 @@ is_dereferenced_thing_row_major(const ir_dereference *deref)
case GLSL_MATRIX_LAYOUT_COLUMN_MAJOR:
return false;
case GLSL_MATRIX_LAYOUT_ROW_MAJOR:
return matrix || deref->type->is_record();
return matrix || deref->type->without_array()->is_record();
}
unreachable("invalid matrix layout");
@@ -301,7 +301,14 @@ lower_ubo_reference_visitor::handle_rvalue(ir_rvalue **rvalue)
deref = deref_array->array->as_dereference();
break;
} else {
array_stride = deref_array->type->std140_size(row_major);
/* Whether or not the field is row-major (because it might be a
* bvec2 or something) does not affect the array itself. We need
* to know whether an array element in its entirety is row-major.
*/
const bool array_row_major =
is_dereferenced_thing_row_major(deref_array);
array_stride = deref_array->type->std140_size(array_row_major);
array_stride = glsl_align(array_stride, 16);
}
@@ -327,6 +334,15 @@ lower_ubo_reference_visitor::handle_rvalue(ir_rvalue **rvalue)
const glsl_type *struct_type = deref_record->record->type;
unsigned intra_struct_offset = 0;
/* glsl_type::std140_base_alignment doesn't grok interfaces. Use
* 16-bytes for the alignment because that is the general minimum of
* std140.
*/
const unsigned struct_alignment = struct_type->is_interface()
? 16
: struct_type->std140_base_alignment(row_major);
for (unsigned int i = 0; i < struct_type->length; i++) {
const glsl_type *type = struct_type->fields.structure[i].type;
@@ -346,6 +362,19 @@ lower_ubo_reference_visitor::handle_rvalue(ir_rvalue **rvalue)
deref_record->field) == 0)
break;
intra_struct_offset += type->std140_size(field_row_major);
/* If the field just examined was itself a structure, apply rule
* #9:
*
* "The structure may have padding at the end; the base offset
* of the member following the sub-structure is rounded up to
* the next multiple of the base alignment of the structure."
*/
if (type->without_array()->is_record()) {
intra_struct_offset = glsl_align(intra_struct_offset,
struct_alignment);
}
}
const_offset += intra_struct_offset;

View File

@@ -76,7 +76,7 @@ compare_index_block(exec_list *instructions, ir_variable *index,
ir_rvalue *broadcast_index = new(mem_ctx) ir_dereference_variable(index);
assert(index->type->is_scalar());
assert(index->type->base_type == GLSL_TYPE_INT);
assert(index->type->base_type == GLSL_TYPE_INT || index->type->base_type == GLSL_TYPE_UINT);
assert(components >= 1 && components <= 4);
if (components > 1) {

View File

@@ -0,0 +1,144 @@
/*
* Copyright © 2014 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/**
* \file lower_vertex_id.cpp
*
* There exists hardware, such as i965, that does not implement the OpenGL
* semantic for gl_VertexID. Instead, that hardware does not include the
* value of basevertex in the gl_VertexID value. To implement the OpenGL
* semantic, we'll have to convert gl_Vertex_ID to
* gl_VertexIDMESA+gl_BaseVertexMESA.
*/
#include "glsl_symbol_table.h"
#include "ir_hierarchical_visitor.h"
#include "ir.h"
#include "ir_builder.h"
#include "linker.h"
#include "program/prog_statevars.h"
namespace {
class lower_vertex_id_visitor : public ir_hierarchical_visitor {
public:
explicit lower_vertex_id_visitor(ir_function_signature *main_sig,
exec_list *ir_list)
: progress(false), VertexID(NULL), gl_VertexID(NULL),
gl_BaseVertex(NULL), main_sig(main_sig), ir_list(ir_list)
{
foreach_in_list(ir_instruction, ir, ir_list) {
ir_variable *const var = ir->as_variable();
if (var != NULL && var->data.mode == ir_var_system_value &&
var->data.location == SYSTEM_VALUE_BASE_VERTEX) {
gl_BaseVertex = var;
break;
}
}
}
virtual ir_visitor_status visit(ir_dereference_variable *);
bool progress;
private:
ir_variable *VertexID;
ir_variable *gl_VertexID;
ir_variable *gl_BaseVertex;
ir_function_signature *main_sig;
exec_list *ir_list;
};
} /* anonymous namespace */
ir_visitor_status
lower_vertex_id_visitor::visit(ir_dereference_variable *ir)
{
if (ir->var->data.mode != ir_var_system_value ||
ir->var->data.location != SYSTEM_VALUE_VERTEX_ID)
return visit_continue;
if (VertexID == NULL) {
const glsl_type *const int_t = glsl_type::int_type;
void *const mem_ctx = ralloc_parent(ir);
VertexID = new(mem_ctx) ir_variable(int_t, "__VertexID",
ir_var_temporary);
ir_list->push_head(VertexID);
gl_VertexID = new(mem_ctx) ir_variable(int_t, "gl_VertexIDMESA",
ir_var_system_value);
gl_VertexID->data.how_declared = ir_var_declared_implicitly;
gl_VertexID->data.read_only = true;
gl_VertexID->data.location = SYSTEM_VALUE_VERTEX_ID_ZERO_BASE;
gl_VertexID->data.explicit_location = true;
gl_VertexID->data.explicit_index = 0;
ir_list->push_head(gl_VertexID);
if (gl_BaseVertex == NULL) {
gl_BaseVertex = new(mem_ctx) ir_variable(int_t, "gl_BaseVertex",
ir_var_system_value);
gl_BaseVertex->data.how_declared = ir_var_declared_implicitly;
gl_BaseVertex->data.read_only = true;
gl_BaseVertex->data.location = SYSTEM_VALUE_BASE_VERTEX;
gl_BaseVertex->data.explicit_location = true;
gl_BaseVertex->data.explicit_index = 0;
ir_list->push_head(gl_BaseVertex);
}
ir_instruction *const inst =
ir_builder::assign(VertexID,
ir_builder::add(gl_VertexID, gl_BaseVertex));
main_sig->body.push_head(inst);
}
ir->var = VertexID;
progress = true;
return visit_continue;
}
bool
lower_vertex_id(gl_shader *shader)
{
/* gl_VertexID only exists in the vertex shader.
*/
if (shader->Stage != MESA_SHADER_VERTEX)
return false;
ir_function_signature *const main_sig =
link_get_main_function_signature(shader);
if (main_sig == NULL) {
assert(main_sig != NULL);
return false;
}
lower_vertex_id_visitor v(main_sig, shader->ir);
v.run(shader->ir);
return v.progress;
}

View File

@@ -79,6 +79,11 @@ ir_constant_folding_visitor::handle_rvalue(ir_rvalue **rvalue)
}
}
/* Ditto for swizzles. */
ir_swizzle *swiz = (*rvalue)->as_swizzle();
if (swiz && !swiz->val->as_constant())
return;
ir_constant *constant = (*rvalue)->constant_expression_value();
if (constant) {
*rvalue = constant;

View File

@@ -813,11 +813,15 @@ dri3_alloc_render_buffer(struct glx_screen *glx_screen, Drawable draw,
*/
fence_fd = xshmfence_alloc_shm();
if (fence_fd < 0)
if (fence_fd < 0) {
ErrorMessageF("DRI3 Fence object allocation failure %s\n", strerror(errno));
return NULL;
}
shm_fence = xshmfence_map_shm(fence_fd);
if (shm_fence == NULL)
if (shm_fence == NULL) {
ErrorMessageF("DRI3 Fence object map failure %s\n", strerror(errno));
goto no_shm_fence;
}
/* Allocate the image from the driver
*/
@@ -826,8 +830,10 @@ dri3_alloc_render_buffer(struct glx_screen *glx_screen, Drawable draw,
goto no_buffer;
buffer->cpp = dri3_cpp_for_format(format);
if (!buffer->cpp)
if (!buffer->cpp) {
ErrorMessageF("DRI3 buffer format %d invalid\n", format);
goto no_image;
}
if (!psc->is_different_gpu) {
buffer->image = (*psc->image->createImage) (psc->driScreen,
@@ -838,8 +844,10 @@ dri3_alloc_render_buffer(struct glx_screen *glx_screen, Drawable draw,
buffer);
pixmap_buffer = buffer->image;
if (!buffer->image)
if (!buffer->image) {
ErrorMessageF("DRI3 gpu image creation failure\n");
goto no_image;
}
} else {
buffer->image = (*psc->image->createImage) (psc->driScreen,
width, height,
@@ -847,8 +855,10 @@ dri3_alloc_render_buffer(struct glx_screen *glx_screen, Drawable draw,
0,
buffer);
if (!buffer->image)
if (!buffer->image) {
ErrorMessageF("DRI3 other gpu image creation failure\n");
goto no_image;
}
buffer->linear_buffer = (*psc->image->createImage) (psc->driScreen,
width, height,
@@ -858,19 +868,25 @@ dri3_alloc_render_buffer(struct glx_screen *glx_screen, Drawable draw,
buffer);
pixmap_buffer = buffer->linear_buffer;
if (!buffer->linear_buffer)
if (!buffer->linear_buffer) {
ErrorMessageF("DRI3 gpu linear image creation failure\n");
goto no_linear_buffer;
}
}
/* X wants the stride, so ask the image for it
*/
if (!(*psc->image->queryImage)(pixmap_buffer, __DRI_IMAGE_ATTRIB_STRIDE, &stride))
if (!(*psc->image->queryImage)(pixmap_buffer, __DRI_IMAGE_ATTRIB_STRIDE, &stride)) {
ErrorMessageF("DRI3 get image stride failed\n");
goto no_buffer_attrib;
}
buffer->pitch = stride;
if (!(*psc->image->queryImage)(pixmap_buffer, __DRI_IMAGE_ATTRIB_FD, &buffer_fd))
if (!(*psc->image->queryImage)(pixmap_buffer, __DRI_IMAGE_ATTRIB_FD, &buffer_fd)) {
ErrorMessageF("DRI3 get image FD failed\n");
goto no_buffer_attrib;
}
xcb_dri3_pixmap_from_buffer(c,
(pixmap = xcb_generate_id(c)),
@@ -910,6 +926,7 @@ no_buffer:
xshmfence_unmap_shm(shm_fence);
no_shm_fence:
close(fence_fd);
ErrorMessageF("DRI3 alloc_render_buffer failed\n");
return NULL;
}

View File

@@ -115,3 +115,12 @@ GET_HASH_GEN := $(LOCAL_PATH)/main/get_hash_generator.py
$(intermediates)/main/get_hash.h: $(glapi)/gl_and_es_API.xml \
$(LOCAL_PATH)/main/get_hash_params.py $(GET_HASH_GEN)
@$(MESA_PYTHON2) $(GET_HASH_GEN) -f $< > $@
FORMAT_INFO := $(LOCAL_PATH)/main/format_info.py
format_info_deps := \
$(LOCAL_PATH)/main/formats.csv \
$(LOCAL_PATH)/main/format_parser.py \
$(FORMAT_INFO)
$(intermediates)/main/format_info.c: $(format_info_deps)
@$(MESA_PYTHON2) $(FORMAT_INFO) $< > $@

View File

@@ -32,6 +32,8 @@ LOCAL_PATH := $(call my-dir)
# MESA_FILES
# X86_FILES
include $(LOCAL_PATH)/Makefile.sources
SRCDIR :=
BUILDDIR :=
include $(CLEAR_VARS)
@@ -55,6 +57,7 @@ endif
LOCAL_C_INCLUDES := \
$(call intermediates-dir-for STATIC_LIBRARIES,libmesa_program,,) \
$(MESA_TOP)/src \
$(MESA_TOP)/src/mapi \
$(MESA_TOP)/src/glsl \
$(MESA_TOP)/src/gallium/auxiliary

View File

@@ -36,11 +36,11 @@ include $(CLEAR_VARS)
LOCAL_MODULE := libmesa_glsl_utils
LOCAL_C_INCLUDES := \
$(MESA_TOP)/src \
$(MESA_TOP)/src/glsl \
$(MESA_TOP)/src/mapi
LOCAL_SRC_FILES := \
main/hash_table.c \
main/imports.c \
program/prog_hash_table.c \
program/symbol_table.c
@@ -59,11 +59,11 @@ LOCAL_IS_HOST_MODULE := true
LOCAL_CFLAGS := -D_POSIX_C_SOURCE=199309L
LOCAL_C_INCLUDES := \
$(MESA_TOP)/src \
$(MESA_TOP)/src/glsl \
$(MESA_TOP)/src/mapi
LOCAL_SRC_FILES := \
main/hash_table.c \
main/imports.c \
program/prog_hash_table.c \
program/symbol_table.c

View File

@@ -32,6 +32,8 @@ LOCAL_PATH := $(call my-dir)
# MESA_GALLIUM_FILES.
# X86_FILES
include $(LOCAL_PATH)/Makefile.sources
SRCDIR :=
BUILDDIR :=
include $(CLEAR_VARS)
@@ -50,6 +52,7 @@ LOCAL_C_INCLUDES := \
$(call intermediates-dir-for STATIC_LIBRARIES,libmesa_program,,) \
$(MESA_TOP)/src/gallium/auxiliary \
$(MESA_TOP)/src/gallium/include \
$(MESA_TOP)/src \
$(MESA_TOP)/src/glsl \
$(MESA_TOP)/src/mapi

View File

@@ -396,25 +396,6 @@ _mesa_meta_init(struct gl_context *ctx)
ctx->Meta = CALLOC_STRUCT(gl_meta_state);
}
static GLenum
gl_buffer_index_to_drawbuffers_enum(gl_buffer_index bufindex)
{
assert(bufindex < BUFFER_COUNT);
if (bufindex >= BUFFER_COLOR0)
return GL_COLOR_ATTACHMENT0 + bufindex - BUFFER_COLOR0;
else if (bufindex == BUFFER_FRONT_LEFT)
return GL_FRONT_LEFT;
else if (bufindex == BUFFER_FRONT_RIGHT)
return GL_FRONT_RIGHT;
else if (bufindex == BUFFER_BACK_LEFT)
return GL_BACK_LEFT;
else if (bufindex == BUFFER_BACK_RIGHT)
return GL_BACK_RIGHT;
return GL_NONE;
}
/**
* Free context meta-op state.
* To be called once during context destruction.
@@ -806,20 +787,9 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state)
}
if (state & MESA_META_DRAW_BUFFERS) {
int buf, real_color_buffers = 0;
memset(save->ColorDrawBuffers, 0, sizeof(save->ColorDrawBuffers));
for (buf = 0; buf < ctx->Const.MaxDrawBuffers; buf++) {
int buf_index = ctx->DrawBuffer->_ColorDrawBufferIndexes[buf];
if (buf_index == -1)
continue;
save->ColorDrawBuffers[buf] =
gl_buffer_index_to_drawbuffers_enum(buf_index);
if (++real_color_buffers >= ctx->DrawBuffer->_NumColorDrawBuffers)
break;
}
struct gl_framebuffer *fb = ctx->DrawBuffer;
memcpy(save->ColorDrawBuffers, fb->ColorDrawBuffer,
sizeof(save->ColorDrawBuffers));
}
/* misc */
@@ -1224,7 +1194,7 @@ _mesa_meta_end(struct gl_context *ctx)
_mesa_BindRenderbuffer(GL_RENDERBUFFER, save->RenderbufferName);
if (state & MESA_META_DRAW_BUFFERS) {
_mesa_DrawBuffers(ctx->Const.MaxDrawBuffers, save->ColorDrawBuffers);
_mesa_drawbuffers(ctx, ctx->Const.MaxDrawBuffers, save->ColorDrawBuffers, NULL);
}
ctx->Meta->SaveStackDepth--;

View File

@@ -74,7 +74,7 @@ make_view(struct gl_context *ctx, struct gl_texture_image *tex_image,
tex_image->Depth,
0, internal_format, tex_format);
view_tex_obj->MinLevel = 0;
view_tex_obj->MinLevel = tex_image->Level;
view_tex_obj->NumLevels = 1;
view_tex_obj->MinLayer = tex_obj->MinLayer;
view_tex_obj->NumLayers = tex_obj->NumLayers;

View File

@@ -616,6 +616,8 @@ intel_create_image_from_fds(__DRIscreen *screen,
return NULL;
}
intel_setup_image_from_dimensions(image);
image->planar_format = f;
for (i = 0; i < f->nplanes; i++) {
index = f->planes[i].buffer_index;

View File

@@ -78,7 +78,7 @@ void
brw_blorp_surface_info::set(struct brw_context *brw,
struct intel_mipmap_tree *mt,
unsigned int level, unsigned int layer,
bool is_render_target)
mesa_format format, bool is_render_target)
{
brw_blorp_mip_info::set(mt, level, layer);
this->num_samples = mt->num_samples;
@@ -86,7 +86,10 @@ brw_blorp_surface_info::set(struct brw_context *brw,
this->map_stencil_as_y_tiled = false;
this->msaa_layout = mt->msaa_layout;
switch (mt->format) {
if (format == MESA_FORMAT_NONE)
format = mt->format;
switch (format) {
case MESA_FORMAT_S_UINT8:
/* The miptree is a W-tiled stencil buffer. Surface states can't be set
* up for W tiling, so we'll need to use Y tiling and have the WM
@@ -115,7 +118,7 @@ brw_blorp_surface_info::set(struct brw_context *brw,
this->brw_surfaceformat = BRW_SURFACEFORMAT_R16_UNORM;
break;
default: {
mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format);
mesa_format linear_format = _mesa_get_srgb_format_linear(format);
if (is_render_target) {
assert(brw->format_supported_as_render_target[linear_format]);
this->brw_surfaceformat = brw->render_target_format[linear_format];

View File

@@ -39,8 +39,10 @@ void
brw_blorp_blit_miptrees(struct brw_context *brw,
struct intel_mipmap_tree *src_mt,
unsigned src_level, unsigned src_layer,
mesa_format src_format,
struct intel_mipmap_tree *dst_mt,
unsigned dst_level, unsigned dst_layer,
mesa_format dst_format,
float src_x0, float src_y0,
float src_x1, float src_y1,
float dst_x0, float dst_y0,
@@ -121,7 +123,7 @@ public:
void set(struct brw_context *brw,
struct intel_mipmap_tree *mt,
unsigned int level, unsigned int layer,
bool is_render_target);
mesa_format format, bool is_render_target);
uint32_t compute_tile_offsets(uint32_t *tile_x, uint32_t *tile_y) const;
@@ -346,8 +348,10 @@ public:
brw_blorp_blit_params(struct brw_context *brw,
struct intel_mipmap_tree *src_mt,
unsigned src_level, unsigned src_layer,
mesa_format src_format,
struct intel_mipmap_tree *dst_mt,
unsigned dst_level, unsigned dst_layer,
mesa_format dst_format,
GLfloat src_x0, GLfloat src_y0,
GLfloat src_x1, GLfloat src_y1,
GLfloat dst_x0, GLfloat dst_y0,

View File

@@ -56,8 +56,10 @@ void
brw_blorp_blit_miptrees(struct brw_context *brw,
struct intel_mipmap_tree *src_mt,
unsigned src_level, unsigned src_layer,
mesa_format src_format,
struct intel_mipmap_tree *dst_mt,
unsigned dst_level, unsigned dst_layer,
mesa_format dst_format,
float src_x0, float src_y0,
float src_x1, float src_y1,
float dst_x0, float dst_y0,
@@ -84,8 +86,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
mirror_x, mirror_y);
brw_blorp_blit_params params(brw,
src_mt, src_level, src_layer,
dst_mt, dst_level, dst_layer,
src_mt, src_level, src_layer, src_format,
dst_mt, dst_level, dst_layer, dst_format,
src_x0, src_y0,
src_x1, src_y1,
dst_x0, dst_y0,
@@ -98,8 +100,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
static void
do_blorp_blit(struct brw_context *brw, GLbitfield buffer_bit,
struct intel_renderbuffer *src_irb,
struct intel_renderbuffer *dst_irb,
struct intel_renderbuffer *src_irb, mesa_format src_format,
struct intel_renderbuffer *dst_irb, mesa_format dst_format,
GLfloat srcX0, GLfloat srcY0, GLfloat srcX1, GLfloat srcY1,
GLfloat dstX0, GLfloat dstY0, GLfloat dstX1, GLfloat dstY1,
GLenum filter, bool mirror_x, bool mirror_y)
@@ -111,7 +113,9 @@ do_blorp_blit(struct brw_context *brw, GLbitfield buffer_bit,
/* Do the blit */
brw_blorp_blit_miptrees(brw,
src_mt, src_irb->mt_level, src_irb->mt_layer,
src_format,
dst_mt, dst_irb->mt_level, dst_irb->mt_layer,
dst_format,
srcX0, srcY0, srcX1, srcY1,
dstX0, dstY0, dstX1, dstY1,
filter, mirror_x, mirror_y);
@@ -153,8 +157,11 @@ try_blorp_blit(struct brw_context *brw,
for (unsigned i = 0; i < ctx->DrawBuffer->_NumColorDrawBuffers; ++i) {
dst_irb = intel_renderbuffer(ctx->DrawBuffer->_ColorDrawBuffers[i]);
if (dst_irb)
do_blorp_blit(brw, buffer_bit, src_irb, dst_irb, srcX0, srcY0,
srcX1, srcY1, dstX0, dstY0, dstX1, dstY1,
do_blorp_blit(brw, buffer_bit,
src_irb, src_irb->Base.Base.Format,
dst_irb, dst_irb->Base.Base.Format,
srcX0, srcY0, srcX1, srcY1,
dstX0, dstY0, dstX1, dstY1,
filter, mirror_x, mirror_y);
}
break;
@@ -174,7 +181,8 @@ try_blorp_blit(struct brw_context *brw,
(dst_mt->format == MESA_FORMAT_Z24_UNORM_X8_UINT))
return false;
do_blorp_blit(brw, buffer_bit, src_irb, dst_irb, srcX0, srcY0,
do_blorp_blit(brw, buffer_bit, src_irb, MESA_FORMAT_NONE,
dst_irb, MESA_FORMAT_NONE, srcX0, srcY0,
srcX1, srcY1, dstX0, dstY0, dstX1, dstY1,
filter, mirror_x, mirror_y);
break;
@@ -183,7 +191,8 @@ try_blorp_blit(struct brw_context *brw,
intel_renderbuffer(read_fb->Attachment[BUFFER_STENCIL].Renderbuffer);
dst_irb =
intel_renderbuffer(draw_fb->Attachment[BUFFER_STENCIL].Renderbuffer);
do_blorp_blit(brw, buffer_bit, src_irb, dst_irb, srcX0, srcY0,
do_blorp_blit(brw, buffer_bit, src_irb, MESA_FORMAT_NONE,
dst_irb, MESA_FORMAT_NONE, srcX0, srcY0,
srcX1, srcY1, dstX0, dstY0, dstX1, dstY1,
filter, mirror_x, mirror_y);
break;
@@ -219,8 +228,8 @@ brw_blorp_copytexsubimage(struct brw_context *brw,
if (brw->gen < 6 || brw->gen >= 8)
return false;
if (_mesa_get_format_base_format(src_mt->format) !=
_mesa_get_format_base_format(dst_mt->format)) {
if (_mesa_get_format_base_format(src_rb->Format) !=
_mesa_get_format_base_format(dst_image->TexFormat)) {
return false;
}
@@ -233,7 +242,7 @@ brw_blorp_copytexsubimage(struct brw_context *brw,
return false;
}
if (!brw->format_supported_as_render_target[dst_mt->format])
if (!brw->format_supported_as_render_target[dst_image->TexFormat])
return false;
/* Source clipping shouldn't be necessary, since copytexsubimage (in
@@ -268,7 +277,9 @@ brw_blorp_copytexsubimage(struct brw_context *brw,
brw_blorp_blit_miptrees(brw,
src_mt, src_irb->mt_level, src_irb->mt_layer,
src_rb->Format,
dst_mt, dst_level, dst_slice,
dst_image->TexFormat,
srcX0, srcY0, srcX1, srcY1,
dstX0, dstY0, dstX1, dstY1,
GL_NEAREST, false, mirror_y);
@@ -291,7 +302,9 @@ brw_blorp_copytexsubimage(struct brw_context *brw,
if (src_mt != dst_mt) {
brw_blorp_blit_miptrees(brw,
src_mt, src_irb->mt_level, src_irb->mt_layer,
src_mt->format,
dst_mt, dst_level, dst_slice,
dst_mt->format,
srcX0, srcY0, srcX1, srcY1,
dstX0, dstY0, dstX1, dstY1,
GL_NEAREST, false, mirror_y);
@@ -1822,8 +1835,10 @@ compute_msaa_layout_for_pipeline(struct brw_context *brw, unsigned num_samples,
brw_blorp_blit_params::brw_blorp_blit_params(struct brw_context *brw,
struct intel_mipmap_tree *src_mt,
unsigned src_level, unsigned src_layer,
mesa_format src_format,
struct intel_mipmap_tree *dst_mt,
unsigned dst_level, unsigned dst_layer,
mesa_format dst_format,
GLfloat src_x0, GLfloat src_y0,
GLfloat src_x1, GLfloat src_y1,
GLfloat dst_x0, GLfloat dst_y0,
@@ -1831,8 +1846,8 @@ brw_blorp_blit_params::brw_blorp_blit_params(struct brw_context *brw,
GLenum filter,
bool mirror_x, bool mirror_y)
{
src.set(brw, src_mt, src_level, src_layer, false);
dst.set(brw, dst_mt, dst_level, dst_layer, true);
src.set(brw, src_mt, src_level, src_layer, src_format, false);
dst.set(brw, dst_mt, dst_level, dst_layer, dst_format, true);
/* Even though we do multisample resolves at the time of the blit, OpenGL
* specification defines them as if they happen at the time of rendering,

View File

@@ -483,6 +483,7 @@ brw_initialize_context_constants(struct brw_context *brw)
ctx->Const.QuadsFollowProvokingVertexConvention = false;
ctx->Const.NativeIntegers = true;
ctx->Const.VertexID_is_zero_based = true;
/* Regarding the CMP instruction, the Ivybridge PRM says:
*

View File

@@ -553,6 +553,7 @@ struct brw_vs_prog_data {
GLbitfield64 inputs_read;
bool uses_vertexid;
bool uses_instanceid;
};
@@ -1061,6 +1062,21 @@ struct brw_context
/* Whether the last depth/stencil packets were both NULL. */
bool no_depth_or_stencil;
struct {
/** Does the current draw use the index buffer? */
bool indexed;
int start_vertex_location;
int base_vertex_location;
/**
* Buffer and offset used for GL_ARB_shader_draw_parameters
* (for now, only gl_BaseVertex).
*/
drm_intel_bo *draw_params_bo;
uint32_t draw_params_offset;
} draw;
struct {
struct brw_vertex_element inputs[VERT_ATTRIB_MAX];
struct brw_vertex_buffer buffers[VERT_ATTRIB_MAX];

View File

@@ -176,26 +176,19 @@ static void brw_emit_prim(struct brw_context *brw,
{
int verts_per_instance;
int vertex_access_type;
int start_vertex_location;
int base_vertex_location;
int indirect_flag;
DBG("PRIM: %s %d %d\n", _mesa_lookup_enum_by_nr(prim->mode),
prim->start, prim->count);
start_vertex_location = prim->start;
base_vertex_location = prim->basevertex;
if (prim->indexed) {
vertex_access_type = brw->gen >= 7 ?
GEN7_3DPRIM_VERTEXBUFFER_ACCESS_RANDOM :
GEN4_3DPRIM_VERTEXBUFFER_ACCESS_RANDOM;
start_vertex_location += brw->ib.start_vertex_offset;
base_vertex_location += brw->vb.start_vertex_bias;
} else {
vertex_access_type = brw->gen >= 7 ?
GEN7_3DPRIM_VERTEXBUFFER_ACCESS_SEQUENTIAL :
GEN4_3DPRIM_VERTEXBUFFER_ACCESS_SEQUENTIAL;
start_vertex_location += brw->vb.start_vertex_bias;
}
/* We only need to trim the primitive count on pre-Gen6. */
@@ -270,10 +263,10 @@ static void brw_emit_prim(struct brw_context *brw,
vertex_access_type);
}
OUT_BATCH(verts_per_instance);
OUT_BATCH(start_vertex_location);
OUT_BATCH(brw->draw.start_vertex_location);
OUT_BATCH(prim->num_instances);
OUT_BATCH(prim->base_instance);
OUT_BATCH(base_vertex_location);
OUT_BATCH(brw->draw.base_vertex_location);
ADVANCE_BATCH();
/* Only used on Sandybridge; harmless to set elsewhere. */
@@ -436,12 +429,35 @@ static bool brw_try_draw_prims( struct gl_context *ctx,
brw_merge_inputs(brw, arrays);
}
}
brw->draw.indexed = prims[i].indexed;
brw->draw.start_vertex_location = prims[i].start;
brw->draw.base_vertex_location = prims[i].basevertex;
drm_intel_bo_unreference(brw->draw.draw_params_bo);
if (prims[i].is_indirect) {
/* Point draw_params_bo at the indirect buffer. */
brw->draw.draw_params_bo =
intel_buffer_object(ctx->DrawIndirectBuffer)->buffer;
drm_intel_bo_reference(brw->draw.draw_params_bo);
brw->draw.draw_params_offset =
prims[i].indirect_offset + (prims[i].indexed ? 12 : 8);
} else {
/* Set draw_params_bo to NULL so brw_prepare_vertices knows it
* has to upload gl_BaseVertex and such if they're needed.
*/
brw->draw.draw_params_bo = NULL;
brw->draw.draw_params_offset = 0;
}
if (brw->gen < 6)
brw_set_prim(brw, &prims[i]);
else
gen6_set_prim(brw, &prims[i]);
retry:
/* Note that before the loop, brw->state.dirty.brw was set to != 0, and
* that the state updated in the loop outside of this block is that in
* *_set_prim or intel_batchbuffer_flush(), which only impacts

View File

@@ -47,6 +47,8 @@ void brw_draw_prims( struct gl_context *ctx,
void brw_draw_init( struct brw_context *brw );
void brw_draw_destroy( struct brw_context *brw );
void brw_prepare_shader_draw_parameters(struct brw_context *);
/* brw_primitive_restart.c */
GLboolean
brw_handle_primitive_restart(struct gl_context *ctx,

View File

@@ -604,16 +604,83 @@ brw_prepare_vertices(struct brw_context *brw)
brw->vb.nr_buffers = j;
}
static void brw_emit_vertices(struct brw_context *brw)
void
brw_prepare_shader_draw_parameters(struct brw_context *brw)
{
int *gl_basevertex_value;
if (brw->draw.indexed) {
brw->draw.start_vertex_location += brw->ib.start_vertex_offset;
brw->draw.base_vertex_location += brw->vb.start_vertex_bias;
gl_basevertex_value = &brw->draw.base_vertex_location;
} else {
brw->draw.start_vertex_location += brw->vb.start_vertex_bias;
gl_basevertex_value = &brw->draw.start_vertex_location;
}
/* For non-indirect draws, upload gl_BaseVertex. */
if (brw->vs.prog_data->uses_vertexid && brw->draw.draw_params_bo == NULL) {
intel_upload_data(brw, gl_basevertex_value, 4, 4,
&brw->draw.draw_params_bo,
&brw->draw.draw_params_offset);
}
}
/**
* Emit a VERTEX_BUFFER_STATE entry (part of 3DSTATE_VERTEX_BUFFERS).
*/
static void
emit_vertex_buffer_state(struct brw_context *brw,
unsigned buffer_nr,
drm_intel_bo *bo,
unsigned bo_ending_address,
unsigned bo_offset,
unsigned stride,
unsigned step_rate)
{
struct gl_context *ctx = &brw->ctx;
GLuint i, nr_elements;
uint32_t dw0;
if (brw->gen >= 6) {
dw0 = (buffer_nr << GEN6_VB0_INDEX_SHIFT) |
(step_rate ? GEN6_VB0_ACCESS_INSTANCEDATA
: GEN6_VB0_ACCESS_VERTEXDATA);
} else {
dw0 = (buffer_nr << BRW_VB0_INDEX_SHIFT) |
(step_rate ? BRW_VB0_ACCESS_INSTANCEDATA
: BRW_VB0_ACCESS_VERTEXDATA);
}
if (brw->gen >= 7)
dw0 |= GEN7_VB0_ADDRESS_MODIFYENABLE;
if (brw->gen == 7)
dw0 |= GEN7_MOCS_L3 << 16;
WARN_ONCE(stride >= (brw->gen >= 5 ? 2048 : 2047),
"VBO stride %d too large, bad rendering may occur\n",
stride);
OUT_BATCH(dw0 | (stride << BRW_VB0_PITCH_SHIFT));
OUT_RELOC(bo, I915_GEM_DOMAIN_VERTEX, 0, bo_offset);
if (brw->gen >= 5) {
OUT_RELOC(bo, I915_GEM_DOMAIN_VERTEX, 0, bo_ending_address);
} else {
OUT_BATCH(0);
}
OUT_BATCH(step_rate);
}
static void brw_emit_vertices(struct brw_context *brw)
{
GLuint i;
brw_prepare_vertices(brw);
brw_prepare_shader_draw_parameters(brw);
brw_emit_query_begin(brw);
nr_elements = brw->vb.nr_enabled + brw->vs.prog_data->uses_vertexid;
unsigned nr_elements = brw->vb.nr_enabled;
if (brw->vs.prog_data->uses_vertexid || brw->vs.prog_data->uses_instanceid)
++nr_elements;
/* If the VS doesn't read any inputs (calculating vertex position from
* a state variable for some reason, for example), emit a single pad
@@ -647,47 +714,33 @@ static void brw_emit_vertices(struct brw_context *brw)
/* Now emit VB and VEP state packets.
*/
if (brw->vb.nr_buffers) {
unsigned nr_buffers =
brw->vb.nr_buffers + brw->vs.prog_data->uses_vertexid;
if (nr_buffers) {
if (brw->gen >= 6) {
assert(brw->vb.nr_buffers <= 33);
assert(nr_buffers <= 33);
} else {
assert(brw->vb.nr_buffers <= 17);
assert(nr_buffers <= 17);
}
BEGIN_BATCH(1 + 4*brw->vb.nr_buffers);
OUT_BATCH((_3DSTATE_VERTEX_BUFFERS << 16) | (4*brw->vb.nr_buffers - 1));
BEGIN_BATCH(1 + 4 * nr_buffers);
OUT_BATCH((_3DSTATE_VERTEX_BUFFERS << 16) | (4 * nr_buffers - 1));
for (i = 0; i < brw->vb.nr_buffers; i++) {
struct brw_vertex_buffer *buffer = &brw->vb.buffers[i];
uint32_t dw0;
emit_vertex_buffer_state(brw, i, buffer->bo, buffer->bo->size - 1,
buffer->offset, buffer->stride,
buffer->step_rate);
if (brw->gen >= 6) {
dw0 = buffer->step_rate
? GEN6_VB0_ACCESS_INSTANCEDATA
: GEN6_VB0_ACCESS_VERTEXDATA;
dw0 |= i << GEN6_VB0_INDEX_SHIFT;
} else {
dw0 = buffer->step_rate
? BRW_VB0_ACCESS_INSTANCEDATA
: BRW_VB0_ACCESS_VERTEXDATA;
dw0 |= i << BRW_VB0_INDEX_SHIFT;
}
}
if (brw->gen >= 7)
dw0 |= GEN7_VB0_ADDRESS_MODIFYENABLE;
if (brw->gen == 7)
dw0 |= GEN7_MOCS_L3 << 16;
WARN_ONCE(buffer->stride >= (brw->gen >= 5 ? 2048 : 2047),
"VBO stride %d too large, bad rendering may occur\n",
buffer->stride);
OUT_BATCH(dw0 | (buffer->stride << BRW_VB0_PITCH_SHIFT));
OUT_RELOC(buffer->bo, I915_GEM_DOMAIN_VERTEX, 0, buffer->offset);
if (brw->gen >= 5) {
OUT_RELOC(buffer->bo, I915_GEM_DOMAIN_VERTEX, 0, buffer->bo->size - 1);
} else
OUT_BATCH(0);
OUT_BATCH(buffer->step_rate);
if (brw->vs.prog_data->uses_vertexid) {
emit_vertex_buffer_state(brw, brw->vb.nr_buffers,
brw->draw.draw_params_bo,
brw->draw.draw_params_bo->size - 1,
brw->draw.draw_params_offset,
0, /* stride */
0); /* step rate */
}
ADVANCE_BATCH();
}
@@ -773,18 +826,35 @@ static void brw_emit_vertices(struct brw_context *brw)
(BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_3_SHIFT));
}
if (brw->vs.prog_data->uses_vertexid) {
if (brw->vs.prog_data->uses_vertexid || brw->vs.prog_data->uses_instanceid) {
uint32_t dw0 = 0, dw1 = 0;
uint32_t comp0 = BRW_VE1_COMPONENT_STORE_0;
uint32_t comp1 = BRW_VE1_COMPONENT_STORE_0;
uint32_t comp2 = BRW_VE1_COMPONENT_STORE_0;
uint32_t comp3 = BRW_VE1_COMPONENT_STORE_0;
dw1 = ((BRW_VE1_COMPONENT_STORE_VID << BRW_VE1_COMPONENT_0_SHIFT) |
(BRW_VE1_COMPONENT_STORE_IID << BRW_VE1_COMPONENT_1_SHIFT) |
(BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_2_SHIFT) |
(BRW_VE1_COMPONENT_STORE_0 << BRW_VE1_COMPONENT_3_SHIFT));
if (brw->vs.prog_data->uses_vertexid) {
comp0 = BRW_VE1_COMPONENT_STORE_SRC;
comp2 = BRW_VE1_COMPONENT_STORE_VID;
}
if (brw->vs.prog_data->uses_instanceid) {
comp3 = BRW_VE1_COMPONENT_STORE_IID;
}
dw1 = (comp0 << BRW_VE1_COMPONENT_0_SHIFT) |
(comp1 << BRW_VE1_COMPONENT_1_SHIFT) |
(comp2 << BRW_VE1_COMPONENT_2_SHIFT) |
(comp3 << BRW_VE1_COMPONENT_3_SHIFT);
if (brw->gen >= 6) {
dw0 |= GEN6_VE0_VALID;
dw0 |= GEN6_VE0_VALID |
brw->vb.nr_buffers << GEN6_VE0_INDEX_SHIFT |
BRW_SURFACEFORMAT_R32_UINT << BRW_VE0_FORMAT_SHIFT;
} else {
dw0 |= BRW_VE0_VALID;
dw0 |= BRW_VE0_VALID |
brw->vb.nr_buffers << BRW_VE0_INDEX_SHIFT |
BRW_SURFACEFORMAT_R32_UINT << BRW_VE0_FORMAT_SHIFT;
dw1 |= (i * 4) << BRW_VE1_DST_OFFSET_SHIFT;
}

View File

@@ -1029,19 +1029,17 @@ update_uip_jip(struct brw_context *brw, brw_inst *insn,
{
int scale = brw->gen >= 8 ? sizeof(brw_compact_inst) : 1;
int32_t jip = brw_inst_jip(brw, insn);
jip -= scale *
compacted_between(this_old_ip, this_old_ip + jip, compacted_counts);
brw_inst_set_jip(brw, insn, jip);
int32_t jip = brw_inst_jip(brw, insn) / scale;
jip -= compacted_between(this_old_ip, this_old_ip + jip, compacted_counts);
brw_inst_set_jip(brw, insn, jip * scale);
if (brw_inst_opcode(brw, insn) == BRW_OPCODE_ENDIF ||
brw_inst_opcode(brw, insn) == BRW_OPCODE_WHILE)
return;
int32_t uip = brw_inst_uip(brw, insn);
uip -= scale *
compacted_between(this_old_ip, this_old_ip + uip, compacted_counts);
brw_inst_set_uip(brw, insn, uip);
int32_t uip = brw_inst_uip(brw, insn) / scale;
uip -= compacted_between(this_old_ip, this_old_ip + uip, compacted_counts);
brw_inst_set_uip(brw, insn, uip * scale);
}
void

View File

@@ -1759,16 +1759,25 @@ fs_visitor::compact_virtual_grfs()
}
/* Patch all the references to delta_x/delta_y, since they're used in
* register allocation.
* register allocation. If they're unused, switch them to BAD_FILE so
* we don't think some random VGRF is delta_x/delta_y.
*/
for (unsigned i = 0; i < ARRAY_SIZE(delta_x); i++) {
if (delta_x[i].file == GRF && remap_table[delta_x[i].reg] != -1) {
delta_x[i].reg = remap_table[delta_x[i].reg];
if (delta_x[i].file == GRF) {
if (remap_table[delta_x[i].reg] != -1) {
delta_x[i].reg = remap_table[delta_x[i].reg];
} else {
delta_x[i].file = BAD_FILE;
}
}
}
for (unsigned i = 0; i < ARRAY_SIZE(delta_y); i++) {
if (delta_y[i].file == GRF && remap_table[delta_y[i].reg] != -1) {
delta_y[i].reg = remap_table[delta_y[i].reg];
if (delta_y[i].file == GRF) {
if (remap_table[delta_y[i].reg] != -1) {
delta_y[i].reg = remap_table[delta_y[i].reg];
} else {
delta_y[i].file = BAD_FILE;
}
}
}
}

View File

@@ -458,6 +458,7 @@ fs_visitor::assign_regs(bool allow_spilling)
* that register and set it to the appropriate class.
*/
if (screen->wm_reg_sets[rsi].aligned_pairs_class >= 0 &&
this->delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC].file == GRF &&
this->delta_x[BRW_WM_PERSPECTIVE_PIXEL_BARYCENTRIC].reg == i) {
c = screen->wm_reg_sets[rsi].aligned_pairs_class;
}

View File

@@ -109,10 +109,10 @@ fs_visitor::visit(ir_variable *ir)
* ir_binop_ubo_load expressions and not ir_dereference_variable for UBO
* variables, so no need for them to be in variable_ht.
*
* Atomic counters take no uniform storage, no need to do
* anything here.
* Some uniforms, such as samplers and atomic counters, have no actual
* storage, so we should ignore them.
*/
if (ir->is_in_uniform_block() || ir->type->contains_atomic())
if (ir->is_in_uniform_block() || type_size(ir->type) == 0)
return;
if (dispatch_width == 16) {
@@ -2238,7 +2238,7 @@ fs_visitor::emit_bool_to_cond_code(ir_rvalue *ir)
{
ir_expression *expr = ir->as_expression();
if (!expr) {
if (!expr || expr->operation == ir_binop_ubo_load) {
ir->accept(this);
fs_inst *inst = emit(AND(reg_null_d, this->result, fs_reg(1)));
@@ -2246,10 +2246,10 @@ fs_visitor::emit_bool_to_cond_code(ir_rvalue *ir)
return;
}
fs_reg op[2];
fs_reg op[3];
fs_inst *inst;
assert(expr->get_num_operands() <= 2);
assert(expr->get_num_operands() <= 3);
for (unsigned int i = 0; i < expr->get_num_operands(); i++) {
assert(expr->operands[i]->type->is_scalar());
@@ -2336,6 +2336,22 @@ fs_visitor::emit_bool_to_cond_code(ir_rvalue *ir)
brw_conditional_for_comparison(expr->operation)));
break;
case ir_triop_csel: {
/* Expand the boolean condition into the flag register. */
inst = emit(MOV(reg_null_d, op[0]));
inst->conditional_mod = BRW_CONDITIONAL_NZ;
/* Select which boolean to return. */
fs_reg temp(this, expr->operands[1]->type);
inst = emit(SEL(temp, op[1], op[2]));
inst->predicate = BRW_PREDICATE_NORMAL;
/* Expand the result to a condition code. */
inst = emit(MOV(reg_null_d, temp));
inst->conditional_mod = BRW_CONDITIONAL_NZ;
break;
}
default:
unreachable("not reached");
}
@@ -2350,12 +2366,12 @@ fs_visitor::emit_if_gen6(ir_if *ir)
{
ir_expression *expr = ir->condition->as_expression();
if (expr) {
fs_reg op[2];
if (expr && expr->operation != ir_binop_ubo_load) {
fs_reg op[3];
fs_inst *inst;
fs_reg temp;
assert(expr->get_num_operands() <= 2);
assert(expr->get_num_operands() <= 3);
for (unsigned int i = 0; i < expr->get_num_operands(); i++) {
assert(expr->operands[i]->type->is_scalar());
@@ -2399,6 +2415,21 @@ fs_visitor::emit_if_gen6(ir_if *ir)
emit(IF(op[0], op[1],
brw_conditional_for_comparison(expr->operation)));
return;
case ir_triop_csel: {
/* Expand the boolean condition into the flag register. */
fs_inst *inst = emit(MOV(reg_null_d, op[0]));
inst->conditional_mod = BRW_CONDITIONAL_NZ;
/* Select which boolean to use as the result. */
fs_reg temp(this, expr->operands[1]->type);
inst = emit(SEL(temp, op[1], op[2]));
inst->predicate = BRW_PREDICATE_NORMAL;
emit(IF(temp, fs_reg(0), BRW_CONDITIONAL_NZ));
return;
}
default:
unreachable("not reached");
}

View File

@@ -282,6 +282,7 @@ get_fast_clear_rect(struct brw_context *brw, struct gl_framebuffer *fb,
* factor is 2 vertically and either 2 or 8 horizontally.
*/
switch (irb->mt->num_samples) {
case 2:
case 4:
x_scaledown = 8;
break;
@@ -641,13 +642,19 @@ get_resolve_rect(struct brw_context *brw,
* with respect to render target being resolved.
*
* The scaledown factors in the table that follows are related to the
* alignment size returned by intel_get_non_msrt_mcs_alignment(), but with
* X and Y alignment each divided by 2.
* alignment size returned by intel_get_non_msrt_mcs_alignment() by a
* multiplier. For IVB and HSW, we divide by two, for BDW we multiply
* by 8 and 16.
*/
intel_get_non_msrt_mcs_alignment(brw, mt, &x_align, &y_align);
x_scaledown = x_align / 2;
y_scaledown = y_align / 2;
if (brw->gen >= 8) {
x_scaledown = x_align * 8;
y_scaledown = y_align * 16;
} else {
x_scaledown = x_align / 2;
y_scaledown = y_align / 2;
}
rect->x0 = rect->y0 = 0;
rect->x1 = ALIGN(mt->logical_width0, x_scaledown) / x_scaledown;
rect->y1 = ALIGN(mt->logical_height0, y_scaledown) / y_scaledown;

View File

@@ -98,7 +98,7 @@ static const struct brw_tracked_state *gen4_atoms[] =
&brw_psp_urb_cbs,
&brw_drawing_rect,
&brw_indices,
&brw_indices, /* must come before brw_vertices */
&brw_index_buffer,
&brw_vertices,
@@ -169,7 +169,7 @@ static const struct brw_tracked_state *gen6_atoms[] =
&brw_drawing_rect,
&brw_indices,
&brw_indices, /* must come before brw_vertices */
&brw_index_buffer,
&brw_vertices,
};
@@ -244,7 +244,7 @@ static const struct brw_tracked_state *gen7_atoms[] =
&brw_drawing_rect,
&brw_indices,
&brw_indices, /* must come before brw_vertices */
&brw_index_buffer,
&brw_vertices,

Some files were not shown because too many files have changed in this diff Show More