Compare commits

...

1794 Commits

Author SHA1 Message Date
Emil Velikov
d26f3c1f86 Add release notes for the 10.4.7 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 00:26:27 +00:00
Emil Velikov
b7b218f3f6 Update version to 10.4.7
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 00:19:39 +00:00
Marek Olšák
832c94a55c radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords
radeon_llvm_emit_prepare_cube_coords uses coords[4] in some cases (TXB2 etc.)

Discovered by Coverity. Reported by Ilia Mirkin.

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit a984abdad3)
2015-03-18 21:49:33 +00:00
Mario Kleiner
70832be2f1 glx: Handle out-of-sequence swap completion events correctly. (v2)
The code for emitting INTEL_swap_events swap completion
events needs to translate from 32-Bit sbc on the wire to
64-Bit sbc for the events and handle wraparound accordingly.

It assumed that events would be sent by the server in the
order their corresponding swap requests were emitted from
the client, iow. sbc count should be always increasing. This
was correct for DRI2.

This is not always the case under the DRI3/Present backend,
where the Present extension can execute presents and send out
completion events in a different order than the submission
order of the present requests, due to client code specifying
targetMSC target vblank counts which are not strictly
monotonically increasing. This confused the wraparound
handling. This patch fixes the problem by handling 32-Bit
wraparound in both directions. As long as successive swap
completion events real 64-Bit sbc's don't differ by more
than 2^30, this should be able to do the right thing.

How this is supposed to work:

awire->sbc contains the low 32-Bits of the true 64-Bit sbc
of the current swap event, transmitted over the wire.

glxDraw->lastEventSbc contains the low 32-Bits of the 64-Bit
sbc of the most recently processed swap event.

glxDraw->eventSbcWrap is a 64-Bit offset which tracks the upper
32-Bits of the current sbc. The final 64-Bit output sbc
aevent->sbc is computed from the sum of awire->sbc and
glxDraw->eventSbcWrap.

Under DRI3/Present, swap completion events can be received
slightly out of order due to non-monotic targetMsc specified
by client code, e.g., present request submission:

Submission sbc:   1   2   3
targetMsc:        10  11  9

Reception of completion events:
Completion sbc:   3   1   2

The completion sequence 3, 1, 2 would confuse the old wraparound
handling made for DRI2 as 1 < 3 --> Assumes a 32-Bit wraparound
has happened when it hasn't.

The client can queue multiple present requests, in the case of
Mesa up to n requests for n-buffered rendering, e.g., n =  2-4 in
the current Mesa GLX DRI3/Present implementation. In the case of
direct Pixmap presents via xcb_present_pixmap() the number n is
limited by the amount of memory available.

We reasonably assume that the number of outstanding requests n is
much less than 2 billion due to memory contraints and common sense.
Therefore while the order of received sbc's can be a bit scrambled,
successive 64-Bit sbc's won't deviate by much, a given sbc may be
a few counts lower or higher than the previous received sbc.

Therefore any large difference between the incoming awire->sbc and
the last recorded glxDraw->lastEventSbc will be due to 32-Bit
wraparound and we need to adapt glxDraw->eventSbcWrap accordingly
to adjust the upper 32-Bits of the sbc.

Two cases, correponding to the two if-statements in the patch:

a) Previous sbc event was below the last 2^32 boundary, in the previous
glxDraw->eventSbcWrap epoch, the new sbc event is in the next 2^32
epoch, therefore the low 32-Bit awire->sbc wrapped around to zero,
or close to zero --> awire->sbc is apparently much lower than the
glxDraw->lastEventSbc recorded for the previous epoch

--> We need to increment glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch to be one higher than the previous one.

--> Case a) also handles the old DRI2 behaviour.

b) Previous sbc event was above closest 2^32 boundary, but now a
late event from the previous 2^32 epoch arrives, with a true sbc
that belongs to the previous 2^32 segment, so the awire->sbc of
this late event has a high count close to 2^32, whereas
glxDraw->lastEventSbc is closer to zero --> awire->sbc is much
greater than glXDraw->lastEventSbc.

--> We need to decrement glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch back to the previous lower epoch of this late
completion event.

We assume such a wraparound to a higher (a) epoch or lower (b)
epoch has happened if awire->sbc and glxDraw->lastEventSbc differ
by more than 2^30 counts, as such a difference can only happen
on wraparound, or if somehow 2^30 present requests would be pending
for a given drawable inside the server, which is rather unlikely.

v2: Explain the reason for this patch and the new wraparound handling
    much more extensive in commit message, no code change wrt. initial
    version.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit cc5ddd584d)
2015-03-18 21:49:25 +00:00
Emil Velikov
ad259df2e0 auxiliary/os: fix the android build - s/drm_munmap/os_munmap/
Squash this silly typo introduced with commit c63eb5dd5ec(auxiliary/os: get
the mmap/munmap wrappers working with android)

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 55f0c0a29f)
2015-03-18 21:49:18 +00:00
Emil Velikov
df2db2a55f loader: include <sys/stat.h> for non-sysfs builds
Required by fstat(), otherwise we'll error out due to implicit function
declaration.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89530
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reported-by: Vadim Rutkovsky <vrutkovs@redhat.com>
Tested-by: Vadim Rutkovsky <vrutkovs@redhat.com>
(cherry picked from commit 771cd266b9)
2015-03-18 21:49:05 +00:00
Rob Clark
0506f69f08 freedreno: update generated headers
Fix a3xx texture layer-size.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e92bc6b38e)
[Emil Velikov: sqush trivial conflicts, drop the a4xx.xml.h changes]

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/freedreno/a2xx/a2xx.xml.h
	src/gallium/drivers/freedreno/a3xx/a3xx.xml.h
	src/gallium/drivers/freedreno/a4xx/a4xx.xml.h
	src/gallium/drivers/freedreno/adreno_common.xml.h
	src/gallium/drivers/freedreno/adreno_pm4.xml.h
2015-03-18 21:48:40 +00:00
Ilia Mirkin
a563045009 freedreno: fix slice pitch calculations
For example if width were 65, the first slice would get 96 while the
second would get 32. However the hardware appears to expect the second
pitch to be 64, based on halving the 96 (and aligning up to 32).

This fixes texelFetch piglit tests on a3xx below a certain size. Going
higher they break again, but most likely due to unrelated reasons.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 620e29b748)
2015-03-18 21:32:21 +00:00
Samuel Iglesias Gonsalvez
b2e243f70c glsl: optimize (0 cmp x + y) into (-x cmp y).
The optimization done by commit 34ec1a24d did not take it into account.

Fixes:

dEQP-GLES3.functional.shaders.random.all_features.fragment.20

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b43bbfa90a)
2015-03-18 21:15:35 +00:00
Iago Toral Quiroga
8c25b0f2d1 i965: Fix out-of-bounds accesses into pull_constant_loc array
The piglit test glsl-fs-uniform-array-loop-unroll.shader_test was designed
to do an out of bounds access into an uniform array to make sure that we
handle that situation gracefully inside the driver, however, as Ken describes
in bug 79202, Valgrind reports that this is leading to an out-of-bounds access
in fs_visitor::demote_pull_constants().

Before accessing the pull_constant_loc array we should make sure that
the uniform we are trying to access is valid.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79202
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 6ac1bc90c4)
Nominated-by: Matt Turner <mattst88@gmail.com>
2015-03-11 17:46:03 +00:00
Rob Clark
a91ee1e187 freedreno/ir3: fix silly typo for binning pass shaders
Was resulting in gl_PointSize write being optimized out, causing
particle system type shaders to hang if hw binning enabled.

Fixes neverball, OGLES2ParticleSystem, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 60096ed906)
2015-03-11 17:44:38 +00:00
Marek Olšák
977626f10a r300g: fix sRGB->sRGB blits
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c939231e72)
2015-03-11 17:42:52 +00:00
Marek Olšák
b451a2ffbf r300g: fix a crash when resolving into an sRGB texture
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9953586af2)
2015-03-11 17:42:38 +00:00
Marek Olšák
a561eee82c r300g: fix RGTC1 and LATC1 SNORM formats
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 74a757f92f)
2015-03-11 17:42:07 +00:00
Stefan Dösinger
80ef80d087 r300g: Fix the ATI1N swizzle (RGTC1 and LATC1)
This fixes the GL_COMPRESSED_RED_RGTC1 part of piglit's rgtc-teximage-01
test as well as the precision part of Wine's 3dc format test (fd.o bug
89156).

The Z component seems to contain a lower precision version of the
result, probably a temporary value from the decompression computation.
The Y and W component contain different data that depends on the input
values as well, but I could not make sense of them (Not that I tried
very hard).

GL_COMPRESSED_SIGNED_RED_RGTC1 still seems to have precision problems in
piglit, and both formats are affected by a compiler bug if they're
sampled by the shader with a swizzle other than .xyzw. Wine uses .xxxx,
which returns random garbage.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89156
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f710b99071)
2015-03-11 17:41:43 +00:00
Ilia Mirkin
fa8bfb3ed1 freedreno/ir3: get the # of miplevels from getinfo
This fixes ARB_texture_query_levels to actually return the desired
value.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cb3eb43ad6)
2015-03-11 17:41:32 +00:00
Ilia Mirkin
025cf8cb3f freedreno/ir3: fix array count returned by TXQ
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8ac957a51c)
2015-03-11 17:41:20 +00:00
Ilia Mirkin
4db4f70546 freedreno: move fb state copy after checking for size change
Fixes: 1f3ca56b ("freedreno: use util_copy_framebuffer_state()")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f3dfe6513c)
2015-03-11 17:40:59 +00:00
Andrey Sudnik
d4a95ffcda i965/vec4: Don't lose the saturate modifier in copy propagation.
Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89224
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0dfec59a27)
2015-03-07 16:41:16 +00:00
Emil Velikov
97b0219ed5 mesa: rename format_info.c to format_info.h
The file is auto-generated, and #included by formats.c. Let's rename it
to reflect the latter. This will also help up fix the dependency
tracking by adding it to the _SOURCES variable, without the side effect
of it being compiled (twice).

v2: Update .gitignore to reflect the rename.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 3f6c28f2a9)

Conflicts:
	src/mesa/Makefile.am
	src/mesa/main/.gitignore
2015-03-07 16:40:27 +00:00
Matt Turner
93273f16af r300g: Check return value of snprintf().
Would have at least prevented the crash the previous patch fixed.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540970
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit ade0b580e7)
2015-03-07 16:37:22 +00:00
Matt Turner
8e8d215cae r300g: Use PATH_MAX instead of limiting ourselves to 100 chars.
When built with Gentoo's package manager, the Mesa source directory
exists seven directories deep. The path to the .test file is too long
and is silently truncated, leading to a crash. Just use PATH_MAX.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540970
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit f5e2aa1324)
2015-03-07 16:37:15 +00:00
Daniel Stone
1a929baa0b egl: Take alpha bits into account when selecting GBM formats
This fixes piglit when using PIGLIT_PLATFORM=gbm

Tom Stellard:
  - Fix ARGB2101010 format

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
(cherry picked from commit 65c8965d03)
2015-03-07 16:37:04 +00:00
Marc-Andre Lureau
3a625d0b3f gallium/auxiliary/indices: fix start param
Since commit 28f3f8d, indices generator take a start parameter. However, some
index values have been left to start at 0.

This fixes the glean/fbo test with the virgl driver, and copytexsubimage
with freedreno.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 073a5d2e84)
2015-03-07 16:36:47 +00:00
Emil Velikov
944ef59b2f cherry-ignore: add not applicable/rejected commits
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-07 16:36:05 +00:00
Emil Velikov
fc9dd495b2 docs: Add sha256 sums for the 10.4.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:44:55 +00:00
Emil Velikov
542a754524 Add release notes for the 10.4.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:23:34 +00:00
Emil Velikov
e559d126f9 Update version to 10.4.6
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:16:58 +00:00
Emil Velikov
fc5881ad73 Revert "gallivm: Update for RTDyldMemoryManager becoming an unique_ptr."
This reverts commit 66a3f104a5.

The commit is likely insufficient for normal work with LLVM 3.6.
The full discussion and reason can be found at
http://lists.freedesktop.org/archives/mesa-dev/2015-March/078795.html
2015-03-06 19:16:28 +00:00
Emil Velikov
9508ca24f1 mesa: cherry-pick the second half of commit 2aa71e9485
Missed out by commit 39ae85732d2(mesa: Fix error validating args for
TexSubImage3D)

Reported-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:16:19 +00:00
Matt Turner
644bbf88ec mesa: Correct backwards NULL check.
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 491d42135a)
2015-03-06 18:45:13 +00:00
Ian Romanick
a369361f9e mesa: Always generate GL_INVALID_OPERATION in _mesa_GetProgramBinary
There are no binary formats supported, so what are you doing?  At least
this gives the application developer some feedback about what's going
on.  The spec gives no guidance about what to do in this scenario.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
(cherry picked from commit f591712efe)
2015-03-06 18:44:52 +00:00
Ian Romanick
f1663a5236 mesa: Ensure that length is set to zero in _mesa_GetProgramBinary
v2: Fix assignment of length.  Noticed by Julien Cristau.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
(cherry picked from commit 4fd8b30123)
2015-03-06 18:44:37 +00:00
Ian Romanick
e1b5bc9330 mesa: Add missing error checks in _mesa_ProgramBinary
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
(cherry picked from commit 201b9c1818)

Conflicts:
	src/mesa/main/shaderapi.c
2015-03-06 18:42:51 +00:00
Emil Velikov
93edf3e7dc Revert "mesa: Correct backwards NULL check."
This reverts commit a598a9bdfe.

The patch was applied without the required dependencies.
2015-03-06 18:40:09 +00:00
José Fonseca
66a3f104a5 gallivm: Update for RTDyldMemoryManager becoming an unique_ptr.
Trivial.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=86958

(cherry picked from commit ef7e0b39a2)
Nominated-by: Sedat Dilek <sedat.dilek@gmail.com>
2015-03-04 01:51:36 +00:00
Abdiel Janulgue
afa7a851da st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported
There is a bug in the current lowering pass implementation where we lower saturate
to clamp only for vertex shaders on drivers supporting SM 3.0. The correct behavior
is to actually lower to clamp only when we don't support saturate which happens
on drivers that don't support SM 3.0

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit 49e0431211)
Nominated-by: Matt Turner <mattst88@gmail.com>
2015-03-04 01:51:36 +00:00
Abdiel Janulgue
d880aa573c glsl: Don't optimize min/max into saturate when EmitNoSat is set
v3: Fix multi-line comment format (Ian)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit 4ea8c8d56c)
2015-03-04 01:51:36 +00:00
Matt Turner
741aeba26f i965/fs: Don't use backend_visitor::instructions after creating the CFG.
This is a fix for a regression introduced in commit a9f8296d ("i965/fs:
Preserve the CFG in a few more places.").

The errata this code works around is described in a comment before the function:

   "[DevBW, DevCL] Errata: A destination register from a send can not be
    used as a destination register until after it has been sourced by an
    instruction with a different destination register.

The framebuffer write's sources must be in message registers, which SEND
instructions cannot have as a destination. There's no way for this
errata to affect anything at the end of the program. Just remove the
code.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84613
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e214000f25)
2015-03-04 01:51:36 +00:00
Matt Turner
a598a9bdfe mesa: Correct backwards NULL check.
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 491d42135a)
[Emil Velikov: the patch hunk has a different offset.]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/mesa/main/shaderapi.c
2015-03-04 01:51:36 +00:00
Chris Forbes
0c46d850d9 i965/gs: Check newly-generated GS-out VUE map against correct stage
Previously, we compared our new GS-out VUE map to the existing *VS*-out
VUE map, which is bogus.

This would mostly manifest as redundant dirty flagging where the GS is
in use but the VS and GS output layouts differ; but there is a scary
case where we would fail to flag a GS-out layout change if it happened
to match the VS-out layout.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.5, 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88885
(cherry picked from commit b51ff50a76)
2015-03-04 01:51:36 +00:00
Jonathan Gray
da46b1b160 auxilary/os: correct sysctl use in os_get_total_physical_memory()
The length argument passed to sysctl was the size of the pointer
not the type.  The result of this is sysctl calls would fail on
32 bit BSD/Mac OS X.

Additionally the wrong pointer was passed as an argument to store
the result of the sysctl call.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 7983a3d2e0)
2015-03-04 01:51:36 +00:00
Matt Turner
7e723c98ce glsl: Rewrite and fix min/max to saturate optimization.
There were some bugs, and the code was really difficult to follow. We
would optimize

   min(max(x, b), 1.0) into max(sat(x), b)

but not pay attention to the order of min/max and also do

   max(min(x, b), 1.0) into max(sat(x), b)

Corrects four shaders from Champions of Regnum that do

   min(max(x, 1), 10)

and corrects rendering of Mass Effect under VMware Workstation.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89180
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit cb25087c7b)
2015-03-04 01:51:36 +00:00
Andreas Boll
0a51529a28 glx: Fix returned values of GLX_RENDERER_PREFERRED_PROFILE_MESA
If the renderer supports the core profile the query returned incorrectly
0x8 as value, because it was using (1U << __DRI_API_OPENGL_CORE) for the
returned value.

The same happened with the compatibility profile. It returned 0x1
(1U << __DRI_API_OPENGL) instead of 0x2.

Internal DRI defines:
   dri_interface.h: #define __DRI_API_OPENGL       0
   dri_interface.h: #define __DRI_API_OPENGL_CORE  3

Those two bits are supposed for internal usage only and should be
translated to GLX_CONTEXT_CORE_PROFILE_BIT_ARB (0x1) for a preferred
core context profile and GLX_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB (0x2)
for a preferred compatibility context profile.

This patch implements the above translation in the glx module.

v2: Fix the incorrect behavior in the glx module

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6d164f65c5)
2015-03-04 01:51:36 +00:00
Leo Liu
2a9e9b5aeb st/omx/dec/h264: fix picture out-of-order with poc type 0 v2
poc counter should be reset with IDR frame,
otherwise there would be a re-order issue with
frames before and after IDR

v2: add commit message

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9c7b343bc0)
2015-03-04 01:51:36 +00:00
Emil Velikov
120792fa04 install-lib-links: remove the .install-lib-links file
With earlier commit (install-lib-links: don't depend on .libs directory)
we moved the location of the file from .libs/ to the current dir.
Although we did not attribute that in the former case autotools was
doing us a favour and removing the file. Explicitly remove the file at
clean-local time, otherwise we'll end up with dangling files.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit fece147be5)
2015-03-04 01:51:35 +00:00
Eduardo Lima Mitev
39ae85732d mesa: Fix error validating args for TexSubImage3D
The zoffset and depth values were not being considered when calling
error_check_subtexture_dimensions().

Fixes 2 dEQP tests:
* dEQP-GLES3.functional.negative_api.texture.texsubimage3d_neg_offset
* dEQP-GLES3.functional.negative_api.texture.texsubimage3d_invalid_offset

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedestkop.org>
(cherry picked from commit 2aa71e9485)
[Emil Velikov: Resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/mesa/main/teximage.c
2015-03-04 01:51:35 +00:00
Marek Olšák
61c1aabb9f radeonsi: fix point sprites
Broken by a27b74819a.

This fix is critical and should be ported to stable ASAP.

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7820a11e3d)

Squashed with commit

radeonsi: fix a warning caused by previous commit

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 050bf75c8b)

[Emil Velikov: The file was renamed si_state_{shaders,draw}.c]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state_shader.c
2015-03-04 01:51:16 +00:00
Marek Olšák
6da4e66d4e vbo: fix an unitialized-variable warning
It looks like a bug to me.

Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 0feb0b7373)
2015-03-04 00:39:01 +00:00
Brian Paul
7e57411b9a st/mesa: fix sampler view reference counting bug in glDraw/CopyPixels
Use pipe_sampler_view_reference() instead of ordinary assignment.
Also add a new sanity check assertion.

Fixes piglit gl-1.0-drawpixels-color-index test crash.  But note
that the test still fails.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 62a8883f32)
2015-03-04 00:38:31 +00:00
Brian Paul
1e6735ead1 swrast: fix multiple color buffer writing
If a fragment program wrote to more than one color buffer, the
first fragment color got replicated to all dest buffers.  This
fixes 5 piglit FBO tests, including fbo-drawbuffers-arbfp.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45348
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 89c96afe3c)
2015-03-04 00:38:23 +00:00
Lucas Stach
deea686c71 install-lib-links: don't depend on .libs directory
This snippet can be included in Makefiles that may, depending on the
project configuration, not actually build any installable libraries.

In that case we don't have anything to depend on and this part of
the makefile may be executed before the .libs directory is created,
so do not depend on it being there.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
(cherry picked from commit 5c1aac17ad)
2015-03-04 00:38:11 +00:00
Emil Velikov
41bdeda102 docs: Add sha256 sums for the 10.4.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-21 12:31:51 +00:00
Emil Velikov
a5c608e951 Add release notes for the 10.4.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-21 12:22:08 +00:00
Emil Velikov
e0276bc297 Update version to 10.4.5
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-21 12:17:35 +00:00
Michel Dänzer
dc16fb1969 Revert "radeon/llvm: enable unsafe math for graphics shaders"
This reverts commit 0e9cdedd2e.

It caused the grass to disappear in The Talos Principle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89069
Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 4db985a5fa)
2015-02-18 12:17:44 +00:00
Kenneth Graunke
aaa823569b glsl: Reduce memory consumption of copy propagation passes.
opt_copy_propagation and opt_copy_propagation_elements create new ACP
and Kill sets each time they enter a new control flow block.  For if
blocks, they also copy the entire existing ACP set contents into the
new set.

When we exit the control flow block, we discard the new sets.  However,
we weren't freeing them - so they lived on until the pass finished.
This can waste a lot of memory (57MB on one pessimal shader).

This patch makes the pass allocate ACP entries using this->acp as the
memory context, and Kill entries out of this->kill.  It also steals
kill entries when moving them from the inner kill list to the parent.

It then frees the lists, including their contents.

v2: Move ralloc_free(this->acp) just before this->acp = orig_acp
    (suggested by Eric Anholt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 76960a55e6)
2015-02-18 12:17:44 +00:00
Laura Ekstrand
f57b41758d main: Fixed _mesa_GetCompressedTexImage_sw to copy slices correctly.
Previously array textures were not working with GetCompressedTextureImage,
leading to failures in the test
arb_direct_state_access/getcompressedtextureimage.c.

Tested-by: Laura Ekstrand <laura@jlekstrand.net>
Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92163482bd)
2015-02-18 12:17:44 +00:00
Marek Olšák
67ac6a3951 radeonsi: fix a crash if a stencil ref state is set before a DSA state
+ minor indentation fixes

Discovered by Axel Davy.

This can't be reproduced with any app, because all state trackers set a DSA
state first.

Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 2ead74888a)
2015-02-18 12:17:44 +00:00
Marek Olšák
5d04b9eeed mesa: fix AtomicBuffer typo in _mesa_DeleteBuffers
Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit e8625a29fe)
2015-02-18 12:17:43 +00:00
Marek Olšák
53041aecef radeonsi: small fix in SPI state
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

(cherry picked from commit a27b74819a)
[Emil Velikov: The file was renamed si_state_{shaders,draw}.c]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
        src/gallium/drivers/radeonsi/si_state_shader.c
2015-02-18 12:14:04 +00:00
Ilia Mirkin
f76bcbb4cd nvc0: allow holes in xfb target lists
Tested with a modified xfb-streams test which outputs to streams 0, 2,
and 3.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 854eb06bee)
2015-02-18 12:09:55 +00:00
Ilia Mirkin
89289934fc st/mesa: treat resource-less xfb buffers as if they weren't there
If a transform feedback buffer's size is 0, st_bufferobj_data doesn't
end up creating a buffer for it. There's no point in trying to write to
such a buffer, so just pretend as if it's not really there.

This fixes arb_gpu_shader5-xfb-streams-without-invocations on nvc0.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 80d373ed5b)
2015-02-18 12:09:54 +00:00
Ilia Mirkin
dbf82d753b nvc0: bail out of 2d blits with non-A8_UNORM alpha formats
This fixes the teximage-colors uploads with GL_ALPHA format and
non-GL_UNSIGNED_BYTE type.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 68e4f3f572)
2015-02-18 12:09:54 +00:00
Emil Velikov
b786e6332b get-pick-list.sh: Require explicit "10.4" for nominating stable patches
A nomination unadorned with a specific version is now interpreted as
being aimed at the 10.5 branch, which was recently opened.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-18 12:09:54 +00:00
Carl Worth
c0ce908a90 Revert use of Mesa IR optimizer for ARB_fragment_programs
Commit f82f2fb3dc added use of the Mesa
IR optimizer for both ARB_fragment_program and ARB_vertex_program, but
only justified the vertex-program portions with measured performance
improvements.

Meanwhile, the optimizer was seen to generate hundreds of unused
immediates without discarding them, causing failures.

Discard the use of the optimizer for now to fix the regression. (In
the future, we anticpate things moving from Mesa IR to NIR for better
optimization anyway.)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82477

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

CC: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 55a57834bf)
2015-02-18 12:09:54 +00:00
Kenneth Graunke
c83c5f4b69 i965: Fix integer border color on Haswell.
+82 Piglits - 100% of border color tests now pass on Haswell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 08a06b6b89)
2015-02-18 12:09:54 +00:00
Kenneth Graunke
f2663112f6 i965: Use a gl_color_union for sampler border color.
This should have no effect, but will make it easier to implement other
bug fixes.

v2: Eliminate "unsigned one" local; just use the value where necessary.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e1e73443c5)
2015-02-18 12:09:54 +00:00
Kenneth Graunke
2ad93851ff i965: Override swizzles for integer luminance formats.
The hardware's integer luminance formats are completely unusable;
currently we fall back to RGBA.  This means we need to override
the texture swizzle to obtain the XXX1 values expected for luminance
formats.

Fixes spec/EXT_texture_integer/texwrap formats bordercolor [swizzled]
on Broadwell - 100% of border color tests now pass on Broadwell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8cb18760cc)
2015-02-18 12:09:54 +00:00
Michel Dänzer
e35e6773c2 st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB
The latter currently implies CPU read access, so only PIPE_USAGE_STAGING
can be expected to be fast.

Mesa demos src/tests/streaming_rect on Kaveri (radeonsi):

Unpatched:  42 frames in  1.023 seconds = 41.056 FPS
Patched:   615 frames in  1.000 seconds = 615.000 FPS

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88658
Cc: "10.3 10.4" <mesa-stable@lists.freedestkop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a338dc0186)
2015-02-18 12:09:54 +00:00
Marek Olšák
51bdd19c97 radeonsi: fix instanced arrays with non-zero start instance
Fixes piglit ARB_base_instance/arb_base_instance-drawarrays.

Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 50908a8918)
2015-02-18 12:09:54 +00:00
Marek Olšák
5c623ff071 r600g,radeonsi: don't append to streamout buffers that haven't been used yet
The FILLED_SIZE counter is uninitialized at the beginning, so we can't use it.
Instead, use offset = 0, which is what we always do when not appending.

This unexpectedly fixes spec/ARB_texture_multisample/sample-position/*.
Yes, the test does use transform feedback.

Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 658f1d4cfe)
2015-02-18 12:09:53 +00:00
Jeremy Huddleston Sequoia
654f197f19 darwin: build fix
xfont.c:237:14: error: implicit declaration of function 'GetGLXDRIDrawable' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
   glxdraw = GetGLXDRIDrawable(CC->currentDpy, CC->currentDrawable);
             ^
Fixes regression from 291be28476

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
(cherry picked from commit e68b67b53f)
2015-02-11 00:24:04 -08:00
Jeremy Huddleston Sequoia
162cee83ba darwin: build fix
../../../src/mesa/main/compiler.h:47:10: fatal error: 'util/macros.h' file not found

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
(cherry picked from commit 1c67a5687a)
2015-02-10 20:35:33 -08:00
Emil Velikov
54da987bae docs: Add sha256 sums for the 10.4.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 00:47:18 +00:00
Emil Velikov
62eb27ac8b Add release notes for the 10.4.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 00:17:09 +00:00
Emil Velikov
a824179af5 Update version to 10.4.4
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 00:12:04 +00:00
Park, Jeongmin
fecedb6c43 st/osmesa: Fix osbuffer->textures indexing
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88930
Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6fd4a61ad6)
2015-02-04 01:37:33 +00:00
Matt Turner
9d1d1f46c7 gallium/util: Don't use __builtin_clrsb in util_last_bit().
Unclear circumstances lead to undefined symbols on x86.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=536916
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 32e98e8ef0)
2015-02-04 01:37:20 +00:00
José Fonseca
b51d369690 egl: Pass the correct X visual depth to xcb_put_image().
The dri2_x11_add_configs_for_visuals() function happily matches a 32
bits EGLconfig with a 24 bits X visual.  However it was passing 32bits
depth to xcb_put_image(), making X server unhappy:

  https://github.com/apitrace/apitrace/issues/313#issuecomment-70571911

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 11a955aef4)
2015-02-02 00:12:04 +00:00
Niels Ole Salscheider
eab8dc28ed configure: Link against all LLVM targets when building clover
Since 8e7df519bd, we initialise all targets in
clover. This fixes bug 85380.

v2: Mention correct bug in commit message

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4b94c3fc31)
2015-02-02 00:12:04 +00:00
Ville Syrjälä
cc580045a8 i965: Fix max_wm_threads for CHV
Change max_wm_threads to match the spec on CHV. The max number of
threads in 3DSTATE_PS is always programmed to 64 and the hardware
internally scales that depending on the GT SKU. So this doesn't
change the max number of threads actually used, but it does affect
the scratch space calculation.

On CHV the old value was too small, so the amount of scratch space
allocated wasn't sufficient to satisfy the actual max number of
threads used.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit 99754446ab)
2015-02-02 00:12:04 +00:00
Mario Kleiner
0d721fa1d6 glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)
Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>

v2: Add Frank Binns signed off by for his original earlier
patch from April 2014, which is identical to this one, and
Chris Wilsons reviewed tag from May 2014 for that patch, ergo
also for this one.

v3: Incorporate comment about triple buffering as suggested
by Axel Davy, and reference to relevant spec provided by
Eric Anholt.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 455d3036fa)
2015-02-02 00:12:04 +00:00
Brian Paul
c96ed76b3d mesa: fix display list 8-byte alignment issue
The _mesa_dlist_alloc() function is only guaranteed to return a pointer
with 4-byte alignment.  On 64-bit systems which don't support unaligned
loads (e.g. SPARC or MIPS) this could lead to a bus error in the VBO code.

The solution is to add a new  _mesa_dlist_alloc_aligned() function which
will return a pointer to an 8-byte aligned address on 64-bit systems.
This is accomplished by inserting a 4-byte NOP instruction in the display
list when needed.

The only place this actually matters is the VBO code where we need to
allocate a 'struct vbo_save_vertex_list' which needs to be 8-byte
aligned (just as if it were malloc'd).

The gears demo and others hit this bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88662
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 53b01938ed)
2015-01-30 08:51:51 -07:00
Emil Velikov
49a5bce780 docs: Add sha256 sums for the 10.4.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-24 12:54:33 +00:00
Emil Velikov
e92bfa3f95 Add release notes for the 10.4.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-24 12:49:17 +00:00
Emil Velikov
f70e4d4afd Update version to 10.4.3
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-24 12:44:46 +00:00
Axel Davy
42806f12a9 st/nine: Allocate vs constbuf buffer for indirect addressing once.
When the shader does indirect addressing on the constants,
we allocate a temporary constant buffer to which we copy
the constants from the app given user constants and
the constants filled in the shader.

This patch makes this buffer be allocated once.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f8a74410f1)
2015-01-23 00:47:26 +00:00
Axel Davy
4c9b64fc44 st/nine: Allocate the correct size for the user constant buffer
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e0f75044c8)
2015-01-23 00:47:26 +00:00
Axel Davy
69c7cf70e7 st/nine: Add variables containing the size of the constant buffers
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b9cbea9dbc)
2015-01-23 00:47:26 +00:00
Axel Davy
4d04fd0871 st/nine: Fix sm3 relative addressing for non-debug build
Relative addressing needs the constant buffer to get all
the correct constants, even those defined by the shader.

The code to copy the shader constants to the constant buffer
was enabled only for debug build. Enable it always.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit a721987077)
2015-01-23 00:47:25 +00:00
Axel Davy
0727ab961c st/nine: Remove unused code for ps
Since constant indirect adressing is not allowed for ps,
we can remove our code to handle that.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4b7a9cfddb)
2015-01-23 00:47:25 +00:00
Axel Davy
7280ddea9d st/nine: Correct rules for relative adressing and constants.
relative adressing for constants is possible only for vs float
constants.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9690bf33d7)
2015-01-23 00:47:25 +00:00
Axel Davy
425bc89720 st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bce94ce831)
2015-01-23 00:47:25 +00:00
Axel Davy
0b3f8c72f7 st/nine: Implement TEXDP3TEX
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9e23b64c15)
2015-01-23 00:47:25 +00:00
Axel Davy
63e668eb18 st/nine: Implement TEXDP3
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 09eb1e901f)
2015-01-23 00:47:24 +00:00
Axel Davy
2b4c577730 st/nine: Implement TEXDEPTH
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f19e699368)
2015-01-23 00:47:24 +00:00
Axel Davy
e3a393b4c3 st/nine: Implement TEXM3x3SPEC
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3676ab02fb)
2015-01-23 00:47:24 +00:00
Axel Davy
7ecd0f9528 st/nine: Implement TEXM3x2TEX
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2b9f079ae3)
2015-01-23 00:47:24 +00:00
Axel Davy
336887bca1 st/nine: implement TEXM3x2DEPTH
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fdff111dc8)
2015-01-23 00:47:24 +00:00
Axel Davy
8e08ba6f96 st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC
The fix is that this line:
"src[s] = tx->regs.vT[s];" is wrong if s doesn't start from 0.
Instead access tx->regs.vT directly when needed.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7865210670)

Conflicts:
	src/gallium/state_trackers/nine/nine_shader.c
2015-01-23 00:47:09 +00:00
Axel Davy
77e1136f44 st/nine: Fill missing dst and src number for some instructions.
Not filling them correctly results in bad padding and later crash.

Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b1259544e3)
2015-01-23 00:44:42 +00:00
Axel Davy
22c75f9f5a st/nine: Implement TEXCOORD special behaviours
texcoord for ps < 1_4 should clamp between 0 and 1 the values.

texcrd (texcoord ps 1_4) does not clamp and can be used with
two modifiers _dw and _dz that means the channels are divided
by w or z.
Implement those in shared code, since the same modifiers can be used
for texld ps 1_4.

v2: replace DIV by RCP + MUL
v3: Remove an useless MOV

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5399119fb1)

Conflicts:
	src/gallium/state_trackers/nine/nine_shader.c
2015-01-23 00:43:57 +00:00
Axel Davy
4b65be8860 st/nine: Fix some fixed function pipeline operation
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6378d74937)
2015-01-22 23:43:28 +00:00
Axel Davy
9ea8e7f0df st/nine: Clamp ps 1.X constants
This is wine (and windows) behaviour.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 018407b5d8)
2015-01-22 23:43:28 +00:00
Axel Davy
d0d09a4eee st/nine: Fix CND implementation
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3ca67f8810)
2015-01-22 23:43:27 +00:00
Axel Davy
75f39e45f0 st/nine: Rewrite LOOP implementation, and a0 aL handling
Previous implementation didn't work well with nested loops.

Instead of using several address registers, put a0 and aL
into normal registers, and copy them to one address register when
we need to use them.

Wine tests loop_index_test() and nested_loop_test() now pass correctly.

Fixes r600g crash while loading Bioshock -
bug https://bugs.freedesktop.org/show_bug.cgi?id=85696

Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6a8e5e48be)
2015-01-22 23:43:27 +00:00
Axel Davy
553089093f st/nine: Correct LOG on negative values
We should take the absolute value of the input.

Also return -FLT_MAX instead of -Inf for an input of 0.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c9aa9a0add)
2015-01-22 23:43:27 +00:00
Axel Davy
add30f01ef st/nine: Handle NRM with input of null norm
When the input's xyz are 0.0, the output
should be 0.0. This is due to the fact that
Inf * 0 = 0 for dx9. To handle this case,
cap the result of RSQ to FLT_MAX. We have
FLT_MAX * 0 = 0.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f5e8e3fb80)
2015-01-22 23:43:27 +00:00
Axel Davy
0dfb9c9e86 st/nine: Handle RSQ special cases
We should use the absolute value of the input as input to ureg_RSQ.

Moreover, an input of 0.0 should return FLT_MAX.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2487f73574)
2015-01-22 23:43:27 +00:00
Axel Davy
7e26cf83ba st/nine: Fix POW implementation
POW doesn't match directly TGSI, since we should
take the absolute value of src0.

Fixes black textures in some games

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c12f8c2088)
2015-01-22 23:43:27 +00:00
Axel Davy
00d22ce0fa st/nine: Fix typo for M4x4
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit e0dd9ca985)
2015-01-22 23:43:26 +00:00
Axel Davy
7f700cc35b st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs
Let's say we have c1 and c2 declared in the shader and c0 given by the app

Then here we would have read c0, c1 and c2 given by the app, instead
of the correct c0, c1, c2.

This correction fixes several issues in some games.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 53dc992f20)
2015-01-22 23:43:26 +00:00
Axel Davy
e6167e749c st/nine: Saturate oFog and oPts vs outputs
According to docs and Wine, these two vs outputs have
to be saturated.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9fb58a74a0)
2015-01-22 23:43:26 +00:00
Axel Davy
bce0058333 st/nine: Remove some shader unused code
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a214838181)
2015-01-22 23:43:26 +00:00
Axel Davy
9a0647ba7f st/nine: Convert integer constants to floats before storing them when cards don't support integers
The shader code is already behaving as if they are floats when the the card doesn't support integers

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d08c7b0b88)
2015-01-22 23:43:26 +00:00
Axel Davy
669c5d6d44 st/nine: Rework of boolean constants
Convert them to shader booleans at earlier stage.
Previous code is fine, but later patch will make
integers being converted at earlier stage, so do
the same for booleans

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d9d18fe39f)
2015-01-22 23:43:26 +00:00
Axel Davy
87ac37074f st/nine: Add ATI1 and ATI2 support
Adds ATI1 and ATI2 support to nine.

They map to PIPE_FORMAT_RGTC1_UNORM and PIPE_FORMAT_RGTC2_UNORM,
but need special handling.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 77f0ecf9ce)
2015-01-22 23:43:25 +00:00
Axel Davy
e1bcca4f13 st/nine: Check if srgb format is supported before trying to use it.
According to msdn, we must act as if user didn't ask srgb if we don't
support it.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b0b5430322)
2015-01-22 23:43:25 +00:00
Stanislaw Halik
50ea1c1f5f st/nine: Hack to generate resource if it doesn't exist when getting view
Buffers in the MANAGED pool are supposed to have the content in a ram buffer,
a copy in VRAM if there is enough memory (driver manages memory and decide when
to delete the buffer in VRAM).

This is not implemented properly in nine, and a VRAM copy is going to be created
when the RAM memory is filled, and the VRAM copy will get synced with the RAM
memory updates.

Due to some issues (in the implementation or in app logic), it can happen
we try to create a sampler view of the resource while we haven't created the
VRAM resource. This hack creates the resource when we hit this case, which prevents
crashing, but doesn't help with the resource content.

This fixes several games crashing at launch.

Acked-by: Axel Davy <axel.davy@ens.fr>
Acked-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Stanislaw Halik <sthalik@misaki.pl>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 82810d3b66)
2015-01-22 23:43:25 +00:00
Axel Davy
3ca8b93476 st/nine: NineBaseTexture9: update sampler view creation
While previous code was having the correct behaviour in general,
this new code is more readable (without checking all gallium formats
manually) and has a more defined behaviour for depth stencil resources.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 47280d777d)
2015-01-22 23:43:25 +00:00
Axel Davy
d06b403377 st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 0abfb80dac)
2015-01-22 23:43:12 +00:00
Axel Davy
481af42f28 st/nine: Fix crash when deleting non-implicit swapchain
The implicit swapchains are destroyed when the device instance is
destroyed. However for non-implicit swapchains, it is not the case,
and the application can have kept an reference on the swapchain
buffers to reuse them.

Fixes problems with battle.net launcher.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 0d2c22e648)
2015-01-22 23:41:09 +00:00
Axel Davy
393fffd07d st/nine: CubeTexture: fix GetLevelDesc
This->surfaces contains the surfaces associated to the levels
and faces. This->surfaces[6*Level] is what we want here,
since it gives us a face descriptor for the level 'Level'.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9232161178)
2015-01-22 23:41:08 +00:00
Axel Davy
c159b4095c st/nine: NineBaseTexture9: fix setting of last_layer
Use same similar settings as u_sampler_view_default_template

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 18c7e70226)
2015-01-22 23:41:08 +00:00
Axel Davy
b80b5b35a3 st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS
The cap means D3DFVF_XYZRHW vertices will see clipping.
This is not the case when
PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION is supported, since
it'll disable clipping.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 05e20e1045)
2015-01-22 23:41:08 +00:00
Xavier Bouchoux
41ca03a7b4 st/nine: Fix D3DRS_POINTSPRITE support
It's done by testing the existence of the point sprite output register *after* parsing the vertex shader.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dc88989189)
2015-01-22 23:41:08 +00:00
Axel Davy
18ac34825b st/nine: Add new texture format strings
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d2f2a550cf)
2015-01-22 23:41:07 +00:00
Xavier Bouchoux
15ef84ccfb st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 072e2ba8e1)
2015-01-22 23:41:07 +00:00
Xavier Bouchoux
44ee59d300 st/nine: Additional defines to d3dtypes.h
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8bb550b958)
2015-01-22 23:41:07 +00:00
Jose Fonseca
1e0ab5b826 nine: Drop use of TGSI_OPCODE_CND.
This was the only state tracker emitting it, and hardware was just having
to lower it anyway (or failing to lower it at all).

v2: Extracted from a larger patch by Jose (which also dropped DP2A), fixed
    to actually not reference TGSI_OPCODE_CND.  Change by anholt.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit 925cb75f89)
2015-01-22 23:40:09 +00:00
Jonathan Gray
a3381286d8 glsl: Link glsl_test with pthreads library.
Otherwise pthread_mutex_lock will be an undefined reference
on OpenBSD.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88219
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c5be9c126d)
2015-01-22 22:27:12 +00:00
Kenneth Graunke
882f702441 i965: Work around mysterious Gen4 GPU hangs with minimal state changes.
Gen4 hardware appears to GPU hang frequently when using Chromium, and
also when running 'glmark2 -b ideas'.  Most of the error states contain
3DPRIMITIVE commands in quick succession, with very few state packets
between them - usually VERTEX_BUFFERS/ELEMENTS and CONSTANT_BUFFER.

I trimmed an apitrace of the glmark2 hang down to two draw calls with a
glUniformMatrix4fv call between the two.  Either draw by itself works
fine, but together, they hang the GPU.  Removing the glUniform call
makes the hangs disappear.  In the hardware state, this translates to
removing the CONSTANT_BUFFER packet between the two 3DPRIMITIVE packets.

Flushing before emitting CONSTANT_BUFFER packets also appears to make
the hangs disappear.  I observed a slowdown in glxgears by doing it all
the time, so I've chosen to only do it when BRW_NEW_BATCH and
BRW_NEW_PSP are unset (i.e. we haven't done a CS_URB_STATE change or
already flushed the whole pipeline).

I'd much rather understand the problem, but at this point, I don't see
how we'd ever be able to track it down further.  We have no real tools,
and the hardware people moved on years ago.  I've analyzed 20+ error
states and read every scrap of documentation I could find.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80568
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85367
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c4fd0c9052)
2015-01-22 16:11:03 +00:00
Jason Ekstrand
a25e26f67f mesa: Fix clamping to -1.0 in snorm_to_float
This patch fixes the return of a wrong value when x is lower than
-MAX_INT(src_bits) as the result would not be between [-1.0 1.0].

v2 by Samuel Iglesias <siglesias@igalia.com>:
    - Modify snorm_to_float() to avoid doing the division when
      x == -MAX_INT(src_bits)

Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 7d1b08ac44)
2015-01-17 14:59:56 +00:00
Kenneth Graunke
021d71b848 i965: Respect the no_8 flag on Gen6, not just Gen7+.
When doing repclears, we only want to use the SIMD16 program, not the
SIMD8 one.  Kristian added this to the Gen7+ code, but apparently we
missed it in the Gen6 code.  This patch copies that code over.

Approximately doubles the performance in a clear microbenchmark from
mesa-demos (clearspd -width 500 -height 500 +color) on Sandybridge.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
References: https://code.google.com/p/chrome-os-partner/issues/detail?id=34681
(cherry picked from commit f95733ddb7)

Conflicts:
	src/mesa/drivers/dri/i965/gen6_wm_state.c
2015-01-17 14:59:08 +00:00
Emil Velikov
14f1659b43 docs: Add sha256 sums for the 10.4.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-12 10:37:09 +00:00
Emil Velikov
02f2e97c3e Add release notes for the 10.4.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-12 10:30:28 +00:00
Emil Velikov
5906dd6c99 Update version to 10.4.2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-12 10:24:59 +00:00
Dave Airlie
2d05942b74 r600g/sb: implement r600 gpr index workaround. (v3.1)
r600, rv610 and rv630 all have a bug in their GPR indexing
and how the hw inserts access to PV.

If the base index for the src is the same as the dst gpr
in a previous group, then it will use PV instead of using
the indexed gpr correctly.

The workaround is to insert a NOP when you detect this.

v2: add second part of fix detecting DST rel writes followed
by same src base index reads.

v3: forget adding stuff to structs, just iterate over the
previous node group again, makes it more obvious.
v3.1: drop local_nop.

Fixes ~200 piglit regressions on rv635 since SB was introduced.

Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3c8ef3a74b)
2015-01-07 17:39:52 +00:00
Dave Airlie
099ed78a04 r600g: fix regression since UCMP change
Since d8da6decea where the
state tracker started using UCMP on cayman a number of tests
regressed.

this seems to be r600g is doing CNDGE_INT for UCMP which is >= 0,
we should be doing CNDE_INT with reverse arguments.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0d4272cd8e)
2015-01-07 17:35:39 +00:00
Vadim Girlin
91c5770ba1 r600g/sb: fix issues with loops created for switch
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit de0fd375f6)
2015-01-07 17:31:12 +00:00
Dave Airlie
3306ed6fd7 Revert "r600g/sb: fix issues cause by GLSL switching to loops for switch"
This reverts commit 7b0067d23a.

Vadim's patch fixes this a lot better.

(cherry picked from commit 34e512d9ea)
2015-01-07 17:29:01 +00:00
Marek Olšák
81f8006f7d radeonsi: fix VertexID for OpenGL
This fixes all failing piglit VertexID tests.

Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit d7c6f397f4)
2015-01-07 17:25:06 +00:00
Marek Olšák
1b498cf5b7 st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX
Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit eaae92a349)
2015-01-07 17:04:21 +00:00
Marek Olšák
8c77be7ef9 vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays
From GL 4.4 Core profile:

  If both PRIMITIVE_RESTART and PRIMITIVE_RESTART_FIXED_INDEX are
  enabled, the index value determined by PRIMITIVE_RESTART_FIXED_INDEX is
  used. If PRIMITIVE_RESTART_FIXED_INDEX is enabled, primitive restart is not
  performed for array elements transferred by any drawing command not taking a
  type parameter, including all of the *Draw* commands other than *DrawEle-
  ments*.

Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8f5d309521)
2015-01-07 16:51:02 +00:00
Leonid Shatz
ef43d21bbc gallium/util: make sure cache line size is not zero
The "normal" detection (querying clflush size) already made sure it is
non-zero, however another method did not. This lead to crashes if this
value happened to be zero (apparently can happen in virtualized environments
at least).
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87913

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5fea39ace3)
2015-01-06 16:21:03 +00:00
Roland Scheidegger
ac3ca98a1b gallium/util: fix crash with daz detection on x86
The code used PIPE_ALIGN_VAR for the variable used by fxsave, however this
does not work if the stack isn't aligned. Hence use PIPE_ALIGN_STACK function
decoration to fix the segfault which can happen if stack alignment is only
4 bytes.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87658.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b59c7ed0ab)
2015-01-06 16:02:10 +00:00
Ilia Mirkin
af1a690075 nv50/ir: fix texture offsets in release builds
assert's get compiled out in release builds, so they can't be relied
upon to perform logic.

Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Roy Spliet <rspliet@eclipso.eu>
Cc: "10.2 10.3 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fb1afd1ea5)
2015-01-06 15:52:12 +00:00
Chad Versace
fffe533f08 i965: Use safer pointer arithmetic in gather_oa_results()
This patch reduces the likelihood of pointer arithmetic overflow bugs in
gather_oa_results(), like the one fixed by b69c7c5dac.

I haven't yet encountered any overflow bugs in the wild along this
patch's codepath. But I get nervous when I see code patterns like this:

   (void*) + (int) * (int)

I smell 32-bit overflow all over this code.

This patch retypes 'snapshot_size' to 'ptrdiff_t', which should fix any
potential overflow.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 414be86c96)
2015-01-04 21:39:10 +00:00
Chad Versace
4d5e0f78b7 i965: Use safer pointer arithmetic in intel_texsubimage_tiled_memcpy()
This patch reduces the likelihood of pointer arithmetic overflow bugs in
intel_texsubimage_tiled_memcpy() , like the one fixed by b69c7c5dac.

I haven't yet encountered any overflow bugs in the wild along this
patch's codepath. But I recently solved, in commit b69c7c5dac, an overflow
bug in a line of code that looks very similar to pointer arithmetic in
this function.

This patch conceptually applies the same fix as in b69c7c5dac. Instead
of retyping the variables, though, this patch adds some casts. (I tried
to retype the variables as ptrdiff_t, but it quickly got very messy. The
casts are cleaner).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 225a09790d)
2015-01-04 21:39:00 +00:00
Marek Olšák
b9e56ea151 glsl_to_tgsi: fix a bug in copy propagation
This fixes the new piglit test: arb_uniform_buffer_object/2-buffers-bug

Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 48094d0e65)
2015-01-04 21:38:26 +00:00
Kenneth Graunke
e05c595acd i965: Fix start/base_vertex_location for >1 prims but !BRW_NEW_VERTICES.
This is a partial revert of c89306983c.
It split the {start,base}_vertex_location handling into several steps:

1. Set brw->draw.start_vertex_location = prim[i].start
   and brw->draw.base_vertex_location = prim[i].basevertex.
   (This happened once per _mesa_prim, in the main drawing loop.)
2. Add brw->vb.start_vertex_bias and brw->ib.start_vertex_offset
   appropriately.  (This happened in brw_prepare_shader_draw_parameters,
   which was called just after brw_prepare_vertices, as part of state
   upload, and only happened when BRW_NEW_VERTICES was flagged.)
3. Use those values when emitting 3DPRIMITIVE (once per _mesa_prim).

If we drew multiple _mesa_prims, but didn't flag BRW_NEW_VERTICES on
the second (or later) primitives, we would do step #1, but not #2.
The first _mesa_prim would get correct values, but subsequent ones
would only get the first half of the summation.

The reason I originally did this was because I needed the value of
gl_BaseVertexARB to exist in a buffer object prior to uploading
3DSTATE_VERTEX_BUFFERS.  I believed I wanted to upload the value
of 3DPRIMITIVE's "Base Vertex Location" field, which was computed
as: (prims[i].indexed ? prims[i].start : prims[i].basevertex) +
brw->vb.start_vertex_bias.  The latter value wasn't available until
after brw_prepare_vertices, and the former weren't available in the
state upload code at all.  Hence the awkward split.

However, I believe that including brw->vb.start_vertex_bias was a
mistake.  It's an extra bias we apply when uploading vertex data into
VBOs, to move [min_index, max_index] to [0, max_index - min_index].

>From the GL_ARB_shader_draw_parameters specification:
"<gl_BaseVertexARB> holds the integer value passed to the <baseVertex>
 parameter to the command that resulted in the current shader
 invocation.  In the case where the command has no <baseVertex>
 parameter, the value of <gl_BaseVertexARB> is zero."

I conclude that gl_BaseVertexARB should only include the baseVertex
parameter from glDraw*Elements*, not any internal biases we add for
optimization purposes.

With that in mind, gl_BaseVertexARB only needs prim[i].start or
prim[i].basevertex.  We can simply store that, and go back to computing
start_vertex_location and base_vertex_location in brw_emit_prim(), like
we used to.  This is much simpler, and should actually fix two bugs.

Fixes missing geometry in Unvanquished.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85529
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit c633528cba)
2015-01-04 21:38:16 +00:00
Ilia Mirkin
c48d0d8dd2 nv50,nvc0: set vertex id base to index_bias
Fixes the piglits which check that gl_VertexID includes the base vertex
offset:
  arb_draw_indirect-vertexid elements
  gl-3.2-basevertex-vertexid

Note that this leaves out the original G80, for which this will continue
to fail. It could be fixed by passing a driver constbuf value in, but
that's beyond the scope of this change.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit be0311c962)
2015-01-04 21:37:51 +00:00
Tiziano Bacocco
aafd13027a nv50,nvc0: implement half_pixel_center
LAST_LINE_PIXEL has actually been renamed to PIXEL_CENTER_INTEGER in
rnndb; use that method to implement the rasterizer setting, used for
st/nine.

Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 609c3e51f5)
2015-01-04 21:37:32 +00:00
Michel Dänzer
1f42230fa7 radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0
E.g. this could happen on older kernels which don't support the
RADEON_INFO_SI_BACKEND_ENABLED_MASK query yet. The code in
si_write_harvested_raster_configs() doesn't deal with this correctly and
would probably mangle the value badly.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit b3057f8097)
2015-01-04 21:34:08 +00:00
Kenneth Graunke
2b85ed72db i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.
This was probably missed when moving from a fixed binding table layout
to a dynamic one that changes based on the shader.

Fixes newly proposed Piglit test fbo-mrt-new-bind.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87619
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Mike Stroyan <mike@LunarG.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4616b2ef85)
2015-01-04 21:33:26 +00:00
Emil Velikov
4cd38a592e docs: Add sha256 sums for the 10.4.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-30 02:38:02 +00:00
Emil Velikov
60e2e04fe8 Add release notes for the 10.4.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-30 02:11:34 +00:00
Emil Velikov
1a3df8cc77 Update version to 10.4.1
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-30 02:07:33 +00:00
Emil Velikov
45416a255f Revert "glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)"
This reverts commit ee241a6889.

May not be the correct fix. Discussion is ongoing.

http://lists.freedesktop.org/archives/mesa-dev/2014-December/072969.html
2014-12-30 01:03:14 +00:00
Cody Northrop
fb3f7c0bc5 i965: Require pixel alignment for GPU copy blit
The blitter will start at a pixel's natural alignment. For PBOs, if the
provided offset if not aligned, bits will get dropped.

This change adds offset alignment check for src and dst, kicking back if
the requirements are not met.

The change is based on following verbiage from BSPEC:
 Color pixel sizes supported are 8, 16, and 32 bits per pixel (bpp).
 All pixels are naturally aligned.

Found in the following locations:
page 35 of intel-gfx-prm-osrc-hsw-blitter.pdf
page 29 of ivb_ihd_os_vol1_part4.pdf
page 29 of snb_ihd_os_vol1_part5.pdf

This behavior was observed with Steam Big Picture rendering incorrect
icon colors.  The fix has been tested on Ubuntu and SteamOS on Haswell.

Signed-off-by: Cody Northrop <cody@lunarg.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83908
Reviewed-by: Neil Roberts <neil@linux.intel.com>
(cherry picked from commit 83e8bb5b1a)
Nominated-by: Matt Turner <mattst88@gmail.com>
2014-12-21 21:19:31 +00:00
Ian Romanick
4f570f2fb3 linker: Assign varying locations geometry shader inputs for SSO
Previously only geometry shader outputs would be assigned locations if
the geometry shader was the only stage in the linked program.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: pavol@klacansky.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82585
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a909b995d9)
Nominted-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-21 21:18:09 +00:00
Ian Romanick
a4c8348597 linker: Wrap access of producer_var with a NULL check
producer_var could be NULL if consumer_var is not NULL and
consumer_is_fs is false.  This will occur when the producer is NULL and
the consumer is the geometry shader for a program that contains only a
geometry shader.  This will occur starting with the next patch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: pavol@klacansky.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82585
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 5eca78a00a)
Nominated-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-21 21:17:45 +00:00
Maxence Le Doré
893583776e glsl: Add gl_MaxViewports to available builtin constants
It seems to have been forgotten during viewports array implementation time.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 19e05d6898)
2014-12-21 21:17:24 +00:00
Andres Gomez
2d669f6583 i965/brw_reg: struct constructor now needs explicit negate and abs values.
We were assuming, when constructing a new brw_reg struct, that the
negate and abs register modifiers would not be present by default in
the new register.

Now, we force explicitly setting these values when constructing a new
register.

This will avoid problems like forgetting to properly set them when we
are using a previous register to generate this new register, as it was
happening in the dFdx and dFdy generation functions.

Fixes piglit test shaders/glsl-deriv-varyings

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82991
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 8517e665bc)
2014-12-21 21:17:16 +00:00
Mario Kleiner
bccfe7ae0f glx/dri3: Don't fail on glXSwapBuffersMscOML(dpy, window, 0, 0, 0) (v2)
glXSwapBuffersMscOML() with target_msc=divisor=remainder=0 gets
translated into target_msc=divisor=0 but remainder=1 by the mesa
api. This is done for server DRI2 where there needs to be a way
to tell the server-side DRI2ScheduleSwap implementation if a call
to glXSwapBuffers() or glXSwapBuffersMscOML(dpy,window,0,0,0) was
done. remainder = 1 was (ab)used as a flag to tell the server to
select proper semantic. The DRI3/Present backend ignored this
signalling, treated any target_msc=0 as glXSwapBuffers() request,
and called xcb_present_pixmap with invalid divisor=0, remainder=1
combo. The present extension responded kindly to this with a
BadValue error and dropped the request, but mesa's DRI3/Present
backend doesn't check for error codes. From there on stuff went
downhill quickly for the calling OpenGL client...

This patch fixes the problem.

v2: Change comments to be more clear, with reference to
relevant spec, as suggested by Eric Anholt.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 0d7f4c8658)
2014-12-14 15:45:27 +00:00
Mario Kleiner
ee241a6889 glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)
Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>

v2: Add Frank Binns signed off by for his original earlier
patch from April 2014, which is identical to this one, and
Chris Wilsons reviewed tag from May 2014 for that patch, ergo
also for this one.

v3: Incorporate comment about triple buffering as suggested
by Axel Davy, and reference to relevant spec provided by
Eric Anholt.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 455d3036fa)
2014-12-14 15:45:21 +00:00
Mario Kleiner
4b37a18da5 glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc (v2)
Prevent calls to glXGetSyncValuesOML() and glXWaitForMscOML()
from overwriting the (ust,msc) values of the last successfull
swapbuffers call (PresentPixmapCompleteNotify event), as
glXWaitForSbcOML() relies on those values corresponding to
the most recent completed swap, not to whatever was last
returned from the server.

Problematic call sequence without this patch would have been, e.g.,

glXSwapBuffers()
... wait ...
swap completes -> PresentPixmapComplete event -> (ust,msc)
updated to reflect swap completion time and count.
... wait for at least 1 video refresh cycle/vblank increment.

glXGetSyncValuesOML()
-> PresentNotifyMsc event overwrites (ust,msc) of swap
completion with (ust,msc) of most recent vblank

glXWaitForSbcOML()
-> Returns sbc of last completed swap but (ust,msc) of last
completed vblank, not of last completed swap.
-> Client is confused.

Do this by tracking a separate set of (ust, msc) for the
dri3_wait_for_msc() call than for the dri3_wait_for_sbc()
call.

This makes the glXWaitForSbcOML() call robust again and restores
consistent behaviour with the DRI2 implementation.

Fixes applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

v2: Rename vblank_msc/ust to notify_msc/ust as suggested by
Axel Davy for better clarity.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit ad8b0e8bf6)
2014-12-14 15:45:15 +00:00
Mario Kleiner
93f6f55983 glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly. (v2)
targetSBC == 0 is a special case, which asks the function
to block until all pending OpenGL bufferswap requests have
completed.

Currently the function just falls through for targetSBC == 0,
returning bogus results.

This breaks applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

v2: Simplify as suggested by Axel Davy. Add comments proposed
by Eric Anholt.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 8cab54de16)
2014-12-14 15:45:10 +00:00
Emil Velikov
af0c82099b docs: Add 10.4 sha256 sums, news item and link release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-14 13:57:54 +00:00
Emil Velikov
5fe79b0b12 docs: Update 10.4.0 release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-14 13:45:54 +00:00
Emil Velikov
45f3aa0bc7 Bump version to 10.4.0 (final)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-14 13:32:44 +00:00
Alexander von Gluck IV
90239276ff mesa/drivers: Add missing mesautil lib to Haiku swrast
* Resolves missing util_format_linear_to_srgb_8unorm_table symbol.

(cherry picked from commit ad2ffd3bc6)
2014-12-11 13:54:54 +00:00
Roland Scheidegger
57868b1ee4 llvmpipe: fix lp_test_arit denorm handling
llvmpipe disables denorms on purpose (on x86/sse only), because denorms are
generally neither required nor desired for graphic apis (and in case of d3d10,
they are forbidden).
However, this caused some arithmetic tests using denorms to fail on some
systems, because the reference did not generate the same results anymore.
(It did not fail on all systems - behavior of these math functions is sort
of undefined when called with non-standard floating point mode, hence the
result differing depending on implementation and in particular the sse
capabilities.)
So, for the reference, simply flush all (input/output) denorms manually
to zero in this case.

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=67672.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 8148a06b8f)
Nominated-by: Matt Turner <mattst88@gmail.com>
2014-12-11 13:54:54 +00:00
Marek Olšák
fe2eac2237 docs/relnotes: document the removal of GALLIUM_MSAA
Cc: 10.2.10.3 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ac319d94d3)
2014-12-11 13:54:54 +00:00
Matt Turner
db784a09f1 i965: Disable unlit-centroid workaround on Gen < 6.
Back to the original commit (8313f444) adding the workaround, we were
enabling it on gens <= 7, even though gens <= 5 can't do multisampling.

I cannot find documentation that says that Sandybridge needs this
workaround but in practice disabling it causes these piglit tests to
fail:

EXT_framebuffer_multisample/interpolation {2,4} centroid-deriv{,-disabled}

On Ironlake:

total instructions in shared programs: 4358478 -> 4349671 (-0.20%)
instructions in affected programs:     117680 -> 108873 (-7.48%)

A bunch of shaders in TF2, Portal 2, and L4D2 are cut by 25~30%.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit 1a2de7dce8)
2014-12-11 13:54:53 +00:00
Dave Airlie
d9f4aaa095 r600g: only init GS_VERT_ITEMSIZE on r600
On evergreen there are 4 regs, on r600/700 there is only one.

Don't initialise regs and trash someone elses state.

Not sure this fixes anything, but hey one less stupid.

Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.3 10.4" mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 7f21cf7198)
2014-12-11 13:54:53 +00:00
Timothy Arceri
e340a28dba mesa: use build flag to ensure stack is realigned on x86
Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but that is an assumption OpenGL drivers (or any dynamic library for that matter) can't afford to make as there are many closed- and open- source application binaries out there that only assume 4-byte stack alignment.

V4: fix comment and indentation

V3: move all sse4.1 build flag config to the same location
 and add comment as to why we need to do the realign

V2: use $target_cpu rather than $host_cpu
  and setup build flags in config rather than makefile

https://bugs.freedesktop.org/show_bug.cgi?id=86788
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Matt Turner <mattst88@gmail.com>
CC: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f1b5f2b157)
2014-12-11 13:54:53 +00:00
Tom Stellard
6b908efd58 radeonsi: Program RASTER_CONFIG for harvested GPUs v5
Harvested GPUs have some of their render backends disabled, so
in order to prevent the hardware from trying to render things
with these disabled backends we need to correctly program
the PA_SC_RASTER_CONFIG register.

v2:
  - Write RASTER_CONFIG for all SEs.

v3:
  - Set GRBM_GFX_INDEX.INSTANCE_BROADCAST_WRITES bit.
  - Set GRBM_GFX_INFEX.SH_BROADCAST_WRITES bit when done setting
    PA_SC_RASTER_CONFIG.
  - Get num_se and num_sh_per_se from kernel.

v4:
  - Get correct value for num_se
  - Remove loop for setting PA_SC_RASTER_CONFIG
  - Only compute raster config when a backend has been disabled.

v5: Michel Dänzer
  - Fix computation for chips with multiple SEs

https://bugs.freedesktop.org/show_bug.cgi?id=60879

CC: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 67dcbcd92c)
2014-12-11 13:54:53 +00:00
Abdiel Janulgue
65f03e6733 ir_to_mesa: Remove sat to clamp lowering pass
Fixes an infinite loop in swrast where the lowering pass unpacks saturate into
clamp but the opt_algebraic pass tries to do the opposite.

v3 (Ian):
This is a revert of commit cfa8c1cb "ir_to_mesa: lower ir_unop_saturate" on
the ir_to_mesa.cpp portion. prog_execute.c can handle saturates in vertex
shaders, so classic swrast shouldn't need this lowering pass.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83463
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit 39f7b72428)
2014-12-11 13:54:53 +00:00
Chris Forbes
ffaf58e7d0 i965/Gen6-7: Fix point sprites with PolygonMode(GL_POINT)
This was an oversight in the original patch. When PolygonMode is
used, then front faces, back faces, or both may be rendered as
points and are affected by point sprite state.

Note that SNB/IVB can't actually be fully conformant here, for
a legacy context -- we don't have separate sets of pointsprite
enables for front and back faces. Haswell ignores pointsprite
state correctly in hardware for non-point rasterization, so can
do this correctly, but it doesn't seem worth it.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86764
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit ed56c16820)
2014-12-11 13:54:53 +00:00
Ben Widawsky
bb9dea8a29 i965/gs: Avoid DW * DW mul
The GS has an interesting use for mul. Because the GS can emit multiple
vertices per input vertex, and it also has a unique count at the top of the URB
payload, the GS unit needs to be able to dynamically specify URB write offsets
(relative to the global offset). The documentation in the function has a very
good explanation from Paul on the mechanics.

This fixes around 2000 piglit tests on BSW.

v2:
Reworded commit message (Ben) no mention of CHV (Matt)
Change SHRT_MAX to USHRT_MAX (Ken, and Matt)
Update comment in code to reflect the use of UW (Ben)
Add Gen7+ assertion for the relevant GS code, since it won't work on Gen6- (Ken)
Drop the bogus hunk in emit_control_data_bits() (Ken)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84777 (with many dupes)
Cc: "10.4 10.3 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit f13870db09)
2014-12-11 13:54:53 +00:00
José Fonseca
be59440b53 util/primconvert: Avoid point arithmetic; apply offset on all cases.
Matches what u_vbuf_get_minmax_index() does.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit f9098f0972)
2014-12-11 13:54:52 +00:00
Ilia Mirkin
ac8d596498 util/primconvert: take ib offset into account
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit c3bed13604)
2014-12-11 13:54:52 +00:00
Ilia Mirkin
112d2fdb17 util/primconvert: support instanced rendering
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit fb434e675f)
2014-12-11 13:54:52 +00:00
Ilia Mirkin
c6353cee0c util/primconvert: pass index bias through
The index_bias (aka base_vertex) applies to the downstream draw just as
much, since the actual index values are never modified.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 1dfa039168)
2014-12-11 13:54:52 +00:00
Emil Velikov
09e4f1a50f Increment version to 10.4.0-rc4
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-05 18:52:11 +00:00
Axel Davy
c7b9a2e38a st/nine: Fix vertex declarations for non-standard (usage/index)
Nine code to match vertex declaration to vs inputs was limiting
the number of possible combinations.

Some sm3 games have issues with that, because arbitrary (usage/index)
can be used.

This patch does the following changes to fix the problem:
. Change the numbers given to (usage/index) combinations to uint16
. Do not put limits on the indices when it doesn't make sense
. change the conversion rule (usage/index) -> number to fit all combinations
. Instead of having a table usage_map mapping a (usage/index) number to
an input index, usage_map maps input indices to their (usage/index)

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: Yaroslav Andrusyak <pontostroy@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 712a4c5438)
2014-12-03 23:20:56 +00:00
Axel Davy
6fcbf9aee3 st/nine: sm1_declusage_to_tgsi, do not restrict indices with TGSI_SEMANTIC_GENERIC
With sm3, you can declare an input/output with an usage and an usage index.

Nine code hardcodes the translation usage/index to a corresponding TGSI code.
The translation was limited to a few usage/index combinations that were corresponding
to most of the needs of games, but some games did not work.

This patch rewrites that Nine code to map all possible usage/index combination
to TGSI code. The index associated to TGSI_SEMANTIC_GENERIC doesn't need to be low
for good performance, as the old code was supposing, and is not particularly bounded
(it's UINT16). Given the index is BYTE, we can map all combinations.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: Yaroslav Andrusyak <pontostroy@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 5d6d260833)
2014-12-03 23:20:01 +00:00
Axel Davy
fd2852fe5b st/nine: Queries: Fix D3DISSUE_END behaviour.
Issuing D3DISSUE_END should:
. reset previous queries if possible
. end the query

Previous behaviour wasn't calling end_query for
queries not needing D3DISSUE_BEGIN, nor resetting
previous queries.

This fixes several applications not launching properly.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit eac0b9b68a)

Conflicts:
	src/gallium/state_trackers/nine/query9.c
2014-12-03 23:18:48 +00:00
Brian Paul
57057c439e mesa: fix height error check for 1D array textures
height=0 is legal for 1D array textures (as depth=0 is legal for
2D arrays).  Fixes new piglit ext_texture_array-errors test.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 4e6244e80f)
2014-12-03 23:16:36 +00:00
Dave Airlie
b5cc04b6ad r600g/sb: fix issues cause by GLSL switching to loops for switch
Since 73dd50acf6
glsl: implement switch flow control using a loop

The SB backend was falling over in an assert or crashing.

Tracked this down to the loops having no repeats, but requiring
a working break, initial code just called the loop handler for
all non-if statements, but this caused a regression in
tests/shaders/dead-code-break-interaction.shader_test.
So I had to add further code to detect if all the departure
nodes are empty and avoid generating an empty loop for that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86089
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 7b0067d23a)
2014-12-03 23:15:27 +00:00
Brian Paul
d2e9fd5b6d mesa: fix arithmetic error in _mesa_compute_compressed_pixelstore()
We need parenthesis around the expression which computes the number of
blocks per row.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 991d5cf8ce)
2014-12-03 23:15:12 +00:00
Ilia Mirkin
b61192f2ae freedreno/ir3: fix UMAD
Looks like none of the mad variants do u16 * u16 + u32, so just add in
the extra value "by hand".

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit de83ef677f)
2014-12-03 23:15:05 +00:00
Ilia Mirkin
75c4824d2f freedreno/a3xx: only enable blend clamp for non-float formats
This fixes arb_color_buffer_float-render GL_RGBA16F.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 3de9fa8ff4)
2014-12-03 23:14:48 +00:00
Christoph Bumiller
f30fbbdbdd nv50/ir/tgsi: handle TGSI_OPCODE_ARR
This instruction is used by st/nine.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f3b4b263c2)
2014-12-03 23:14:34 +00:00
Emil Velikov
b247956c77 cherry-ignore: drop whitespace commit
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-03 23:13:53 +00:00
Axel Davy
72a802a9c2 st/nine: Fix setting of the shift modifier in nine_shader
It is an sint_4, but it was stored in a uint_8...
The code using it was acting as if it was signed.

Problem found thanks to Coverity

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit d52328fc39)
2014-12-03 22:59:28 +00:00
David Heidelberg
cfbc474d80 st/nine: remove unused pipe_viewport_state::translate[3] and scale[3]
2efabd9f5a removed them as unused.

This caused random memory overwrites (reported by Coverity).

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit 90fea6b3e0)
2014-12-03 22:59:21 +00:00
Axel Davy
360872a45e st/nine: fix wrong variable reset
Error detected by Coverity (COPY_PASTE_ERROR)

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit 614d9387c7)
2014-12-03 22:59:12 +00:00
David Heidelberg
42839ea5ba st/nine: return GetAvailableTextureMem in bytes as expected (v2)
PIPE_CAP_VIDEO_MEMORY returns the amount of video memory in megabytes,
so need to converted it to bytes.

Fixed Warframe memory detection.

v2: also prepare for cards with more than 4GB memory

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: Yaroslav Andrusyak <pontostroy@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit a99f31bced)
2014-12-03 22:59:07 +00:00
Axel Davy
8dc03bd575 st/nine: Add pool check to SetTexture (v2)
D3DPOOL_SCRATCH is disallowed according to spec.
D3DPOOL_SYSTEMMEM should be allowed but we don't handle it right for now.

v2: Fixes segfault in SetTexture when unsetting the texture

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 4eea2496bc)
2014-12-03 22:58:54 +00:00
Axel Davy
41906e9764 st/nine: propertly declare constants (v2)
Fixes "Error : CONST[20]: Undeclared source register" when running
dx9_alpha_blending_material. Also artifacts on ilo.

v2: also remove unused MISC_CONST

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 890f963d64)
2014-12-03 22:58:49 +00:00
Stanislaw Halik
56572002fc st/nine: call DBG() at more external entry points
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Stanislaw Halik <sthalik@misaki.pl>
(cherry picked from commit 7f74b9d479)
2014-12-03 22:58:44 +00:00
Axel Davy
c0e0de45dc st/nine: rework the way D3DPOOL_SYSTEMMEM is handled
This patch moves the data field from Resource9 to Surface9 and cleans
D3DPOOL_SYSTEMMEM handling in Texture9. This fixes HL2 lost coast.

It also removes in Texture9 some code written to support importing
and exporting non D3DPOOL_SYSTEMMEM shared buffers. This code hadn't
the design required to support the feature and wasn't used.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 6aeae7442d)
2014-12-03 22:58:39 +00:00
Axel Davy
b75a285633 st/nine: Rework Basetexture9 and Resource9.
Instead of having parts of the structures initialised by the parents,
have them initialised by the children.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 133b2087c5)
2014-12-03 22:58:35 +00:00
Axel Davy
1cf4dbdc81 st/nine: clean device9ex.
Pass ex specific parameters as arguments to device9 ctor instead
of passing them by filling the structure.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 104b5a8193)
2014-12-03 22:58:29 +00:00
Emil Velikov
c29ddc923f Increment version to 10.4.0-rc3
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-28 17:58:26 +00:00
Ilia Mirkin
085de45812 freedreno/ir3: don't pass consts to madsh.m16 in MOD logic
madsh.m16 can't handle a const in src1, make sure to unconst it

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 37fe347542)
2014-11-28 17:29:29 +00:00
Dave Airlie
31c7e6c51d r600g: merge the TXQ and BUFFER constant buffers (v1.1)
We are using 1 more buffer than we have, although in the future the
driver should just end up using one buffer in total probably, this
is a good first step, it merges the txq cube array and buffer info
constants on r600 and evergreen.

This should in theory fix geom shader tests on r600.

v1.1: fix comments from Glenn.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 07ae69753c)

Squashed with commit

r600g: fix fallout from last patch

I accidentally rebased from the wrong machine and missed some
fixes that were on my r600 box.

doh.

this fixes a bunch of geom shader textureSize tests on rv635
from gpu reset to pass.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86760
Reported-by: wolput@onsneteindhoven.nl
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b10ddf962f)

Squashed with commit

r600g: make llvm code compile this time

Actually compiling the code helps make it compile.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 91a827624c)
2014-11-28 17:29:07 +00:00
José Fonseca
2a0290d5f5 st/wgl: Don't export wglGetExtensionsStringARB.
It's not exported by the official opengl32.dll neither.  Applications are
supposed to get it via wglGetProcAddress(), not GetProcAddress().

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit cb009bdd44)
2014-11-28 17:28:28 +00:00
José Fonseca
f77a97f057 mapi/glapi: Fix dll linkage of GLES1 symbols.
This fixes several MSVC warnings like:

  warning C4273: 'glClearColorx' : inconsistent dll linkage

In fact, we should avoid using `declspec(dllexport)` altogether, and use
exclusively the .DEF instead, which gives more precise control of which
symbols must be exported, but all the public GL/GLES headers practically
force us to pick between `declspec(dllexport)` or
`declspec(dllimport)`.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 5fdb6d6839)
2014-11-28 17:28:20 +00:00
José Fonseca
d45c35c3d7 util/u_snprintf: Don't redefine HAVE_STDINT_H as 0.
We now always guarantee availability of stdint.h on MSVC -- if MSVC
doesn't supply one we use our own.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 4b6e93650c)
2014-11-28 17:28:13 +00:00
Emil Velikov
16eaf01a6a nine: the .pc file should not follow mesa version
The version provided by it should be the same as the one
provided/handled by the module. Add the missing tiny version.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit 9b7037a369)
2014-11-26 21:23:59 +00:00
Chris Forbes
6316d415c4 i965/Gen6-7: Do not replace texcoords with point coord if not drawing points
Fixes broken rendering in Windows-based QtQuick2 apps run through Wine.
This library sets all texture units' GL_COORD_REPLACE, leaves point
sprite mode enabled, and then draws a triangle fan.

Will need a slightly different fix for Gen4-5, but I don't have my old
machines in a usable state currently.

V2: - Simplify patch -- the real changes are no longer duplicated across
      the Gen6 and Gen7 atoms.
    - Also don't clobber attr overrides -- which matters on Haswell too,
      and fixes the other half of the problem
    - Fix newly-introduced warnings
V3: - Use BRW_NEW_GEOMETRY_PROGRAM and brw->geometry_program rather than
      core flag and state; keep the state flags in order.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84651
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0008d0e59e)
2014-11-26 21:23:23 +00:00
Kenneth Graunke
dca88397ca glsl: Make lower_constant_arrays_to_uniforms require dereferences.
Ilia noticed that my lowering pass was converting the constant array
used by textureGatherOffsets' offsets parameter to a uniform.  This
broke textureGather for Nouveau, and is generally a horrible plan,
since it violates the GLSL constraint that offsets must be an
immediate constant.

When I wrote this pass, I neglected to consider whole array assignment.
I figured opt_array_splitting would handle constant indexing, so this
pass was really about fixing variable indexing.

textureGatherOffsets is an example of whole array access that we really
don't want to touch.  Whole array copies don't appear to benefit from
this either - they're most likely initializers for temporary arrays
which are going to be mutated anyway.  Since you're copying, you may
as well copy from immediates, not uniforms.

This patch makes the pass look for ir_dereference_arrays of
ir_constants, rather than looking for any ir_constant directly.
This way, it ignores whole array assignment.

No shader-db changes or Piglit regressions on Haswell.  Some Piglit
tests generate different code (fixing textureGatherOffsets on Nouveau).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 60f011af1a)
2014-11-26 21:23:14 +00:00
Chris Forbes
6c383aaadd mesa: Fix Get(GL_TRANSPOSE_CURRENT_MATRIX_ARB) to transpose
This was just returning the same value as GL_CURRENT_MATRIX_ARB.
Spotted while investigating something else in apitrace.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 2b4fe85f0e)
2014-11-26 21:23:08 +00:00
Chris Forbes
7e47ae3185 glsl: Generate unique names for each const array lowered to uniforms
Uniform names (even for hidden uniforms) are required to be unique; some
parts of the compiler assume they can be looked up by name.

Fixes the piglit test: tests/spec/glsl-1.20/linker/array-initializers-1

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 129178893b)
2014-11-26 21:20:33 +00:00
Chris Forbes
9e94c05936 i965: Handle nested uniform array indexing
When converting a uniform array reference to a pull constant load, the
`reladdr` expression itself may have its own `reladdr`, arbitrarily
deeply. This arises from expressions like:

   a[b[x]]     where a, b are uniform arrays (or lowered const arrays),
               and x is not a constant.

Just iterate the lowering to pull constants until we stop seeing these
nested. For most shaders, there will be only one pass through this loop.

Fixes the piglit test:
tests/spec/glsl-1.20/linker/double-indirect-1.shader_test

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit adefccd12a)
2014-11-26 21:20:11 +00:00
Dave Airlie
4952c49f21 r600g: do all CUBE ALU operations before gradient texture operations (v2.1)
This moves all the CUBE section above the gradients section,
so that the gradient emission happens on one block which
is what sb/hardware expect.

v2: avoid changes to bytecode by using spare temps
v2.1: shame gcc, oh the shame. (uninit var warnings)

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c88385603a)
2014-11-26 21:20:05 +00:00
Dave Airlie
013eba0ec1 r600: fix texture gradients instruction emission (v2)
The piglit tests were failing, and it appeared to be SB
optimising out things, but Glenn pointed out the gradients
are meant to be clause local, so we should emit the texture
instructions in the same clause. This moves things around
to always copy to a temp and then emit the texture clauses
for H/V.

v2: Glenn pointed out we could get another ALU fetch in
the wrong place, so load the src gpr earlier as well.

Fixes at least:
./bin/tex-miplevel-selection textureGrad 2D

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 38ec184419)
2014-11-26 21:19:56 +00:00
Ilia Mirkin
db9a6b96ab nv50,nvc0: buffer resources can be bound as other things down the line
res->bind is not an indicator of how the resource is currently bound.
buffers can be rebound across different binding points without changing
underlying storage.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fecae4625c)
2014-11-24 00:55:28 +00:00
Ilia Mirkin
4d9c0445dd nv50,nvc0: actually check constbufs for invalidation
The number of vertex buffers has nothing to do with the number of bound
constbufs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e80a0a7d9a)
2014-11-24 00:55:22 +00:00
Ilia Mirkin
1a8f90dc70 nv50/ir: set neg modifiers on min/max args
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=86618
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7d07083cfd)
2014-11-24 00:55:17 +00:00
Emil Velikov
7fe9292069 Increment version to 10.4.0-rc2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-22 03:58:31 +00:00
Chad Versace
c260cb700b i965: Fix segfault in WebGL Conformance on Ivybridge
Fixes regression of WebGL Conformance test texture-size-limit [1] on
Ivybridge Mobile GT2 0x0166 with Google Chrome R38.

Regression introduced by

    commit 6c04423153
    Author: Kenneth Graunke <kenneth@whitecape.org>
    Date:   Sun Feb 2 02:58:42 2014 -0800

        i965: Bump GL_MAX_CUBE_MAP_TEXTURE_SIZE to 8192.

The test regressed because the pointer offset arithmetic in
intel_miptree_map_gtt() overflows for large textures. The pointer
arithmetic is not 64-bit safe.

[1] 52f0dc240f/sdk/tests/conformance/textures/texture-size-limit.html

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=78770
Fixes: Intel CHRMOS-1377
Reported-by: Lu Hua <huax.lu@intel.com>
Reviewed-by: Ian Romanic <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit b69c7c5dac)
2014-11-19 18:58:22 +00:00
Dave Airlie
aab3758916 r600g: limit texture offset application to specific types (v2)
For 1D and 2D arrays we don't want the other coordinates being
offset and affecting where we sample. I wrote this patch 6 months
ago but lost it.

Fixes:
./bin/tex-miplevel-selection textureLodOffset 1DArray
./bin/tex-miplevel-selection textureLodOffset 2DArray
./bin/tex-miplevel-selection textureOffset 1DArray
./bin/tex-miplevel-selection textureOffset 1DArrayShadow
./bin/tex-miplevel-selection textureOffset 2DArray
./bin/tex-miplevel-selection textureOffset(bias) 1DArray
./bin/tex-miplevel-selection textureOffset(bias) 2DArray

v2: rewrite to handle more cases and be consistent with code
above.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1830138cc0)
2014-11-19 00:53:30 +00:00
Dave Airlie
be24d54195 r600g: geom shaders: always load texture src regs from inputs
Otherwise we seem to lose the split_gs_inputs and try and
pull from an uninitialised register.

fixes 9 texelFetch geom shader tests.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d4c342f67e)
2014-11-19 00:53:23 +00:00
Marek Olšák
8751abf752 radeonsi: support per-sample gl_FragCoord
Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit da2dea3843)
2014-11-19 00:53:14 +00:00
Ilia Mirkin
da7475f35f st/mesa: add a fallback for clear_with_quad when no vs_layer
Not all drivers can set gl_Layer from VS. Add a fallback that passes the
instance id from VS to GS, and then uses the GS to set the layer.

Tested by adding

  quad_buffers |= clear_buffers;
  clear_buffers = 0;

to the st_Clear logic, and forcing set_vertex_shader_layered in all
cases. No piglit regressions (on piglits with 'clear' in the name).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 68db29c434)
2014-11-19 00:53:08 +00:00
Dave Airlie
7b62f0eb50 r600g/cayman: handle empty vertex shaders
Some of the geom shader tests produce an empty vertex shader,
on cayman we'd crash in the finaliser because last_cf was NULL.

cayman doesn't need the NOP workaround, so if the code arrives
here with no last_cf, just emit an END.

fixes crashes in a bunch of piglit geom shader tests.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4e520101e6)
2014-11-19 00:53:00 +00:00
Dave Airlie
fa62619da5 r600g/cayman: fix texture gather tests
It appears on cayman the TG4 outputs were reordered.

This fixes a lot of piglit tests.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 27e1e0e710)
2014-11-19 00:52:52 +00:00
Dave Airlie
2e3d2035cf r600g/cayman: fix integer multiplication output overwrite (v2)
This fixes tests/spec/glsl-1.10/execution/fs-op-assign-mult-ivec2-ivec2-overwrite.shader_test.

hopeful fix for fd.o bug 85376

Reported-by: ghallberg
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4a128d5a16)
2014-11-19 00:52:44 +00:00
Brian Paul
edb2186671 st/mesa: copy sampler_array_size field when copying instructions
The sampler_array_size field was added by "mesa/st: add support for
dynamic sampler offsets".  But the field wasn't getting copied in
the get_pixel_transfer_visitor() or get_bitmap_visitor() functions.

The count_resources() function then didn't properly compute the
glsl_to_tgsi_visitor::samplers_used bitmask.  Then, we didn't declare
all the sampler registers in st_translate_program().  Finally, we
asserted when we tried to emit a tgsi ureg src register with File =
TGSI_FILE_UNDEFINED.

Add the missing assignments and some new assertions to catch the
invalid register sooner.

Cc: "10.3, 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 11abd7b2bc)
2014-11-19 00:52:36 +00:00
Michel Dänzer
5a2ff2002b radeonsi: Disable asynchronous DMA except for PIPE_BUFFER
Using the asynchronous DMA engine for multi-dimensional operations seems
to cause random GPU lockups for various people. While the root cause for
this might need to be fixed in the kernel, let's disable it for now.

Before re-enabling this, please make sure you can hit all newly enabled
paths in your testing, preferably with both piglit and real world apps,
and get in touch with people on the bug reports below for stability
testing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85647
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83500
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
(cherry picked from commit ae4536b4f7)
2014-11-19 00:51:57 +00:00
Vinson Lee
0a3c146723 scons: Require glproto >= 1.4.13 for X11.
GLXBadProfileARB and X_GLXCreateContextAtrribsARB require glproto >=
1.4.13. These symbols were added in commit
d5d41112cb "st/xlib: Generate errors as
specified."

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 876c53375e)
2014-11-19 00:51:50 +00:00
Emil Velikov
6452e24ebc configure.ac: roll up a program for the sse4.1 check
So when checking/building sse code we have three possibilities:
 1 Old compiler, throws an error when using -msse*
 2 New compiler, user disables sse* (-mno-sse*)
 3 New compiler, user doesn't disable sse

The original code, added code for #1 but not #2. Later on we patched
around the lack of handling #2 by wrapping the code in __SSE4_1__.
Yet it lead to a missing/undefined symbol in case of #1 or #2, which
might cause an issue for #2 when using the i965 driver.

A bit later we "fixed" the undefined symbol by using #1, rather than
updating it to handle #2. With this commit we set things straight :)

To top it all up, conventions state that in case of conflicting
(-enable-foo -disable-foo) options, the latter one takes precedence.
Thus we need to make sure to prepend -msse4.1 to CFLAGS in our test.

v2: Clean the #includes. Suggested by Ilia, Matt & Siavash.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Tested-by: Siavash Eliasi <siavashserver@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 1a6ae84041)
2014-11-19 00:51:44 +00:00
Ilia Mirkin
4186c1c7b1 nv50,nvc0: use clip_halfz setting when creating rasterizer state
This enables the ARB_clip_control extension.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3bc42a09e2)
2014-11-19 00:51:38 +00:00
Emil Velikov
d133096d26 Increment version to 10.4.0-rc1
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-18 02:30:18 +00:00
Axel Davy
01c9bf999e nine: Implement threadpool
DRI_PRIME setups have different issues due the lack of dma-buf fences
support in the drivers. For DRI3 DRI_PRIME, a race can appear, making
tearings visible, or worse showing older content than expected. Until
dma-buf fences are well supported (and by all drivers), an alternative
is to send the buffers to the server only when rendering has finished.
Since waiting the rendering has finished in the main thread has a
performance impact, this patch uses an additional thread to offload the
wait and the sending of the buffers to the server.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 7f565845a1)
2014-11-18 02:30:18 +00:00
Axel Davy
df63e76c2c nine: Add drirc options (v2)
Implements vblank_mode and throttling, which  allows us change default ratio
between framerate and input lag.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 948e6c5228)
2014-11-18 02:30:18 +00:00
Joakim Sindholt
b46e80ae60 nine: Add state tracker nine for Direct3D9 (v3)
Work of Joakim Sindholt (zhasha) and Christoph Bumiller (chrisbmr).
DRI3 port done by Axel Davy (mannerov).

v2: - nine_debug.c: klass extended from 32 chars to 96 (for sure) by glennk
    - Nine improvements by Axel Davy (which also fixed some wine tests)
    - by Emil Velikov:
     - convert to static/shared drivers
     - Sort and cleanup the includes
     - Use AM_CPPFLAGS for the defines
     - Add the linker garbage collector
     - Restrict the exported symbols (think llvm)

v3: - small nine fixes
    - build system improvements by Emil Velikov

v4: [Emil Velikov]
   - Do no link against libudev. No longer needed.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit fdd96578ef)
[Emil Velikov: use correct ureg_property* functions]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-18 02:29:26 +00:00
Christoph Bumiller
ff97fbd9e9 gallium/auxiliary: add contained and rect checks (v6)
v3: thanks to Brian, improved coding style, also glennk helped spot few
things (unsigned -> int, two constify)
v4: thanks Ilia improved function, dropped u_box_clip_3d
v5: incorporated rest of Gregor proposed changes,clean ups
v6: u_box_clip_2d simplify proposed by Ilia Mirkin

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit 7d2573b537)
2014-11-18 02:23:12 +00:00
Christoph Bumiller
504d73f342 gallium/auxiliary: add inc and dec alternative with return (v4)
At this moment we use only zero or positive values.

v2: Implement it for also for Solaris, MSVC assembly
    and enable for other combinations.

v3: Replace MSVC assembly by assert + warning during compilation

v4: remove inc and dec with return for MSVC assembly

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit cb49132166)
2014-11-18 02:23:11 +00:00
Christoph Bumiller
7bbf0836c8 gallium/auxiliary: implement sw_probe_wrapped (v2)
Implement pipe_loader_sw_probe_wrapped which allows to use the wrapped
software renderer backend when using the pipe loader.

v2: - remove unneeded ifdef
    - use GALLIUM_PIPE_LOADER_WINSYS_LIBS
    - check for CALLOC_STRUCT
    thanks to Emil Velikov

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit e23d63cffd)
2014-11-18 02:23:10 +00:00
Christoph Bumiller
8d6963f005 winsys/sw/wrapper: implement is_displaytarget_format_supported for swrast
Acked-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit 8314315dff)
2014-11-18 02:23:08 +00:00
Christoph Bumiller
50e6b471c5 tgsi/ureg: add ureg_UARL shortcut (v2)
v2: moved in in same order as in p_shader_tokens (thanks Brian)

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit 259ec77db9)
2014-11-18 02:23:05 +00:00
Kristian Høgsberg
a4ffc2a445 i965: Move fs_visitor ra pass to new fs_visitor::allocate_registers()
This will be reused for the scalar VS pass.

v2 (Ken): Rebase on master.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-14 19:38:08 -08:00
Kristian Høgsberg
c50f2dadc5 i965: Move fs_visitor optimization pass into new method fs_visitor::optimize()
We'll reuse this toplevel optimization driver for the scalar VS.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-14 19:38:06 -08:00
Kristian Høgsberg
5c4efc644e i965: Move more code into codegen-branch of the fs_visitor::run() if statement
These last few operations all only apply when we've actually generated
code, optimized and allocated registers.  The dummy and the repclear
shaders don't need the gen4 send workaround, and don't spill.  This
means we can move these lines into the else-branch, which will make
the following refactoring easier.

v2 (Ken): Rebase on master, which removed the uncompressed stack.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-14 19:38:05 -08:00
Kristian Høgsberg
f2bb655ac7 i965: Refactor fs_generator API
We split out SIMD8 and SIMD16 generation into seperate calls to
new method generate_code(), which returns the start offset for the
generated code.  A new get_assembly() method returns the generated code.

This avoids asserting MESA_SHADER_FRAGMENT and accessing wm_prog_data
in the generator.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-14 19:38:03 -08:00
José Fonseca
13849f327c st/wgl: Implement WGL_EXT_create_context_es/es2_profile.
Derived from st/glx's GLX_EXT_create_context_es/es2_profile implementation.

Tested with an OpenGL ES 2.0 ApiTrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-11-14 23:29:59 +00:00
José Fonseca
d5d41112cb st/xlib: Generate errors as specified.
Tested with piglit glx tests.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-11-14 23:29:59 +00:00
Rob Clark
82103206fe freedreno/ir3: move some helpers
Split out a few helpers from fd3_program so we don't have to duplicate
for fd4_program.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-14 13:59:54 -05:00
Rob Clark
e091c08089 freedreno: rename draw->draw_vbo
Gets rid of a namespace conflict w/ a4xx which wants an fd4_draw()
version of fd_draw()..

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-14 13:59:31 -05:00
Rob Clark
2f024d2b10 freedreno/a3xx: missing u_upload_destroy
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-14 12:07:57 -05:00
Rob Clark
28b2269ee0 freedreno: fix borked check for a320.0
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-14 12:07:39 -05:00
Rob Clark
8b898c1174 freedreno/ir3: half vs full reg in standalone compiler output
Handle hrN.c in printing outputs/inputs.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-14 12:02:43 -05:00
José Fonseca
7037793f6b st/dri: Support EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR/GLX_CONTEXT_DEBUG_BIT_ARB on ES contexts.
The latest version of the specs explicitly allow it, and given that Mesa
universally supports KHR_debug we should definitely support it.

Totally untested.  (Just happened to noticed this while implementing
GLX_EXT_create_context_es2_profile for st/xlib.)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-11-14 16:10:22 +00:00
Marek Olšák
363b53f000 egl: remove egl_gallium from the loader
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Informally acked-by: Jose Fonseca
2014-11-14 16:16:12 +01:00
Marek Olšák
c46c551c56 configure.ac: remove enable flags for EGL and GBM Gallium state trackers
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Informally acked-by: Jose Fonseca
2014-11-14 16:16:12 +01:00
Kenneth Graunke
bd20fad316 i965/vec4: Combine all the math emitters.
17 insertions(+), 102 deletions(-).  Works just as well.

v2: Make emit_math take const references (suggested by Matt),
    drop redundant WRITEMASK_XYZW setting (Matt and Curro).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-11-13 20:55:41 -08:00
Kenneth Graunke
dba683cf16 i965/vec4: Use const references in emit() functions.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-11-13 20:55:41 -08:00
Kenneth Graunke
0efc53a96c i965: Use macros to create prototypes for emitter helpers.
We do this almost everywhere else; this should make it easier to modify.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-11-13 20:55:41 -08:00
Ben Widawsky
f14a35f9dc i965: Always enable VF statistics
Every other unit in the geometry pipeline automatically enables
statistics gathering. This part of the pipe has been controlled by the
DEBUG_STATS variable, but this is asymmetric. This dates back to the
original implementation, and I am not sure if there is a reason for it.

I need access to these stats to implement ARB_pipeline_statistics_query.

Eric wrote it, and Ken touched it last. Do you have any opposition?

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86145
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2014-11-13 10:48:24 -08:00
Ville Syrjälä
0d924738d9 i915: Emit 3DSTATE_SCISSOR_RECTANGLE_0 before 3DSTATE_SCISSOR_ENABLE
According to gen2 BSpec the pipeline must be flushed at least up to the
windower before changing the scissor rect enable field. Emitting the
3DSTATE_SCISSOR_RECTANGLE_0 before 3DSTATE_SCISSOR_ENABLE is sufficient
to do that.

gen3 BSpec no longer has that piece of text, but let's make the same
change there too for symmetry. The spec does still say that the scissor
rectangle must be defined before enabling it, so the new order does seem
more in line with the spec.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2014-11-13 19:13:27 +02:00
Ville Syrjälä
81c31e560f i915: Don't call _mesa_meta_glsl_Clear() on gen2
Gen2 doesn't have fragment shaders so we shouldn't be calling
_mesa_meta_glsl_Clear() on gen2. Restore the appropriate
ARB_fragment_shader check to the clear path which was lost in:

 commit 94f22fbe78
 Author: Tapani Pälli <tapani.palli@intel.com>
 Date:   Wed Aug 8 20:46:45 2012 +0300

    intel: use _mesa_meta_Clear with OpenGL ES 1.1 v2

v2: Fix spelling in commit message

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2014-11-13 19:13:27 +02:00
Ville Syrjälä
4747b2638c i915: Protect macro argument for TEXTURE_SET()
TEXTURE_SET() is the only register macro that forgets to wrap the
argument evaluation in parens. Only simple integers are passed to this
macro so there's no bug but sitll it seems prudent to add the
parens.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2014-11-13 19:13:27 +02:00
Ville Syrjälä
3746ff89bc i915: Kill intel_context::hw_stencil
ctx.hw_stencil is not used anywhere so kill it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2014-11-13 19:13:27 +02:00
Ville Syrjälä
dafae910d4 i915: Accept GL_DEPTH_STENCIL GL_DEPTH_COMPONENT formats for renderbuffers
Gen2 doesn't support depth/stencil textures, and since

 commit c1d4d49993
 Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
 Date:   Thu Apr 24 14:11:43 2014 +0300

    i915: Don't advertise Z formats in TextureFormatSupported on gen2

depth/stencil formats are no longer accepted as texture formats.
However we still want depth/stencil renderbuffers, so add explicit
format checks to intel_alloc_renderbuffer_storage() to allow such
things.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2014-11-13 19:13:27 +02:00
Ville Syrjälä
a071425817 i915: Override mip filter to nearest with aniso
gen2 doesn't supporte linear mip filter with anisotropic min/mag
filtering. The hardware would automagically downgrade the min/mag
filters to linear in such cases, which IMO looks worse than forcing
the mip filter to nearest.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2014-11-13 19:13:27 +02:00
Ville Syrjälä
40a08e0d6a i915: Use L8A8 instead of I8 to simulate A8 on gen2
Gen2 doesn't support the A8 texture format. Currently the driver
substitutes it with I8, but that results in incorrect RGB values.
Use A8L8 instead. We end up wasting a bit of memory, but at least
we should get the correct results.

v2: Handle the fallback in _mesa_choose_tex_format() and also
    do it for all alpha formats that currently accept A8

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72819
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80050
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38873
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2014-11-13 19:13:27 +02:00
Ville Syrjälä
7988ff2fd1 i915: Fix GL_DOT3_RGBA a bit
The spec says using DOT4 for alpha is undefined unless DOT4 is also used
for color. It seems to do the right thing anyway, but better safe than sorry.

Also override numAlphaArgs to 2 for DOT4 since that's what it wants.
This migth fix something in case the specified alpha mode has only one
argument. Also avoids emitting a needless 3DSTATE_MAP_BLEND_ARG if
the specified alpha mode has three arguments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2014-11-13 19:13:27 +02:00
Neil Roberts
352f8f2d13 linker: Add a missing space in an error message
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-11-13 16:46:12 +00:00
José Fonseca
d5b1731178 llvmpipe: Call pipe_thread_wait() on Linux.
To address http://lists.freedesktop.org/archives/mesa-dev/2014-November/070569.html

In short, revert 706ad3b649 for non-Windows
OSes.
2014-11-13 15:01:19 +00:00
Kenneth Graunke
2b6e703863 i915g: we also have more than 0 viewports!
See 546d6c8d for the corresponding fix in freedreno.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
2014-11-12 20:59:28 -08:00
Eric Anholt
b3d269f5ae vc4: Avoid reusing a pointer from c->outputs[] after add_output().
add_output() can resize the qreg array, so we might use a stale pointer.
2014-11-12 18:24:10 -08:00
Eric Anholt
acc1cca7ae vc4: Fix assumption of TGSI OUT[0] being POSITION in the VS.
All the shaders we've received so far had this be the case, but with
nir-to-tgsi that changed.

I might decide to make nir-to-tgsi keep the outputs in the same order, for
debugging sanity, but I'm not sure.
2014-11-12 18:23:40 -08:00
Ilia Mirkin
22543dd8a1 nvc0: remove unused mm_VRAM_fe0
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-11-12 15:24:15 -05:00
José Fonseca
9247509a8d st/glx: Implement GLX_EXT_create_context_es2_profile.
apitrace now supports it, and it makes it much easier to test
tracing/replaying on OpenGL ES contexts since
GLX_EXT_create_context_{es2,es}_profile are widely available.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-11-12 19:03:50 +00:00
Tom Stellard
0cae7ea271 Revert "clover: Fix build after llvm r221375"
This reverts commit cd93d82ba9.

llvm r221375 was reverted, so this commit needs to be too.
2014-11-12 12:30:08 -05:00
José Fonseca
977b18e486 gallivm: Fix build with LLVM 3.6 (r221751).
Tested with LLVM 3.3, 3.4, 3.5, and 3.6.

Trivial.
2014-11-12 11:08:07 +00:00
Matt Turner
7a82961b71 i965/cfg: Remove if_block/else_block.
I used these in the SEL peephole, but they require extra tracking and
fix ups. The SEL peephole can pretty easily find the blocks it needs
without these.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-11-11 09:41:06 -08:00
Matt Turner
4001181ba3 i965/fs: Don't use if_block/else_block in SEL peephole.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-11-11 09:41:06 -08:00
Chia-I Wu
20a061d2b4 ilo: clean up gen6_3DSTATE_SF()
Make the helpers fill out valid Gen7 3DSTATE_SF and 3STATE_SBE.  This
prevents the helpers from having to do

  dw[0] = GEN7_SBE_DW1_x; // setting DW1 value to dw[0]!?

and simplifies gen7_3DSTATE_{SF,SBE}().

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-11 16:04:18 +08:00
Chia-I Wu
239dca78b1 ilo: clean up gen7_3DSTATE_STREAMOUT()
Render stream and render enable are independent from so enable.  Having a
single return point makes it easier to see that.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-11 13:52:26 +08:00
Chia-I Wu
eab595d573 ilo: rework gen7_3DSTATE_SO_DECL_LIST()
Started to make pipe_stream_output_info mandatory, but ended up adding support
for stream id and making a workaround Gen7-specific.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-11 13:52:26 +08:00
Chia-I Wu
c637075ea2 ilo: add 3DSTATE_SO_BUFFER variants
Add gen7_disable_3DSTATE_SO_BUFFER() to disable SO buffers.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-11 13:52:25 +08:00
Chia-I Wu
2ff88ce4be ilo: add gen6_3dstate_constant()
It replaces gen6_fill_3dstate_constant().  gen6_3DSTATE_CONSTANT_{VS,GS,PS}
are made wrappers of the new function.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-11 13:52:25 +08:00
Chia-I Wu
31372f2d2c ilo: add variants of 3DSTATE_{HS,DS}
Rename them to gen7_disable_3DSTATE_{HS,DS}() to reflect the fact.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-11 13:52:25 +08:00
Chia-I Wu
421b565b3b ilo: add variants of 3DSTATE_GS
Add gen6_so_3DSTATE_GS(), gen6_disable_3DSTATE_GS(), and
gen7_disable_3DSTATE_GS() to do SO on GEN6 or to disable GS.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-11 13:52:22 +08:00
Chia-I Wu
63ded78e1c ilo: add variants of 3DSTATE_VS
Add gen6_disable_3DSTATE_VS() to disable VS.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-11 13:51:36 +08:00
Chia-I Wu
9087239df8 ilo: add variants of 3DSTATE_PS
Add gen7_disable_3DSTATE_PS() to disable PS.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-11 13:51:31 +08:00
Chia-I Wu
8ebb86325b ilo: add variants of 3DSTATE_WM
Add gen6_hiz_3DSTATE_WM() and gen7_hiz_3DSTATE_WM() for HiZ ops without
dispatching.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-11 13:51:28 +08:00
Chia-I Wu
703ae84ac2 ilo: add variants of 3DSTATE_CLIP
Add gen6_disable_3DSTATE_CLIP to disable clipping.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-11 13:51:21 +08:00
Chia-I Wu
8abf4976c6 ilo: prefix 3DSTATE_VF with gen75
3DSTATE_VF is Gen7.5+ only.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-11 09:11:56 +08:00
Michael Varga
9d6253cf82 st/va: MPEG4 call vlVaDecoderFixMPEG4Startcode()
If the VOP and GOV headers were truncated they will be regenerated.

Signed-off-by: Michael Varga <Michael.Varga@amd.com>
2014-11-10 10:24:07 -05:00
Michael Varga
d335f5ffa6 st/va: MPEG4 generate GOV and VOP header
Also, Implemented a small locally used interface for writing bits to a buffer.

Signed-off-by: Michael Varga <Michael.Varga@amd.com>
2014-11-10 10:24:07 -05:00
Michael Varga
fa9e461967 st/va: MPEG4 populate the SPS structure
Signed-off-by: Michael Varga <Michael.Varga@amd.com>
2014-11-10 10:24:07 -05:00
Michael Varga
92350a65c4 st/va: MPEG4 populate the iq matrix buffers
Signed-off-by: Michael Varga <Michael.Varga@amd.com>
2014-11-10 10:24:07 -05:00
Michael Varga
9f1ee1b5c9 st/va: MPEG4 populate the PPS structure
Signed-off-by: Michael Varga <Michael.Varga@amd.com>
2014-11-10 10:24:07 -05:00
Michael Varga
c24ee2cf43 st/va: refactored handleVASliceDataBufferType
This patch cleans the function handleVASliceDataBufferType() for better
readability.

Signed-off-by: Michael Varga <Michael.Varga@amd.com>
2014-11-10 10:24:07 -05:00
Ian Romanick
46a2323c3f mesa: Remove _mesa_max_buffer_index
It appears to be completely unused since f9be8543 (February 2012).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-10 05:38:03 -08:00
Ian Romanick
8e4a6481e8 mesa: Uniform logging is very, very unlikely
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-11-10 04:25:40 -08:00
Ian Romanick
9cdf66657a glsl: Swap the order of glsl_type::name and ::length
On x86-64 this saves 8 bytes of padding in the structure, and this
reduces the size of the structure to 32 bytes.

v2: Fix constructor so that GCC won't warn about the order of
initialization.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-11-10 04:25:40 -08:00
Ian Romanick
3711abd780 glsl: Store glsl_type::vector_elements and ::matrix_columns as uint8_t
Due to the total number of bits used in the bitfield, this does not
increase the size of the structure.

It does, however, reduce the number of instructions required each time
one of these fields is accessed.  To access ::matrix_columns with the
bitfield, three instructions were required:

    movzbl 0x9(%rdx),%eax
    shr    %al
    and    $0x7,%eax

As a uint8_t, only one instruction is required.

    movzbl 0xa(%rdx),%eax

These fields are accessed *a lot*.

Valgrind callgrind results for a trace of Tesseract:

                 _mesa_Uniform4fv  _mesa_Uniform4f  _mesa_Uniform1i
Before (64-bit):       48,103,497       16,556,096          676,447
After  (64-bit):       45,722,616       15,737,964          670,607

                 _mesa_Uniform4fv  _mesa_Uniform4f  _mesa_Uniform1i
Before (32-bit):       61,472,611       21,051,222          821,361
After  (32-bit):       57,987,421       19,872,226          811,609

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-11-10 04:25:40 -08:00
Ian Romanick
378d92c74e mesa: Don't check for API_OPENGLES in _mesa_uniform_matrix
There are no uniforms in OpenGL ES 1.x, so we can't even get to this
code in that API.

Also, reorder the checks.  First check that transpose is true, then
check whether or not that is legal in the current API.  transpose should
never be true in an ES2 context, so this gets one check (the more
expensive one) out of the main path.

Valgrind callgrind results for a trace of Tesseract:

                 _mesa_UniformMatrix4fv  _mesa_UniformMatrix3fv
Before (64-bit):             96,119,025              24,240,510
After  (64-bit):             90,726,569              22,926,662

                 _mesa_UniformMatrix4fv  _mesa_UniformMatrix3fv
Before (32-bit):            132,434,452              29,051,808
After  (32-bit):            126,658,112              27,989,316

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-11-10 04:25:40 -08:00
Ian Romanick
91a2fa1490 mesa: Rework array error checks in validate_uniform_parameters
Before ARB_explicit_uniform_location, Mesa's location encoding allowed
locations for non-array types that had non-zero array indices.
Basically, part of the location was the uniform and part was the array
index.  This meant that some checks had to occur for arrays and
non-arrays.  This is no longer possible, we the checks can be split up.

Valgrind callgrind results for a trace of Tesseract:

                 _mesa_Uniform4fv  _mesa_Uniform4f  _mesa_Uniform1i
Before (64-bit):       50,499,557      17,487,316           686,227
After  (64-bit):       50,023,791      17,274,432           684,293

                 _mesa_Uniform4fv  _mesa_Uniform4f  _mesa_Uniform1i
Before (32-bit):       62,968,039       21,732,380          828,147
After  (32-bit):       62,373,967       21,490,756          826,223

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-11-10 04:25:40 -08:00
Ian Romanick
366540e9af mesa: Get some gl_shader_program::LinkStatus checking out of the main path
I really wanted to remove 'shProg != NULL' as well, but that would have
required adding a dummy program as the default program.  That seemed
like more churn than removing one test was worth.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-11-10 04:25:39 -08:00
Ian Romanick
3f5ebb98b7 mesa: Rework location == -1 error checking
Only one caller wanted to generate an error when location == -1, so move
the error generation to that caller.  There will be more callers in the
future that do not want to generate errors.

Move the location == -1 check later in validate_uniform_parameters.  As
currently implemented, glUniform1iv(-1, -1, data) would not generate an
error, but it should due to count being < 0.

The location that I have moved it to will make more sense with the next
commit.

Valgrind callgrind results for a trace of Tesseract:

                 _mesa_Uniform4fv  _mesa_Uniform4f  _mesa_Uniform1i
Before (64-bit):       51,241,217      17,740,162           689,181
After  (64-bit):       50,499,557      17,487,316           686,227

                 _mesa_Uniform4fv  _mesa_Uniform4f  _mesa_Uniform1i
Before (32-bit):       63,940,605       21,987,918          831,065
After  (32-bit):       62,968,039       21,732,380          828,147

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-11-10 04:25:39 -08:00
Ian Romanick
23dcbf623f mesa: Minor clean ups in _mesa_uniform
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-11-10 04:25:39 -08:00
Ian Romanick
9c38d4db52 mesa: Remove GLSL_TYPE_SAMPLER check
Noting the assertion just a few lines earlier, returnType cannot be
GLSL_TYPE_SAMPLER.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-11-10 04:25:39 -08:00
Ian Romanick
5b9cf337b4 mesa/main: Pass the data that _mesa_uniform actually wants
The GL_ enums were previously used because glsl_types.h couldn't be used
in C code.  That was fixed some time ago (and uniforms.c already
includes glsl_types.h), so this is no longer necessary.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-11-10 04:25:39 -08:00
Chia-I Wu
d388d8576f ilo: derive fb blending caps at bind time
Derive whether a RT supports blending, logicop, and the like when
set_framebuffer_state() is called.  This enables us to simplify
gen6_BLEND_STATE().

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-10 15:46:31 +08:00
Chia-I Wu
55d70e0669 ilo: remove inlined state functions
We had some inlined state functions for dispatching.  They were not needed
with the new top/bottom split.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-10 15:46:19 +08:00
Chia-I Wu
c88c49baf4 ilo: use top/bottom split for state functions
Follow the builder and split state functions into top (vertex processing) and
bottom (pixel processing).

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-10 13:14:04 +08:00
Kenneth Graunke
f3b709c0ac i965: Advertise a line width of 40.0 on Cherryview and Skylake.
According to the documentation, line widths higher than 40.0 may have
quality problems.  That's already 20 times larger than we've been
exposing, so it seems totally sufficient.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-11-08 22:24:08 -08:00
Kenneth Graunke
6dab04d7e3 i965: Advertise larger line widths.
We've artificially been limiting this to 5 for no particular reason.

On Gen4-5, the limit is [0, 7.5] with a granularity of 0.5 (U3.1).
On Gen6+, the limit is [0, 7.9921875].  Since it's a U3.7, the
granularity should be 0.125 (1/8).

This patch conservatively advertises one granularity smaller than the
hardware's maximum value, just in case there's a problem using the
largest possible value.  On Gen4-5, this is 7.5 - 0.5 = 7.0.  On Gen6+,
this is 8.0 - 0.125 = 7.875.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-11-08 22:22:54 -08:00
Kenneth Graunke
61838fd9ad i965: Use ctx->Const.MaxLineWidth when clamping ctx->Line.Width.
Rather than hardcoding platform values in every code path, just use the
maximum value we set.

Currently, ctx->Const.LineWidth == 5, which is smaller than the hardware
limit.  But applications shouldn't be using a value larger than we
support anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-11-08 22:22:53 -08:00
Kenneth Graunke
87927ed1f0 i965: Set Line Width correctly on Cherryview and Skylake.
Line Width moved to DW1 bits 29:12.  It's actually now a U11.7.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-11-08 22:22:18 -08:00
Emil Velikov
a6d8413d7c docs: add news item and link release notes for mesa 10.3.3
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-08 17:22:15 +00:00
Emil Velikov
caa0fb4709 docs: Add sha256 sums for the 10.3.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 9cc26056ee)
2014-11-08 17:22:15 +00:00
Emil Velikov
0d5da6d9a8 Add release notes for the 10.3.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 1a9cc5f50d)
2014-11-08 17:22:15 +00:00
José Fonseca
b238c756da util/format: Fix clamping to 32bit integers.
Use clamping constants that guarantee no integer overflows.

As spotted by Chris Forbes.

This causes the code to change as:

-         value |= (uint32_t)CLAMP(src[0], 0.0f, 4294967295.0f);
+         value |= (uint32_t)CLAMP(src[0], 0.0f, 4294967040.0f);

-         value |= (uint32_t)((int32_t)CLAMP(src[0], -2147483648.0f, 2147483647.0f));
+         value |= (uint32_t)((int32_t)CLAMP(src[0], -2147483648.0f, 2147483520.0f));

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-08 10:32:39 +00:00
José Fonseca
d268eac3a9 util/format: Generate floating point constants for clamping.
This commit causes the generated C code to change as

            union util_format_r32g32b32a32_sscaled pixel;
  -         pixel.chan.r = (int32_t)CLAMP(src[0], -2147483648, 2147483647);
  -         pixel.chan.g = (int32_t)CLAMP(src[1], -2147483648, 2147483647);
  -         pixel.chan.b = (int32_t)CLAMP(src[2], -2147483648, 2147483647);
  -         pixel.chan.a = (int32_t)CLAMP(src[3], -2147483648, 2147483647);
  +         pixel.chan.r = (int32_t)CLAMP(src[0], -2147483648.0f, 2147483647.0f);
  +         pixel.chan.g = (int32_t)CLAMP(src[1], -2147483648.0f, 2147483647.0f);
  +         pixel.chan.b = (int32_t)CLAMP(src[2], -2147483648.0f, 2147483647.0f);
  +         pixel.chan.a = (int32_t)CLAMP(src[3], -2147483648.0f, 2147483647.0f);
            memcpy(dst, &pixel, sizeof pixel);

which surprisingly makes a difference for MSVC.

Thanks to Juraj Svec for diagnosing this and drafting a fix.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=29661
2014-11-08 10:32:39 +00:00
Vinson Lee
42443339f1 glsl/list: Revert unintentional file mode change in previous commit.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2014-11-07 21:04:08 -08:00
Vinson Lee
f9fc3949e1 glsl/list: Move declaration before code.
Fixes MSVC build error.

shaderapi.c
src\glsl\list.h(535) : error C2143: syntax error : missing ';' before 'type'
src\glsl\list.h(535) : error C2143: syntax error : missing ')' before 'type'
src\glsl\list.h(536) : error C2065: 'node' : undeclared identifier

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86025
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2014-11-07 15:36:26 -08:00
Jason Ekstrand
0c36aac832 glsl/list: Add an exec_list_validate function
This can be very useful for trying to debug list corruptions.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-07 14:53:40 -08:00
José Fonseca
706ad3b649 llvmpipe: Avoid deadlock when unloading opengl32.dll
On Windows, DllMain calls and thread creation/destruction are
serialized, so when llvmpipe is destroyed from DllMain waiting for the
rasterizer threads to finish will deadlock.

So, instead of waiting for rasterizer threads to have finished, simply wait for the
rasterizer threads to notify they are just about to finish.

Verified with this very simple program:

   #include <windows.h>
   int main() {
      HMODULE hModule = LoadLibraryA("opengl32.dll");
      FreeLibrary(hModule);
   }

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=76252

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
2014-11-07 21:00:06 +00:00
José Fonseca
edb7b1c566 docs: Update minimum required LLVM version. 2014-11-07 21:00:06 +00:00
Emil Velikov
21925ec3fc i965: drop the custom gen8_instruction CFLAG
No longer needed as the file was removed with
commit 8c229d306b
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Mon Aug 11 10:07:07 2014 -0700

    i965: Delete the Gen8 code generators.

    We now use the brw_eu_emit.c code instead.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-07 18:32:17 +00:00
Emil Velikov
f6432c4d72 gbm/dri: cleanup memory leak on teardown
During teardown we free the driver_configs list pointer, but we forget
to deallocate each config in that list.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-07 18:32:07 +00:00
Emil Velikov
8ed08e69bc egl_dri2: add a note about dri2_create_screen
The function is not called by platform_drm. As such one needs to
pay special attention at teardown.

v2: Fix the comment block. Spotted by Ken.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2014-11-07 18:31:23 +00:00
Emil Velikov
38cec0303b egl_dri2: fix double free on drm platforms
Earlier commit failed to attribure that for drm platforms one does not
call dri2_create_screen, thus it does not create the screen and
driver_configs but inherits them from the "display" - gbm.

As such wrap cleanup in Platform != _EGL_PLATFORM_DRM to prevent
the issue and still cleanup correctly for non-drm platforms.

v2:
 - Drop the ifdef HAVE_DRM_PLATFORM, reindent the code and fix the
comment block. Suggested by Ken.

Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2014-11-07 18:29:08 +00:00
Chia-I Wu
9a0a4d67a9 ilo: tidy up message descriptor decoding
Move opcode to string mappings to functions of their own.  Have for consistent
outputs for similar opcodes.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-07 23:34:56 +08:00
Chia-I Wu
d3c5976a3b ilo: decode INTERFACE_DESCRIPTOR_DATA
This is at least much better than decoding as blobs.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-07 23:33:21 +08:00
Matt Turner
58a54091a9 i965/fs: Wire up control flow correctly in predicated break pass.
When the earlier block ended with control flow, we'd mistakenly remove
some of its links to its children. The same happened with the later
block.

Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-11-06 16:37:56 -08:00
Matt Turner
f0cfc4fca0 i965/cfg: Add functions to get first and last non-CF instructions.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-11-06 16:37:56 -08:00
Kenneth Graunke
a16ca4ac6a glsl: Skip loop-too-large heuristic if indexing arrays of a certain size
A pattern in certain shaders is:

   uniform vec4 colors[NUM_LIGHTS];

   for (int i = 0; i < NUM_LIGHTS; i++) {
      ...use colors[i]...
   }

In this case, the application author expects the shader compiler to
unroll the loop.  By doing so, it replaces variable indexing of the
array with constant indexing, which is more efficient.

This patch extends the heuristic to see if arrays accessed within the
loop are indexed by an induction variable, and if the array size exactly
matches the number of loop iterations.  If so, the application author
probably intended us to unroll it.  If not, we rely on the existing
loop-too-large heuristic.

Improves performance in a phong shading microbenchmark by 2.88x, and a
shadow mapping microbenchmark by 1.63x.  Without variable indexing, we
can upload the small uniform arrays as push constants instead of pull
constants, avoiding shader memory access.  Affects several games, but
doesn't appear to impact their performance.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2014-11-06 16:30:47 -08:00
Kenneth Graunke
4f22db5fbb glsl: Lower constant arrays to uniform arrays.
Consider GLSL code such as:

   const ivec2 offsets[] =
      ivec2[](ivec2(-1, -1), ivec2(-1, 0), ivec2(-1, 1),
              ivec2(0, -1),  ivec2(0, 0),  ivec2(0, 1),
              ivec2(1, -1),  ivec2(1, 0),  ivec2(1, 1));

   ivec2 offset = offsets[<non-constant expression>];

Both i965 and nv50 currently handle this very poorly.  On i965, this
becomes a pile of MOVs to load the immediate constants into registers,
a pile of scratch writes to move the whole array to memory, and one
scratch read to actually access the value - effectively the same as if
it were a non-constant array.

We'd much rather upload large blocks of constant data as uniform data,
so drivers can simply upload the data via constbufs, and not have to
populate it via shader instructions.

This is currently non-optional because both i965 and nouveau benefit
from it, and according to Marek radeonsi would benefit today as well.
(According to Tom, radeonsi may want to handle this itself in the long
term, but we can always add a flag when it becomes useful.)

Improves performance in a terrain rendering microbenchmark by about 2x,
and cuts the number of instructions in about half.  Helps a lot of
"Natural Selection 2" shaders, as well as one "HOARD" shader.

total instructions in shared programs: 5473459 -> 5471765 (-0.03%)
instructions in affected programs:     5880 -> 4186 (-28.81%)

v2: Use ir_var_hidden to avoid exposing the new uniform via the GL
    uniform introspection API.

v3: Alphabetize Makefile.sources properly.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77957
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-06 16:20:01 -08:00
Kenneth Graunke
0c0bfb2ead glsl: Add infrastructure for "hidden" uniforms.
In the compiler, we'd like to generate implicit uniforms for internal
use.  These should not be visible via the GL uniform introspection API.

To support that, we add a new ir_variable::how_declared value of
ir_var_hidden, and plumb that through to gl_uniform_storage.

v2 (idr): Fix some memory management issues in
move_hidden_uniforms_to_end.  The comment block on the function has more
details.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-11-06 16:20:01 -08:00
Timothy Arceri
1378617218 mesa: Add SSE 4.1 optimisation for glDrawElements.
Makes use of SSE 4.1 to speed up compute of min and max elements.

Callgrind cpu usage results from pts benchmarks:

Openarena 0.8.8: 3.67% -> 1.03%
UrbanTerror: 2.36% -> 0.81%

V5:
- actually make use of the optimisation in android (Emil Velikov)
- set a better array size limit for using SSE and added TODO

V4:
- fixed bugs with incrementing pointer and updating counters

V3:
- Removed sse_minmax.c from Makefile.sources
- handle the first few values without SSE until the pointer is aligned
 and use _mm_load_si128 rather than _mm_loadu_si128
- guard the call to the SSE code better at build time

V2:
- removed GL* types
- use _mm_store_si128() rather than _mm_store_ps()
- add runtime check for SSE
- use aligned attribute for local mix/max
- bunch of tidyups

Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
2014-11-06 11:39:59 -08:00
Matt Turner
9557cf7d0d i965: Remove non-existent vertical strides from array.
These never existed, as far as I can tell.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-06 11:11:37 -08:00
Matt Turner
cc3b028a4f i965: Convert stride/width/execution size macros into enums.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-06 11:11:34 -08:00
Matt Turner
497122a338 i965/fs: Remove force uncompressed stack.
Last use was in shader_time.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-11-06 11:09:46 -08:00
Matt Turner
7e19e6c877 i965/fs: Use execution size of 1 for some shader_time operations.
The ADDs depended on dispatch_width, which really isn't what we wanted.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-11-06 11:09:46 -08:00
Matt Turner
ee7e6009a9 i965/fs: Use mov(4) instructions to read timestamp.
We only want fields 0-2.
2014-11-06 11:09:45 -08:00
Jan Vesely
cd93d82ba9 clover: Fix build after llvm r221375
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2014-11-06 11:39:36 -05:00
Emil Velikov
ba0bb4227e egl_dri2: do not leak dri2_dpy->driver_configs
Walk through the list and free each config, and finally free the list
itself. Freeing approx 20KiB of memory, according to valgrind.
Inspired by a similar patch by enpeng xu.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-06 13:23:51 +00:00
Emil Velikov
54a065d9a6 ilo: add two missing headers to the sources list
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-06 13:19:08 +00:00
Alexandros Frantzis
f53b6d0134 Releasing a surfaceless EGL context doesn't release underlying DRI context.
driUnbindContext() checks for valid drawables before calling the driver
unbind function. In case of Surfaceless contexts, the drawables are always
Null and we end up not releasing the underlying DRI context. Moving the
call to the driver function before the drawable validity checks fixes things.

Steps to trigger this bug are following:

   - create surfaceless context and make it current
   - make some other context current
   - {another thread} destroy surfaceless context
   - make another context current

Signed-off-by: Alexandros Frantzis <Alexandros.Frantzis@canonical.com>
Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74563
2014-11-06 13:40:39 +02:00
Chia-I Wu
cd745d46ce ilo: let ilo_shader_compile_cs() return a dummy shader
The dummy shader sends an EOT message to end itself.  There are many more
works need to be done on the compiler side before we can advertise
PIPE_CAP_COMPUTE.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-06 10:45:20 +08:00
Chia-I Wu
ce40fa3a4a ilo: hook up launch_grid()
All we need to do is to upload the input data and call
ilo_render_emit_launch_grid() with space checking.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-06 10:43:53 +08:00
Chia-I Wu
a1a701877a ilo: add ilo_render_emit_launch_grid()
ilo_render_emit_launch_grid() emits all the hardware states needed for a
launch_grid() call.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-06 10:43:53 +08:00
Chia-I Wu
9dd596c99f ilo: improve media command helpers
They were written for Gen6 but mostly untested.  Make them work for Gen7+.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-06 10:43:53 +08:00
Chia-I Wu
a2054af85c ilo: disassemble DP DC messages
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-06 10:43:53 +08:00
Chia-I Wu
58099ed0a1 ilo: disassemble TS messages
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-06 10:43:53 +08:00
Chia-I Wu
bfaed536dd ilo: update genhw headers for media pipeline
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-06 10:43:45 +08:00
Chia-I Wu
207eccc5bf ilo: add ilo_finalize_compute_states()
It updates the handles of the global bindings.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-06 10:43:31 +08:00
Chia-I Wu
9feb637cd0 ilo: use a dynamic array for global bindings
Use util_dynarray in ilo_set_global_binding() to allow for unlimited number of
global bindings.  Add a comment for global bindings.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-06 10:43:31 +08:00
Chia-I Wu
1d51947693 ilo: add kernel queries for compute shaders
We need to know the local/input/private sizes and others.  This is not
complete.  We need many others for CURBE setup.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-06 10:42:19 +08:00
Chia-I Wu
99742998fc ilo: fix compute params
Based on beignet, hardware capabilities, and OpenCL requirements.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-06 10:26:34 +08:00
Chia-I Wu
510a1a9012 ilo: add eu_count and thread_count to ilo_dev_info
They will be used to report compute params or program compute states.
thread_count can also be used for 3DSTATE_VS.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-06 10:26:34 +08:00
Chia-I Wu
29253f44d0 ilo: fix intel_bo_wait() on kernel 3.17
drm_intel_gem_bo_wait() with negative timeout is broken on kernel 3.17.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-11-06 10:26:34 +08:00
Ian Romanick
93a92d2c69 mesa: Silence unused parameter warning in check_context_limits in non-debug builds
../../src/mesa/main/context.c: In function 'check_context_limits':
../../src/mesa/main/context.c:733:41: warning: unused parameter 'ctx' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-05 09:36:04 -08:00
Ian Romanick
6f3b8bb747 util: Implement unreachable for MSVC using __assume
Based on the description of __assume at:

http://msdn.microsoft.com/en-us/library/1b3fsfxw.aspx

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-11-05 09:36:04 -08:00
Chris Forbes
1ca88aa582 i965: Fix sampler state pointer adjustment for nonconst samplers
This started hitting an assertion recently. Only affects Haswell
(Ivybridge doesn't support this meddling with the sampler state pointer,
and ARB_gpu_shader5 is not enabled yet on Broadwell)

14 Piglits crash->pass.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-05 23:32:42 +13:00
Nick Sarnie
9e2473763d ilo: add drm_configuration for the pipe-target
Allows the driver to advertise DMA-BUF and throttling.
2014-11-04 21:22:52 +00:00
Kenneth Graunke
6107557f8f i965: Re-enable Z16 on Gen8+.
Improves performance in GLBenchmark 2.7 TRex by 3.88889% +/- 0.336383%
(n=80) at 1280x720 on Broadwell GT3.  Together with the previous patch,
it improves performance by 5.42738% +/- 0.541971% (n=10) at 1920x1080.

Note that without the PMA stall fix, this would instead decrease
performance by 22%.

v2: Update comment (noticed by Kristian Høgsberg).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-04 11:39:18 -08:00
Kenneth Graunke
7423cc891b i965: Implement the PMA stall fix.
Certain non-promoted depth cases typically incur stalls.  In very
specific cases, we can enable a workaround which improves performance.

Improves performance in GLBenchmark 2.7 TRex by 1.17762% +/- 0.448765%
(n=75) at 1280x720 on Broadwell GT3.

Haswell has this feature as well, but we can't currently write registers
from userspace batches (and we'd incur additional software batch
scanning overhead as well), so we haven't enabled it.  Broadwell allows
us to write CACHE_MODE_1.  Backporters beware: the formula and flushing
incantation differs between Haswell and Broadwell.

v2: Move pma_stall_bits from brw->state to brw itself (requested by
    Kristian Høgsberg).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-04 11:38:01 -08:00
Kenneth Graunke
8ccf54ab09 i965: Add #defines for Broadwell HiZ workarounds in CACHE_MODE_1.
This patch adds macros needed for the HiZ PMA stall optimization.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-04 11:35:11 -08:00
Kenneth Graunke
b5ad8a5d72 i965: Update compaction code to handle Skylake like Cherryview.
Matt requested this in review feedback on the original patch, which I
completely missed when pushing this series.  Kristian also made this
change, but I grabbed the wrong version of the patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-03 22:37:11 -08:00
Kenneth Graunke
8ca8dd123a mesa: Don't call _mesa_ClipControl from glPopAttrib when unsupported.
Otherwise, calling glPopAttrib on drivers that don't support
ARB_clip_control gives you a GL error, which is surprising at best.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-03 18:26:08 -08:00
Kenneth Graunke
f781965097 i965: Disable fast color clears on Skylake for now.
We're not programming the clear values yet, so this won't work.

This patch should be (effectively) reverted eventually.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-03 15:35:25 -08:00
Kristian Høgsberg
c31ce2c40c i965/skl: Use new MOCS for SKL
On Skylake, the MOCS bits are an index into a table of 63 different,
configurable cache configurations.  As for previous GENs, we only care about
WB and WT, which are available in the documented default set.  Define
SKL_MOCS_WB and SKL_MOCS_WT to the indices for those configucations and use
those for the Skylake MOCS values.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-03 15:33:12 -08:00
Jordan Justen
5745aaf15c i965/skl: Implement workaround for VF Invalidate issue
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-03 15:33:09 -08:00
Kenneth Graunke
35bbe177ec i965/skl: Update Viewport Z Clip Test Enable bits for Skylake.
Skylake has separate controls for enabling the Z Clip Test for the near
and far planes.  For now, maintain the legacy behavior by setting both.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-03 15:33:07 -08:00
Kenneth Graunke
77f584c7f9 i965/skl: Emit extra zeros in 3DSTATE_DS on Skylake.
Skylake's 3DSTATE_DS packet has a few more fields; we don't support
domain shaders yet though.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-03 15:33:05 -08:00
Kristian Høgsberg
0bb072b42b i965/skl: Init instructions compaction tables for SKL
They are the same as for BDW, so just add a case for SKL to the init switch.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-03 15:32:59 -08:00
Kristian Høgsberg
d235c5afde i965/skl: Add fast clear resolve rect multipliers for SKL
SKL updates the resolve rectangle scaling factors again.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-03 15:32:55 -08:00
Kenneth Graunke
051bfe4d52 i965/skl: Always emit 3DSTATE_BINDING_TABLE_POINTERS_* on Skylake.
On SKL, 3DSTATE_CONSTANT_* command is not committed until we give
the corresponding 3DSTATE_BINDING_TABLE_POINTERS_* command.  If we
fail to do so, the constant buffers wont be read and push constants
will be wrong.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-03 15:32:53 -08:00
Kenneth Graunke
1df496edb9 i965/skl: Allocate 16 DWords for SURFACE_STATE on Skylake.
Otherwise they overlap and horrible things happen.  All the new DWords
are for fast color clear values, which we don't do yet.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-03 15:32:51 -08:00
Kenneth Graunke
d18949ad82 i965/skl: Refactor surface state allocation.
We will need to allocate more DWords on Skylake.

v2: Don't mark brw_context parameter const.  It's modified.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-03 15:32:49 -08:00
Kenneth Graunke
263b584d5e i965/skl: Emit extra zeros in STATE_BASE_ADDRESS on Skylake.
Skylake introduces a new base address for a feature we don't yet expose.
Setting these to 0 should be safe.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-03 15:32:47 -08:00
Kenneth Graunke
eaf12022d2 i965/skl: Update stencil reference handling for Skylake.
Skylake uploads the stencil reference values in DW3 of the
3DSTATE_WM_DEPTH_STENCIL packet, rather than in COLOR_CALC_STATE.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-03 15:32:45 -08:00
Kenneth Graunke
822e791321 i965/skl: Set mask bits in PIPELINE_SELECT on Skylake.
Skylake has some extra bits in PIPELINE_SELECT, none of which are
interesting for a 3D driver.  In order to selectively change them, it
also introduces new "mask bits" in 15:8.  We care about the "Pipeline
Selection" bits (1:0), so set the mask to 0x3.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-03 15:32:43 -08:00
Jordan Justen
e813728b2b i965/skl: Set max OpenGL version the same as gen7/8
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-03 15:32:41 -08:00
Damien Lespiau
48157b904a i965/skl: Update 3DSTATE_SBE for Skylake.
This commands has seen the addition of 2 dwords that allow to specify
which channels of which attributes need to be forwarded to the fragment
shader.

v2: Rebase forward a year (done by Ken).

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-11-03 15:32:34 -08:00
Kenneth Graunke
2b7f73af9c glsl: Improve the CSE pass debugging output.
The CSE pass now prints out why it thinks a value is not a candidate for
adding to the AE set.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-03 15:16:50 -08:00
Matt Turner
799106d387 i965/fs: Don't compute_to_mrf() on Gen >= 7.
No differences in shader-db on Haswell (Gen 7.5).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-03 11:27:52 -08:00
Matt Turner
5fbcb1b41d glsl: Remove now useless dot optimization on basis vect
The optimization in commit d056863b covers these cases, which were the
first optimizations I added to the GLSL compiler.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-11-03 11:27:50 -08:00
Matt Turner
336e76c143 glsl: Emit mul instead of dot if only one component left.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85683
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85691
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-11-03 11:27:38 -08:00
Tom Stellard
263eb7fa39 clover: Fix clBuildProgram piglit regression
Should trigger CL_INVALID_VALUE if device_list is NULL and num_devices
is greater than zero.

Introduced by e5468dfa52

Reported by: EdB

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-11-03 10:35:07 -05:00
José Fonseca
bfd453f942 gallivm: Disable frame-pointer-omission on x86 to ensure right stack alignment.
Between release 3.2 and 3.3 LLVM stopped aligning properly when certain
conditions (no allocas, but large number of vectors causing spills to
the stack, and frame pointer omission enabled).

We were already disabling frame-pointer-omission on several build types,
but we now disable it on all build types.

It's not clear whether this affects 32-bits x86 processes only, or if it
can also affect 64-bits x86_64 processes when AVX registers are
available and used.  So disable frame-pointer-omission on both
x86/x86_64 to be on the safe side.

See also:
- http://llvm.org/PR21435

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-03 14:47:00 +00:00
José Fonseca
b7e447d323 gallivm: When disassemble a function, start by printing out its name.
To help recognize what's supposed to do.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-03 14:47:00 +00:00
Ben Widawsky
5695303563 i965/chv: Increase VS and GS thread counts
AFAICT the number of threads is 80, not 70. I am not sure if Ken knows
something I do not.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-02 21:18:08 -08:00
Brian Paul
52576dcb88 gallium/docs: fix NRM, NRM4 docs
Need to do a sqrt().

FWIW, the html that Sphinx 1.1.3 generates for the math expressions
looks completely broken.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2014-11-01 09:00:07 -06:00
Brian Paul
afdc4309dc softpipe: use the tgsi_free_tokens() function
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-10-31 15:30:00 -06:00
Brian Paul
e6ee85ec61 tgsi: add a tgsi_free_tokens() function
To match tgsi_alloc_tokens().

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-10-31 15:29:59 -06:00
Brian Paul
c996b22329 util: simplify u_pstipple.c code
Use the new helper functions in the tgsi_transform.h file to emit
declarations and instructions.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-10-31 15:29:59 -06:00
Brian Paul
55008ef697 util: simplify temp register selection in u_pstipple.c
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-10-31 15:29:59 -06:00
Brian Paul
ccd1ea9d52 util: simplify util_pstipple_create_fragment_shader() params
Pass and return tgsi_token buffers instead of pipe_shader_state.

And update softpipe driver (the only user of this function).

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-10-31 15:29:59 -06:00
Brian Paul
e3ecb8206a softpipe: remove unused softpipe_create_fs_variant_exec() parameter
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-10-31 15:29:59 -06:00
Brian Paul
2b9e63823f softpipe: check for SP_NEW_STIPPLE when building quad pipeline
Fixes polygon stipple if both DO_PSTIPPLE_IN_DRAW_MODULE and
DO_PSTIPPLE_IN_HELPER_MODULE are zero/off.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-10-31 15:29:59 -06:00
Tom Stellard
b9e41b587f r600g: Fix build with opencl and radeonsi disabled 2014-10-31 16:26:52 -04:00
Tom Stellard
64b0fac5e2 clover: Fix bug when binary programs are passed to clBuildProgram() v2
This was a regression introduced by
611d66fe45

Passing a binary program to clBuildProgram() is legal, but passing one
to clCompileProgram() is not.

v2:
  - Code cleanups.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-31 15:24:00 -04:00
Tom Stellard
e5468dfa52 clover: Factor input validation of clCompileProgram into a new function v2
This factors out the validation that is common with clBuildProgram().

v2:
  - Code cleanups.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-31 15:24:00 -04:00
Tom Stellard
1f4e48d5b5 radeonsi/compute: Enable PIPE_SHADER_IR_NATIVE for compute shaders v2
v2:
  - Drop dependency on LLVM >= 3.5.1
  - Rename si_create_shader() to si_shader_binary_read()
2014-10-31 15:24:00 -04:00
Tom Stellard
fa07f4b68a r600g/compute: Enable PIPE_SHADER_IR_NATIVE for compute shaders v2
v2:
  - Drop dependency on LLVM >= 3.5.1
2014-10-31 15:24:00 -04:00
Tom Stellard
e91735a641 gallium/radeon: Add query for symbol specific config information
This adds a query which allows drivers to access the config
information of a specific function within the LLVM generated ELF
binary.  This makes it possible for the driver to handle ELF
binaries with multiple kernels / global functions.
2014-10-31 15:24:00 -04:00
Marek Olšák
f058c6bbd1 r300g: remove enabled/disabled hyperz and AA compression messages
It's annoying with octave. Reported by Michael Burian.

Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
2014-10-30 22:24:18 +01:00
Dieter Nützel
068b9f4f7a r600g: Delete unused variable 'max_global_size' in 'r600_get_compute_param'
Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2014-10-30 22:24:18 +01:00
Chia-I Wu
4ded2ef5e8 mesa: protect the debug state with a mutex
We are about to change mesa to spawn threads for deferred glCompileShader and
glLinkProgram, and we need to make sure those threads can send compiler
warnings/errors to the debug output safely.

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-30 02:26:19 -07:00
Chia-I Wu
2d64e4ffba glsl: protect glsl_type with a mutex
glsl_type has several static hash tables and a static ralloc context.  They
need to be protected by a mutex as they are not thread-safe.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69200
Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-30 02:26:19 -07:00
Chia-I Wu
a6706163cb glsl: protect anonymous struct id with a mutex
There may be two contexts compiling shaders at the same time, and we want the
anonymous struct id to be globally unique.

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-30 02:26:19 -07:00
Chia-I Wu
61c3d49388 util: initialize locale_t with a static object
_mesa_strtod and _mesa_strtof may be called from multiple threads.  They need
to be thread-safe.

v2: platform checks are now done in configure.ac

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-30 02:26:19 -07:00
Chia-I Wu
b039dbfffd configure: check for xlocale.h and strtof
With the assumptions that xlocale.h implies newlocale and strtof_l.  SCons is
updated to define HAVE_XLOCALE_H on linux and darwin.

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-30 02:26:19 -07:00
Chia-I Wu
e3f2029479 util: add _mesa_strtod and _mesa_strtof
Both core mesa and glsl have their own wrappers for strtof_l.  Merge
and move them to util/.  They are compiled with a C++ compiler so that
we can make them thread-safe in a following commit.

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Kenneth Graunke <kenneth@whiteacpe.org>
2014-10-30 02:26:19 -07:00
Mathias Fröhlich
2c2ada6720 mesa/gallium: Signal _NEW_TRANSFORM from glClipControl.
This removes the need for the gallium rasterizer state
to listen to viewport changes.
Thanks to Marek Olšák <maraeo@gmail.com>.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-10-30 07:52:00 +01:00
Matt Turner
600066af93 Revert "i965/compaction: Disable compaction on SNB temporarily."
This reverts commit cabc93c5ad.

Mark thinks the failures on the SNB GT2 in the lab are actually because
of faulty hardware, not instruction compaction. The GT1 didn't see any
problems after changes to the compaction code.
2014-10-29 21:38:39 -07:00
Matt Turner
601a134180 i965/vec4: Perform CSE on MAD instructions with final arguments switched.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-29 21:35:46 -07:00
Matt Turner
b65bd9583b i965/fs: Perform CSE on MAD instructions with final arguments switched.
Multiplication is commutative.

instructions in affected programs:     48314 -> 47954 (-0.75%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-29 21:35:46 -07:00
Matt Turner
d056863b3c glsl: Drop constant 0.0 components from dot products.
Helps a small number of vertex shaders in the games Dungeon Defenders
and Shank, as well as an internal benchmark.

instructions in affected programs:     2801 -> 2719 (-2.93%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-29 21:35:46 -07:00
Kenneth Graunke
26122e09a3 glx/dri3: Implement LIBGL_SHOW_FPS=1 for DRI3/Present.
v2: Use the UST value provided in the PRESENT_COMPLETE_NOTIFY event
    rather than gettimeofday(), which gives us the presentation time
    instead of the time when SwapBuffers was called.  Suggested by
    Keith Packard.  This relies on the fact that the X DRI3/Present
    implementations use microseconds for UST.

v3: Properly ignore PresentCompleteKindMSCNotify; multiply in 64 bits
    (caught by Keith Packard).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Keith Packard <keithp@keithp.com> [v3]
Reviewed-by: Marek Olšák <marek.olsak@amd.com> [v1]
2014-10-29 15:13:58 -07:00
Kenneth Graunke
62b07b934e i965: Rename brw_vec4_gs.[ch] to brw_gs.[ch].
These source files support actual geometry shaders, so using "gs" for
the name makes a lot of sense.  We're going to be adding SIMD8 geometry
shader support as well, at which point "vec4_gs" will be a misnomer.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
2014-10-29 12:38:56 -07:00
Kenneth Graunke
02f8f90cc2 i965: Rename brw_gs{,_emit}.[ch] to brw_ff_gs{,_emit}.[ch].
The brw_gs.[ch] and brw_gs_emit.c source files contain code for
emulating fixed-function unit functionality (VF primitive decomposition
or SOL) using the GS unit.  They do not contain code to support proper
geometry shaders.

We've taken to calling that code "ff_gs" (see brw_ff_gs_prog_key,
brw_ff_gs_prog_data, brw_context::ff_gs, brw_ff_gs_compile,
brw_ff_gs_prog).  So it makes sense to make the filenames match.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
2014-10-29 12:38:42 -07:00
Kenneth Graunke
1480814173 i965: Rename intel_bufferobj_* functions to match GL and DD hooks.
The GL functions and driver hooks use corresponding names---for example,
glMapBufferRange and Driver.MapBufferRange.  But our implementation was
called "intel_bufferobj_map_range," which has the words "map" and
"buffer" swapped, as well as randomly adding "obj."

FlushMappedBufferRange was even trickier: it ordered the words
3, "obj", 1, 2, 4: intel_bufferobj_flush_mapped_range.

Even though the old names were consistent, I always had trouble
rearranging the jumble of words when searching for a function,
and it took a few tries to eventually land there.

The new names match the word order of GL and the driver hooks;
FlushMappedBufferRange is simply brw_flush_mapped_buffer_range.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-29 12:38:28 -07:00
Jan Vesely
993e2922c9 configure: fix typos
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2014-10-29 19:10:48 +00:00
Jan Vesely
af9551e68c configure: include llvm systemlibs when using static llvm
v2: drop -WL,--exclude-libs, it's not necessary
    fix tabs/spaces

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70410
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2014-10-29 18:52:46 +00:00
Michel Dänzer
402ab50bed radeon/llvm: Dynamically allocate branch/loop stack arrays
This prevents us from silently overflowing the stack arrays, and allows
arbitrary stack depths.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85454

Cc: mesa-stable@lists.freedesktop.org
Reported-and-Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-29 19:01:25 +09:00
Chris Forbes
0d5f4960a4 mesa: Fix order of errors for glDrawTransformFeedbackStream
The OpenGL 4.0 core profile specification, section 2.17.3
Transform Feedback Draw Operations says:

   "The error INVALID_VALUE is generated if <stream> is greater
    than or equal to the value of MAX_VERTEX_STREAMS.
    ...
    The error INVALID_OPERATION
    is generated if EndTransformFeedback has never been called
    while the object named by id was bound."

Fixes the piglit test:
   ARB_transform_feedback3/arb_transform_feedback3-draw_using_invalid_stream_index
   (with the test itself fixed to eliminate an unrelated failure)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-29 21:25:20 +13:00
Eric Anholt
f87c700895 vc4: Add support for ARL and indirect register access on TGSI_FILE_CONSTANT.
Fixes 14 ARB_vp tests (which had no lowering done), and should improve
performance of indirect uniform array access in GLSL.
2014-10-28 17:16:05 -07:00
Eric Anholt
5539a5b685 vc4: Fix mixup of return type in reloc_tex(). 2014-10-28 17:15:36 -07:00
Eric Anholt
926ab7dfa5 vc4: Drop redundant check for is_tmu_write().
This function is only called when it would return true.
2014-10-28 17:15:36 -07:00
Eric Anholt
8911879dec vc4: Don't forget to validate code that's got PROG_END on it.
This signal doesn't terminate the program now, it terminates the program
soon.  So you have to actually validate the code in the instruction.
2014-10-28 17:15:36 -07:00
Eric Anholt
fc1eb614a7 vc4: Add .dir-locals.el for kernel style in the kernel code. 2014-10-28 17:15:36 -07:00
Eric Anholt
6576dc1e92 vc4: Fix a couple missing '\n's in error output. 2014-10-28 17:15:36 -07:00
Brian Paul
6ad1c1eec1 st/mesa: use PIPE_BIND_DISPLAY_TARGET when checking for sRGB capability
When we're checking if the framebuffer is sRGB capable, call
is_format_supported() with the PIPE_BIND_DISPLAY_TARGET flag.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-10-28 18:07:54 -06:00
Marek Olšák
6fcb5520b7 Revert "st/mesa: set MaxUnrollIterations = 255"
This reverts commit 20836c8185.

255 is a huge number. If you have a loop with 255 iterations, unrolling it
will exceed the SM3 instruction limit. Let's use the default again.

The comment about a SM3 limit doesn't make sense. For SM3, we generally
want 32 (default) or a lower number due to the SM3 instruction limit, which
is 512 instructions. For SM4, we can try higher numbers if needed, but
some shaders can end up being pretty huge and shader compilation can take
more time.

This fixes a shader compile failure on R500/SM3. Reported on IRC.

Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-10-28 23:20:51 +01:00
David Heidelberger
b7186ebea9 r300g/vdpau: enable again
Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2014-10-28 23:20:51 +01:00
Marek Olšák
3fc499a1dd r300g: only set clip_halfz for chips with HW TCL
I forgot that we cannot emit vertex shader state on a chip without VS.
In such a case, clip_halfz is handled by the Draw module.
2014-10-28 23:20:45 +01:00
Marek Olšák
e05259b637 radeonsi: fix incorrect index buffer max size for lowered 8-bit indices
Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-28 23:20:45 +01:00
Marek Olšák
72424061e0 radeonsi: fix polygon mode for points and lines and point/line fill modes
Fixes piglit/polygon-mode-offset.

Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-28 23:20:45 +01:00
Marek Olšák
dab177ea99 r600g: fix polygon mode for points and lines and point/line fill modes
Fixes piglit/polygon-mode-offset.

Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-28 23:20:45 +01:00
Glenn Kennard
7b1c0cbc90 r600g: Implement sm5 UBO/sampler indexing
Caveat: Shaders using UBO/sampler indexing will
not be optimized by SB, due to SB not currently
supporting the necessary CF_INDEX_[01] index
registers.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
2014-10-28 23:20:45 +01:00
Glenn Kennard
444c8c2f28 r600g: Implement sm5 interpolation functions
Requires evergreen/cayman

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
2014-10-28 23:20:44 +01:00
Neil Roberts
3b83a5c35c docs: Update GL3.txt and relnotes for GL_KHR_context_flush_control 2014-10-28 16:51:12 +00:00
Neil Roberts
60ec95fa1e mesa: Add support for the GL_KHR_context_flush_control extension
The GL side of this extension just provides an accessor via glGetIntegerv for
the value of GL_CONTEXT_RELEASE_BEHAVIOR so it is trivial to implement. There
is a constant on the context for the value of the enum which is initialised to
GL_CONTEXT_RELEASE_BEHAVIOR_FLUSH. The extension is always enabled because it
doesn't need any driver interaction to retrieve the value.

If the value of the enum is anything but FLUSH then _mesa_make_current will
now refrain from calling _mesa_flush. This should only affect drivers that
explicitly change the enum to a non-default value.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-28 16:40:18 +00:00
Neil Roberts
1ecf6e1595 gles2: Update gl2ext.h to revision 28335
The main incentive to do this is to get the defines for the
GL_KHR_context_flush_control extension.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-28 16:40:18 +00:00
Jason Ekstrand
17d98ae254 i965/fs: Don't set dependency hints on instructions with spilled destinations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-27 17:54:10 -07:00
Jason Ekstrand
547a7fb458 i965/fs: Make scratch write instructions use the correct execution size
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-27 13:35:57 -07:00
Jason Ekstrand
9d1f72ebde i965/fs: Use correct spill offsets
Different platforms require the offset to be in different units.  However,
the generator fixes all of this up for us and only requires an offset in
bytes.  Previously, we were getting this wrong all over the place.  Some
computed/used it correctly as bytes while others treated the offset as
whole registers or computed it as bytes or bytes*2 in SIMD16 mode.  This
commit cleans all this up and makes us properly treat it as bytes
everywhere.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-27 13:35:57 -07:00
Jason Ekstrand
4242eb14c1 i965: Use the spill destination for the message header on GEN >= 7
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-27 13:35:57 -07:00
Jason Ekstrand
76bb695f09 i965/fs: Don't [un]spill multiple registers at a time in SIMD8 mode
I thought this would be a clever way to make spilling less expensive.
However, it appears that the oword read/write messages we are using for
spilling ignore the execution size and assume SIMD16 whenever working with
more than one register.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-27 13:35:57 -07:00
Jason Ekstrand
3a5df8b612 i965/fs: Use instruction execution sizes when generating scratch reads/writes
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-27 13:35:57 -07:00
Lionel Landwerlin
d175e7c16b egl/drm: do not crash when swapping buffers without any rendering
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-27 10:36:21 -07:00
Tobias Klausmann
1a170980a0 nv50: handle inverted render conditions
This enables ARB_conditional_render_inverted.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-26 07:33:16 -04:00
Rob Clark
13862812dc freedreno/ir3: consider instruction neighbors in cp
Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive
scalar registers.  Keep track of instruction neighbors in copy-
propagation step and avoid eliminating mov's which would cause an
instruction to need multiple distinct left and/or right neighbors.

This lets us not fall on our face when we encounter things like:

  1: MOV TEMP[2], IN[0].xyzw
  2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D
  3: MOV TEMP[2].xy, IN[0].yxzz
  4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D
  5: END

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 12:07:43 -04:00
Rob Clark
4dff2a6429 freedreno/ir3: always mov tex coords
Always insert extra mov's for the tex coord into the fanin.  This
simplifies things a bit, and avoids a scenario where multiple sam
instructions can have mutually exclusive input's to it's fanin, for
example:

  1: TEX OUT[0].xy, IN[0].xyxx, SAMP[0], 2D
  2: TEX OUT[0].zw, IN[0].yxxx, SAMP[0], 2D

The CP pass can always remove the mov's that are not actually needed,
so better to start out with too many mov's in the front end, than not
enough.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 12:07:34 -04:00
Rob Clark
33193540fc freedreno: rename a couple debug flags
dscis -> noscis
dbypass -> nobypass

a bit more consistant w/ nobin, etc.  And IMO a bit more sensible names.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 12:07:21 -04:00
Rob Clark
ded5013c4c freedreno/ir3: skip virtual outputs in standalone compiler
Kills get added to the outputs list, to ensure they get scheduled.  But
they aren't *really* outputs so skip them in the header comment block.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 10:25:15 -04:00
Mathias Fröhlich
a9c634dded glx: Fix make check.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=85429.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-10-25 15:14:24 +02:00
Mathias Fröhlich
ce61559413 mesa: Add ARB_clip_control.xml to automake.
Adding this makes 'make check' catch failures introduced from
within ARB_clip_control.xml earlier.

Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-10-25 15:14:24 +02:00
Rob Clark
d6252d0f63 freedreno/ir3: standalone compiler updates for ir3test
In order to test compiler changes more easily, spit out the assembled
shader with some header information so that we can know about
inputs/outputs more easily.

See: git://people.freedesktop.org/~robclark/ir3test

In ir3test we have a big collection of tgsi shaders and reference
ir3_compiler outputs.  When making compiler changes, regenerate the
compiler outputs and feed to ir3test to compare the new vs reference
shader.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 09:08:15 -04:00
Chia-I Wu
762c68b879 ilo: improve blob decoding
The last few dwords were skipped if the total number of dwords was not a
multiple of 4.  Change the formatting for better readability.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-10-25 14:28:08 +08:00
Eric Anholt
08599f668c i965: Skip recalculating URB allocations if the entry size didn't change.
We only get here if the VS/GS compiled programs change, but we can even
skip it if the VS/GS size didn't change.

Affects cairo runtime on glamor by -1.26471% +/- 0.674335% (n=234)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 23:17:14 -07:00
Andres Gomez
b0e0c26f02 glsl: Standardize names and fix typos
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 23:14:04 -07:00
Ian Romanick
7d560a3861 i965: Silence unused parameter warning in brw_dump_ir
Just remove the parameter.  Silences:

brw_program.c: In function 'brw_dump_ir':
brw_program.c:566:33: warning: unused parameter 'brw' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 19:54:39 -07:00
Ian Romanick
4939c2eced i965: Remove brwIsProgramNative
Originally I just fixed some unused parameter warnings in this
function.  However, Ken pointed out:

    "You could instead remove this driver hook.  If the dd pointer is
    NULL, arbprogram.c will return true.  I think I'd prefer that."

Way, way back in time, I think _mesa_GetProgramivARB had the opposite
behavior.  Given that it works the way it now works, I also prefer
removing the driver hook.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 19:54:39 -07:00
Ian Romanick
66d950464c mesa: Silence unused parameter warning in _mesa_init_shader_program
Just remove the parameter.  Silences:

../../src/mesa/main/uniform_query.cpp:1062:1: warning: unused parameter 'ctx' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 19:54:39 -07:00
Ian Romanick
99e8a3973f mesa: Remove context parameter from dd_function_table::NewShaderProgram
This fixes some unused parameter warnings introduced by the previous
commit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 19:54:39 -07:00
Ian Romanick
c76cc7bab0 mesa: Make _mesa_init_shader_program static
Since a couple commits ago, there is only one caller, and that caller is
in the same file.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 19:54:39 -07:00
Ian Romanick
cfe195f901 mesa: Remove context parameter from _mesa_init_shader_program
Silences:

../../src/mesa/main/shaderobj.c: In function '_mesa_init_shader_program':
../../src/mesa/main/shaderobj.c:239:46: warning: unused parameter 'ctx' [-Wunused-parameter]

For now, this adds a couple other unused parameter warnings, but future
patches will clean those up.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 19:54:39 -07:00
Ian Romanick
edcba62655 glsl_to_tgsi: Remove st_new_shader
It was identical to the default implementation in _mesa_new_shader.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
2014-10-24 19:54:39 -07:00
Ian Romanick
deee3b0f9e glsl_to_tgsi: Remove st_new_shader_program
It was identical to the default implementation in
_mesa_new_shader_program.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
2014-10-24 19:54:39 -07:00
Ian Romanick
a2dc16ed81 i965: Remove brw_new_shader_program
It was identical to the default implementation in
_mesa_new_shader_program.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 19:54:39 -07:00
Ian Romanick
9cdf2f78fc mesa: Silence unused parameter warning in _mesa_clear_shader_program_data
Just remove the parameter.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 19:54:38 -07:00
Ian Romanick
fefead3b63 linker: Rely on _mesa_clear_shader_program_data to clear link information
_mesa_link_shader_program already calls _mesa_clear_shader_program_data
before calling link_shaders, so this is already done.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-24 19:54:38 -07:00
Ian Romanick
7cbcff0606 mesa: Add some missing clean-up to _mesa_clear_shader_program_data
All of this is already done in link_shaders.  More clean-ups coming.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 19:54:38 -07:00
Ian Romanick
a3bfc7d313 mesa: Remove prototypes for nonexistent functions
_mesa_UseShaderProgramEXT, _mesa_ActiveProgramEXT, and
_mesa_CreateShaderProgramEXT were all removed when support for
GL_EXT_separate_shader_objects was removed.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 19:54:38 -07:00
Ian Romanick
1ac924a77d ff_fragment_shader: Silence unused parameter warning in smear
Just remove the parameter.  Silences:

../../src/mesa/main/ff_fragment_shader.cpp:668:1: warning: unused parameter 'p' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 19:54:38 -07:00
Ian Romanick
3e462d9221 meta: Only use _mesa_ClipControl if the extension is supported
Fixes many piglit failures on IVB since 85edaa8.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85425
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Mathias Fröhlich <Mathias.Froehlich@gmx.net>
2014-10-24 19:24:54 -07:00
Emil Velikov
f9a9054b61 docs: add news item and link release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-25 01:13:11 +00:00
Emil Velikov
95d00f6640 docs: Add sha256 sums for the 10.3.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 9599470642)
2014-10-25 01:11:02 +00:00
Emil Velikov
95d31ab54c Add release notes for the 10.3.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 3b6a4758fa)
2014-10-25 01:09:55 +00:00
Jason Ekstrand
5d1046291a i965/fs: Compute q-values for register allocation manually
Previously, we were allowing the register allocation code to do the
computation for us in ra_set_finalize.  However, the runtime for this
computation is O(c^4 * g) where c is the number of classes and g is the
number of GRF registers.  However, these q-values are directly computable
based on the way we lay out our register classes so there is no need for
the aweful runtime algorithm.

We were doing ok until commit 7210583eb where we bumped the number of
register classes from 11 to 16.  While startup times don't normally matter,
this caused piglit to take 4 times as long to run on Bay Trail.  This patch
should make generating the ra_set much faster and melt the piglit run
times.

v2: Fixed a couple of bugs.  I have now verified that the same q-values are
generated both ways.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-24 16:25:31 -07:00
Jason Ekstrand
2ec161b239 i965/fs: Don't interfere with too many base registers
On older GENs in SIMD16 mode, we were accidentally building too much
interference into our register classes.  Since everything is divided by 2,
the reigster allocator thinks we have 64 base registers instead of 128.
The actual GRF mapping still needs to be doubled, but as far as the ra_set
is concerned, we only have 64.  We were accidentally adding way too much
interference.

Signed-off-by: Jason Ekstrand <jason.ekstrand@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-24 16:24:05 -07:00
Jason Ekstrand
ee65f2b50d i965/fs: Properly precolor payload registers on GEN5 in SIMD16
For GEN6 SIMD16 mode, we have to 2-align all the registers, so we only have
the even-numbered ones.  This means that we have to divide the register
number by 2 when we precolor.  This wasn't a problem before because we were
setting up the interference between ra_node registers wrong.  This will be
fixed in the next commit.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-24 16:23:54 -07:00
Jason Ekstrand
1988b71655 i965/fs: Add another use of MAX_VGRF_SIZE
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-24 16:23:24 -07:00
Jason Ekstrand
f84adb8481 util: Use reg_belongs_to_class instead of BITSET_TEST
This shouldn't be a functional change since reg_belongs_to_class is just a
wrapper around BITSET_TEST.  It just makes the code a little easier to
read.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-24 16:23:08 -07:00
José Fonseca
701f739d7f llvmpipe: Ensure the packed input of the lp_test_format is aligned.
Fixes:
- https://bugs.freedesktop.org/show_bug.cgi?id=85377
- http://llvm.org/bugs/show_bug.cgi?id=21365

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-10-24 21:35:23 +01:00
José Fonseca
1ef6d439ba llvmpipe: Flush stdout on lp_test_* unit tests.
So that the order of test messages and gallivm/llvmpipe debug output is
preserved.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-10-24 21:35:09 +01:00
Mathias Fröhlich
5fc0e11053 gallium: Enable ARB_clip_control for gallium drivers.
Gallium should be prepared fine for ARB_clip_control.
So enable this and mention it in the release notes.

v2:
Only enable for drivers announcing the freshly introduced
PIPE_CAP_CLIP_HALFZ capability.

v3:
Use extension enable infrastructure to connect PIPE_CAP_CLIP_HALFZ
with ARB_clip_control.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-10-24 19:21:21 +02:00
Mathias Fröhlich
56088131d0 gallium: introduce PIPE_CAP_CLIP_HALFZ.
In preparation of ARB_clip_control. Let the driver decide if
it supports pipe_rasterizer_state::clip_halfz being set to true.

v3:
Initially enable on ilo.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de
2014-10-24 19:21:21 +02:00
Mathias Fröhlich
85edaa8b72 mesa: Handle clip control in meta operations.
Restore clip control to the default state if MESA_META_VIEWPORT
or MESA_META_DEPTH_TEST is requested.

v3:
Handle clip control state with MESA_META_TRANSFORM.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-10-24 19:21:21 +02:00
Mathias Fröhlich
34a3c97fe6 mesa: Implement ARB_clip_control.
Implement the mesa parts of ARB_clip_control.
So far no driver enables this.

v3:
Restrict getting clip control state to the availability
of ARB_clip_control.
Move to transformation state.
Handle clip control state with the GL_TRANSFORM_BIT.
Move _FrontBit update into state.c.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-10-24 19:21:21 +02:00
Mathias Fröhlich
6340e609a3 mesa: Refactor viewport transform computation.
This is for preparation of ARB_clip_control.

v3:
Add comments.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-10-24 19:21:20 +02:00
Eric Anholt
8c7ac377b7 vc4: Reuse uniform_data/contents indices when making uniforms.
This allows vc4_opt_cse.c to CSE-away operations involving the same
uniform values.

total instructions in shared programs: 37341 -> 36906 (-1.16%)
instructions in affected programs:     10233 -> 9798 (-4.25%)
total uniforms in shared programs: 10523 -> 10320 (-1.93%)
uniforms in affected programs:     2467 -> 2264 (-8.23%)
2014-10-24 18:04:26 +01:00
Eric Anholt
18ccda7b86 vc4: When asked to discard-map a whole resource, discard it.
This saves a bunch of extra flushes when texsubimaging a whole texture
that's been used for rendering, or subdataing a whole BO.  In particular,
this massively reduces the runtime of piglit texture-packed-formats (when
the probes have been moved out of the inner loop).
2014-10-24 18:04:26 +01:00
Eric Anholt
a71c3b885a vc4: Refactor flushing before mapping a BO.
I'm going to want to make some other decisions here before flushing.
2014-10-24 18:04:26 +01:00
Eric Anholt
52824811b9 vc4: Allow dead code elimination of unused varyings.
total instructions in shared programs: 39022 -> 37341 (-4.31%)
instructions in affected programs:     26979 -> 25298 (-6.23%)
total uniforms in shared programs: 11242 -> 10523 (-6.40%)
uniforms in affected programs:     5836 -> 5117 (-12.32%)
2014-10-24 18:04:26 +01:00
Eric Anholt
5d32e26335 vc4: Add debug output to match shaderdb info to program dumps.
I'm going to be using VC4_DEBUG=shaderdb,norast to do shaderdb stats, but
when debugging regressions, I want to match shaderdb output to shader
disassembly.
2014-10-24 18:04:26 +01:00
Andreas Boll
14bdcc6ff9 radeon: enable Hyper-Z on r600g and radeonsi by default
This reverts commit 01e6371149.
Since then many Hyper-Z issues have been fixed or worked around.

Enable Hyper-Z by default so that we get enough feedback for the upcoming
mesa 10.4 release.

If you have issues with Hyper-Z try to disable Hyper-Z using the enviroment
variable R600_DEBUG=nohyperz and please report the issue on the bugtracker.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75011
See also: https://bugs.freedesktop.org/show_bug.cgi?id=75112

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-24 09:11:51 +02:00
Matt Turner
76f27a6b03 i965: Silence unused variable warning. 2014-10-23 16:20:07 -07:00
Matt Turner
40492be2a4 i965/fs: Silence uninitialized variable warning.
The compiler isn't privy to the knowledge that we're doing at least one
framebuffer write.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-10-23 16:20:07 -07:00
Matt Turner
2695891088 util: Add assume() macro.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-23 16:20:07 -07:00
Jan Vesely
bbe93161e7 glapi: Fix compiler warning and script name
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-23 16:03:16 +01:00
Rob Clark
4f1fec6060 Revert "freedreno/a3xx: only emit dirty consts"
This reverts commit 94bb33617d.

Which somehow broke gnome-shell.. and needs more investigation.  For
now, revert..
2014-10-23 10:46:51 -04:00
Rob Clark
6eabc11936 freedreno: fix PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE
fd_bo_cpu_prep() doesn't realize the bo is already referenced in
unflushed cmdstream.  It could be made to do so (but would have to be
implemented twice, ie. both for msm and kgsl).  But we still can't do
the expected thing if the caller isn't using _NOSYNC.  Because of the
way the tiling works, we need to build quite a bit of cmdstream at flush
time, which is not possible to do at the libdrm level.

So rather than trying to make fd_bo_cpu_prep() smarter than it can
possibly be, just *always* discard and reallocate if the
PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag is set.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-23 10:46:51 -04:00
Jan Vesely
ab53830b95 clover: Require libelf
v2: test for libelf once, check in both radeon and clover

CC: Tom Stellard <tom@stellard.net>
CC: Emil Velikov <emil.l.velikov@gmail.com>
CC: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-23 15:19:00 +01:00
Emil Velikov
b4039cf15a clover: use correct typenames for compat::pair's first/second
Seems to be a typo judging from the overall declaration of the
template.

Cc: EdB <edb+mesa@sigluy.net>
Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-23 15:18:12 +01:00
Emil Velikov
c63eb5dd5e auxiliary/os: get the mmap/munmap wrappers working with android
- Use macro for munmap under Android - the STATIC_ASSERT uses
a off_t which is not used under Android for mmap. As loff_t size
does not vary as does off_t just ignore the assert.

 - Wrap the long lines to improve readability.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-23 15:18:11 +01:00
Mauro Rossi
417b17378a gallium/nouveau: fully build the driver under android
Fix the trivial typo in the variable name.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-10-23 15:18:11 +01:00
Alon Levy
d897e7c34a mesa/shaderimage.c: fix inconsistent sign warning
Signed-off-by: Alon Levy <alevy@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-10-23 14:45:41 +01:00
Alon Levy
501baa6bbb wgl: stw_pixelformat_get_info: correct type for index variable
Signed-off-by: Alon Levy <alevy@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-10-23 14:45:40 +01:00
Alon Levy
23080e49c4 u_math.h: fix 64 to 32 bit truncation warning
Signed-off-by: Alon Levy <alevy@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-10-23 14:45:40 +01:00
José Fonseca
75ad4fe78e gallivm: Fix build with LLVM 3.3.
The setMCJITMemoryManager method doesn't exist in LLVM 3.3.

I thought I had tested the latest version of my earlier change with LLVM
3.3, but it looks I missed it.

Trivial.
2014-10-23 10:42:12 +01:00
José Fonseca
065256dfc4 gallivm: Properly update for removal of JITMemoryManager in LLVM 3.6.
JITMemoryManager was removed in LLVM 3.6, and replaced by its base class
RTDyldMemoryManager.

This change fixes our JIT memory managers specializations to derive from
RTDyldMemoryManager in LLVM 3.6 instead of JITMemoryManager.

This enables llvmpipe to run with LLVM 3.6.

However, lp_free_generated_code is basically a no-op because there are
not enough hook points in RTDyldMemoryManager to track and free the code
of a module.  In other words, with MCJIT, code once created, stays
forever allocated until process destruction.  This is not speicfic to
LLVM 3.6 -- it will happen whenever MCJIT is used regardless of version.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-10-23 10:19:33 +01:00
José Fonseca
3fd220e2eb gallivm: Fix white-space.
Replace tabs with spaces.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-10-23 10:19:33 +01:00
José Fonseca
013ff2fae1 gallivm,llvmpipe,clover: Bump required LLVM version to 3.3.
We'll need to update gallivm for the interface changes in LLVM 3.6, and
the fewer the number of older LLVM versions we support the less hairy that
will be.

As consequence HAVE_AVX define can disappear.  (Note HAVE_AVX meant
whether LLVM version supports AVX or not.  Runtime support for AVX is
always checked and enforced independently.)

Verified llvmpipe builds and runs with with LLVM 3.3, 3.4, and 3.5.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-10-23 10:18:56 +01:00
Ilia Mirkin
9ad80d1d18 mesa: remove conditional render and rgtc from ES3 requirements
The functionality exposed by those extensions does not appear in ES3

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-23 00:45:08 -04:00
Brian Paul
c9a6ec1978 u_blitter: put a comment on util_blitter_cache_all_shaders()
Trivial.
2014-10-22 17:33:40 -06:00
Brian Paul
f82a84c097 u_blitter: use ctx->bind_fs_state(), not pipe->bind_fs_state()
Consistently use the function pointer we saved earlier.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-22 17:33:40 -06:00
Brian Paul
0bcd9f5469 u_blitter: create basic fs shaders in util_blitter_cache_all_shaders()
We need to create all fs shaders in this function.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-22 17:33:40 -06:00
Brian Paul
27de89d266 u_blitter: do error checking assertions for shader caching
If the user calls util_blitter_cache_all_shaders() set a flag and assert
that we never try to create any new fragment shaders after that point.
If the assertions fails, it means we missed generating some shader in
util_blitter_cache_all_shaders().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-22 17:33:40 -06:00
Anuj Phogat
7a652c41b4 glsl: Use signed array index in update_max_array_access()
Avoids a crash in case of negative array index is used in a
shader program.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-22 16:13:37 -07:00
Anuj Phogat
6f0089e92e glsl: Fix crash due to negative array index
Currently Mesa crashes with a shader like this:

[fragmnet shader]
float[5] array;
int idx = -2;
void main()
{
   gl_FragColor = vec4(0.0, 1.0, 0.0, array[idx]);
}

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-22 16:13:37 -07:00
Marek Olšák
8ec40adf7e radeonsi: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-22 21:05:00 +02:00
Marek Olšák
a3591da1a0 r600g: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-22 21:04:58 +02:00
Marek Olšák
8ddd2f7aee r300g: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-22 21:04:56 +02:00
Michel Dänzer
ae879718c4 r600g: Drop references to destroyed blend state
Fixes use-after-free when the currently bound blend state is destroyed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85267
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84140

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

Cc: mesa-stable@lists.freedesktop.org
2014-10-22 17:09:43 +09:00
Kenneth Graunke
6dc6e6e0d9 i965/vec4: Generate better code for ir_triop_csel.
Previously, we generated an extra CMP instruction:

   cmp.ge.f0(8)    g6<1>D          g1<0,4,1>F      0F
   cmp.nz.f0(8)    null            g6<4,4,1>D      0D
   (+f0) sel(8)    g5<1>F          g1.4<0,4,1>F    g2<0,4,1>F

The first operand is always a boolean, and we want to predicate the SEL
on that.  Rather than producing a boolean value and comparing it against
zero, we can just produce a condition code in the flag register.

Now we generate:

   cmp.ge.f0(8)    null            g1<0,4,1>F      0F
   (+f0) sel(8)    g5<1>F          g1.4<0,4,1>F    g2<0,4,1>F

No difference in shader-db.

v2: Remember to delete the old code (thanks Matt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-21 21:14:03 -07:00
Kenneth Graunke
f5c3f095b9 i965/vec4: Simplify visit(ir_expression *)'s result_src/dst setup.
Using dst_reg(this, ir->type) automatically sets the writemask to the
proper size for the type; src_reg(dst_reg) preserves that.  This should
be equivalent, but less code.

Note that src_reg(dst_reg) either uses SWIZZLE_XXXX or SWIZZLE_XYZW, so
the old code did need the manual writemask adjustment, since it
constructed the registers the other way around.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-21 21:14:00 -07:00
Kenneth Graunke
cb36e79f96 i965/vec4: Delete some dead code in visit(ir_expression *).
Nothing uses the vector_elements temporary variable.

Setting this->result.file is dead because we overwrite this->result a
few lines later.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-21 21:13:37 -07:00
Kenneth Graunke
4d34c4b582 i965/fs: Generate better code for ir_triop_csel.
Previously, we generated an extra CMP instruction:

   cmp.ge.f0(8)   g4<1>D          g2<0,1,0>F      0F
   cmp.nz.f0(8)   null            g4<8,8,1>D      0D
   (+f0) sel(8)   g120<1>F        g2.4<0,1,0>F    g3<0,1,0>F

The first operand is always a boolean, and we want to predicate the SEL
on that.  Rather than producing a boolean value and comparing it against
zero, we can just produce a condition code in the flag register.

Now we generate:

   cmp.ge.f0(8)    null            g2<0,1,0>F      0F
   (+f0) sel(8)    g124<1>F        g2.4<0,1,0>F    g3<0,1,0>F

total instructions in shared programs: 5473459 -> 5473253 (-0.00%)
instructions in affected programs:     6219 -> 6013 (-3.31%)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-21 21:13:37 -07:00
Kenneth Graunke
32364a1fe5 glsl: Delete unused gl_uniform_driver_format enum values.
A while back, Matt made the uniform upload functions simply upload
ctx->Const.UniformBooleanTrue for boolean values instead of 0/1, which
removed the need to convert it later.  We also set UniformBooleanTrue to
1.0f for drivers which want to treat booleans as 0.0/1.0f.

Nothing ever sets these, so they are dead.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-21 18:53:13 -07:00
Rob Clark
36310d9d56 freedreno/a3xx: fix depth/stencil restore format
Also fix z16 restore format which was completely wrong.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Rob Clark
2bc2ab66d9 freedreno/a3xx: fix viewport state during clear
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Rob Clark
3eb8289aa4 freedreno: mark scissor state dirty when enable bit changes
We don't have a scissor enable bit in hw, so when a raster state change
results in scissor enable bit changing, we need to also mark scissor
state as dirty.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Rob Clark
01b757e2b0 freedreno: clear vs scissor
The optimization of avoiding restore (mem2gmem) if there was a clear
falls down a bit if you don't have a fullscreen scissor.  We need to
make the decision logic a bit more clever to keep track of *what* was
cleared, so that we can (a) completely skip mem2gmem if entire buffer
was cleared, or (b) skip mem2gmem on a per-tile basis for tiles that
were completely cleared.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Vinson Lee
1ab6543431 clover: Fix build error with LLVM 3.4.
DataLayoutPass was added in LLVM 3.5 r202168, commit
57edc9d4ff1648568a5dd7e9958649065b260dca "Make DataLayout a plain
object, not a pass.".

This patch fixes this build error with LLVM 3.4.

  CXX      llvm/libclllvm_la-invocation.lo
llvm/invocation.cpp: In function 'void {anonymous}::optimize(llvm::Module*, unsigned int, const std::vector<llvm::Function*>&)':
llvm/invocation.cpp:324:18: error: expected type-specifier
       PM.add(new llvm::DataLayoutPass(mod));
                  ^

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85189
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-10-21 15:40:47 -07:00
Marek Olšák
43b2432368 r600g,radeonsi: convert TGSI shader type to LLVM shader type
The values are hardcoded in the LLVM backend, but the TGSI definitions are
going to be changed with tessellation, e.g. TGSI_PROCESSOR_COMPUTE will be
increased by 2.

We'll use VS for LS and HS, because there's nothing special about them
from the LLVM backend point of view, even though the hardware side is
different. We do the same for ES.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:39:50 +02:00
Marek Olšák
c5a44cf3f8 radeonsi: add some missing register definitions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:39:50 +02:00
Marek Olšák
fc3b3354d7 radeonsi: load ring resource descriptors only once
v2: document the new functions

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:39:35 +02:00
Marek Olšák
d787608957 radeonsi: clarify shader constant load functions
I'll need indexed loads without the meta data flag for tessellation later.
Also rename load_const to buffer_load_const to distinguish it from indexed
const loads.

v2: add comments

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:35:44 +02:00
Marek Olšák
55a9b778c8 radeonsi: statically declare resource and sampler arrays
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:48 +02:00
Marek Olšák
e827bb6fe7 radeonsi: remove conversion of DX9 FACE input to GL
st/mesa and gallium expect the DX9 format, so this is useless.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:41 +02:00
Marek Olšák
a18f803a86 radeonsi: revert hack for random failures in glsl-max-varyings
This reverts commit 032e5548b3.

I've run glsl-max-varyings 30 times and it always passed.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:29 +02:00
Marek Olšák
b9b0973db2 radeonsi: generate shader pm4 states right after shader compilation
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:26 +02:00
Marek Olšák
c94af8f0d7 radeonsi: make pm4 state generation for shaders independent of the context
The si_pm4_delete_state calls became useless, because the pm4 state is
always generated only once.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:22 +02:00
Marek Olšák
139bde061a radeonsi: inline si_pm4_alloc_state
It seemed like the function needed a context pointer. Let's remove it
to make it less confusing.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:15 +02:00
Marek Olšák
22c5886f3f r300g: replace r300_get_num_samples with a util variant 2014-10-21 22:03:55 +02:00
Marek Olšák
013850a1b7 glsl_to_tgsi: use _mesa_copy_linked_program_data
This deduplicates some code.
2014-10-21 22:01:16 +02:00
Marek Olšák
9ec305ead7 glsl_to_tgsi: fix the value of gl_FrontFacing with native integers
We must convert it to boolean from the DX9 float encoding that Gallium
specifies.

Later, we should probably define that FACE should be 0 or ~0 if native
integers are supported.

Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-10-21 22:01:16 +02:00
Marek Olšák
e8764a4673 st/mesa: add ST_DEBUG=wf option which enables wireframe rendering
Useful for tessellation.
2014-10-21 22:01:16 +02:00
Marek Olšák
5f5b83cbba gallium: add PIPE_SHADER_CAP_MAX_OUTPUTS and use it in st/mesa
With 5 shader stages and various combinations of enabled and disabled shaders,
the maximum number of outputs in one shader doesn't have to be equal to
the maximum number of inputs in the following shader.

v2: return 32 for softpipe and llvmpipe
2014-10-21 21:59:02 +02:00
Eric Anholt
ef280c95f2 vc4: Fix SRC_ALPHA_SATURATE blending.
Fixes glean blendFunc.
2014-10-21 15:46:48 +01:00
Eric Anholt
cc298023c9 vc4: Fix stencil writemask handling.
If the writemask doesn't compress, then we want to put in the uncompressed
writemask, not the compressed writemask failure value (all-on).

Fixes glean's stencil2 and fbo-clear-formats on stencil.
2014-10-21 15:16:41 +01:00
Eric Anholt
48f6351940 vc4: Don't look at back stencil state unless two-sided stencil is enabled.
Fixes regressions in the next bugfix, because gallium util stuff leaves
the back stencil state as 0 if !back->enabled.
2014-10-21 15:16:41 +01:00
Rob Clark
4f17e026bb freedreno/ir3: add debug flag to disable cp
FD_MESA_DEBUG=nocp will disable copy propagation pass.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Ilia Mirkin
f0ca26725e freedreno: positions come out as integers, not half-integers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark
3fcb021201 freedreno/a3xx: disable early-z when we have kill's
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark
8a0ffedd8d freedreno/ir3: fix potential gpu lockup with kill
It seems like the hardware is unhappy if we execute a kill instruction
prior to last input (ei).  Probably the shader thread stops executing
and the end-input flag is never set.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark
ab33a24089 freedreno/ir3: comment + better fxn name
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark
94bb33617d freedreno/a3xx: only emit dirty consts
If app only updates (for example) vertex uniforms, it would be nice to
only re-emit those and not also frag uniforms.  Means we need to mark
the first frag shader const buffer dirty after a clear.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark
74069e324e freedreno/a3xx: more layer/level fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Brian Paul
aafbd89c5e mesa: fix 'feeedback' typo in comment
Trivial.
2014-10-20 11:53:34 -06:00
Brian Paul
4676c6c25b mesa: fix 'misalgned' typos in error messages
Trivial.
2014-10-20 11:50:49 -06:00
Brian Paul
14379a0644 glsl: fix several use-after-free bugs
The get_variable_being_redeclared() function can free the 'var' argument.
Thereafter, we cannot assume that 'var' is a valid  pointer.  This patch
replaces 'var->name' with 'earlier->name' in two places and calls
is_gl_identifier(var->name) before 'var' might get freed.

This fixes several piglit GLSL crashes, including:
spec/glsl-1.50/execution/geometry/clip-distance-in-param
spec/glsl-1.50/execution/geometry/clip-distance-bulk-copy
spec/glsl-1.50/compiler/gs-redeclares-pervertex-out-before-global-redeclaration.geom

I'm not sure why these were not spotted sooner.
A similar bug was previously fixed by f9cecca7a.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-20 08:59:32 -06:00
Tapani Pälli
953a0af8e3 mesa: validate sampler uniforms during gluniform calls
Patch fixes 'glsl-2types-of-textures-on-same-unit' in WebGL conformance
test suite. No Piglit regressions, fixes gl-2.0-active-sampler-conflict.

To avoid adding potentially heavy check during draw (valid_to_render),
check is done during uniform updates by inspecting TexturesUsed mask.

A new boolean variable is introduced to cache validation state.

v2: take into account case where 2 uniforms use same unit (curro)
    also do the check only when SSO is not in use, SSO has own
    path for sampler validation.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-20 11:07:12 +03:00
EdB
01d94193ac clover: Don't return CL_INVALID_VALUE if there is no header.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-20 10:35:10 +03:00
EdB
aa93af809f clover: Add allow_empty_tag.
To allow empty objs() list checks.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-20 10:35:10 +03:00
EdB
611d66fe45 clover: Add initial implementation of clCompileProgram for CL 1.2.
[ Francisco Jerez: General clean-up. ]

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-20 10:34:51 +03:00
EdB
fead2b0463 clover: Add a simple compat::pair.
std::pair is not c++98/c++11 safe.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-20 10:33:02 +03:00
Francisco Jerez
5583459655 clover/util: Allow using key_equals with pair-like objects other than std::pair. 2014-10-20 10:33:02 +03:00
Francisco Jerez
e987fd5dc6 clover/util: Define equality operators for a couple of compat classes. 2014-10-20 10:33:01 +03:00
Francisco Jerez
1441a3c1bb clover/util: Fix construction of compat::vector with a general container as argument. 2014-10-20 10:33:01 +03:00
Tapani Pälli
73dd50acf6 glsl: implement switch flow control using a loop
Patch removes old variable based logic for handling a break inside
switch. Switch is put inside a loop so that existing infrastructure
for loop flow control can be used for the switch, now also dead code
elimination works properly.

Possible 'continue' call inside a switch needs now special handling
which is taken care of by detecting continue, breaking out and calling
continue for the outside loop.

v2: remove one unnecessary ir_expression (Curro)

Fixes following Piglit tests:

   fs-exec-after-break.shader_test
   fs-conditional-break.shader_test

No Piglit or es3conform regressions.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-20 07:55:58 +03:00
Eric Anholt
6212d2402d vc4: Translate 4-byte index buffers to 2 bytes.
Fixes assertion failures in 14 piglit tests (half of which now pass).
2014-10-19 08:44:56 +01:00
Eric Anholt
572fba95e4 vc4: Add support for rebasing texture levels so firstlevel == 0.
GLES2 doesn't have GL_TEXTURE_BASE_LEVEL, so the hardware doesn't.  Fixes
piglit levelclamp, tex-miplevel-selection, and texture-storage/2D mipmap
rendering.
2014-10-19 08:42:33 +01:00
Eric Anholt
15eb4c59f6 vc4: Apply a Newton-Raphson step to improve RSQ
Fixes all the piglit built-in-functions/*sqrt tests, among others.
2014-10-18 10:08:59 +01:00
Eric Anholt
1fc124b80f vc4: Apply a Newton-Raphson step to improve RCP.
Fixes all the piglit floating-point *-op-div tests, among others.
2014-10-18 10:08:59 +01:00
Eric Anholt
0fdc5111b4 vc4: Add a little bit more packet parsing to make dump reading easier.
Probably should have done this *before* staring at all those render lists
today.
2014-10-18 10:08:59 +01:00
Chris Forbes
81041c4a4a meta/msaa-blit: consider weird sample count case unreachable
Suppresses a bunch of warning noise about sample_map possibly being used
uninitialized.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-10-18 19:09:28 +13:00
Jason Ekstrand
4656c14e57 i965/fs: Change the type of booleans to UD and emit correct immediates
Before, we used the a signed d-word for booleans and the immedates we
emitted varried between signed and unsigned.  This commit changes the type
to unsigned (I think that makes more sense) and makes immediates more
consistent.  This allows copy propagation to work better cleans up some
instructions.

total instructions in shared programs: 5473519 -> 5465864 (-0.14%)
instructions in affected programs:     432849 -> 425194 (-1.77%)
GAINED:                                27
LOST:                                  0

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-17 13:36:48 -07:00
Kenneth Graunke
ffe582aa20 i965/fs: Don't pass ir_variable * to emit_sampleid_setup().
gl_SampleID is a built-in variable that always is of type "int".

Suggested by Connor Abbott.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2014-10-17 13:03:18 -07:00
Eric Anholt
9ebfb3014e vc4: Make some assertions about how many flushes/EOFs the simulator sees.
This caught the previous commit's bug in the kernel validator.
2014-10-17 13:13:43 +01:00
Eric Anholt
1f7048419e vc4: Fix accidental dropping of the low bits of the store tilebuffer packet.
Notably this included the EOF flag (the other bits are the full buffer
dump selection, but we don't do full dumps), which caused the kernel
checking for frame completion to trigger.
2014-10-17 13:09:29 +01:00
Eric Anholt
afc3aa373d vc4: Set the primitive list format at the start of rendering.
The other driver does this manually before calling into each tile, but we
can just let it get binned into the tiles (saving repeated kernel
validation on the packet).

Fixes simulator assertion failures on polygon-mode and non-auto texwrap.
2014-10-17 13:09:28 +01:00
Eric Anholt
895c904103 vc4: Replace the FLUSH_ALL with FLUSH.
We don't need to emit all of our current state at the end of each bin
list.  We're going to be smashing it all at the start of the next tile's
bin list, anyway.
2014-10-17 13:09:28 +01:00
Eric Anholt
000976ed99 vc4: Add some comments about state management. 2014-10-17 13:09:28 +01:00
Eric Anholt
135287db17 vc4: Make sure there's exactly 1 tile store per tile coords packet.
It's not documented that I can see, but the other driver does it (check
vg_hw_4.c), and one of the HW guys confirmed that you really do need to do
it.
2014-10-17 13:09:25 +01:00
Michel Dänzer
c4db733fac winsys/radeon: Use a single buffer cache manager again
The trick is to generate a unique buffer usage value for each possible
combination of domains and flags, with only one bit set each for the
domains and flags. This ensures pb_check_usage() only returns TRUE when
the domains and flags the cached buffer was created for exactly match
the requested ones.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-17 17:09:49 +09:00
Tom Stellard
e1d363b3ff clover: Add environment variables for dumping kernel code v2
There are two debug variables:

CLOVER_DEBUG which you can set to any combination of llvm,clc,asm
(separated by commas) to dump llvm IR, OpenCL C, and native assembly.

CLOVER_DEBUG_FILE which you can set to a file name for dumping output
instead of stderr.  If you set this variable, the output will be split
into three separate files with different suffixes: .cl for OpenCL C,
.ll for LLVM IR, and .asm for native assembly.  Note that when data
is written, it is always appended to the files.

v2:
  - Code cleanups
  - Add CLOVER_DEBUG_FILE environment variable for dumping to a file.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 19:42:52 -04:00
Tom Stellard
76136c29bb clover: Register an llvm diagnostic handler v3
This will allow us to handle internal compiler errors.

v2:
  - Code cleanups.

v3:
  - More cleanups.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 19:42:41 -04:00
Tom Stellard
8e7df519bd clover: Add support for compiling to native object code v3
v2:
  - Split build_module_native() into three separate functions.
  - Code cleanups.

v3:
  - More cleanups.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 19:42:30 -04:00
Tom Stellard
8b7cc90cef gallium: Add PIPE_SHADER_IR_NATIVE to enum pipe_shader_ir
Drivers can return this value for PIPE_COMPUTE_CAP_IR_TARGET
if they want clover to give them native object code.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 19:42:22 -04:00
Tom Stellard
dc39b32c9b clover: Factor kernel argument parsing into its own function v2
v2:
  - Code cleanups.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 19:42:14 -04:00
Marek Olšák
833d698ad5 st/mesa: use pipe_sampler_view_release for releasing sampler views
This fixes a crash when exiting Firefox. I have really no idea how Firefox
does it. It seems to involve multiple contexts and multithreading.

v2: added an XXX comment

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81680

Acked by Christian König.
Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Benjamin Bellec <b.bellec@gmail.com>
2014-10-16 23:31:20 +02:00
Kenneth Graunke
63c6509ad2 mesa: Drop the "target" parameter from NewBufferObject().
NewBufferObject took a "target" parameter, which it blindly passed to
_mesa_initialize_buffer_object(), which ignored it.

Not much point in passing it around.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-16 10:56:19 -07:00
Andres Gomez
af31f930ab glsl: Update and fix typos in README. 2014-10-16 09:38:36 -07:00
Chris Forbes
2883aff3be i965: Flag BRW_ATOMIC_COUNTER_BUFFER when a possible ABO is respecified
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 22:31:44 +13:00
Chris Forbes
7bd6dfe934 mesa: Mark buffer objects that are used as atomic counter buffers
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 22:31:44 +13:00
Chris Forbes
f1261db1ee i965/disasm: Add missing message type for Gen7 DP untyped surface read
This is used to implement GLSL's atomicCounter() intrinsic. Previously
it *worked*, but the disassembly was bogus.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 22:31:43 +13:00
Chris Forbes
0dc56600aa i965: Correctly use ABO count to trigger flagging of new surfaces.
This would have *almost never* actually been an issue, since other state
tends to get flagged at the same time as new ABOs -- but still bogus.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-16 22:31:43 +13:00
Chris Forbes
25189c72ce i965: No longer reemit textures on BRW_NEW_UNIFORM_BUFFER
This didn't make any sense, but papered over the missing TexBO flagging
we've just fixed, in a bunch of cases.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-10-16 22:31:43 +13:00
Chris Forbes
1655f6fc61 i965: Dirty state in BO reallocation based on usage history
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-10-16 22:31:43 +13:00
Chris Forbes
c442745981 i965: Have mesa flag BRW_NEW_TEXTURE_BUFFER when a TexBO binding changes
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-10-16 22:31:43 +13:00
Chris Forbes
be5df28941 i965: Add new dirty flag for new TexBOs.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-10-16 22:31:43 +13:00
Chris Forbes
8db38ba4d2 mesa: Mark buffer objects that are used as TexBOs
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-10-16 22:31:43 +13:00
Chris Forbes
fe3133fe78 mesa: Mark buffer objects which are bound as UBOs
When a buffer object is bound to one of the indexed uniform buffer
binding points, assume that from that point on it may be used as
a uniform buffer.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-10-16 22:31:43 +13:00
Chris Forbes
3d989467f1 mesa: Add usage history bitfield to buffer objects
In the drivers, we occasionally want to reallocate the backing
store for a buffer object; often to avoid waiting for the GPU
to be finished with the previous contents.

At the point that happens, we don't have a good way of determining
where else the buffer object may be bound, and so no good way of
determining which dirty flags need to be raised -- it's fairly
expensive to go looking at all the possible binding points.

Until now, we've considered any BO to be possibly bound as a UBO or
TexBO, and flagged all that state to be reemitted.

Instead, remember what kinds of binding point this buffer has ever
been used with, so that the drivers can flag only what they need.
I don't expect these bits to ever be reset, but that doesn't matter
for reasonable apps.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-10-16 22:31:43 +13:00
Emil Velikov
79d09a4b12 vc4: correctly include the source files
The kernel files are built into a separate static library and
all the functions that require it are already wrapped in ifdef
USE_VC4_SIMULATOR. Don't forget the header file :)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-10-16 10:00:14 +01:00
Connor Abbott
70fa53be5e i965/fs: don't make a fake ir_texture in the Mesa IR frontend
Now that we've made all the texture emit code mostly independent of GLSL
IR, this isn't necessary any more.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:05:25 -07:00
Kenneth Graunke
b17f571945 i965/fs: Refactor the texture emission logic into a single function.
Before, we had 3 different emit functions for various different gen's,
as well as some ancilliary work that was the same across all gen's which
was either contained in functions or duplicated across the GLSL IR and
Mesa IR backends. Now, we have a single method, emit_texture(), that
takes all the information needed to make a texture instruction and
handles all the setup, and all we have to do to emit a texture
instruction while converting from GLSL IR, Mesa IR, or any new backend
is to extract the information emit_texture() needs and then call it.

v2: Significant rebasing (by Ken).

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:05:22 -07:00
Connor Abbott
9e95d8ebf8 i965/fs: Make gather_channel() not use ir_texture.
Our new IR won't have ir_texture objects.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:05:20 -07:00
Connor Abbott
12d9a8cd86 i965/fs: Make swizzle_result() not use ir_texture.
Our new IR won't have ir_texture objects.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:05:19 -07:00
Connor Abbott
cf94dfdb96 i965/fs: fix integer textures with swizzles
This happened to work before, but it would convert the output to a float
and then back to an integer which seems bad.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:05:16 -07:00
Connor Abbott
7c8f0b7cd9 i965/fs: don't pass in ir_texture to emit_texture_*
At this point, the only thing it's used for is the opcode.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:05:14 -07:00
Connor Abbott
4bffcb7e8e i965/fs: don't use ir->type in emit_texture_gen4()
We already have the type from the original destination.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:05:05 -07:00
Connor Abbott
eaadc43192 i965/fs: Don't use ir->lod_info.grad.dPd<x,y> in emit_texture_*.
This drops a dependency on ir_texture objects.

v2 (Ken): Rename lod_components to grad_components, as it only has a
          meaningful value for ir_txd.  We could set it to 1 for TXL,
          but there's no real need.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:05:00 -07:00
Connor Abbott
cbde5407c9 i965/fs: Don't use ir->coordinate in emit_texture_*.
This drops a dependency on ir_texture objects.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:04:58 -07:00
Connor Abbott
a8905e8c09 i965/fs: make rescale_texcoord() not use ir_texture.
Our new IR won't have ir_texture objects, but using glsl_type is fine.

v2 (Ken): Drop redundant ir->coordinate NULL check; rebase.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:04:56 -07:00
Connor Abbott
e599837fed i965/fs: Make emit_mcs_fetch() not use ir_texture.
Our new IR won't have ir_texture objects.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:04:54 -07:00
Kenneth Graunke
465373535e i965/fs: Rename "length" to "components" in emit_mcs_fetch().
This is slightly clearer.  Based on a patch by Connor Abbott.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:04:52 -07:00
Connor Abbott
fa212c6b98 i965: Make brw_texture_offset() not use ir_texture.
Our new IR won't have ir_texture objects.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:04:50 -07:00
Connor Abbott
a71455bc99 i965/fs: don't use ir->offset in emit_texture_gen5.
v2 (Ken): Refactor the Gen7 code separately; rebase.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:04:47 -07:00
Kenneth Graunke
1f76fcf231 i965/fs: Move texel offset handling to visit(ir_texture *).
This moves the handling of non-constant texel offset subexpression trees
to the place where we visit other such subtrees.  It also removes some
uses of ir->offset in emit_texture_gen7, which will be useful when we
write the backend for our new upcoming IR.

Based on a patch by Connor Abbott.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:04:45 -07:00
Kenneth Graunke
cee2027574 i965: Drop ir->op != ir_txf condition in offset checking.
brw_lower_unnormalized_offset sets ir->offset to NULL if it applies the
texelFetchOffset workarounds, so there's no need to special case it
here---there won't be an offset for ir_txf.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:04:43 -07:00
Kenneth Graunke
a2c3cfbb4d i965: Restore a lost comment about TXF offset bugs.
Eric's original code to work around TXF offset bugs contained a comment
explaining the problem, which was lost when Chris generalized it to an
IR transformation (in commit 598ca510b8).

This commit adds the original comment to the newer code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-15 17:04:27 -07:00
Rob Clark
652b8fbbbb freedreno/ir3: large const support
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:49 -04:00
Rob Clark
e71a3f80fb freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Rob Clark
dd332fe641 freedreno: fix layer_stride
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Rob Clark
8233b36a17 freedreno: inline fd_draw_emit()
Manual LTO

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Rob Clark
368466b7b7 freedreno/ir3: optimize shader key comparision
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Rob Clark
d595987ea3 freedreno/a3xx: refactor/optimize emit
Because we reuse various bits of emit code (for state/vertex/prog/etc)
for both regular draws and internal draws (gmem<->mem, clear, etc), the
number of parameters getting passed around has been growing.  Refactor
to group these into fd3_emit.  This simplifies fxn signatures, avoids
passing around shader key on the stack, etc.  It also gives us a nice
place to cache shader-variant lookup to avoid looking up shader variants
multiple times per draw (without having to *also* pass them around as
fxn args everywhere).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Rob Clark
d5d80b3739 freedreno/a3xx: refactor vertex state emit
Get rid of fd3_vertex_buf and use fd_vertex_state directly for all
draws.  Removes a tiny bit of CPU overhead for munging around the vertex
state every time it is emitted, but more importantly it cleans things up
for later optimizations, so the emit paths don't have to special case
internal draws (gmem<->mem, clears, etc) with regular draws.

Instead of constructing fd3_vertex_buf array each time for internal
draws, and context init time pre-create solid_vbuf_state and
blit_vbuf_state.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Eric Anholt
57de9bbb63 vc4: Fix the uniform debug output.
I dropped the shader index when moving to the compiled shader struct, but
didn't update the format string here.
2014-10-15 18:12:03 +01:00
Eric Anholt
201d4c0b2a vc4: Add support for user clip plane and gl_ClipVertex.
Fixes about 15 piglit tests about interpolation and clipping.
2014-10-15 18:11:46 +01:00
Eric Anholt
6a0bf67048 vc4: Move the output semantics setup to a helper.
I want to reuse it elsewhere to set up outputs that aren't in the TGSI.
2014-10-15 18:11:46 +01:00
Kenneth Graunke
39a5a60b57 i965: Allow CSE on Gen4-5 unary math.
Due to the implicit move-from-GRF, unary math looks a lot like the Gen6+
math instruction: it's a single instruction (SEND) with a GRF source.
The difference is that it also implicitly clobbers a message register.

The only visible effect is that CSE will remove the MRF-clobbering from
later math operations.  This should be fine; compute_to_mrf and
remove_redundant_mrf_writes don't look at the values populated by
implied writes, so they can't rely on those values being present.
Less interference may actually help those passes make more progress.

Binary math is still problematic, since it involves a separate MOV
instruction to load the second operand.  We continue disallowing CSE for
binary math operations.

total instructions in shared programs: 3340303 -> 3340100 (-0.01%)
instructions in affected programs:     26927 -> 26724 (-0.75%)
Nothing hurt, gained, or lost.  ~6% reduction on a few shaders.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-15 08:44:54 -07:00
Michel Dänzer
159f93cf39 r600g,radeonsi: Only set use_staging_texture = TRUE once
No need to check for setting the flag after we set it already.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-15 16:26:30 +09:00
Michel Dänzer
87da286755 r600g,radeonsi: Use staging texture for transfers if any miplevel is tiled
We set the NO_CPU_ACCESS flag for BO allocation in that case, so direct CPU
access may not work.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-15 16:26:14 +09:00
Michel Dänzer
3ede67a4c6 winsys/radeon: Use separate caching buffer manager for each set of flags
Otherwise the caching buffer manager may return a buffer which was created
with a different set of flags, which can cause trouble.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-15 16:11:40 +09:00
Andres Gomez
657764c21c configure.ac: check for libexpat when no pkg-config is available
Previously, when no pkg-config was available for
libexpat we would just add the needed linking
flags without any extra check.

Now, we check that the library and the headers are
also installed in the building environment.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-15 08:59:12 +02:00
Tom Stellard
8cf6482c3d clover: Fix regression in module serialization
We need to serialize semantic information for arguments, which was added
in 06139c56fa.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-14 17:58:06 -04:00
Jason Ekstrand
3435aa49f4 i965/fs: Use the correct regs_written on unspill instructions
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-14 12:39:45 -07:00
Ilia Mirkin
742158b51e st/gbm: fix order of arguments passed to is_format_supported
Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2014-10-14 12:33:38 -04:00
Ilia Mirkin
5524af8136 nouveau: 3d textures are unsupported, limit 3d levels to 1
Ideally there would be a swrast fallback, but the driver isn't ready for
that. This should avoid crashes if someone tries to use 3d textures
though.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: mesa-stable@lists.freedesktop.org
2014-10-14 12:33:38 -04:00
Rob Clark
abe3b3d1e0 freedreno: use tgsi_lowering
Now that the freedreno_lowering code is moved to tgsi_lowering, remove
our private copy and switch over to using the common version.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-14 12:30:08 -04:00
David Heidelberger
d2c1d9693f r300/compiler: remove useless check
This code is already in if (!variable->C->is_r500) so no need check
twice.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz>
2014-10-14 12:18:32 -04:00
Nick Sarnie
e5bf8d38db ilo: Build pipe-loader for ilo
Trivial patch to create the pipe loader for ilo. All the code was already there.

Signed-off-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-14 16:16:08 +01:00
Emil Velikov
af897df508 automake: explicitly set TARGET_RADEON_{WINSYS,COMMON}
Originally the variables were set only once via the ?= operator but
that causes issues when doing incremental builds. They appear to be
undefined and missing from the dependency list despite their addition
to LIBADD.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84807
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-14 16:16:08 +01:00
Eric Anholt
a2d8b6dbd5 vc4: Fix render target NPOT alignment at small miplevels.
The texturing hardware takes the POT level 0 width/height and minifies
those.  This is different from what we were doing, for example, for
273-wide's level 5: POT(273>>5) == 8, while POT(273)>>5 == 16.

Fixes piglit-depthstencil-render-miplevels 273.
2014-10-14 14:57:50 +01:00
Eric Anholt
b5fc9d5664 vc4: Add support for having 0 vertex elements used.
You have to load at least 1, according to the simulator.  Fixes 4 piglit
tests and even more ES2 conformance tests.
2014-10-14 11:29:48 +01:00
Vinson Lee
a2fd55cfb6 auxilary/os: Add DragonFly BSD support in os_get_total_physical_memory.
This patch fixes this build error on DragonFly BSD.

  CC       os/os_misc.lo
os/os_misc.c: In function 'os_get_total_physical_memory':
os/os_misc.c:132:2: error: #error Unsupported *BSD

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-10-13 23:40:46 -07:00
Daniel Manjarres
291be28476 glx: Fix glxUseXFont for glxWindow and glxPixmaps
The current implementation of glxUseXFont requires creating
a temporary pixmap and graphics context, which requires a real
old-school X11 Window, not a glxDrawable. This patch changes
things so that glxUseXFont will also accept a glxWindow or
glxPixmap, and lookup the underlying X11 Drawable. Without
this patch glxUseXFont generates a giant stream of Xerrors
about bad drawables and bad graphics contexts.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54372

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-10-13 22:05:58 -06:00
Chia-I Wu
4e2cf84b1f ilo: clear writer pointer after unmapping
It does not look like an issue now but it is good to be future proof.  Spotted
by Courtney Goeltzenleuchter.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-10-14 08:54:25 +08:00
Eric Anholt
615bbf0ca6 vc4: Write the VPM read setup multiple times to queue all the inputs.
There's a 4-element fifo, and the size (number of dwords per vertex) field
is just 4 bits.

Fixes glsl-routing on sim.
2014-10-13 17:16:05 +01:00
Eric Anholt
e1d1c39626 vc4: Add support for the TXL opcode.
There's a bit at the bottom of cube map stride (which has some formatting
bugs in the docs) which flips the bias coordinate to being an absolute
LOD.
2014-10-13 17:15:47 +01:00
Eric Anholt
5bc91b6e32 vc4: Improve the accuracy of SIN and COS.
This gets them to pass glsl-sin/cos.  There was an obvious problem that I
was using the FRC code on the scaled input value, which means that we had
a range in [0, 1], while our taylor is most accurate across [-0.5, 0.5].
We can just slide things over, but that means flipping the sign of the
coefficients.  After that, it was just a matter of stuffing more
coefficients in.
2014-10-13 17:15:47 +01:00
Kenneth Graunke
02ca66fbc3 i965: Use unsynchronized maps for the program cache on LLC platforms.
There's no reason to stall on pwrite - the CPU always appends to the
buffer and never modifies existing contents, and the GPU never writes
it.  Further, the CPU always appends new data before submitting a batch
that requires it.

This code predates the unsynchronized mapping feature, so we simply
didn't have the option when it was written.

Ideally, we would do this for non-LLC platforms too, but unsynchronized
mapping support only exists for LLC systems.

Saves a bunch of stall avoidance copies when uploading shaders.

v2: Rebase on changes to previous patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1]
2014-10-13 06:55:08 -07:00
Kenneth Graunke
4c766c7959 i965: Issue performance warnings when copying the program cache BO.
We don't really want unnecessary buffer copying, so it'd be nice to know
when it's happening.

v2: Drop stall warnings when doing a read-only CPU mapping of the cache
    BO.  The GPU also uses it in a read-only fashion, so there won't be
    any stalls, even though the buffer is busy.  (Thanks to Chris Wilson
    for catching this mistake.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1]
2014-10-13 06:53:54 -07:00
Kenneth Graunke
a4fe727a5d i965: Issue performance warnings on MapBufferRange stalls.
This is easy: we just need to use brw_map_bo instead of mapping it
directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-13 06:44:05 -07:00
Eric Anholt
5d72a1c956 vc4: Match VS outputs to FS inputs.
If the VS doesn't output a value that the FS needs, we still need to read
the right contents for the remaining FS inputs, by emitting padding.  And
if the VS outputs something the FS doesn't need, we shouldn't put it in
the VPM at all (so the code producing it can get DCEed).

Fixes 77 piglit tests.
2014-10-13 13:23:48 +01:00
Christian König
d561a42bc1 configure: use $libdir/dri as default for VA-API
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-13 12:43:59 +02:00
Christian König
966ae170b0 configure: remove superflous VA-API line from configure.ac
We don't have GALLIUM_STATE_TRACKERS_DIRS any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-13 12:43:54 +02:00
Christian König
d3004a267a configure: respect $libdir for the OMX installation dir
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-13 12:43:49 +02:00
Christian König
5ce06d12ff configure: Revert "ask vdpau.pc for the default location of the vdpau drivers"
This reverts commit bbe6f7f865.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-13 12:43:05 +02:00
Eric Anholt
83365a5b57 vc4: Add support for the CEIL opcode.
Not as big of a deal as SSG, but still +9 piglit tests.
2014-10-13 08:06:48 +01:00
Eric Anholt
926eaa9af4 vc4: Add support for the SSG opcode. 2014-10-13 08:06:48 +01:00
Emil Velikov
b86f814afd docs: add news item and link release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-13 02:14:02 +01:00
Emil Velikov
fc6345a916 docs: Add sha256 sums for the 10.3.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit fa98c74692)
2014-10-13 02:06:29 +01:00
Emil Velikov
04fae07f0e Add release notes for the 10.3.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 088d350178)
2014-10-13 02:06:20 +01:00
Emil Velikov
66ea8a581d docs: Add sha256 sums for the 10.2.9 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 52bd154980)
2014-10-13 02:05:53 +01:00
Emil Velikov
f5e61295cd Add release notes for the 10.2.9 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 9f1149876f)
2014-10-13 02:05:22 +01:00
Glenn Kennard
a327fa3a06 r600g: Implement GL_ARB_sample_shading
Also fixes two sided lighting which was broken at least
on pre-evergreen by commit b1eb00.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2014-10-12 23:53:57 +02:00
Marek Olšák
75e97e2e3f radeonsi: use tgsi_shader_info in si_llvm_emit_fs_epilogue
This is the last use tgsi_parse_token in radeonsi.

It looks ugly because the code was re-indented, but there is really no change
in behavior.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:57 +02:00
Marek Olšák
558f7770a7 radeonsi: remove si_shader_output_values::index
It's redundant now.

It led to a simplification in si_llvm_emit_streamout, because outidx == reg.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:57 +02:00
Marek Olšák
ec0d16872b radeonsi: use tgsi_shader_info in si_llvm_emit_vs_epilogue
That code was really ugly.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:57 +02:00
Marek Olšák
8067732740 radeonsi: remove shader->input[] and output[] arrays and dependencies
They were reinventing tgsi_shader_info. They are unused now.

radeon_llvm_context::load_input can be NULL if input fetching is implemented
in some other way.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:57 +02:00
Marek Olšák
8b057ddaea radeonsi: move param_offset out of shader->input[] and output[]
Those are going away.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:57 +02:00
Marek Olšák
02134cfaae radeonsi: use tgsi_shader_info to get a list of GS outputs
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:54 +02:00
Marek Olšák
101905d3f7 radeonsi: use tgsi_shader_info in si_update_spi_map
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:54 +02:00
Marek Olšák
6f04cf7fac radeonsi: simplify dereferences in si_update_spi_map
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:54 +02:00
Marek Olšák
639f6b41d2 radeonsi: use tgsi_shader_info in si_shader_vs
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:54 +02:00
Marek Olšák
fa933438a2 radeonsi: use tgsi_shader_info in si_shader_ps
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:54 +02:00
Marek Olšák
e23fec1445 radeonsi: use tgsi_shader_info in fetch_input_gs
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:51 +02:00
Marek Olšák
7a645c5366 radeonsi: don't rely on shader->output in si_llvm_emit_fs_epilogue
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:52:16 +02:00
Marek Olšák
216cf86ec4 radeonsi: use tgsi_shader_info in si_llvm_emit_es_epilogue
tgsi_shader_info contains everything we need.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:52:13 +02:00
Marek Olšák
34e8200599 radeonsi: don't recompile shaders when changing nr_cbufs from 0 to 1
Both cases are equivalent.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:52:07 +02:00
Marek Olšák
5e0fbe1b63 radeonsi: remove vs.ucps_enabled from the shader key
Written CLIPDIST outputs are simply disabled in PA_CL_VS_OUT_CNTL.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:52:02 +02:00
Marek Olšák
a9592cd3ac radeonsi: assume ClipDistance usage mask is always 0xf
No code in Mesa sets the usage mask to any other value.
The final mask is AND'ed with enable bits from the rasterizer state anyway.

If somebody implements setting usage masks in st/mesa, we can use
tgsi_shader_info to get it more easily.

This is a prerequisite for the following commit.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:51:44 +02:00
Francisco Jerez
2286edce16 clover: Fix unintended fall-through in kernel::argument::bind. 2014-10-12 11:44:05 +03:00
Jan Vesely
5bffc5e262 clover: Append implicit arguments to the kernel argument list.
[ Francisco Jerez: Split off from a larger patch, and take a slightly
  different approach for passing the implicit arguments around. ]

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-12 01:50:13 +03:00
Francisco Jerez
bf89a97748 clover: Pass execution dimensions and offset to the kernel as implicit arguments.
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2014-10-12 01:44:19 +03:00
Francisco Jerez
06139c56fa clover: Add semantic information to module::argument for implicit parameter passing.
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2014-10-12 01:39:21 +03:00
Francisco Jerez
27c51b5f58 clover: Use unreachable() from util/macros.h instead of assert(0).
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-11 12:44:09 +03:00
Vinson Lee
5480d6b13f gallium: Add tokens for DragonFly BSD.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Brian Paul <brianp@vmware.com>
2014-10-10 21:32:35 -07:00
Chia-I Wu
566d1889ea ilo: disassemble compacted instructions
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-10-11 11:55:50 +08:00
Erik Faye-Lund
326e303175 glsl: improve accuracy of atan()
Our current atan()-approximation is pretty inaccurate at 1.0, so
let's try to improve the situation by doing a direct approximation
without going through atan.

This new implementation uses an 11th degree polynomial to approximate
atan in the [-1..1] range, and the following identitiy to reduce the
entire range to [-1..1]:

atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x)

This range-reduction idea is taken from the paper "Fast computation
of Arctangent Functions for Embedded Applications: A Comparative
Analysis" (Ukil et al. 2011).

The polynomial that approximates atan(x) is:

x   * 0.9999793128310355 - x^3  * 0.3326756418091246 +
x^5 * 0.1938924977115610 - x^7  * 0.1173503194786851 +
x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444

This polynomial was found with the following GNU Octave script:

x = linspace(0, 1);
y = atan(x);
n = [1, 3, 5, 7, 9, 11];
format long;
polyfitc(x, y, n)

The polyfitc function is not built-in, but too long to include here.
It can be downloaded from the following URL:

http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m

This fixes the following piglit test:
shaders/glsl-const-folding-01

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-10 20:44:27 +02:00
Eric Anholt
070b2c2efc vc4: Use the fnv1 hash function instead of gallium util's crc32.
Improves simulated norast performance on a little benchmark by 13.4012%
+/- 2.08459% (n=13).
2014-10-10 15:49:34 +02:00
Eric Anholt
d09509da2a vc4: Don't look up the compiled shaders unless state has changed.
Improves simulated norast performance on a little benchmark by 38.0965%
+/- 3.27534% (n=11).
2014-10-10 15:49:22 +02:00
Eric Anholt
c6f50c4086 vc4: Actually clear the context's dirty flags.
I was trying to skip state updates when !dirty, and suspiciously
everything was always dirty.
2014-10-10 15:03:13 +02:00
Eric Anholt
7c474f9f2e vc4: Optimize the other case of SEL_X_Y wih a 0 -> SEL_X_0(a).
Cleans up some output to be more obvious in a piglit test I'm looking at.
2014-10-10 15:03:12 +02:00
Tapani Pälli
ac557b4c12 mesa: fix error reported on gTexSubImage2D when level not valid
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
2014-10-10 15:01:51 +03:00
Kenneth Graunke
94841b6d5d i965: Fix register write checks.
When mapping the buffer a second time, we need to use the new pointer,
not the one from the previous mapping.  Otherwise, we will most likely
crash.

Apparently, we've just been getting lucky and getting the same
bo->virtual pointer in both cases.  libdrm probably has a hand in that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2014-10-10 00:04:39 +02:00
Eric Anholt
7e67ea994c vc4: Optimize out adds of 0. 2014-10-09 21:47:06 +02:00
Eric Anholt
0401f55fff vc4: Optimize fmul(x, 0) and fmul(x, 1).
This was being generated frequently by matrix multiplies of 2 and
3-channel vertex attributes (which have the 0 or 1 loaded in the shader).
2014-10-09 21:47:06 +02:00
Eric Anholt
1cd8c1aab0 vc4: Factor out the turn-it-into-a-mov in opt_algebraic.
This will be used more in the next commits.
2014-10-09 21:47:06 +02:00
Eric Anholt
40748cf8d9 vc4: Eliminate unused texture instructions. 2014-10-09 21:47:06 +02:00
Eric Anholt
b73cab6826 vc4: Dead code eliminate unused SF instructions. 2014-10-09 21:47:06 +02:00
Eric Anholt
93cac2637b vc4: Prevent copy propagating out the MOVs from r4.
Copy propagating these might result in reading the r4 after some other
instruction has written r4.  Just prevent all copy propagation of this for
now.

Fixes bad rendering with upcoming indirect register access support, where
the copy propagation was consistently happening across another read.
2014-10-09 21:47:06 +02:00
Eric Anholt
c4b0dd5356 vc4: Split the coordinate shader to its own vc4_compiled_shader.
Merging VS and CS into the same struct wasn't winning us anything except
for not allocating a separate BO (but if we want to pack programs into
BOs, we should pack not just those 2 programs together).  What it was
getting us was a bunch of code duplication about hash table lookups and
propagating vc4_compile contents into a vc4_compiled_shader.

I was about to make the situation worse with indirect uniform buffer
access.
2014-10-09 21:47:06 +02:00
Eric Anholt
5c72d7706c vc4: Add #defines for the texture uniform fields.
I wanted to make another set of texture uploads for handling reladdr
constants, and duplicating all the bitshifting looked like a terrible
idea.  In the process, this fixes a swap of the s/t texture wrap modes.
2014-10-09 21:47:06 +02:00
Eric Anholt
5cfab07639 vc4: Initialize undefined temporaries to 0.
Under the simulator, reading registers before writing them triggers an
assertion failure.  c->undef gets treated as r0, which will usually be
written, but not if it's used in the first instruction.  We should
definitely not be aborting in this case, and return some sort of undefined
value instead.

Fixes glsl-user-varying-ff.
2014-10-09 21:47:06 +02:00
Kenneth Graunke
4ce11de4ae i965: Skip uploading border color when unnecessary.
The border color is only needed when using the GL_CLAMP_TO_BORDER or
(deprecated) GL_CLAMP wrap modes; all others ignore it, including the
common GL_CLAMP_TO_EDGE and GL_REPEAT wrap modes.

In those cases, we can skip uploading it entirely, saving a bit of space
in the batchbuffer.  Instead, we just point it at the start of the
batch (offset 0); we have to program something, and that address is safe
to read.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-09 15:43:18 +02:00
Kenneth Graunke
b7844d1248 i965: Use BDW_MOCS_PTE for renderbuffers.
Write-back caching cannot be used for buffers being scanned out by the
display engine; surfaces used for scan-out must be write-through or
uncached.  I originally chose WT for render targets because it works in
all cases.  However, we really want to use write-back caching where
possible, as it is more efficient.

Most renderbuffers are not used for scanout - off-screen FBOs certainly
are fine, and non-pageflipped backbuffers should be fine as well.  So
in most cases WB will work.  However, we don't know what will be used
for scan-out, so we instead simply use the PTE value specified by the
kernel, as it knows these things.

This matches our MOCS choice on Haswell.

Fixes performance regressions since commit ee4484be3d
in a microbenchmark (spotted by Eero Tamminen).  Improves performance
in GLBenchmark 2.7/EgyptHD by 7.44362% +/- 0.496939% (n=55) on a
Broadwell GT2.  Improves performance in a bunch of other microbenchmarks
by ~15% or so.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: mesa-stable@lists.freedesktop.org
2014-10-09 15:03:32 +02:00
Kenneth Graunke
d489a8a73d i965: Add a BRW_MOCS_PTE #define.
Like BDW_MOCS_WB and BDW_MOCS_WT, this specifies that we want to use all
three caches (L3, LLC, and eLLC where available), but leaves the LLC
caching mode up to the kernel's page table entry.

This allows the kernel to pick WB/WT/UC based on whether it's using a
buffer for scanout.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: mesa-stable@lists.freedesktop.org
2014-10-09 15:02:55 +02:00
Kenneth Graunke
8892a6b79c mesa: Make _mesa_print_arrays use stderr.
These days, most driver debug output happens via stderr, not stdout.
Some applications (such as Xephyr) also appear to close stdout which
makes these messages go nowhere.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-10-09 15:02:42 +02:00
Michel Dänzer
7b4276d7ac r600g,radeonsi: Always use GTT again for PIPE_USAGE_STREAM buffers
Putting those in VRAM can cause long pauses due to buffers being moved
into / out of VRAM.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84662
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-09 18:11:44 +09:00
Eric Anholt
5a13522898 vc4: Optimize SF(ITOF(x)) -> SF(x).
This is a common production of st_glsl_to_tgsi, because CMP takes a float
argument.
2014-10-09 11:01:18 +02:00
Eric Anholt
00a9aebfe0 vc4: Add some optimization of FADD(FSUB(0, x)).
This is a common production of st_glsl_to_tgsi, which uses negate flags on
source arguments to handle subtraction.
2014-10-09 11:01:18 +02:00
Eric Anholt
67aea92964 vc4: Mostly fix offset calculation for NPOT mipmap levels.
The non-base NPOT levels are stored as POT-aligned images.  We get that
POT alignment by minifying the POT-aligned base level.

This means that level strides are also POT aligned, so we have to tell the
rendering mode config that our resource is larger than the actual
requested area.

Fixes the fbo-generatemipmap-formats NPOT cases.  Regresses
depthstencil-render-miplevels 273 * -- the texture presentation now works
(where it was completely broken before), it looks like there's some
overflow of image bounds happening at the lower miplevels.
2014-10-09 11:01:09 +02:00
Eric Anholt
0b96a086cb vc4: Move the mirrored kernel code to a kernel/ directory.
Now this whole setup matches the kernel's file layout much more closely.
2014-10-09 09:46:39 +02:00
Eric Anholt
ef9914aa74 vc4: Enable LIT lowering in TGSI instead of our own code.
This brings us the -128/128 clamping on the w component.
2014-10-08 22:47:39 +02:00
Eric Anholt
9773d45908 vc4: Fix scalar math opcodes to replicate their result from the X channel.
Thanks to robclark for pointing out that I was probably failing to do this
when I reported a "bug" in his lowering code.
2014-10-08 22:47:39 +02:00
Chia-I Wu
4e50a32be6 ilo: fix rectlist on GEN7+
It was broken by 343b014b57.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-10-09 03:37:04 +08:00
Eric Anholt
581418585e vc4: Add support for two-sided color.
It's fairly easy, thanks to Rob Clark's lowering code.  Fixes
two-sided-lighting and 4 vertex-program-two-side testcases, while
regressing 8 testcases that involve enabling two-sided color while only
initializing one of the two colors in the VS.  If you're enabling two
sided color, it's of course expected that you really do set up both
colors, so this is still an improvement (and when we set up a linker for
TGSI, we'll hopefully fix those 8 fails).
2014-10-08 17:45:16 +02:00
Eric Anholt
4dccdbf5cb vc4: Enable POW lowering in TGSI instead of our own code. 2014-10-08 17:42:59 +02:00
Eric Anholt
1aef5a337f vc4: Enable DP lowering in TGSI instead of our own code. 2014-10-08 17:42:59 +02:00
Eric Anholt
4f6e4c7370 vc4: Start using tgsi_lowering for opcodes we haven't supported before. 2014-10-08 17:42:59 +02:00
Eric Anholt
f9854e169f gallium: Rename freedreno parts of tgsi_lowering.[ch].
Acked-by: Rob Clark <robclark@freedesktop.org>
2014-10-08 17:42:59 +02:00
Eric Anholt
19df602b39 gallium: Reformat tgsi_lowering.c for the normal style.
Acked-by: Rob Clark <robclark@freedesktop.org>
2014-10-08 17:42:59 +02:00
Eric Anholt
3141dc8e87 gallium: Copy fd_lowering.[ch] to tgsi_lowering.[ch] for code sharing.
Lots of drivers need to transform the weird instructions in TGSI into
reasonable scalar ops, and this code can make those translations
canonical.

Acked-by: Rob Clark <robclark@freedesktop.org>
2014-10-08 17:42:59 +02:00
Eric Anholt
84caf5a861 vc4: Set unused raddr fields to QPU_R_NOP.
The simulator assertion fails if you have a write to a reg and then a read
(for example, in the NOP side of an instruction), even if the read isn't
used for anything.  By setting unused raddrs to NOP, we avoid the problem
(since only the phsyical registers are tracked).
2014-10-08 17:42:59 +02:00
Eric Anholt
48af7426f2 vc4: Abstract out the field-merging logic for instructions.
I'm going to be doing the same logic for some more fields next.
2014-10-08 17:42:59 +02:00
Niels Ole Salscheider
acdcef6788 r600: Use DMA transfers in r600_copy_global_buffer
v2: Do not demote items that are already in the pool

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
2014-10-07 15:59:43 -04:00
Iago Toral Quiroga
fd31628c49 glsl: Optimize min/max expression trees
Original patch by Petri Latvala <petri.latvala@intel.com>:

Add an optimization pass that drops min/max expression operands that
can be proven to not contribute to the final result. The algorithm is
similar to alpha-beta pruning on a minmax search, from the field of
AI.

This optimization pass can optimize min/max expressions where operands
are min/max expressions. Such code can appear in shaders by itself, or
as the result of clamp() or AMD_shader_trinary_minmax functions.

This optimization pass improves the generated code for piglit's
AMD_shader_trinary_minmax tests as follows:

total instructions in shared programs: 75 -> 67 (-10.67%)
instructions in affected programs:     60 -> 52 (-13.33%)
GAINED:                                0
LOST:                                  0

All tests (max3, min3, mid3) improved.

A full shader-db run:

total instructions in shared programs: 4293603 -> 4293575 (-0.00%)
instructions in affected programs:     1188 -> 1160 (-2.36%)
GAINED:                                0
LOST:                                  0

Improvements happen in Guacamelee and Serious Sam 3. One shader from
Dungeon Defenders is hurt by shader-db metrics (26 -> 28), because of
dropping of a (constant float (0.00000)) operand, which was
compiled to a saturate modifier.

Version 2 by Iago Toral Quiroga <itoral@igalia.com>:

Changes from review feedback:
- Squashed various cosmetic changes sent by Matt Turner.
- Make less_all_components return an enum rather than setting a class member.
  (Suggested by Mat Turner). Also, renamed it to compare_components.
- Make less_all_components, smaller_constant and larger_constant static.
  (Suggested by Mat Turner)
- Change mixmax_range to call its limits "low" and "high" instead of
  "range[0]" and "range[1]". (Suggested by Connor Abbot).
- Use ir_builder swizzle helpers in swizzle_if_required(). (Suggested by
  Connor Abbot).
- Make the logic more clearer by rearrenging the code and commenting.
  (Suggested by Connor Abbot).
- Added comment to explain why we need to recurse twice. (Suggested by
  Connor Abbot).
- If we cannot prune an expression, do not return early. Instead, attempt
  to prune its children. (Suggested by Connor Abbot).

Other changes:
- Instead of having a global "valid" visitor member, let the various functions
  that can determine this status return a boolean and check for its value
  to decide what to do in each case. This is more flexible and allows to
  recurse into children of parents that could not be prunned due to invalid
  ranges (so related to the last bullet in the review feedback).
- Make sure we always check if a range is valid before working with it. Since
  any use of get_range, combine_range or range_intersection can invalidate
  a range we should check for this situation every time we use any of these
  functions.

Version 3 by Iago Toral Quiroga <itoral@igalia.com>:

Changes from review feedback:
- Now we can make get_range, combine_range and range_intersection static too
  (suggested by Connor Abbot).
- Do not return NULL when looking for the larger or greater constant into
  mixed vector constants. Instead, produce a new constant by doing a
  component-wise minmax. With this we can also remove of the validations when
  we call into these functions (suggested by Connor Abbot).
- Add a comment explaining the meaning of the baserange argument in
  prune_expression (suggested by Connor Abbot).

Other changes:
- Eliminate minmax expressions operating on constant vectors with mixed values
  by resolving them.

No piglit regressions observed with Version 3.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76861

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2014-10-07 12:37:51 +02:00
Tapani Pälli
16b53005a7 glsl: do not emit error for non written varyings on OpenGL ES
Patch fixes following test case from 'shaders-with-varyings' WebGL
conformance suite: "vertex shader with unused varying and fragment
shader with used varying must succeed"

v2: emit still a warning if the condition happens (Ian)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-07 08:28:51 +03:00
Michel Dänzer
be0a994fb8 radeonsi: Use dummy pixel shader if compilation of the real shader failed
Instead of crashing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79155#c5
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-07 12:07:13 +09:00
Chia-I Wu
f358462640 ilo: let shaders determine surface counts
When a shader needs N surfaces, we should upload N surfaces and not depend on
how many are bound.  This commit is larger than it should be because we did
not export how many surfaces a surface uses before.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-10-06 15:10:30 +08:00
Chia-I Wu
ca824e6940 ilo: let shaders determine sampler counts
When a shader needs N samplers, we should upload N samplers and not depend on
how many are bound.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-10-04 23:18:51 +08:00
Marek Olšák
0c4bc1e292 tgsi: change tgsi_shader_info::properties to a one-dimensional array
Reviewed-by: Roland Scheidegger <sroland@vmware.com>

v2: fix svga too
2014-10-04 15:36:39 +02:00
Marek Olšák
1f6c0b55df radeonsi: set number of userdata SGPRs of GS copy shader to 4
It only needs the constant buffer with clip planes and read-write resources
for the GS->VS ring and streamout. That's 2 pointers.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-04 15:16:15 +02:00
Marek Olšák
68d36c0bb5 radeonsi: pass the GS shader directly to si_generate_gs_copy_shader
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-04 15:16:15 +02:00
Marek Olšák
aeb05f011e radeonsi: set LLVMByValAttribute for all descriptor arrays
I hope this is correct.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-04 15:16:15 +02:00
Marek Olšák
91f1a79f78 radeonsi: make the vertex shader key smaller
We only support 16 vertex attribs, not 32.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-04 15:16:14 +02:00
Marek Olšák
90611297fa radeonsi: don't flush shader caches when building PM4 shader states
This is a wrong place to flush caches to say the least.

I don't think we need to flush the instruction caches if we don't patch
shaders with DMA.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-04 15:16:14 +02:00
Marek Olšák
10e386f4aa radeonsi: remove interp_at_sample from the key, use TGSI_INTERPOLATE_LOC_SAMPLE
st/mesa has the same flag in its shader key, we don't need to do it
in the driver anymore.

Instead, use TGSI_INTERPOLATE_LOC_SAMPLE, which is what st/mesa sets.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-04 15:16:14 +02:00
Marek Olšák
0a2d6f0c4e radeonsi: move geometry shader properties from si_shader to si_shader_selector
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-04 15:16:14 +02:00
Marek Olšák
54de709911 radeonsi: always compile shaders on demand
The first compiled shader is sometimes useless, because the key doesn't match
the key for the draw call where it's used.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-04 15:16:14 +02:00
Marek Olšák
6c9f61c97e radeonsi: remove unused variable si_shader::gs_input_prim
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-04 15:16:14 +02:00
Marek Olšák
7dc0164192 tgsi: remove some not so useful variables from tgsi_shader_info 2014-10-04 15:16:14 +02:00
Marek Olšák
8860584045 radeonsi: get fs_write_all from tgsi_shader_info directly
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-04 15:16:14 +02:00
Marek Olšák
8908fae243 tgsi: simplify shader properties in tgsi_shader_info
Use an array of properties indexed by TGSI_PROPERTY_* definitions.
2014-10-04 15:16:14 +02:00
Marek Olšák
5233568861 radeonsi: get tgsi_shader_info only once before compilation
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-04 15:16:14 +02:00
Marek Olšák
af4f5a7c97 gallium/util: add util_bitcount64
I'll need this in radeonsi.

v2: use __builtin_popcountll if available

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-04 15:16:14 +02:00
Marek Olšák
837907b8b3 radeonsi: fix CS tracing and remove excessive CS dumping 2014-10-04 15:16:14 +02:00
Ilia Mirkin
c74be01e80 gk110/ir: add dnz flag emission for fmul/fmad
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-10-03 20:37:59 -04:00
Ilia Mirkin
d58037ccf5 gm107/ir: add dnz emission for fmul
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
2014-10-03 20:37:59 -04:00
Brian Paul
90dc71b454 st/wgl: add WINAPI qualifiers on wgl function typedefs
Fixes a release build segfault when wglCreateContextAttribsARB()
calls the wglCreateContext() function.

Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matthew McClure <mcclurem@vmware.com>
2014-10-03 13:45:52 -06:00
Rob Clark
7297bdbd50 freedreno: query fixes
Fixes a few issues, including a potential empty-IB (which triggers gpu
hangs in piglit occlusion_query_meta_no_fragments)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-03 14:19:52 -04:00
Rob Clark
a262c601d3 freedreno/a3xx: handle VS only outputting BCOLOR
Possibly we should map the front color to black (zeroes).  But not sure
there is a way to do that without generating a shader variant.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-03 14:19:52 -04:00
Rob Clark
af4d088395 freedreno/ir3: fix lockups with lame FRAG shaders
Shaders like:

  FRAG
  PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
  DCL IN[0], GENERIC[0], PERSPECTIVE
  DCL OUT[0], COLOR
  DCL SAMP[0]
  DCL TEMP[0], LOCAL
  IMM[0] FLT32 {    0.0000,     1.0000,     0.0000,     0.0000}
    0: TEX TEMP[0], IN[0].xyyy, SAMP[0], 2D
    1: MOV OUT[0], IMM[0].xyxx
    2: END

cause unhappyness.  They have an IN[], but once this is compiled the
useless TEX instruction goes away.  Leaving a varying that is never
fetched, which makes the hw unhappy.

In the process fix a signed vs unsigned compare.  If the vertex shader
has max_reg=-1, MAX2() vs an unsigned would not give the desired result.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-03 14:19:52 -04:00
Matt Turner
cabc93c5ad i965/compaction: Disable compaction on SNB temporarily.
Will investigate after XDC.
2014-10-03 10:41:57 -07:00
Matt Turner
0d5c9bf1e4 Revert "i965: Emit ELSE/ENDIF JIP with type D on Gen 7."
This reverts commit 54e30dbf4d.

Will investigate after XDC.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84557
2014-10-03 10:02:24 -07:00
Matt Turner
b59db8e0f0 i965/fs: Remove dead generate_rep_fb_write prototype.
Added in commit f9dc7aab.
2014-10-03 10:02:24 -07:00
Brian Paul
c7f0755caa mesa: fix spurious wglGetProcAddress / GL_INVALID_OPERATION error
On Windows, the Piglit primitive-restart test was failing a
glGetError()==0 assertion when it was run w/out any command line
arguments.  Piglit's all.py script only runs primitive-restart
with arguments so this case isn't normally hit during a full
piglit run.

The basic problem is Microsoft's opengl32.dll calls glFlush
from wglGetProcAddress() and Piglit uses wglGetProcAddress() to
resolve glPrimitiveRestartNV() which is called inside glBegin/End.
See comments in the code for more info.

Plus, improve the comments for _mesa_alloc_dispatch_table().

Cc: <mesa-stable@lists.freedesktop.org>
Acked-by: Sinclair Yeh <syeh@vmware.com>
2014-10-03 10:04:48 -06:00
Ilia Mirkin
33c9ad97bf freedreno/ir3: add TXF support
Still failing a bunch of the fairly picky texelFetch tests, but the
1D(Array) ones are full passes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-02 23:30:47 -04:00
Ilia Mirkin
e6acf3ac24 freedreno/ir3: add TXD support and expose ARB_shader_texture_lod
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-02 23:30:47 -04:00
Ilia Mirkin
c49107c889 freedreno/ir3: add texture offset support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-02 23:30:47 -04:00
Ilia Mirkin
5bba74c64b freedreno/ir3: shadow comes before array
Experimentally, this makes *ArrayShadow tex-miplevel-selection tests
pass.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-02 23:30:47 -04:00
Ilia Mirkin
81b34e4461 freedreno/ir3: make TXQ return integers, not floats
We're still doing something wrong for array textures.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-02 23:30:47 -04:00
Ilia Mirkin
c4e2a196c3 freedreno/ir3: add UMAD support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-02 23:30:47 -04:00
Ilia Mirkin
347bc197a6 freedreno/ir3: add ISSG support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-02 23:30:47 -04:00
Ilia Mirkin
ad5db64e7e freedreno/ir3: add MOD support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-02 23:30:47 -04:00
Ilia Mirkin
cab3cb1d71 freedreno/ir3: add UMOD support, based on UDIV
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-02 23:30:47 -04:00
Ilia Mirkin
8f7d01c2cb freedreno/ir3: add IDIV/UDIV support
Logic shamelessly copied from nv50 lowering pass.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-02 23:30:47 -04:00
Michel Dänzer
ed03747e6a radeonsi: Clear sampler view flags when binding a buffer
Fixes assertion failure while running the Unreal Engine 4 Elemental demo:

.../si_blit.c:322:si_decompress_color_textures: Assertion `tex->cmask.size || tex->fmask.size' failed.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-03 11:15:38 +09:00
Eric Anholt
ca00070259 vc4: Add support for framebuffer sRGB encoding. 2014-10-02 18:29:18 -07:00
Eric Anholt
24d9980562 vc4: Add support for sampling from sRGB.
This isn't perfect -- the filtering is happening on the srgb values, and
we're decoding afterwards, which is not what you want.  I think that's the
cause of some additional texwrap(GL_CLAMP, LINEAR) failures, though many
other texwrap tests on srgb start to pass since unfiltered values come out
correct.
2014-10-02 18:28:45 -07:00
Ilia Mirkin
3dd9a0d6fd freedreno/ir3: avoid fan-in sources referring to same instruction
Since the RA has to be done s.t. each one gets its own (adjacent)
register, it would complicate matters if instructions were allowed to be
repeated. This enables copy-propagation use in situations where
previously that might have happened.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-02 21:05:50 -04:00
Rob Clark
f5eeb8a6dc freedreno/a3xx: emit all immediates in one shot
Makes the command stream a bit tighter when there are lots of
immediates.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-02 21:05:50 -04:00
Ilia Mirkin
be00852bae freedreno: instanced drawing/compute not yet supported
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-02 21:05:50 -04:00
Dave Airlie
8df3c02cdc mesa: fix GetTexImage for 1D array depth textures
While running piglit in virgl, I hit an assert in intel driver.

"qemu-system-x86_64: intel_tex.c:219: intel_map_texture_image: Assertion `tex_image->TexObject->Target != 0x8C18 || h == 1' failed."

Thanks to Eric and Ken for pointing me in the right direction,

Fix the get_tex_depth to do the same fixup as get_tex_rgba does
for 1D array textures.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-10-03 10:37:55 +10:00
Tomasz Figa
b4ffd19e6c st/mesa: Fix paths used in Android builds
With current makefiles the build fails because source and build paths
are generated incorrectly. With Android build system the top_srcdir and
top_builddir variables are undefined and all paths are relative to where
Android.mk is located. This ends up with path likes
external/mesa/src/mesa/src/mesa/ for both source and build paths, which
are obviously wrong.

This patch fixes this by overriding resulting SRCDIR and BUILDDIR
variables with empty string, so that paths end up being relative to
Android.mk file again. Appending correct build path to generated files
is already done in Android.gen.mk.

Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-03 01:25:35 +01:00
Tomasz Figa
98445fd25e st/mesa: Generate format_info.c in Android builds
Current Android makefiles lack generation of format_info.c, which is
a dependency of main/format.c. This patch adds necessary code to
Android.gen.mk.

Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-03 01:25:32 +01:00
Tomasz Figa
d703abf735 util: Include in Android builds
This patch fixes Android build failures by including src/util directory
in compilation. Files inside of this directory are compiled into
libmesa_util static library and linked with resulting libGLES_mesa.

Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-03 01:25:28 +01:00
Jason Ekstrand
493bfa54a5 i965/fs: Use the correct base_mrf for spilling pairs in SIMD8
Before, we were hard-coding the base_mrf based on dispatch width not number
of registers spilled at a time.  This caused us to emit instructions with a
base_mrf or 14 and a mlen of 3 so we used the magical non-existant m16
register.  This fixes the problem.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-02 16:38:25 -07:00
Jason Ekstrand
50d0e2e118 i965/fs: Add a MAX_GRF_SIZE define and use it various places
Previously, we had a MAX_SAMPLER_MESSAGE_SIZE which we used instead.
However, some FB write messages can validly be longer than this so we need
something different.  Since MAX_SAMPLER_MESSAGE_SIZE is validly useful on
its own, we leave it alone and add a new MAX_GRF_SIZE that's big enough for
FB writes.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84539
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-02 14:14:25 -07:00
Jason Ekstrand
b33e5465a7 i965/fs: Use the actual regsister width in brw_reg_from_fs_reg
This fixes a bug where 1-wide operations don't properly translate down to
1-wide instructions.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-02 13:17:03 -07:00
Jason Ekstrand
75986830b4 i965/fs_fp: Use null_reg from fs_visitor instead of rolling our own
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84529
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-02 13:17:03 -07:00
Rob Clark
7309c6126f freedreno/a3xx: handle large shader program sizes
Above a certain limit use CACHE mode instead of BUFFER mode.  This
should solve gpu hangs with large shader programs.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-02 13:57:07 -04:00
Rob Clark
d01ee5923d freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-02 13:57:07 -04:00
Ilia Mirkin
3dc47c5960 freedreno: dual-source render targets are not supported
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-02 13:57:07 -04:00
Ilia Mirkin
786f01c492 gallium/hud: use u_sampler_view_default_template helper
The existing code was not setting several fields, most importantly the
target, which is required on nv50/nvc0.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-02 12:18:21 -04:00
Iago Toral Quiroga
db8cd4d519 glsl: Fix memory leak in builtin_builder::_image_prototype.
in_var calls the ir_variable constructor, which dups the variable name.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-02 15:39:05 +02:00
Tapani Pälli
f4b4ae8c24 mesa: relax draw api validation on ES2
Patch fixes failing test in WebGL conformance test
'point-no-attributes' when running Chrome on OpenGL ES.
(Shader program may draw points using constant data in shader.)

No Piglit regressions.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-02 11:55:13 +03:00
Ilia Mirkin
3914dc579e glsl: make consistent use of DECLARE_RALLOC_CXX_OPERATORS
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-02 00:59:35 -04:00
Eric Anholt
4111b1d54b vc4: Fix the mapping of the minification filter to HW values.
They're actually as documented in the HW specs and the GL mipmapping enums
order.  Fixes fbo-generatemipmap-filtering , and some other tests where we
were off by a few bits due to unexpected linear filtering.
2014-10-01 17:03:36 -07:00
Eric Anholt
75f8e0bc2a vc4: Make the last static array in vc4_program.c dynamically sized. 2014-10-01 17:03:35 -07:00
Eric Anholt
ebff93ac19 vc4: Fix some broken indentation. 2014-10-01 17:03:35 -07:00
Eric Anholt
d7a0502a54 vc4: Add support for the FACE semantic.
Fixes glsl-fs-frontfacing.
2014-10-01 17:03:35 -07:00
Eric Anholt
1bf2d17a60 vc4: Add support for TGSI_OPCODE_CLAMP.
This will be used by the shared LIT lowering code.
2014-10-01 17:03:35 -07:00
Eric Anholt
0c8c7d32f0 vc4: Fix compiler warning 2014-10-01 17:03:35 -07:00
Anuj Phogat
25266b2c11 meta: Fix make check failures in setup_glsl_msaa_blit_scaled_shader()
introduced by commit 68ee950.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reported-by: Mark Janes <mark.a.janes@intel.com>
2014-10-01 15:27:31 -07:00
Brian Paul
44b500f5f2 mesa: fix _mesa_alloc_dispatch_table() declaration
Insert 'void' parameter to match declaration in api_exec.h.  Trivial.
2014-10-01 15:17:47 -06:00
Roland Scheidegger
dea0fcf4e6 meta: (trivial) remove accidental double semicolon 2014-10-01 23:14:46 +02:00
Anuj Phogat
4330fa970b i965: Enable EXT_framebuffer_multisample_blit_scaled for gen8
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-10-01 12:04:15 -07:00
Anuj Phogat
68ee950c78 meta: Implement ext_framebuffer_multisample_blit_scaled extension
Extension enables doing a multisample buffer resolve and buffer
scaling using a single glBlitFrameBuffer() call. Currently, we
have this extension implemented in BLORP which is only used by
SNB and IVB. This patch implements the extension in meta path
which makes it available to Broadwell.

Implementation features:
 - Supports scaled resolves of 2X, 4X and 8X multisample buffers.

 - Avoids unnecessary shader compilations by storing the pre compiled
   shaders for each supported sample count.

 - Uses bilinear filtering for both GL_SCALED_RESOLVE_FASTEST_EXT and
   GL_SCALED_RESOLVE_NICEST_EXT filter options. This is an allowed
   behavior in the extension's spec.

 - I tried doing bicubic filtering for GL_SCALED_RESOLVE_NICEST_EXT
   filter. It made the edges in the image look little smoother but
   the image gets blurred causing no overall quality improvement.
   For now I have dropped the idea of doing different filtering for
   nicest filter.

V2:
 - Minor changes to simplify the fragment shader.
 - Refactor the code to move i965 specific sample_map computation out
   of Meta. We now use ctx->Const.SampleMap{2,4,8}x variables initialized
   by the driver.
 - Use a simple msaa resolve shader for scaled resolves with scaling
   factor = 1.0.

V3:
 - Make changes to create a string out of ctx->Const.SampleMap{2,4,8}x
   variables and use it in fragment shader.

V4:
 - Make changes to use uint8_t type ctx->Const.SampleMap{2,4,8}x
   variables.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-10-01 12:04:15 -07:00
Anuj Phogat
7a4790148c i965: Initialize the SampleMap{2,4,8}x variables
with values specific to Intel hardware.

V2: Define and use gen6_get_sample_map() function to initialize
    the variables.

V3: Change the function name to gen6_set_sample_maps() and use
    memcpy() to fill in the data.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-10-01 12:04:15 -07:00
Anuj Phogat
38cd40faab mesa: Add new variables in gl_context to store sample layout
SampleMap{2,4,8}x variables are used in later patches to implement
EXT_framebuffer_multisample_blit_scaled extension.

V2: Use integer array instead of a string.
    Bump up the comment.

V3: Use uint8_t type array.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-10-01 12:04:15 -07:00
Leo Liu
4f7916ab4f st/va: implement vlVa(Query|Create|Get|Put|Destroy)Image
This patch implements functions for images support,
which basically supports copy data between video
surface and user buffers, in this case supports
SW decode, and other video output

v2: fix buffer size for odd-sized image case
    expose I420 format as well
v3: fix YUV 4:2:2 format data buffer size
    cleanup I420 format  exposure

Signed-off-by: Leo Liu <leo.liu@amd.com>
2014-10-01 13:21:36 -04:00
Christian König
7913c8943a st/va: implement Picture functions for mpeg2 h264 and vc1
This patch implements codec for mpeg2 h264 and vc1,
populates codec parameters and pass them to HW driver.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
2014-10-01 13:21:36 -04:00
Christian König
1be5515838 st/va: implement Context Surface and Buffer
This patch implements context managements, relate it HW driver,
functions for video surface managements, and functions for
application data memory buffer managements.

implemented functions:
vlVa(Create|Destroy)Context
vlVa(Create|Destroy|Put)Surfaces
vlVa(Create|Destroy)Buffer

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
2014-10-01 13:21:36 -04:00
Christian König
2825ef3abf st/va: implement vlVa(Create|Destroy|Query|Get)Config
This patch is for application to query configuration,
such as profiles, entrypoints, and attributes

v2: fix missing profile with query

Signed-off-by: Michael Varga <michael.varga@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
2014-10-01 13:21:36 -04:00
Christian König
3867933ecb st/va: skeleton VAAPI state tracker
This patch adds a skeleton VA-API state tracker,
which is filled with live in the subsequent patches.

v2: fixes in configure.ac and va state_tracker Makefile.am
v3: do not link against libva.
    detect libva version, and correctly set driver entrypoint name.
    rebase(cleanup) targets/va/Makefile.am
v4: cleanup va version auto detection
    add back targets/va/va.sym

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-01 13:21:36 -04:00
Leo Liu
0eb8f89981 st/vdpau: move common functions to util
Break out these functions so that they can be shared with a other
state trackers.  They will be used in subsequent patches for the new
VA-API state tracker.

Signed-off-by: Leo Liu <leo.liu@amd.com>
2014-10-01 13:21:36 -04:00
Rob Clark
204dd73c99 freedreno: max-texture-lod-bias should be 15.0f
Fixes piglit lodbias test.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-01 07:28:06 -04:00
Kenneth Graunke
95073a2dca mesa: Avoid flagging _NEW_VIEWPORT on redundant viewport updates.
Cuts the number of i965 color calculator viewport uploads by 100x
(11017983 -> 113385) in 'x11perf -gc' with Glamor in Xephyr.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-01 01:08:26 -07:00
Kenneth Graunke
0a1730200e i965: Drop CACHE_NEW_VS_PROG from the gen7_sf_state atom.
I believe when I wrote this code, gen6_sf_state used CACHE_NEW_VS_PROG,
which has since been replaced by BRW_NEW_VUE_MAP_GEOM_OUT.  It's not
needed here anyway - only SBE needs it.  Just a copy and paste mistake.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-01 01:08:07 -07:00
Kenneth Graunke
106e0db769 i965: Drop brwBindProgram driver hook.
This function flagged BRW_NEW_*_PROGRAM

When ctx->{Vertex,Geometry,Fragment}Program._Current changes, core Mesa
calls the BindProgram driver hook, which flagged BRW_NEW_*_PROGRAM.

However, brw_upload_state also checks for that changing, sets the same
flags, and also updates brw->fragment_program and so on.  So, this looks
to be entirely redundant.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-01 01:05:41 -07:00
Kenneth Graunke
e25a453b7f i965: Add missing /* BRW_NEW_FRAGMENT_PROGRAM */ comments.
I had to dig a bit to figure out why this was necessary.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-01 01:05:39 -07:00
Kenneth Graunke
3d31ed0d93 i965: Use "1ull" instead of "1" in BRW_NEW_* defines.
Now that the bitfield is a uint64_t, we should use 1ull.  Currently, we
only have 32 entries, so 1 works fine, but it's not future-proof.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-01 01:05:38 -07:00
Kenneth Graunke
a114f452ae i965: Use ~0ull when flagging all BRW_NEW_* dirty flags.
~0 is 0xFFFFFFFF, which only covers the first 32 bits.  We need all 64.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-01 01:05:36 -07:00
Kenneth Graunke
5105f9a7ae i965: Fix INTEL_DEBUG=state to work with 64-bit dirty bits.
This will keep INTEL_DEBUG=state working when we add BRW_NEW_* bits
beyond 1 << 31.  We missed doing this when widening the driver flags
from uint32_t to uint64_t.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-01 01:05:35 -07:00
Kenneth Graunke
fbebd5e4a5 i965: Delete CACHE_NEW_BLORP_CONST_COLOR_PROG.
Unused since krh rewrote fast clears to use meta.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-01 01:05:24 -07:00
Chris Forbes
e4e3b0fc0d i965: Fix typo in comment
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-01 18:37:06 +13:00
Chris Forbes
d8c5c4f3e4 i965: Fix spelling of GEN7_SAMPLER_EWA_ANISOTROPIC_ALGORITHM
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-01 18:37:06 +13:00
Vinson Lee
6a238ac0b7 llvmpipe: Add missing LLVMGetGlobalContext() arg in lp_test_format.c.
Fix build error introduced with commit
eedbce9c63.

lp_test_format.c: In function ‘test_format_unorm8’:
lp_test_format.c:226:4: error: too few arguments to function ‘gallivm_create’
    gallivm = gallivm_create("test_module_unorm8");
    ^
In file included from ../../../../src/gallium/auxiliary/gallivm/lp_bld_format.h:38:0,
                 from lp_test_format.c:42:
../../../../src/gallium/auxiliary/gallivm/lp_bld_init.h:58:1: note: declared here
 gallivm_create(const char *name, LLVMContextRef context);
 ^

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84538
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2014-09-30 21:52:13 -07:00
Keith Packard
3202926746 glx/dri3: Provide error diagnostics when DRI3 allocation fails
Instead of just segfaulting in the driver when a buffer allocation fails,
report error messages indicating what went wrong so that we can debug things.

As a simple example, chromium wraps Mesa in a sandbox which doesn't allow
access to most syscalls, including the ability to create shared memory
segments for fences. Before, you'd get a simple segfault in mesa and your 3D
acceleration would fail. Now you get:

$ chromium --disable-gpu-blacklist
[10618:10643:0930/200525:ERROR:nss_util.cc(856)] After loading Root Certs, loaded==false: NSS error code: -8018
libGL: pci id for fd 12: 8086:0a16, driver i965
libGL: OpenDriver: trying /local-miki/src/mesa/mesa/lib/i965_dri.so
libGL: Can't open configuration file /home/keithp/.drirc: Operation not permitted.
libGL: Can't open configuration file /home/keithp/.drirc: Operation not permitted.
libGL error: DRI3 Fence object allocation failure Operation not permitted
[10618:10618:0930/200525:ERROR:command_buffer_proxy_impl.cc(153)] Could not send GpuCommandBufferMsg_Initialize.
[10618:10618:0930/200525:ERROR:webgraphicscontext3d_command_buffer_impl.cc(236)] CommandBufferProxy::Initialize failed.
[10618:10618:0930/200525:ERROR:webgraphicscontext3d_command_buffer_impl.cc(256)] Failed to initialize command buffer.

This made it pretty easy to diagnose the problem in the referenced bug report.

Bugzilla: https://code.google.com/p/chromium/issues/detail?id=415681
Signed-off-by: Keith Packard <keithp@keithp.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 21:23:04 -07:00
Keith Packard
f7a355556e glx/dri3: Use four buffers until X driver supports async flips
A driver which doesn't have async flip support will queue up flips without any
way to replace them afterwards. This means we've got a scanout buffer pinned
as soon as we schedule a flip and so we need another buffer to keep from
stalling.

When vblank_mode=0, if there are only three buffers we do:

        current scanout buffer = 0 at MSC 0

        Render frame 1 to buffer 1
        PresentPixmap for buffer 1 at MSC 1

                This is sitting down in the kernel waiting for vblank to
                become the next scanout buffer

        Render frame 2 to buffer 2
        PresentPixmap for buffer 2 at MSC 1

                This cannot be displayed at MSC 1 because the
                kernel doesn't have any way to replace buffer 1 as the pending
                scanout buffer. So, best case this will get displayed at MSC 2.

Now we block after this, waiting for one of the three buffers to become idle.
We can't use buffer 0 because it is the scanout buffer. We can't use buffer 1
because it's sitting in the kernel waiting to become the next scanout buffer
and we can't use buffer 2 because that's the most recent frame which will
become the next scanout buffer if the application doesn't manage to generate
another complete frame by MSC 2.

With four buffers, we get:

        current scanout buffer = 0 at MSC 0

        Render frame 1 to buffer 1
        PresentPixmap for buffer 1 at MSC 1

                This is sitting down in the kernel waiting for vblank to
                become the next scanout buffer

        Render frame 2 to buffer 2
        PresentPixmap for buffer 2 at MSC 1

                This cannot be displayed at MSC 1 because the
                kernel doesn't have any way to replace buffer 1 as the pending
                scanout buffer. So, best case this will get displayed at MSC
                2. The X server will queue this swap until buffer 1 becomes
                the scanout buffer.

        Render frame 3 to buffer 3
        PresentPixmap for buffer 3 at MSC 1

                As soon as the X server sees this, it will replace the pending
                buffer 2 swap with this swap and release buffer 2 back to the
                application

        Render frame 4 to buffer 2
        PresentPixmap for buffer 2 at MSC 1

                Now we're in a steady state, flipping between buffer 2 and 3
                waiting for one of them to be queued to the kernel.

        ...

        current scanout buffer = 1 at MSC 1

                Now buffer 0 is free and (e.g.) buffer 2 is queued in
                the kernel to be the scanout buffer at MSC 2

        Render frames, flipping between buffer 0 and 3

When the system can replace a queued buffer, and we update Present to take
advantage of that, we can use three buffers and get:

        current scanout buffer = 0 at MSC 0

        Render frame 1 to buffer 1
        PresentPixmap for buffer 1 at MSC 1

                This is sitting waiting for vblank to become the next scanout
                buffer

        Render frame 2 to buffer 2
        PresentPixmap for buffer 2 at MSC 1

                Queue this for display at MSC 1
                1. There are three possible results:

                  1) We're still before MSC 1. Buffer 1 is released,
                     buffer 2 is queued waiting for MSC 1.

                  2) We're now after MSC 1. Buffer 0 was released at MSC 1.
                     Buffer 1 is the current scanout buffer.

                     a) If the user asked for a tearing update, we swap
                        scanout from buffer 1 to buffer 2 and release buffer
                        1.

                     b) If the user asked for non-tearing update, we
                        queue buffer 2 for the MSC 2.

                In all three cases, we have a buffer released (call it 'n'),
                ready to receive the next frame.

        Render frame 3 to buffer n
        PresentPixmap for buffer n

                If we're still before MSC 1, then we'll ask to present at MSC
                1. Otherwise, we'll ask to present at MSC 2.

Present already does this if the driver offers async flips, however it does
this by waiting for the right vblank event and sending an async flip right at
that point.

I've hacked the intel driver to offer this, but I get tearing at the top of
the screen. I think this is because flips are always done from within the
ring, and so the latency between the vblank event and the async flip happening
can cause tearing at the top of the screen.

That's why I'm keying the need for the extra buffer on the lack of 2D
driver support for async flips.

Signed-off-by: Keith Packard <keithp@keithp.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Tested-by: Dylan Baker <baker.dylan.c@gmail.com>
2014-09-30 20:08:28 -07:00
Jason Ekstrand
eedbce9c63 i965/fs: Fix the build 2014-09-30 17:27:33 -07:00
Jason Ekstrand
83669fac9d i965/fs: Fix an uninitialized value warnings
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 17:26:05 -07:00
Roland Scheidegger
9750ae8ca9 galahad: fix indirect draw
Need to unwrap the indirect resource otherwise bad things will happen.

Fixes random crashes and timeouts with piglit's arb_indirect_draw tests.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-10-01 02:17:24 +02:00
Roland Scheidegger
e3da8c110c galahad: (trivial) handle cubemap arrays
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-10-01 02:16:57 +02:00
Matt Turner
3e7f8005db i965/fs: Emit compressed BFI2 instructions on Gen > 7.
IVB had a restriction that prevented us from emitting compressed
three-source instructions, and although that was lifted on Haswell,
Haswell had a new restriction that said BFI instructions specifically
couldn't be compressed.
2014-09-30 17:09:34 -07:00
Matt Turner
9f5e5bd34d i965/fs: Allow SIMD16 borrow/carry/64-bit multiply on Gen > 7.
These checks were intended for Gen 7 only. None of these restrictions
apply to Gen 8.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-30 17:09:34 -07:00
Matt Turner
05586f9bc1 i965/fs: Set MUL source type to W/UW in 64-bit mul macro on Gen8.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-30 17:09:34 -07:00
Matt Turner
94b68109fb i965/fs: Optimize sqrt+inv into rsq.
Transform

   sqrt a, b
   rcp  c, a

into

   sqrt a, b
   rsq  c, b

The improvement here is that we've broken a dependency between these
instructions. Leads to 330 fewer INV instructions and 330 more RSQ.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-30 17:09:34 -07:00
Matt Turner
b52126b44f i965/vec4: Optimize sqrt+inv into rsq.
Transform

   sqrt a, b
   rcp  c, a

into

   sqrt a, b
   rsq  c, b

In most cases the sqrt's result is still used, so the improvement here
is that we've broken a dependency between these instructions. Leads to
80 fewer INV instructions and 80 more RSQ.

Occasionally the sqrt's result is no longer used, leading to:

instructions in affected programs:     5005 -> 4949 (-1.12%)

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-30 17:09:34 -07:00
Matt Turner
189ac07764 i965/vec4: Call opt_algebraic after opt_cse.
The next patch adds an algebraic optimization for the pattern

   sqrt a, b
   rcp  c, a

and turns it into

   sqrt a, b
   rsq  c, b

but many vertex shaders do

   a = sqrt(b);
   var1 /= a;
   var2 /= a;

which generates

   sqrt a, b
   rcp  c, a
   rcp  d, a

If we apply the algebraic optimization before CSE, we'll end up with

   sqrt a, b
   rsq  c, b
   rcp  d, a

Applying CSE combines the RCP instructions, preventing this from
happening.

No shader-db changes.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-30 17:09:34 -07:00
Matt Turner
d13bcdb3a9 i965/fs: Extend predicated break pass to predicate WHILE.
Helps a handful of programs in Serious Sam 3 that use do-while loops.

instructions in affected programs:     16114 -> 16075 (-0.24%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-30 17:09:34 -07:00
Mathias Fröhlich
6e7d36fd2c gallivm: Fix build for LLVM 3.2
Do not rely on LLVMMCJITMemoryManagerRef being available.
The c binding to the memory manager objects only appeared
on llvm-3.4.
The change is based on an initial patch of Brian Paul.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-10-01 00:29:31 +02:00
Rob Clark
cc355f1c06 freedreno: destroy transfer pool after blitter
Blitter can still have transfers hanging around which it frees in
util_blitter_destroy().  So let it clean up before we yank the
transfer_pool from under it.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-30 16:56:15 -04:00
Rob Clark
01ff0b28b3 freedreno/lowering: fix token calculation for lowering
Indirect registers consume an additional token.  Try to clean up the
token calculation math a bit, and fix it at the same time.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-30 16:56:15 -04:00
Ian Romanick
408aa46ca8 i965/fs: Don't make a name for a vector splitting temporary
If the name is just going to get dropped, don't bother making it.  If
the name is made, release it sooner (rather than later).

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 13:34:43 -07:00
Ian Romanick
0b47252999 glsl: Don't make a name for the function return variable
If the name is just going to get dropped, don't bother making it.  If
the name is made, release it sooner (rather than later).

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 13:34:43 -07:00
Ian Romanick
c87d09d7f0 glsl: Don't allocate a name for ir_var_temporary variables
Valgrind massif results for a trimmed apitrace of dota2:

                  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
Before (32-bit): 74 40,578,719,715       67,762,208       62,263,404     5,498,804            0
After  (32-bit): 52 40,565,579,466       66,359,800       61,187,818     5,171,982            0

Before (64-bit): 74 37,129,541,061       95,195,160       87,369,671     7,825,489            0
After  (64-bit): 76 37,134,691,404       93,271,352       85,900,223     7,371,129            0

A real savings of 1.0MiB on 32-bit and 1.4MiB on 64-bit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 13:34:43 -07:00
Ian Romanick
eaa0c74142 glsl: Use ir_var_temporary for compiler generated temporaries
These few places were using ir_var_auto for seemingly no reason.  The
names were not added to the symbol table.

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 13:34:43 -07:00
Ian Romanick
04e1357d97 glsl: Add context-level controls for whether temporaries have real names
No change Valgrind massif results for a trimmed apitrace of dota2.

v2: Minor rebase on _mesa_init_constants changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 13:34:42 -07:00
Ian Romanick
a99482482d glsl: Never put ir_var_temporary variables in the symbol table
Later patches will give every ir_var_temporary the same name in release
builds.  Adding a bunch of variables named "compiler_temp" to the symbol
table can only cause problems.

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 13:34:42 -07:00
Ian Romanick
7625babfae glsl: Add the possibility for ir_variable to have a non-ralloced name
Specifically, ir_var_temporary variables constructed with a NULL name
will all have the name "compiler_temp" in static storage.

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 13:34:42 -07:00
Ian Romanick
0e654ab1b9 glsl: Store ir_variable_data::_num_state_slots and ::binding in 16-bits each
Valgrind massif results for a trimmed apitrace of dota2:

                  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
Before (32-bit): 44 40,577,049,140       68,118,608       62,441,063     5,677,545            0
After  (32-bit): 71 40,583,408,411       67,761,528       62,263,519     5,498,009            0

Before (64-bit): 63 37,122,829,194       95,153,008       87,333,600     7,819,408            0
After  (64-bit): 67 37,123,303,706       95,150,544       87,333,600     7,816,944            0

A real savings of 173KiB on 32-bit and no change on 64-bit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-09-30 13:34:42 -07:00
Ian Romanick
a32ac726ee glsl: Squish ir_variable::max_ifc_array_access and ::state_slots together
At least one of these pointers must be NULL, and we can determine which
will be NULL by looking at other fields.  Use this information to store
both pointers in the same location.

If anyone can think of a better name for the union than "u", I'm all
ears.

Valgrind massif results for a trimmed apitrace of dota2:

                  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
Before (32-bit): 63 40,574,239,515       68,117,280       62,618,607     5,498,673            0
After  (32-bit): 44 40,577,049,140       68,118,608       62,441,063     5,677,545            0

Before (64-bit): 53 37,126,451,468       95,150,256       87,711,304     7,438,952            0
After  (64-bit): 63 37,122,829,194       95,153,008       87,333,600     7,819,408            0

A real savings of 173KiB on 32-bit and 368KiB on 64-bit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-09-30 13:34:42 -07:00
Ian Romanick
5aa8d8194c glsl: Make ir_variable::num_state_slots and ir_variable::state_slots private
Also move num_state_slots inside ir_variable_data for better packing.

The payoff for this will come in a few more patches.

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-09-30 13:34:42 -07:00
Ian Romanick
21df016902 glsl: Make ir_variable::max_ifc_array_access private
The payoff for this will come in a few more patches.

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-09-30 13:34:42 -07:00
Ian Romanick
8afe6efa21 glsl: Store ir_variable::depth_layout using 3 bits
warn_extension_index was moved to improve packing.

Valgrind massif results for a trimmed apitrace of dota2:

                  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
Before (32-bit): 73 40,580,476,304       68,488,400       62,796,151     5,692,249            0
After  (32-bit): 73 40,575,751,558       68,116,528       62,618,607     5,497,921            0

Before (64-bit): 71 37,124,890,613       95,889,584       88,089,008     7,800,576            0
After  (64-bit): 62 37,123,578,526       95,150,784       87,711,304     7,439,480            0

A real savings of 173KiB on 32-bit and 368KiB on 64-bit.

v2: Use the enum name with the bit-field and remove the extra casts.
Suggested by Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> [v1]
2014-09-30 13:34:42 -07:00
Ian Romanick
ab51179f1f glsl: Replace ir_variable::warn_extension pointer with an 8-bit index
Also move the new warn_extension_index into ir_variable::data.  This
enables slightly better packing.

Valgrind massif results for a trimmed apitrace of dota2:

                  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
Before (32-bit): 82 40,580,040,531       68,488,992       62,973,695     5,515,297            0
After  (32-bit): 73 40,580,476,304       68,488,400       62,796,151     5,692,249            0

Before (64-bit): 65 37,124,013,542       95,892,768       88,466,712     7,426,056            0
After  (64-bit): 71 37,124,890,613       95,889,584       88,089,008     7,800,576            0

A real savings of 173KiB on 32-bit and 368KiB on 64-bit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-09-30 13:34:41 -07:00
Ian Romanick
baf5a75664 glsl: Use accessors for ir_variable::warn_extension
The payoff for this will come in the next patch.

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-09-30 13:34:41 -07:00
Ian Romanick
1012e95a40 glsl: Eliminate unused built-in variables after compilation
After compilation (and before linking) we can eliminate quite a few
built-in variables.  Basically, any uniform or constant (e.g.,
gl_MaxVertexTextureImageUnits) that isn't used (with one exception) can
be eliminated.  System values, vertex shader inputs (with one
exception), and fragment shader outputs that are not used and not
re-declared in the shader text can also be removed.

gl_ModelViewProjectMatrix and gl_Vertex are used by the built-in
function ftransform.  There are some complications with eliminating
these variables (see the comment in the patch), so they are not
eliminated.

Valgrind massif results for a trimmed apitrace of dota2:

                  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
Before (32-bit): 46 40,661,487,174       75,116,800       68,854,065     6,262,735            0
After  (32-bit): 50 40,564,927,443       69,185,408       63,683,871     5,501,537            0

Before (64-bit): 64 37,200,329,700      104,872,672       96,514,546     8,358,126            0
After  (64-bit): 59 36,822,048,449       96,526,888       89,113,000     7,413,888            0

A real savings of 4.9MiB on 32-bit and 7.0MiB on 64-bit.

v2: Don't remove any built-in with Transpose in the name.

v3: Fix comment typo noticed by Anuj.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
2014-09-30 13:34:41 -07:00
Ian Romanick
77005cfabd glsl: Validate that built-in uniforms have backing state
All built-in uniforms are supposed to be backed by some GL state.  The
state_slots field describes this backing state.

This helped me track down a bug in a later patch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-30 13:34:41 -07:00
Eric Anholt
8786544b3e vc4: Don't forget to store stencil along with depth when storing either.
Otherwise, we'd replace the stencil in our packed depth/stencil with 0s.
Fixes about 50 piglit tests.
2014-09-30 12:55:28 -07:00
Mathias Fröhlich
43e2109326 llvmpipe: Reuse llvmpipes LLVMContext in the draw context.
Reuse the LLVMContext already allocated in llvmpipe_context
for draw_llvm if ppossible. This should decrease the memory
footprint of an llvmpipe context.

v2: Fix compile with llvm disabled.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-09-30 20:51:02 +02:00
Mathias Fröhlich
d90ff351f3 llvmpipe: Make a llvmpipe OpenGL context thread safe.
This fixes the remaining problem with the recently introduced
global jit memory manager. This change again uses a memory manager
that is local to gallivm_state. This implementation still frees
the majority of the memory immediately after compilation.
Only the generated code is deferred until this code is no longer used.

This change and the previous one using private LLVMContext instances
I can now safely run several independent OpenGL contexts driven
by llvmpipe from different threads.

v3: Rebase on llvm-3.6 compile fixes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-09-30 20:51:02 +02:00
Mathias Fröhlich
83c62597fc llvmpipe: Use two LLVMContexts per OpenGL context instead of a global one.
This is one step to make llvmpipe thread safe as mandated by the OpenGL
standard. Using the global LLVMContext is obviously a problem for
that kind of use pattern. The patch introduces two LLVMContext
instances that are private to an OpenGL context and used for all
compiles. One is put into struct draw_llvm and the other
one into struct llvmpipe_context.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-09-30 20:45:19 +02:00
Jason Ekstrand
98d00d6640 i965/brw_reg: Make the accumulator register take an explicit width.
The big pile of patches I just pushed regresses about 25 piglit tests on
SNB.  This fixes the regressions.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-30 11:42:34 -07:00
Brian Paul
6b65847835 llvmpipe: move lp_jit_screen_init() call after allocation of screen object
The screen argument isn't actually used by lp_jit_screen_init() at this
time, but let's move the call so that we pass a valid pointer.

v2: don't leak screen if lp_jit_screen_init() fails.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-09-30 12:09:14 -06:00
Brian Paul
b12899d752 tgsi: fix Semantic.Name assignment in tgsi_transform_input_decl()
Assign the sem_name parameter, not TGSI_SEMANTIC_GENERIC.
Fixes polygon stipple regression.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-09-30 12:08:49 -06:00
Brian Paul
0fb1e6b7b4 util: simplify PIPE_TEXTURE_CUBE case in util_max_layer()
For cube resources, the array_size value should be 6.  So handle
that case as we do for array texture resources.  But assert that
array_size==6 just to be safe.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-30 12:08:49 -06:00
Brian Paul
59562e9ba5 softpipe: don't special case PIPE_TEXTURE_CUBE in softpipe_resource_layout()
As with the previous patch for llvmpipe.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-30 12:08:49 -06:00
Brian Paul
3d77b80d80 llvmpipe: remove special case for PIPE_TEXTURE_CUBE in llvmpipe_texture_layout()
layers (aka array_size) should be 6 for cube textures so we don't need
to special-case it.  But add an assertion just to be safe.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-30 12:08:49 -06:00
Brian Paul
8269bfdb83 gallium: add doc note about cube textures and can_create_resource()
Just to be clear, and echo the description for resource_create().

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-30 12:08:49 -06:00
Brian Paul
3bfc9a73ad st/mesa: remove unneded PIPE_TEXTURE_CUBE check in st_texture_create()
Earlier in the function we assert layers==6 for PIPE_TEXTURE_CUBE so
there's no reason to special-case the pt.array_size = layers assignment.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-30 12:08:49 -06:00
Eric Anholt
2b76ee9031 mesa: Drop the always-software-primitive-restart paths.
The core sw primitive restart code is still around, because i965 uses it
in some cases, but there are no drivers that want it on all the time.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2014-09-30 11:00:52 -07:00
Eric Anholt
bcb722d830 gallium: Drop software-only primitive restart support.
The drivers not flagging primitive restart support are r300 swtcl, svga,
nv30, and vc4.

The point of primitive restart is to slightly reduce draw call overhead
for apps by batching multiple draws.  If we do an extra pass to read the
index buffer and split back into multiple draws, we've entirely missed the
point.  This is particularly bad for drivers that otherwise have hardware
IB reads, where the readback is probably uncached.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2014-09-30 10:59:58 -07:00
Jason Ekstrand
4ddc25a8d4 i965/fs: Properly calculate the number of instructions in calculate_register_pressure
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:15 -07:00
Jason Ekstrand
514fd1c55e i965/fs: Use the GRF for FB writes on gen >= 7
On gen 7, the MRF was removed and we gained the ability to do send
   instructions directly from the GRF.  This commit enables that
   functinoality for FB writes.

   v2: Make handling of components more sane.

i965/fs: Force a high register for the final FB write

   v2: Renamed the array for the range mappings and added a comment

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:15 -07:00
Jason Ekstrand
1dd9b90ecd i965/fs: Handle COMPR4 in LOAD_PAYLOAD
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:15 -07:00
Jason Ekstrand
29f4c5b5d5 i965/fs: Constant propagate into LOAD_PAYLOAD
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:15 -07:00
Jason Ekstrand
6d770ce93a i965/fs: Add split_virtual_grfs and compute_to_mrf after lower_load_payload
If we are going to use LOAD_PAYLOAD operations to fill MRF registers, then
we will need this.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:15 -07:00
Jason Ekstrand
8b0e4b387a i965/fs: Add a an optional source to the FS_OPCODE_FB_WRITE instruction
Previously, we were use the base_mrf parameter of fs_inst to store the MRF
location.  In preparation for doing FB writes from the GRF, we now also
allow you to set inst->base_mrf to -1 and provide a source register.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:15 -07:00
Jason Ekstrand
9e1f52a6e2 i965/fs: Use the GRF for UNTYPED_SURFACE_READ instructions
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:15 -07:00
Jason Ekstrand
d25aaf1cb1 i965/fs: Use the GRF for UNTYPED_ATOMIC instructions
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:15 -07:00
Jason Ekstrand
65ddf6f404 i965/fs: Add a function for getting a component of a 8 or 16-wide register
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:15 -07:00
Jason Ekstrand
30d718c2fb i965/fs: Use the instruction execution size directly for texture generation
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:15 -07:00
Jason Ekstrand
48ddd2889e i965/fs: Use exec_size instead of force_uncompressed in dump_instruction
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:15 -07:00
Jason Ekstrand
b18fd234da i965/fs: Use instruction execution sizes instead of heuristics
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:15 -07:00
Jason Ekstrand
894ec5a1d8 i965/fs: Use instruction execution sizes to set compression state
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
8f1adb5965 i965/fs: Remove unneeded uses of force_uncompressed
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
2999f83bd9 i965/fs: Derive force_uncompressed from instruction exec_size
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
5f41d052bf i965/fs: Make fs_reg::effective_width take fs_inst* instead of fs_visitor*
Now that we have execution sizes, we can use that instead of the
   dispatch width.  This way it also works for 8-wide instructions in
   SIMD16.

i965/fs: Make effective_width a variable instead of a function

i965/fs: Preserve effective width in constant propagation

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
6ba31cc000 i965/fs: Better guess the width of LOAD_PAYLOAD
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
071ac3a467 i965/fs: Add an exec_size field to fs_inst
This will, eventually, allow us to manage execution sizes of
   instructions in a much more natural way from the fs_visitor level.

i965/fs: Explicitly set instruction execute size a couple of places

i965/blorp: Explicitly set instruction execute sizes

   Since blorp is all 16-wide and nothing isn't, in general, very careful
   about register width, we'll just set it all explicitly.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
fbc0a798ee i965/fs: Determine partial writes based on the destination width
Now that we track both halves of a 16-wide vgrf, we no longer need to worry
about force_sechalf or force_uncompressed.  The only real issue is if the
destination is too small.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
27d7ef094a i965/fs: Fix a bug in register coalesce
This commit fixes a bug in register coalesce that happens when one register
is moved to another the proper number of times but the channels are
re-arranged.  When this happens, the previous code would happily coalesce
the registers regardless of the fact that the channel mappins were wrong.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
16819b48ab i965/fs: Rework GEN5 texturing code to use fs_reg and offset()
Now that offset() can properly handle MRF registers, we can use an MRF
fs_reg and let offset() handle incrementing it correctly for different
dispatch widths.  While this doesn't have any noticeable effect currently,
it does ensure that the destination register is 16-wide which will be
necessary later when we start detecting execution sizes based on source and
destination registers.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
7210583eb8 i965/fs_reg: Allocate double the number of vgrfs in SIMD16 mode
This is actually the squash of a bunch of different changes.  Individual
commit titles follow:

i965/fs: Always 2-align registers SIMD16 for gen <= 5

i965/fs: Use the register width when applying offsets

   This reworks both byte_offset() and offset() to be more intelligent.
   The byte_offset() function now supports offsets bigger than 32. The
   offset() function uses the byte_offset() function together with the
   register width and the type size to offset the register by the correct
   amount.

i965/fs: Change regs_read to be in hardware registers

i965/fs: Change regs_written to be actual hardware registers

i965/fs: Properly handle register widths in LOAD_PAYLOAD

   The LOAD_PAYLOAD instruction is a bit special because it collects a
   bunch of registers (with possibly different widths) into a single
   payload block.  Once the payload is constructed, it's treated as a
   single block of data and most of the information such as register widths
   doesn't matter anymore.  In particular, the offset of any particular
   source register is the accumulation of the sizes of the previous source
   registers.

i965/fs: Properly set writemasks in LOAD_PAYLOAD

i965/fs: Handle register widths in demote_pull_constants

i965/fs: Get rid of implicit register doubling in the allocator

i965/fs: Reserve enough registers for PLN instructions

i965/fs: Make sources and destinations interfere in 16-wide

i965/fs: Properly handle register widths in CSE

i965/fs: Properly handle register widths in register_coalesce

i965/fs: Properly handle widths in copy propagation

i965/fs: Properly handle register widths in VARYING_PULL_CONSTANT_LOAD

i965/fs: Properly handle register widths and odd register sizes in spilling

i965/fs: Don't waste a register on texture lookups for gen >= 7

   Previously, we were waisting a register in SIMD16 mode because we could
   only allocate registers in pairs.  Now that we can allocate and address
   odd-sized registers, let's get rid of this special-case.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
4232a776a6 i965/fs: Handle printing of registers better.
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
5390ca8ce9 i965: Explicitly set widths on gen5 math instruction destinations.
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
004fbd5375 i965/fs: Make half() divide the register width by 2 and use it more
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
24d023b9fe i965/fs: Add a concept of a width to fs_reg
Every register in i965 assembly implicitly has a concept of a "width".
Usually, this is derived from the execution size of the instruction.
However, when writing a compiler it turns out that it is frequently a
useful to have the width explicitly in the register and derive the
execution size of the instruction from the widths of the registers used in
it.

This commit adds a width field to fs_reg along with an effective_width()
helper function.  The effective_width() function tells you how wide the
register effectively is when used in an instruction.  For example, uniform
values have width 1 since the data is not actually repeated, but when used
in an instruction they take on the width of the instruction.  However, for
some instructions (LOAD_PAYLOAD being the notable exception), the width is
not the same.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
1030ee6e9b i965/fs: A little harmless refactoring of register_coalesce
Just pass the visitor into is_copy_payload() and is_coalesce_candidate()
instead of a register size and the virtual_grf_sizes array.  Among other
things, this makes the code more obvious because you don't have to figure
out where src_size came from.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
f91b566f55 i965/brw_reg: Add a firsthalf function and use it in the generator
Right now, this function is a no-op but it indicates that we intend to only
use the first half of the 16-wide register.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
1728e74957 i965/fs: Copy propagate partial reads.
This commit reworks copy propagation a bit to support propagating the
copying of partial registers.  This comes up every time we have pull
constants because we do a pull constant read immediately followed by a move
to splat the one component of the out to 8 or 16-wide.  This allows us to
eliminate the copy and simply use the one component of the register.

Shader DB results:

total instructions in shared programs: 5044937 -> 5044428 (-0.01%)
instructions in affected programs:     66112 -> 65603 (-0.77%)
GAINED:                                0
LOST:                                  0

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
4d5f0eb048 i965/fs: Refactor fs_inst::is_send_from_grf()
A switch statement is much easier to read/edit than a big giant or
statement.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
54688cd03b i965/fs: Clean up emit_fb_writes
This splits emit_fb_writes into two functions: emit_fb_writes and
emit_single_fb_write.  This reduces the amount of duplicated code in
emit_fb_writes and makes the register number fiddling less arcane.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
72a3780f26 i965/fs: Print BAD_FILE registers in dump_instruction
Sometimes these show up in LOAD_PAYLOAD instructions and it's nice to be
able to see them.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:14 -07:00
Jason Ekstrand
2af4b0aeaf i965/fs: Make compact_virtual_grfs an optimization pass
Previously we disabled compact_virtual_grfs when dumping optimizations.
The idea here was to make it easier to diff the dumped shader because you
didn't have a sudden renaming.  However, sometimes a bug is affected by
compact_virtual_grfs and, when this happens, you want to keep dumping
instructions with compact_virtual_grfs enabled.  By turning it into an
optimization pass and dumping it along with the others, we retain the
ability to diff because you can just diff against the compact_virtual_grf
output.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:13 -07:00
Jason Ekstrand
a25db10c12 i964/fs: Make immediate fs_reg constructors explicit
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:13 -07:00
Jason Ekstrand
1c89e098e8 i965/fs: Make null_reg_* const members of fs_visitor instead of globals
We also set the register width equal to the dispatch width.  Right now,
this is effectively a no-op since we don't do anything with it.  However,
it will be important once we add an actual width field to fs_reg.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:13 -07:00
Jason Ekstrand
ab7234c852 i965/fs: Use the var_from_vgrf helper function instead of doing it manually
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:13 -07:00
Jason Ekstrand
c24dd54f97 i965/fs: Fix a bug with dead_code_eliminate on large writes
Previously, if an instruction wrote to more than one register, we
implicitly assumed that it filled the entire register.  We never hit this
before because the only time we did multi-register writes was things like
texturing which always wrote to all of the registers.  However, with the
upcoming ability to do 16-wide instructions in SIMD8 and things of that
nature, we can have multi-register writes at offsets and we'll hit this.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:13 -07:00
Jason Ekstrand
1385a4b706 i965/fs: Use the UW type for the destination of VARYING_PULL_CONSTANT_LOAD instructions
Using a floating-point type doesn't usually cause hangs on my HSW, but the
simulator complains about it quite a bit.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:13 -07:00
Jason Ekstrand
f0d43c09b2 i965/fs: Use offset a lot more places
We have this wonderful offset() function for advancing registers, but we're
not using it.  Using offset() allows us to do some sanity checking and
avoid manually touching fs_reg::reg_offset.  In a few commits, we will make
offset do even more nifty things for us.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:13 -07:00
Jason Ekstrand
0089d025aa i965/fs: fix a comment in compact_virtual_grfs
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:13 -07:00
Jason Ekstrand
3dc3fccb75 i965/fs: Rewrite fs_visitor::split_virtual_grfs
The original vgrf splitting code was written with the assumption that vgrfs
came in two types: those that can be split into single registers and those
that can't be split at all It was very conservative and bailed as soon as
more than one element of a register was read or written.  This won't work
once we start allowing a regular MOV or ADD operation to operate on
multiple registers.  This rewrite allows for the case where a vgrf of size
5 may appropriately be split in to one register of size 1 and two registers
of size 2.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:13 -07:00
Jason Ekstrand
f9da0740e2 i965/fs_live_variables: Use var_from_vgrf insead of repeating the calculation
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-30 10:29:13 -07:00
Jason Ekstrand
75afe17b79 i965/fs: Manually generate the meta fast-clear shader
Previously, we were generating the fast-clear shader from GLSL.  The
problem is that fast clears require that we use a replicated write rather
than a regular write instruction.  In order to get this we had a
complicated and somewhat fragile optimization pass that looked for places
where we can use a replicated write and used it.  Since replicated writes
have a lot of restrictions, we only ever use them for fast-clear
operations.

This commit replaces the optimization pass with a function that just
generates the shader we want.  This is a) less code, b) less fragile than
the optimization pass, and c) generates a more efficient shader.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-30 10:29:13 -07:00
Michel Dänzer
61128d7507 radeonsi: Pass the slice size to si_dma_copy_buffer
Otherwise some parts of tiled slices can be missed.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-30 18:55:48 +09:00
Michel Dänzer
74aeccd701 radeonsi: Catch more cases that can't be handled by si_dma_copy_buffer/tile
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-30 18:55:48 +09:00
Michel Dänzer
d17b85524d radeonsi: Fix si_dma_copy(_tile) for compressed formats
Fixes GPUVM faults when running the piglit test "getteximage-formats
init-by-rendering" with R600_DEBUG=forcedma on SI.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-30 18:55:48 +09:00
Michel Dänzer
761d80ddab radeonsi: Fix tiling mode index for stencil resources
We are currently only dealing with depth-only or stencil-only resources
here, not with resources having both depth and stencil[0]. In both cases,
the tiling mode index is in the tile_mode field, not in the
stencil_tile_mode field.

[0] Add an assertion for that.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-30 18:55:48 +09:00
Chia-I Wu
594e1a2f4b ilo: fix format of edge flag pointer
The VE format of edge flag pointers was changed in
780ce576bb.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-30 16:41:32 +08:00
Chia-I Wu
2d13b5ac81 ilo: add a pass to finalize ilo_ve_state
Add finalize_vertex_elements() to finalize ilo_ve_state.  This fixes a
potential issue with URB entry allocation for VS and move the complexity of
gen6_3DSTATE_VERTEX_ELEMENTS() to the new function.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-30 16:41:32 +08:00
Chia-I Wu
2b4c8ffc30 ilo: precalculate aligned depth buffer size
To replace the hacky zs_align_surface().

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-30 16:41:31 +08:00
Chia-I Wu
343b014b57 ilo: use dynamic bo for rectlist vertices
The size is always 24 bytes.  We can upload them to the dynamic buffer.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-30 16:41:31 +08:00
Thomas Hellstrom
46537f1d03 st/xa: Fix regression in xa_yuv_planar_blit()
Commit "st/xa: scissor to help tilers" broke xa_yuv_planar_blit() and vmwgfx
textured video. Fix this by implementing scissors also in the yuv draw path.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-30 08:31:33 +02:00
Kenneth Graunke
68627235f2 i965: Delete intel_chipset.h.
Unused; it was replaced by include/pci_ids/i965_pci_ids.h long ago.

Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-29 20:10:00 -07:00
Alex Henrie
3bea907797 driconf: Correct and update Catalan translation
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-09-29 17:45:41 -07:00
Alex Henrie
33a7d0d040 driconf: Update Spanish translation
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-09-29 17:45:26 -07:00
Alex Henrie
3b34b876f4 driconf: Synchronize po files
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-09-29 17:45:10 -07:00
Eric Anholt
4ceaad14ff vc4: Don't try to do stores to buffers that aren't bound.
The code was kind of mixed up what buffers were getting stored in the case
that a resolve bit was unset (which are set based on the GL state at draw
time) and the buffer wasn't actually bound.  In particular, depth-only
rendering would store the color buffer contents, which happen to be
pointing at the depth buffer.

Thanks to clearing out the resolve bits for things we really can't
resolve, now I can drop the safety checks for buffer presence around the
actual stores.

Fixes 42 piglit tests.
2014-09-29 17:44:15 -07:00
Eric Anholt
1d42aa8358 vc4: Shove some depth comparison bits down to where they're used. 2014-09-29 17:44:15 -07:00
Matt Turner
66ab9c22fe i965: Use BRW_MATH_DATA_SCALAR when source regioning is scalar.
Notice the mistaken (but harmless) argument swapping in brw_math_invert().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-29 15:59:19 -07:00
Matt Turner
a0df258f89 i965/compaction: Move variable declarations to their uses.
Tested-by: Mark Janes <mark.a.janes@intel.com>
2014-09-29 15:59:16 -07:00
Matt Turner
a36631b74c i965/compaction: Simplify jump target code.
My attempts to clarify the code with _compacted/_uncompacted prefixed
variables apparently failed. Hopefully this is clearer.

In any case, the previous code wasn't clear enough to gcc to let it
optimize division by a power of two into a shift. No problems now.

Also, the previous code (in the ADD case) didn't work on 32-bit x86, due
to complicated set of interactions best summed up as unsigned division
and compiler optimizations.

Tested-by: Mark Janes <mark.a.janes@intel.com>
2014-09-29 15:58:57 -07:00
Rob Clark
dce96f6da2 freedreno/a3xx: re-emit shaders on variant change
We need to keep track if a state change other than frag/vert shader
state will trigger us to need a different shader variant, and if
necessary mark the appropriate shader state as dirty.  Otherwise we will
forget to re-emit the shader state.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-29 18:30:43 -04:00
Rob Clark
3aaab87563 freedreno/ir3: add some cmdline args
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-29 18:30:43 -04:00
Rob Clark
7cdd467994 freedreno/a3xx: add support to emulate GL_CLAMP
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-29 18:30:43 -04:00
Rob Clark
3541705816 freedreno: add texcoord clamp support to lowering
This is for hw that needs to emulate some texture wrap modes (like
CLAMP) with some help from the shader.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-29 18:30:43 -04:00
Rob Clark
a6746d1124 freedreno: move bind_sampler_states to per-generation
Keep the existing function as a common helper.  But this lets us move an
a2xx specific hack out of common code.  And the PIPE_TEX_WRAP_CLAMP
emulation will require an a3xx specific hack.  So rather than piling on
hacks, split this out.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-29 18:30:42 -04:00
Rob Clark
7e20c09d4a freedreno/a3xx: fix border color order
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-29 18:30:42 -04:00
Rob Clark
c61133046e freedreno/a3xx: add 32bit integer vtx formats
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-29 18:30:42 -04:00
Eric Anholt
fc4b5b85ce vc4: Add support for GL 1.1's stupid CLAMP mode.
We just clamp the incoming texture coordinates.  This breaks the lambda
calculation, but it gets the piglit tests to pass.  This is the same
behavior as in i965.
2014-09-29 14:12:33 -07:00
Eric Anholt
ae22f5aa14 vc4: Add support for texture border color.
One spot in the docs says that it's stored at a miplevel just beyond the
last miplevel, which was scary.  But really, you just load it as the R
coordinate (which conflicts with cubemaps, but you don't do border
clamping on cubes).
2014-09-29 13:48:08 -07:00
Eric Anholt
b65761f764 vc4: Add the necessary stubs for occlusion queries.
We have to expose them for GL 2.0, but we just always return a value of 0.
We should be advertising 0 query bits instead of 64, but gallium doesn't
have plumbing for that yet.  At least this stops the segfaults.
2014-09-29 11:51:09 -07:00
Eric Anholt
76cd9955d9 vc4: Optimize out silly SUBs of 0.
Drops instructions on vs-temp-array-mat4-index-col-row-wr.shader_test,
which I was looking at because it's failing to register allocate.
2014-09-29 11:33:34 -07:00
Eric Anholt
64122b16ce vc4: Dump constant uniform values in VC4_DEBUG=qir.
Definitely helps when trying to understand and optimize a program.
2014-09-29 11:33:34 -07:00
Eric Anholt
3311513041 vc4: Turn a SEL_X_Y(x, 0) into SEL_X_0(x).
This may reduce register pressure and uniform counts.  Drops a bunch of 0
uniform loads on vs-temp-array-mat4-index-col-row-wr.shader_test, which is
failing to register allocate.
2014-09-29 11:33:34 -07:00
Eric Anholt
730267eb23 vc4: Add support for texture cube maps.
It's not passing some of the piglit tests, because it looks like at small
miplevels some contents from surrounding faces are getting filtered in at
the corners.  It does get 7 new tests passing.
2014-09-29 11:29:28 -07:00
Eric Anholt
c4245d8b2e vc4: Rename the slice's size0.
In the other related fields, "0" refers to the size of the first miplevel,
while this is a field in a slice.  The other implicit slices we have
(cubemap layers) don't vary in size compared to the first one.
2014-09-29 11:26:43 -07:00
Eric Anholt
7a85ebf6e2 vc4: Stop trying to reuse temporaries that store uniform values.
Almost always, the MOV will get copy propagated out.  Even if it doesn't,
it's probably better to just reload the uniform at next use (to reduce
register pressure) rather than try to save instruction count.

I was looking at this because in the presence of texturing (which calls
add_uniform() directly to get the uniform load forced into the
instruction) the c->uniform_contents indices don't match 1:1 with the
temporary qregs.
2014-09-29 10:07:24 -07:00
Tapani Pälli
3386e95994 egl: setup screen iterator before using it
commit 4ed23fd broke creation of pbuffer surfaces, patch fixes
the failure, noticed when running chrome with '--use-gl=egl'.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
2014-09-29 15:12:11 +03:00
Chia-I Wu
8c7c0f7114 ilo: fix a missing 'else'
An 'else' is missing in the disassembler.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-29 16:58:36 +08:00
Kalyan Kondapally
66a2fe4cf9 glsl: Allow texture2DProjLod and textureCubeLod in GL ES
According to GLES (i.e. 1.0 and above) spec textureCubeLod and
texture2DProjLod are built in functions. We seem to disable support
for these functions with GLES. This patch enables the support.

Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84355
2014-09-29 11:10:38 +03:00
Rob Clark
40aabc0e80 configure.ac: bump libdrm_freedreno requirement
We need 2.4.57 for fd_bo_dmabuf() / fd_bo_from_dmabuf().

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-28 12:46:17 -04:00
Matt Turner
5ccdc23a86 glsl: Recognize open-coded pow(x, y).
pow(x, y) is equivalent to exp(log(x) * y).

instructions in affected programs:     578 -> 458 (-20.76%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-27 12:18:37 -07:00
Matt Turner
e9aee2572a i965/fs: Don't invalidate live intervals in saturate propagation.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-27 12:18:37 -07:00
Matt Turner
b9689c6bda i965/fs: Ignore mov.sat instructions in interference check in sat prop.
When an instruction's result was consumed by multiple mov.sat
instructions, we would decide that we couldn't move the saturate
modifier because something else was using the result, even though it was
just another mov.sat!

total instructions in shared programs: 4275598 -> 4274842 (-0.02%)
instructions in affected programs:     75634 -> 74878 (-1.00%)

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-27 12:18:37 -07:00
Matt Turner
82bdb559a1 i965/fs: Walk instructions in reverse in saturate propagation.
When we find a mov.sat, we search backwards. We might as well search
everything else backwards as well and potentially look at fewer
instructions.

This change enables the next patch.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-27 12:18:37 -07:00
Rob Clark
ed48f91275 freedreno/a3xx: add flat interpolation mode
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-27 13:34:07 -04:00
Rob Clark
df2f0c6d55 freedreno/a3xx: add LOD_BIAS
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-27 13:34:07 -04:00
Rob Clark
f7259949da freedreno: turn missing caps into compile warnings
Get rid of the 'default' case (as suggestied by imirkin) so compiler
warns us about missing caps.  Also add some caps that were missing until
now.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-27 13:34:07 -04:00
Rob Clark
546d6c8dc9 freedreno: we have more than 0 viewports!
4155d1c7 'st/mesa: drop dependence on API profile in st_init_extensions'
broke freedreno because somehow 'PIPE_CAP_MAX_VIEWPORTS' fell through
the cracks.  Resulting that we reported zero viewports.  So the state
tracker never bothered to give us any valid viewport!

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-27 13:34:07 -04:00
Rob Clark
24cd746e4b freedreno: update generated headers
Among other things, fixes a bug for fixed point registers/bitfields.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-27 13:34:07 -04:00
Rob Clark
5c72672cdc freedreno: don't advertise mirror-clamp support
At least on a3xx, we cannot do it without some emulation in shader.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-27 13:34:07 -04:00
Rob Clark
e4c678c164 freedreno: fix compiler warning
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-27 13:34:07 -04:00
Tom Stellard
ec566e0f16 configure.ac: Compute LLVM_VERSION_PATCH using llvm-config
This is the only guaranteed way get the patch level for llvm,
since the define cannot always be found in config.h depending
on the version of llvm or the build system used.

CC: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jonathan Gray <jsg@jsg.id.au>
2014-09-27 17:46:39 +01:00
Emil Velikov
5ef6eb4654 Remove Bluegene/L wrappers
Added back in 2009, with osmesa/GLU in mind. Unlikely to be working
any more since the removal of the static makefiles.

Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-27 15:21:22 +01:00
Emil Velikov
343795e445 mesa: remove last DJGPP remains
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-27 15:20:49 +01:00
Emil Velikov
a662fa94c1 configure: use explicit enabled/disabled in config switch description
Rather than having double negatives -> disable-opencl, default=no
simply use enabled/disabled. It makes things a bit easier for the
reader and consistent throughout the file.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-27 15:20:42 +01:00
Emil Velikov
bbe6f7f865 configure: ask vdpau.pc for the default location of the vdpau drivers
Rather than using hardcoded values honor the value set at libvdpau
build time - i.e. the moduledir variable from vdpau.pc

Update the omx description to match reality while we're here.

Cc: Christian König <deathsimple@vodafone.de>
Cc: Alexandre Demers <alexandre.f.demers@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80615
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-27 15:20:26 +01:00
Emil Velikov
407450eb84 configure: drop --with-egl-driver-dir switch
The location of the egl driver(s) is matter that we should have
never exposed to the user. Currently the dri2 driver is built
into the libEGL loader, with the gallium based one soon to follow.

v2: Fold EGL_DRIVER_INSTALL_DIR within the makefiles. Suggested by Matt.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80615
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-27 15:20:14 +01:00
Emil Velikov
2e6fc0647a configure: remove non-functional --with-opencl-libdir
The parameter used to control where the gallium pipe-drivers
were installed, but was broken since

commit 45270fb0fd
Author: Matt Turner <mattst88@gmail.com>
Date:   Thu Sep 13 10:45:01 2012 -0700

    targets/pipe-loader: Convert to automake

Considering that nowadays the pipe-drivers can be used by
more than just the opencl target, even fixing this up will
not be the best idea.

Cc: Matt Turner <mattst88@gmail.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Buzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61415
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-27 15:15:58 +01:00
Ian Romanick
c3f17bb18f glsl: Strip arrayness from ir_type_dereference_variable too
If the thing being dereferenced is a record or an array of records, it
should be treated as row-major.  The ir_type_derference_record path
already does this, and I think I intended to do the same for this path
in b17a4d5d.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83741
Cc: mesa-stable@lists.freedesktop.org
2014-09-26 07:59:53 -07:00
Ian Romanick
2ab71e1486 glsl: Round struct size up to at least 16 bytes
Per rule #9, the size of the structure is vec4 aligned.  The MAX2 in the
loop ensures that sizes >= 16 bytes are vec4 aligned.  The new MAX2
after the loop ensures that sizes < 16 bytes are vec4 aligned.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82932
Cc: mesa-stable@lists.freedesktop.org
2014-09-26 07:59:50 -07:00
Ian Romanick
5c75270c34 glsl: Make sure row-major array-of-structure get correct layout
Whether or not the field is row-major (because it might be a bvec2 or
something) does not affect the array itself.  We need to know whether an
array element in its entirety is row-major.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83506
Cc: mesa-stable@lists.freedesktop.org
2014-09-26 07:59:47 -07:00
Ian Romanick
8e01c66da6 glsl: Make sure fields after small structs have correct padding
Previously the linker would correctly calculate the layout, but the
lower_ubo_reference pass would not apply correct alignment to fields
following small (less than 16-byte) nested structures.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83533
Cc: mesa-stable@lists.freedesktop.org
2014-09-26 07:59:25 -07:00
Chia-I Wu
24653bcd7d ilo: give gen6_draw_session a better prefix
gen6_draw_session is not GEN dependent.  Rename it to ilo_render_draw_session.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
4be7b7ee85 ilo: make ilo_render opaque
It is not used outside the render code.  There are also too many details in it
that we do not want other components to access directly.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
8f284343e0 ilo: make ilo_render_emit_draw() direct
Remove emit_draw() and ILO_RENDER_DRAW indirections.  With all emit functions
being direct now, ilo_render_estimate_size() and more can also be removed.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
a05ce904aa ilo: make ilo_render_emit_rectlist() direct
Remove emit_rectlist() and ILO_RENDER_RECTLIST indirections.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
362d2fb982 ilo: clean up draw and rectlist state emission
Add these new high-level functions

  ilo_render_get_draw_dynamic_states_len()
  ilo_render_emit_draw_dynamic_states()
  ilo_render_get_rectlist_dynamic_states_len()
  ilo_render_emit_rectlist_dynamic_states()
  ilo_render_get_draw_surface_states_len()
  ilo_render_emit_draw_surface_states()

for draw and rectlist state emission.  They are implemented in the new
ilo_render_dynamic.c and ilo_render_surface.c.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
f1662e3670 ilo: sanity check ilo_render_get_*_len()
Assert that we never write more than what ilo_render_get_*_len() returns.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
7fc7415316 ilo: simplify ilo_render_get_query_len()
For all supported query types, we always emit a PIPE_CONTROL.  Call
ilo_render_get_flush_len() for simplicity and clarity.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
0afc17ea49 ilo: make ilo_render_emit_query() direct
Remove emit_query() and ILO_RENDER_QUERY indirections.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
18cbd3cc34 ilo: make ilo_render_emit_flush() direct
Remove emit_flush() and ILO_RENDER_FLUSH indirections.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
e3451552d2 ilo: simplify ilo_render invalidation
ilo_render is based on ilo_builder.  We should only care if the builder
buffers are invalidated, or if the hardware context is invalidated.  Replace
ilo_render_invalidate() with flags by ilo_render_invalidate_builder() and
ilo_render_invalidate_hw().

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
ce2bda300d ilo: add ilo_builder_{dynamic,surface}_used()
Return how many DWords are used in dynamic and surface buffers respectively.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
2df2f60e8d ilo: rename state buffer to dynamic buffer
Both dynamic buffer and surface buffer are state buffers.  We should not use
state buffer to refer to the former.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
a7f2ab668c ilo: constify ilo_render in ilo_render_get_sample_position()
It is a getter and is not supposed to modify ilo_render.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
23d66a42a3 ilo: rename 3d_pipeline to render
Follow the file renaming.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
3afe30e64b ilo: remove struct ilo_3d
Move members of ilo_3d that still make sense to ilo_context.  With ilo_3d
gone, rename functions whose names begin with ilo_3d to something more
appropriate.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
b6443ae969 ilo: rename ilo_3d_pipeline*.[ch] to ilo_render*.[ch]
They are used to build render engine commands, which can be more than 3D.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Chia-I Wu
392890d5de ilo: rename ilo_3d.[ch] to ilo_draw.[ch]
There is not much left in struct ilo_3d.  We want to kill it and ilo_3d.[ch]
will be bad names.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-26 21:15:55 +08:00
Michel Dänzer
7e55c3b352 st/mesa: Use PIPE_USAGE_STAGING for GL_STATIC/DYNAMIC/STREAM_READ buffers
Such buffers can only be useful by reading from them with the CPU, so we
need to make sure CPU reads are fast.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84178
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2014-09-26 16:53:13 +09:00
Tapani Pälli
9caa5c3b13 glsl: remove unused link_assign_uniform_block_offsets
ubo offsets are assigned by link_uniform_blocks since 514f8c7e

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-26 08:29:10 +03:00
Kalyan Kondapally
e018ea81bf glsl: Structures must have same name to be considered same type.
According to GLSL(4.2) and GLSL-ES (1.0, 3.0) spec, Structures must
have the same name to be considered same type. We currently ignore
the name check while checking if two records are same. This patch
fixes this.

Patch fixes failing tests in WebGL conformance test
'shaders-with-uniform-structs' when running Chrome on OpenGL ES.

v2: Do not force name comparison with unnamed types (Tapani)
v3: Cleanups (Matt)

Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83934
2014-09-26 08:29:10 +03:00
Tapani Pälli
1cb81d3a9b glsl: fix uniform location count used for glsl types
Patch fixes the slot count used by vector types and adds 1 slot
to be used by image and sampler types.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82921
2014-09-26 08:29:10 +03:00
Ilia Mirkin
a5bbfeda97 gm107/ir: take relative pfetch offset into account
There is no dedicated instruction for this, so just combine it with the
constant offset.

Acked-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
2014-09-26 01:13:06 -04:00
Michel Dänzer
4a38b154fd gallivm: More fallout from disabling with LLVM 3.6
The draw module would still try to use gallivm, causing many piglit tests
to fail with an assertion failure. llvmpipe might have been similarly
affected.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-09-26 11:35:52 +09:00
Ilia Mirkin
cdc4de1215 gm107/ir: add support for indirect const buffer selection
This was missed in the commit that enabled it for fermi/kepler as part
of ARB_gpu_shader5

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
2014-09-25 22:15:50 -04:00
Ilia Mirkin
0532a5fd00 gm107/ir: fix texture argument order
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
2014-09-25 22:15:50 -04:00
Ilia Mirkin
d3c3bba6d0 gm107/ir: fix manual TXD for array targets
This parallels the fixes in commit afea9bae.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
2014-09-25 22:15:49 -04:00
Ilia Mirkin
d78b533c29 nouveau: fix glCompressedTexImage
mesa_texstore expects pixel data, not compressed data. For compressed
textures, we want to just copy the bits in without any conversion.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2014-09-25 22:15:49 -04:00
Ilia Mirkin
0147c10c5f nv50/ir: avoid deleting pseudo instructions too early
What happens is that a SPLIT operation is part of the spill node, and as
a pseudo op, the instruction gets erased after processing its first def.
However the later defs still need to refer to it, so instead delay
deleting until after that whole RA node is done processing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79462
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-25 22:15:49 -04:00
Ilia Mirkin
9d2e298dd4 mesa/st: NumLayers is only valid for array textures
For 3d textures, NumLayers is set to 1, which is not what we want. This
fixes the newly added gl-layer-render-storage test (which constructs
immutable 3d textures). Fixes regression introduced in d82bd7eb06.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84145
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
2014-09-25 22:15:49 -04:00
Ilia Mirkin
fca2216ced nv50/ir: add some comments on edge classification
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-25 22:15:49 -04:00
Ilia Mirkin
1ae32e24ca nv50,nvc0: fix 3d blit logic for odd depth/stencil formats
Reported-by: David Heidelberger <david.heidelberger@ixit.cz>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-25 22:15:49 -04:00
Ilia Mirkin
b49dfb68ed nv50,nvc0: add missing depth/stencil formats to tile flag selection
Reported-by: David Heidelberger <david.heidelberger@ixit.cz>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-25 22:15:49 -04:00
Eric Anholt
db11eb92cf vc4: Switch from errx() to fprintf() and abort().
These are pretty catastrophic, "should never happen" failure paths (though
4 tests in piglit hit them currently, due to a single bug).  An abort()
that you can gdb on easily is probably more useful than a clean exit,
particularly since a bug in piglit framework right now is causing early
exit(1)s to simply not be recorded in the results at all.
2014-09-25 16:41:25 -07:00
Eric Anholt
45962fbeee vc4: Fix miplevel validation for raster textures.
We were using the un-minified value, meaning we'd reject correctly laid
out textures.
2014-09-25 16:41:25 -07:00
Matt Turner
43267a325f mesa: Replace IS_NEGATIVE(x) with x < 0.0f.
I only made IS_NEGATIVE(x) use signbit in commit 0f3ba405 in an attempt
to fix 54805, but it didn't help. We didn't use signbit on some
platforms and instead defined it to x < 0.0f.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-25 13:57:29 -07:00
Matt Turner
50e2f70093 radeon: Use PRINTLIKE macro. 2014-09-25 13:57:29 -07:00
Matt Turner
b66791d47f configure.ac: Replace gallium_check_st with gallium_require_drm. 2014-09-25 13:57:29 -07:00
Matt Turner
28e84c93bb configure.ac: Drop gallium directory tracking.
Was only tracked to be printed at the end of configure, but configure
quits if it can't build something we requested, rather than silently
dropping it, so printing these directories has little use.
2014-09-25 13:57:29 -07:00
Matt Turner
691bd9b9df configure.ac: Use autoconf macro for GNU make. 2014-09-25 13:57:28 -07:00
Matt Turner
e4be17fd04 ralloc: Mark ralloc functions with gcc's malloc attribute.
Cuts a few hundred bytes from the DRI drivers, so it must give gcc some
extra information.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-25 13:52:55 -07:00
Matt Turner
976464c210 mesa: Replace a priori knowledge of gcc attributes with configure tests.
Note that I had to add support for testing the packed attribute to
m4/ax_gcc_func_attribute.m4.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [C bits]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-25 13:52:55 -07:00
Matt Turner
4a96df73e7 mesa: Replace a priori knowledge of gcc builtins with configure tests.
Presumbly this will let clang and other compilers use the built-ins as
well.

Notice two changes specifically:
   - in _mesa_next_pow_two_64(), always use __builtin_clzll and add a
     static assertion that this is safe.
   - in macros.h, remove the clang-specific definition since it should
     be able to detect __builtin_unreachable in configure.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [C bits]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-25 13:52:55 -07:00
Matt Turner
3e00822619 i965/compaction: Document instruction compaction capabilities.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-25 11:02:37 -07:00
Matt Turner
54e30dbf4d i965: Emit ELSE/ENDIF JIP with type D on Gen 7.
The spec says the type must be W (JIP is 16-bits after all), but we've
been emitting it with a UD type all along and have experienced no
adverse effects. Changing the type to D allows ELSE and ENDIF
instructions to be compacted.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-25 11:02:36 -07:00
Matt Turner
6a4e84edfa i965/compaction: Support compaction of control flow instructions.
We're currently emitting compactable control flow instruction the wrong
types, preventing their compaction. The next patch will fix this and
actually enable compaction.

On chips that cannot compact control flow instructions, attempts to find
a match in the datatype table will fail.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-25 11:02:36 -07:00
Matt Turner
14e44f896f i965/compaction: Add support for G45.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-25 11:02:36 -07:00
Matt Turner
5a559557e6 i965: Add BRW_OPCODE_NENOP for G45.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-25 11:02:36 -07:00
Matt Turner
64c0f62018 i965/compaction: Add support for Gen5.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-25 11:02:36 -07:00
Matt Turner
bb05b530ab i965/compaction: Reduce size of compacted_counts[] array.
The array was previously indexed in units of brw_compact_inst (8-bytes),
but before compaction all instructions are uncompacted, so every odd
element was unused.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-25 11:02:36 -07:00
Matt Turner
90c982a8a8 i965/compaction: Use sizeof brw_inst/brw_compact_inst.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-25 11:02:36 -07:00
Matt Turner
b92a1e2174 i965/compaction: Increment offset in for loop.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-25 11:02:36 -07:00
Matt Turner
eebf1f5441 i965/compaction: Make src_offset local to the for loop.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-25 11:02:36 -07:00
Matt Turner
cde887ccb1 i965/compaction: Remove unnecessary is-compacted? check.
Used to pass over previously compacted instructions in this loop, but no
longer. No point in checking.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-25 11:02:36 -07:00
Matt Turner
118021f929 i965/compaction: Don't set UIP on ELSE on Gen < 8.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-25 11:02:36 -07:00
Matt Turner
45c3ece266 i965/compaction: Rework 3-src compaction logic.
It may be possible to create a contrived example in which a 3-src
instruction would have been compacted on Gen < 8. I'd rather not
discover it in the wild.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-25 11:02:36 -07:00
Matt Turner
1fce6fcac1 i965/sf: Disable instruction compaction.
Currently a no-op, since instruction compaction isn't implemented for the
generations that have a programmable strips-and-fans unit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-25 11:02:36 -07:00
Matt Turner
b5466707d6 i965: Set JumpCount, not JIP, on ENDIF on Gen 6.
Despite what the Sandybridge PRM says, ENDIF has Jump Count in <dst>,
not JIP in <src1>. (The same mistake appears about WHILE as well).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-25 11:02:36 -07:00
Kenneth Graunke
23247e8059 mesa: Use VertexArray, not _VertexArray, in array size expressions.
Both sizes are VERT_ATTRIB_MAX, so this has no effect.  But it drops a
few trivial uses of the derived state.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-24 23:14:26 -07:00
Kenneth Graunke
f81052dc9b mesa: Set correct array element in vbo_exec_vtx_init.
I'm not familiar with this code, but this sure appears to be a typo.
It looks like the intent is to set each array element, not arrays[0]
each time.  Notably, the loop just below uses "array", not "arrays".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: mesa-stable@lists.freedesktop.org
2014-09-24 23:14:23 -07:00
Kenneth Graunke
d0ec6e8509 mesa: Use proper structure for glGet*(GL_TEXTURE_COORD_ARRAY*).
The code in get.c that handles this uses ctx->Array.VAO->VertexAttrib,
which is a gl_vertex_attrib_array structure, not a gl_client_array.

The offsets of all fields happened to be the same in both structures, at
least on x86_64.  "Size," "Type," and "Stride" are obviously the same:
both structures start with the same fields, in the same order.

"Enabled" is dicier: there are different fields before it in both
structures, including pointer sized values which might need special
alignment.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: mesa-stable@lists.freedesktop.org
2014-09-24 23:14:16 -07:00
Kenneth Graunke
8c16a0d7ba mesa: Remove some dead helper functions.
Dead since the _MaxElement removal, but these functions seemed generally
applicable, so I decided to remove them in a separate patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-24 23:13:56 -07:00
Eric Anholt
a04605a8ca vc4: Compute max_index instead of trusting the rest of userspace.
max_index was coming from either the user telling us as part of
glDrawRangeElements, or from an incidental calculation as part of some
sort of primitive conversion fallback.  Sometimes, it was just set to the
default "I don't know" ~0 value.

If it wasn't set to the actual max index, then the kernel would reject the
draw call for allowing out-of-bounds VBO reads.  So, compute the max index
from the sizes of the VBOs, which isn't too expensive (unlike mapping and
reading the index buffer) and is reliable.

Fixes piglit vao-element-array-buffer.
2014-09-24 20:51:15 -07:00
Eric Anholt
61cb08ab4f vc4: Move shader record setup before the draw call.
The flush only happens after both are written, so we can do them in either
order.  This will let me compute max_index during the shader record setup.
2014-09-24 20:49:08 -07:00
Matt Turner
ba0c0a186d i965/vec4: Call calculate_cfg() in test programs to avoid crashing.
Reported-by: Mark Janes <mark.a.janes@intel.com>
2014-09-24 16:06:41 -07:00
Eric Anholt
52476b35c1 vc4: Add support for gl_PointCoord.
Fixes piglit glsl-fs-pointcoord, point-sprite, and fbo-gl_pointcoord.
2014-09-24 15:59:03 -07:00
Eric Anholt
66b7bd60e0 vc4: Add support for point size setting.
This is the support for both the global and per-vertex modes.
2014-09-24 15:56:39 -07:00
Eric Anholt
f24588d64e vc4: Add support for line width setting.
I don't see piglit tests for it, but this should be better than not
emitting it at all.
2014-09-24 15:56:39 -07:00
Eric Anholt
7fa399f93a vc4: Actually add support for polygon offset.
Setting the bit without setting the offset values is kind of useless.
Fixes piglit polygon-offset (but not polygon-mode-offset).
2014-09-24 15:56:39 -07:00
Eric Anholt
6abbdfe3db vc4: Fix swapped 565 dithering versus no-dithering render configs.
Fixes many 565 piglit tests (like fbo-generatemipmap-formats) that weren't
expecting dithering.
2014-09-24 15:56:39 -07:00
Eric Anholt
8cd165051b vc4: Add support for alpha test.
Fixes most of piglit fbo-alphatest-formats (but not RGB565/332).
2014-09-24 15:56:39 -07:00
Rob Clark
a87e44da3a freedreno/a3xx: initial texture border-color
Still some open questions.. and at any rate, no additional piglit passes
due to various wrap modes that we need to emulate in at least some
cases :-(

But it does fix some mystery page-faults.. So add some comments in the
code where there are things that we need to emulate or do more r/e, and
push as-is.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-24 18:52:58 -04:00
Brian Paul
9f47220450 util: use linear formats in util_blit_pixels()
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-09-24 15:35:11 -06:00
Brian Paul
b6947e02de util: simplify writemask parameters for util_blit_pixels()
Instead of separate color and Z/S writemasks, just have one writemask
parameter that takes a mask of the PIPE_MASK_[RGBAZS] flags.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-09-24 15:35:11 -06:00
Brian Paul
b32f05e153 util: s/PIPE_TEX_MIPFILTER/PIPE_TEX_FILTER/ in u_blit code
PIPE_TEX_MIPFILTER_x is not legal for the pipe_sampler_state::
min/mag_img_filter fields.  But PIPE_TEX_MIPFILTER_x == PIPE_TEX_FILTER_x
so we were getting lucky.

This also makes the code consistent with u_blitter.c.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-09-24 15:35:10 -06:00
Brian Paul
f5e8b30472 mesa: remove EXT suffix from FBO error messages
And use pass caller="" for _mesa_FramebufferTexture().

Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-24 15:35:10 -06:00
Matt Turner
5980fc35c9 mesa: Drop _mesa_getenv() wrapper.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-24 09:58:43 -07:00
Matt Turner
209eba42eb mesa: Drop _mesa_bsearch() wrapper.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-24 09:58:43 -07:00
Matt Turner
9499d6e358 mesa: Unifdef _WIN32_WCE.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-24 09:58:43 -07:00
Matt Turner
d20015a576 mesa: Unifdef _XBOX.
Inexplicably added in commit 36940429.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-24 09:58:43 -07:00
Matt Turner
b133b84733 configure.ac: Remove duplicate -DHAVE_PTHREAD.
It's also defined by the AX_PTHREAD macro.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-24 09:58:43 -07:00
Matt Turner
d1022529fe configure.ac: Stop checking for perl.
Added by commit a75c6163, but no longer used.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-24 09:58:43 -07:00
Matt Turner
585e250dd2 configure.ac: Use test -a, rather than another test.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-24 09:58:43 -07:00
Matt Turner
452926a5ec mesa: Use realloc() instead of _mesa_realloc() and remove the latter.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-24 09:58:42 -07:00
Matt Turner
e5162defc8 mesa: Remove duplicate _mesa_{init,free}_shader_state prototypes.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-09-24 09:58:42 -07:00
Tom Stellard
180b152b24 gallivm: Wrap deleted inlcude in if HAVE_LLVM < 0x0306
This was missed in 8f4ee56.
2014-09-24 11:54:44 -04:00
Matt Turner
ef75f60822 i965: Add and use functions to get next/prev blocks.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
444fc0b4a8 i965: Call insert and remove functions from exec_node directly.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
49374fab5d i965: Make instruction lists local to the bblocks.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
3fe1a84bbe i965/cfg: Add note about double-loop macros and break behavior.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
153d148e9e i965: Replace initialization loops with memset().
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
72bb3f81c6 i965/vec4: Don't iterate between blocks with inst->next/prev.
The register coalescing portion of this patch hurts three shaders in
Guacamelee by one instruction each, but examining the diff makes me
believe that what we were generating was (perhaps harmlessly) incorrect.
2014-09-24 09:42:46 -07:00
Matt Turner
f0598d413b i965/fs: Don't iterate between blocks with inst->next/prev.
When instruction lists are per-basic block, this won't work.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
7119712f45 i965/cfg: Add macros to iterate through a block given a starting point.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
235f451f7a i965/fs: Make count_to_loop_end() use basic blocks.
When the instructions aren't in a flat list, this wouldn't have worked.
Also, this should be faster.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
90bfeb2244 i965/vec4: Don't use instruction list after calculating the cfg.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
2ff0ff880c i965/fs: Don't use instruction list after calculating the cfg.
The only trick is changing a break into a return true in register
coalescing, since the macro is actually a double loop, and break will do
something different than you expect. (Wish I'd realized that earlier!)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
a4fb8897a2 i965: Remove now unneeded calls to calculate_cfg().
Now that nothing invalidates the CFG, we can calculate_cfg() immediately
after emit_fb_writes()/emit_thread_end() and never again.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
072ea414d0 i965: Remove cfg-invalidating parameter from invalidate_live_intervals.
Everything has been converted to preserve the CFG.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
9e28bb863c i965: Preserve the CFG in instruction scheduling.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
269b6e24d6 i965/vec4: Preserve CFG in spill_reg().
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
b0b64c85e4 i965/vec4: Preserve the CFG in a few more places.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Matt Turner
a9f8296dbb i965/fs: Preserve the CFG in a few more places.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-24 09:42:46 -07:00
Kristian Høgsberg
9b75663866 i965: Restructure debug flags
This cleans up the debug flags to be consistently indented, use bit
shifting instead of hex-values and fixes a bug where the new DEBUG_NO8 flag
used the same value as the DEBUG_VUE flag.  This was hidden by the numbers not
being aligned.  Also removes gaps in the range where DEBUG_IOCTL (0x4) and
DEBUG_REGION (0x400) used to be.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-09-24 09:15:09 -07:00
Tom Stellard
8f4ee56e49 gallivm: Disable gallivm to fix build with LLVM 3.6
LLVM commit r218316 removes the JITMemoryManager class, which is
the parent for a seemingly important class in gallivm.  In order to
fix the build, I've wrapped most of lp_bld_misc.cpp in
if HAVE_LLVM < 0x0306 and modifyed the
lp_build_create_jit_compiler_for_module() function to return false
for 3.6 and newer which effectively disables the gallivm functionality.

I realize this is overkill, but I could not come up with a simple
solution to fix the build.  Also, since 3.6 will be the first release
without the old JIT, it would be really great if we could
move gallivm to use the C API only for accessing MCJIT.  There
is still time before the 3.6 release to extend the C API in
case it is missing some functionality that is required by gallivm.
2014-09-24 10:34:19 -04:00
Marek Olšák
2f7714e071 gallium/rbug: correctly unreference a sampler view
This fixes heap corruption. The sampler view can be bound in the context,
so we cannot call destroy directly.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
91ddf49c87 gallium/rbug: unlock a mutex in rbug_create_query
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
c944866708 radeonsi: remove old cache flushing code
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
dd53d53dc6 radeonsi/compute: do CS partial flush with si_emit_cache_flush
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
604b58b554 radeonsi/compute: flush caches with si_emit_cache_flush
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
628f8ee1d9 radeonsi/compute: directly emit CONTEXT_CONTROL
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
dc05a9e4e0 radeonsi: properly destroy the GS copy shader and scratch_bo for compute
Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
711623f7c8 radeonsi: release GS rings at context destruction
Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
2833dc4e45 radeonsi: don't use pipe_constant_buffer for GS rings
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
1abb1a97b0 radeonsi: don't pass the context to the shader translator
This should prevent accessing context state there.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
e29353ff20 radeonsi: don't snoop currently-bound GS shader when compiling ES
Instead, pass the layout of GS inputs in memory to the ES using the shader
key. Only 64 bits are needed to represent the layout in the key.

Mixing and matching different VS and GS shaders should now always work.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
2774abd4ce radeonsi: shorten si_pipe_* prefixes to si_*
This was the original naming convention in r600g and it somehow crept
into radeonsi.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
8c37c16cbc radeonsi: merge si_pipe_shader into si_shader
One is part of the other anyway.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
07c0b4d9b7 radeonsi: disable gl_SampleMask fragment shader output if MSAA is disabled
This fixes piglit: arb_sample_shading-builtin-gl-sample-mask 0

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
b53b1ceb3e radeonsi: only update MSAA-specific framebuffer state if nr_samples is changed
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
dba4c5baf4 radeonsi: move DB_SHADER_CONTROL into db_render_state
I will need this for fixing sample shading with 1 sample.

The good news is that all shader pm4 states no longer use the current context
state, so we can generate the pm4 states outside of draw_vbo if needed.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
adc5797f54 radeonsi: set KILL_ENABLE during shader compilation, remove uses_kill flag
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
a34c9f70b1 radeonsi: remove shader.ps_conservative_z, set db_shader_control instead
Also set the field on SI too. It's not just specific to CIK.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
884f1654e2 radeonsi: move DB registers from draw_vbo into new db_render_state
It's called db_misc_state in r600g.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
a768b43bc3 radeonsi: remove unused variable si_pipe_shader::sprite_coord_enable
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
fd076259ff radeonsi: document what si_descriptors.c does
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
4ace4190ac r300g: implement MSAA copies by resolving and upsampling
There's no other way. It will use hw resolve + blit.
2014-09-24 14:48:02 +02:00
Marek Olšák
6cfedf8797 st/mesa: redefine mapping from VARYING_SLOT_TEXi/PNTC/VARi to TGSI GENERIC[i]
Generic varyings in TGSI were based on the value of VARYING_SLOT_TEX0, so VAR0
was always GENERIC[22] (with tessellation patches). Some drivers might not
be able to cope with that.

This commit defines a proper mapping, so that PNTC is GENERIC[8] and VAR0 is
GENERIC[9].

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
77038cd35a st/mesa: don't set coord_enable for gl_PointCoord if using TGSI_SEMANTIC_PCOORD
This was missed when Christoph Bumiller added PIPE_CAP_TGSI_TEXCOORD.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
ffbcee8a57 st/mesa: use UniformBooleanTrue in glsl_to_tgsi
Just for consistency. This doesn't fix anything as the original code was
already pretty good.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
4155d1c7b0 st/mesa: drop dependence on API profile in st_init_extensions
The extensions and limits being set in the conditional block are core-only
anyway and don't have any effect on other profiles.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-24 14:48:02 +02:00
Marek Olšák
2599b92eb9 mesa: allow forcing >=3.1 compatibility contexts with MESA_GL_VERSION_OVERRIDE
E.g. the 4.0 compatibility profile can be forced with:

MESA_GL_VERSION_OVERRIDE=4.0COMPAT

Some tests that I have require 4.0 compatibility.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-24 14:48:01 +02:00
Marek Olšák
10ffd98c34 mesa: don't set ES versions to GLSLVersion in _mesa_init_constants
No place in Mesa expects an ES version there.
Drivers don't even set it like this.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-24 14:48:01 +02:00
Emil Velikov
a3e9582f09 targets/vl: don't forget to set GALLIUM_STATIC_TARGETS
git rebase failure while dropping out a patch that reworks
the way we build aux/vl.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-24 11:54:28 +01:00
Emil Velikov
5a68432f04 targets/egl: fold in target LDFLAGS variables
Both variables are identical thus we can fold them into AM_LDFLAGS.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:44:01 +01:00
Emil Velikov
a37b9bb555 targets: drop the old MEGADRIVERS & STATIC_TARGET... variables
No longer used/needed as of last commit.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:44:01 +01:00
Emil Velikov
0f3c0ff17b gallium/softpipe,llvmpipe: add automake target 'templates'
Rather than duplicating the libdeps, extra define... all over the
targets, define them only once and use when applicable.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:44:01 +01:00
Emil Velikov
29c4ae0ebf configure: remove NEED_{SOFT,LLVM}PIPE_DRIVER variables
The respective HAVE_{SOFT,LLVM}PIPE are already descriptive
enough. Additionally the svga modules does not really use either
one, but the auxiliary draw & gallivm modules.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:44:01 +01:00
Emil Velikov
3d909864c8 gallium/vc4: add automake target 'templates'
Rather than duplicating the libdeps, extra define... all over the
targets, define them only once and use when applicable.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:44:01 +01:00
Emil Velikov
c2b5d7024e gallium/r300,r600,radeonsi: add automake target 'templates'
Rather than duplicating the libdeps, extra define... all over the
targets, define them only once and use when applicable.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:44:01 +01:00
Emil Velikov
fd4cd8e20a gallium/svga: add automake target 'template'
Rather than duplicating the libdeps, extra define... all over the
targets, define them only once and use when applicable.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:44:00 +01:00
Emil Velikov
ca32ce40b1 gallium/ilo: add automake target 'template'
Rather than duplicating the libdeps, extra define... all over the
targets, define them only once and use when applicable.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:44:00 +01:00
Emil Velikov
defd48c6c5 gallium/i915: add automake target 'template'
Rather than duplicating the libdeps, extra define... all over the
targets, define them only once and use when applicable.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:44:00 +01:00
Emil Velikov
97bec98ac9 gallium/freedreno: add automake target 'template'
Rather than duplicating the libdeps, extra define... all over the
targets, define them only once and use when applicable.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:44:00 +01:00
Emil Velikov
0e59153229 gallium/nouveau: add automake target 'template'
Rather than duplicating the libdeps, extra define... all over the
targets, define them only once and use when applicable.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:44:00 +01:00
Emil Velikov
6e1f846ce0 targets/pipe-loader: drop unused authentication
The dri, vdpau, omx, xvmc and gbm targets don't need any authentication
even the VL ones never used it. Either the respective loader or the
library itself (vl) is doing its auth prior to calling create_screen()

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:43:44 +01:00
Emil Velikov
18137c5fe0 targets/vl: fix hard-links when building shared pipe-drivers
Make sure that MEGADRIVERS is set in order to create the hardlinks.
The variable name is not the most appropriate and will be sorted
out in upcoming commits.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:38:43 +01:00
Emil Velikov
1cb8bba499 configure: remove unused variable OSMESA_MESA_DEPS
Leftover from the static Makefiles

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:38:43 +01:00
Emil Velikov
523fa2f1ce gallium/freedreno: remove unused draw header
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:38:43 +01:00
Emil Velikov
e8053bb65e gallium/r300: remove obsolete declaration
The definition of rc_pair_regalloc_inputs_only() is no longer
around so drop the declaration.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-24 10:38:43 +01:00
Eric Anholt
bf4aecfb2a vc4: Drop maximum number of varyings down to 8.
There are only 32 bits in the flatshade flags (which are 1 bit per
component), the simulator crashes when you use more than about this many
varyings, and the original Broadcom code drop only exposed 8 as well.

Fixes 26 piglit tests in the varying-packing group, and makes many others
go from crash to fail (due to not checking their varying counts and
treating link failures as failures).  Regresses ARB_fp/minmax (due to 8
varyings instead of 10).
2014-09-24 00:25:07 -07:00
Eric Anholt
45b104e0a2 vc4: Add support for flat shading.
This is just the GL 1.1 flat shading of colors -- we don't need to support
TGSI constant interpolation bits, because we don't do GLSL 1.30.

Fixes 7 piglit tests.
2014-09-23 17:23:29 -07:00
Eric Anholt
0e7bc3088b vc4: Drop stale comment.
This should have been in 001247d230.
2014-09-23 17:23:29 -07:00
Brian Paul
e8ea783d79 util: fix SCons build
after register_allocate.c was moved to util/ directory.
2014-09-23 16:33:17 -06:00
Eric Anholt
9dbfca10a3 vc4: Put dead writes into the NOP register when generating code.
They still provide register pressure since I haven't made a special class
for them, but since they're only live for one instruction it probably
doesn't matter.

This improves the readability of QPU assembly.
2014-09-23 13:51:42 -07:00
Eric Anholt
d2b58240b4 vc4: When possible, resolve raddr conflicts by swapping files on specials.
Cleans up a bunch of ugliness in perspective interpolation.
2014-09-23 13:51:41 -07:00
Eric Anholt
3e5325e8c9 vc4: Fix overzealous raddr conflict resolution.
We only need to do the fixup when both args are in the same file, not just
when both are in physical registers.
2014-09-23 13:51:29 -07:00
Eric Anholt
2e48b286bf vc4: Add support for 8-bit unorm/snorm vertex inputs. 2014-09-23 13:40:10 -07:00
Eric Anholt
b7edf30191 vc4: Add disasm for A-file unpack operations.
The A-file unpack is just like R4 unpack, except that if you don't do a
floating-point operation it won't do float conversion (so int16 gets
scaled up to int32).
2014-09-23 13:40:10 -07:00
Eric Anholt
71e5ba9c01 vc4: Switch to using Mesa's register allocator.
This will let me more reliably allocate a-file registers, which are going
to be even more in demand when I start using a-file unpacks.

Also fixes a bug where the reservation of payload registers (FRAG_Z/W) was
off by one but just caused failure to register allocate at all if the
off-by-one was fixed.
2014-09-23 13:40:10 -07:00
Eric Anholt
0148690ac7 vc4: Make a static list of all the registers. 2014-09-23 13:40:10 -07:00
Eric Anholt
e157837282 vc4: Switch the context struct to use ralloc.
I wanted to hang the ra_regs off it so I didn't have to free, but it
turned out it wasn't ralloced yet.
2014-09-23 13:40:10 -07:00
Eric Anholt
517e01b5c3 mesa: Move register_allocate.c to util.
The r300 gallium driver is using it outside of the Mesa tree, and I wanted
to do so for vc4 as well.  Rather than make the multiple-definitions
problem even more complicated, just move it to more-shared code.

v2: Don't forget to delete the symlink in r300 (review by Matt).
    Delete more r300-helper references (review by Emil)
    Don't prefix util/ header inclusion with "util/" (review by Emil)

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> (v1)
2014-09-23 13:40:10 -07:00
Roland Scheidegger
5e1fcc6258 gallivm: fix idiv
ffeb77c7b0 had a typo which turned all signed
integer divisions into unsigned ones. Oops.
This gets us back the 51 little piglits
(all from glsl built-in-functions, fs/vs/gs-op-div-int-ivec2 and similar).

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-09-23 21:46:00 +02:00
Juha-Pekka Heikkila
4ed23fd590 egl: extra null checks for get_xcb_screen() return values
verify get_xcb_screen() returned pointer before using it.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila
b9463813ee meta: Fix error paths in meta_copy_image.c
If _mesa_get_tex_image() return NULL there is already error
set in context. Other error pats free allocated texture.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila
e13a8dc37d meta: Avoid null access on setup_glsl_msaa_blit_shader()
On default fallback path there was null access on src_rb

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila
ba089cfa82 i965: Add extra null check in intel_bufferobj_alloc()
Check calloc returned requested memory.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila
51aa221480 mesa/main: Check allocations success in _mesa_one_time_init_extension_overrides()
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila
a3d6146e3a glsl: Check realloc return value in ir_function::matching_signature()
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila
261120daef loader: Check dlsym() did not fail in libudev_get_device_name_for_fd()
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila
d2f0442bf6 glsl: Check calloc return value in link_intrastage_shaders()
Check calloc return value while adding build-in functions.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila
808b8e59c0 i965: Avoid null access in intelMakeCurrent()
separate two null checks connected with && to their own if branches.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila
36f8042e8c mesa: add null checks in symbol_table.c
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila
6e56eaf7b7 glsl: add missing null check in tfeedback_decl::init()
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila
a82b29d526 i965: in set_read_rb_tex_image() check _mesa_meta_bind_rb_as_tex_image() did succeed
Check if _mesa_meta_bind_rb_as_tex_image() did give the texture.
If no texture was given there is already either
GL_INVALID_VALUE or GL_OUT_OF_MEMORY error set in context.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila
5a6ec26aec glsl: Fix memory leak in glsl_lexer.ll
Running fast clear glClear with SNB caused Valgrind to
complain about this.

v2: line 237 fixed glClear from leaking memory, other
strdups are also now changed to ralloc_strdups but I
don't know what effect those have. At least no changes in
my Piglit quick run.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-23 10:25:02 +03:00
Chia-I Wu
6c9d67118a ilo: rework pipeline workarounds
Add current_pipe_control_dw1 and deferred_pipe_control_dw1 to track what have
been done since lsat 3DPRIMITIVE and what need to be done before next
3DPRIMITIVE.  Based on them, we can emit WAs more smartly.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-23 10:08:05 +08:00
Chia-I Wu
34e807817f ilo: remove handle_invalid_batch_bo()
It was used to set has_gen6_wa_pipe_control to false when the batch buffer
changed.  When called from emit_flush() and others, it also unset
ILO_3D_PIPELINE_INVALIDATE_BATCH_BO so that the following emit_draw() will not
set has_gen6_wa_pipe_control to false again.  It sounded error-prone and was
just ugly.

We should be able to achieve the same goal by reset has_gen6_wa_pipe_control
in ilo_3d_pipeline_invalidate().  With handle_invalid_batch_bo() gone, the
emit functions can also be inlined.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-23 10:08:05 +08:00
Chia-I Wu
2c1f978d6c ilo: make gen6_pipeline_update_max_svbi() static
We do not need to call it from GEN7 pipeline anymore since software
PIPE_QUERY_PRIMITIVES_EMITTED is gone.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-23 10:08:05 +08:00
Ilia Mirkin
f6ff4cd517 freedreno/ir3: add TXB2 support
Handles texture(samplerCubeShadow, bias), part of GLES3 and GL3

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-22 22:06:34 -04:00
Ilia Mirkin
9b7961f9a3 freedreno/ir3: add TXQ support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-22 22:06:34 -04:00
Ilia Mirkin
9a3dcf21d7 freedreno/ir3: fix TXB/TXL to actually pull the bias/lod argument
Previously we would get a potentially computed post-swizzle coord based
on the texture target info, which would not include the bias/lod in the
last argument.

The second argument does not have to be adjacent, so adjusting the order
array did not make sense.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-22 22:06:34 -04:00
Ilia Mirkin
53678f5e6b freedreno/ir3: make texture instruction construction more dynamic
This will make life a lot easier as we add support for additional
instructions.

v2: shadow reference value is always .z or .w

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-22 22:06:34 -04:00
Andreas Pokorny
df341320c9 i915: Fix black buffers when importing prime fds
Width and Height of the imported image was never initialized from the
imported bo.

Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Pokorny <andreas.pokorny@canonical.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2014-09-23 00:26:17 +01:00
Andreas Pokorny
53b614bfd3 egl/drm: expose KHR_image_pixmap extension
This changes enables EGL_KHR_image_pixmap in the egl drm platform, which is implemented
there but has not been advertised yet.

Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Pokorny <andreas.pokorny@canonical.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2014-09-23 00:25:45 +01:00
Brian Paul
6addb7f42b gallium: update comment for enum pipe_format
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-09-22 16:59:48 -06:00
Brian Paul
e7a614c60c gallium: replace pipe_type enum with tgsi_return_type enum
The only place the enum pipe_type was used is for the TGSI sampler
view return type.  So make it a TGSI type.  Note: it appears this
part of TGSI isn't used by anyone so it may be removed in the future.

v2: the new name is tgsi_return_type, not tgsi_type.  This means we
can drop the previously posted tgsi_type -> tgsi_opcode_type patch.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-09-22 16:56:24 -06:00
Brian Paul
9ce72ac1fa draw: use new tgsi_transform inst/decl helpers in pstipple code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-09-22 16:56:24 -06:00
Brian Paul
493ab77551 draw: use new tgsi_transform inst/decl helpers in aapoint code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-09-22 16:56:24 -06:00
Brian Paul
d7e5b7138a draw: use new tgsi_transform inst/decl helpers in aaline code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-09-22 16:56:24 -06:00
Brian Paul
e9d076e6d0 tgsi: add inst/decl helpers for tgsi_transform utility
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-09-22 16:56:24 -06:00
Brian Paul
16ff2fdd70 draw: use tgsi transform prolog callback in polygon stipple code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-09-22 16:56:24 -06:00
Brian Paul
6581aa441e draw: use tgsi transform prolog/epilog callbacks in AA line code
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-09-22 16:56:24 -06:00
Brian Paul
d77c0a2b52 draw: use tgsi transform prolog/epilog callbacks in AA point code
This simplifies the code and makes it a little easier to understand.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-09-22 16:56:24 -06:00
Brian Paul
9e0160fc58 tgsi: fix tgsi transform's epilog callback
We want to call the caller's epilog callback when we find the TGSI
END instruction, not after it.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-09-22 16:56:23 -06:00
Brian Paul
b16bb3f50f tgsi: add prolog() method to tgsi_transform_context
Called when the user can insert new decls, instructions.
This could be used in a few places in the 'draw' module.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-09-22 16:56:23 -06:00
Brian Paul
2826212dc7 glsl: use ptrdiff_t cast to silence g++ sign warning
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-22 16:56:23 -06:00
Jordan Justen
19b08e1bb3 i965/fs: Remove direct fs_visitor brw_wm_prog_key dependence
Instead we store a void pointer to the key, and cast it to
brw_wm_prog_key for fragment shader specific code paths.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-22 11:11:33 -07:00
Jordan Justen
e9be6a7833 i965/fs: Use brw_sampler_prog_key_data instead of brw_wm_prog_key::tex
This helps:
1. Reduce the need to have fs_visitor::key's type be brw_wm_prog_key*
2. Align the code to allow brw_sampler_prog_key_data to be pulled out of other
   prog_key types for different stages.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-22 11:11:33 -07:00
Jordan Justen
49e5f76a65 i965/fs: Remove direct fs_visitor brw_wm_prog_data dependence
Instead we store a brw_stage_prog_data pointer, and cast it to
brw_wm_prog_data for fragment shader specific code paths.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-22 11:11:33 -07:00
Tom Stellard
c6d9801409 clover: Add support to mem objects for multiple destructor callbacks v2
The spec says that mem objects should maintain a stack of callbacks
not just one.

v2:
  - Remove stray printf.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

CC: "10.3" <mesa-stable@lists.freedesktop.org>
2014-09-22 12:32:34 -04:00
Brian Paul
cc71457b48 st/xa: silence unused variable warning
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-22 08:04:34 -06:00
Brian Paul
0100d45b7e target-helpers: add inline qualifier on configuration_query()
To silence unused function warnings.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-22 08:04:34 -06:00
Chia-I Wu
a68f421d73 ilo: clean up fallback path for primitive restart
We should be able to draw with the index buffer mapped.  That simplifies
things a lot.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-22 14:22:37 +08:00
Chia-I Wu
d69faf851f ilo: handle conditional rendering in the context
Conditional rendering is not limited to draw_vbo().  Move the support to
ilo_context, and replace ilo_3d_pass_render_condition() by
ilo_skip_rendering().
2014-09-22 12:51:42 +08:00
Chia-I Wu
295a3a3ff0 ilo: create the pipeline from the builder
The pipeline needs just the builder to build commands.  It does not need CP.
2014-09-22 11:47:33 +08:00
Chia-I Wu
61c6a294dd ilo: move aperture checks out of pipeline
They can be done outside of the pipeline.  Move them and let the pipeline
focus on building commands.
2014-09-22 11:45:38 +08:00
Chia-I Wu
672592de7e ilo: flush before setting SOL_RESET
SOL_RESET happens before bo execution.  It should not be observed by the
commands that are already in the bo.

Move the code out of the pipeline now that it submits.
2014-09-22 10:41:13 +08:00
Chia-I Wu
17e7582465 ilo: move size estimation check out of pipeline
It can be done outside of the pipeline.  Let's move it.
2014-09-22 10:36:27 +08:00
Rob Clark
49b8fb937f freedreno/a3xx: more texture array fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-21 15:36:26 -04:00
Rob Clark
18291ee17a freedreno: add DRM_CONF_SHARE_FD
And config query and DRM_CONF_SHARE_FD to both mega-driver and
traditional build configs, so that EGL_EXT_image_dma_buf_import
works.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-21 15:35:53 -04:00
Chia-I Wu
41f072a4f8 ilo: use a single list for queries
We used different lists for different types of queries because we wanted to
update software queries quickly.  Now that there is no software queries, we
are fine with a single list.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-21 23:36:00 +08:00
Chia-I Wu
6b79d894d7 ilo: replace software queries by hardware ones
Read PIPE_QUERY_PRIMITIVES_GENERATED and PIPE_QUERY_PRIMITIVES_EMITTED from
hardware registers.  Because all queries now have a bo, remove unnecessary
checks for q->bo.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-21 23:31:41 +08:00
Chia-I Wu
154972700d ilo: support prim queries in ilo_3d_pipeline_emit_query()
Add support for PIPE_QUERY_PRIMITIVES_GENERATED and
PIPE_QUERY_PRIMITIVES_EMITTED in ilo_3d_pipeline_emit_query().

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-21 23:31:31 +08:00
Chia-I Wu
900d8136e1 ilo: add ilo_3d_pipeline_emit_query()
It replaces

  ilo_3d_pipeline_emit_write_timestamp(),
  ilo_3d_pipeline_emit_write_depth_count(), and
  ilo_3d_pipeline_emit_write_statistics().

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-21 23:31:20 +08:00
Chia-I Wu
9c873816a8 ilo: rework query support
This fixes some corner cases, but more importantly, the new code should be
easier to reason about.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-21 23:31:10 +08:00
Chia-I Wu
26fefae9a7 ilo: clarify cp owning/releasing
Make it own()'s responsibility to make room for release() and itself.  To be
able to do that, allow ilo_cp_submit() in own().

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-21 23:30:51 +08:00
Chia-I Wu
4eb2bbefd2 ilo: add a pointer to builder in ilo_3d_pipeline
It saves quite some typings.
2014-09-20 11:45:31 +08:00
Chia-I Wu
8b4726d32e ilo: add a helper for RECTLIST blitter
Add ilo_3d_draw_rectlist() for use by RECTLIST blitter.
2014-09-20 11:29:40 +08:00
Chia-I Wu
bca549691e ilo: no direct ilo_context access in BLT blitter
We need ilo_builder for command building and ilo_cp for size check.
ilo_context is not used.
2014-09-20 11:06:08 +08:00
Chia-I Wu
c1165c8ea0 ilo: fix headers in Makefile.sources 2014-09-20 11:01:35 +08:00
Chia-I Wu
6c0de4b979 ilo: add a new struct for context states
Move pipe states in ilo_context to the new ilo_state_vector.  The motivation
is that ilo_context consists of several loosely related things.  When we need
an ilo_context somewhere, we usually need only one or two of the things in it.
This change makes ilo_state_vector one such thing.

An immediate result is that we no longer need ilo_context in 3D pipelines,
something we have planned for since early days.
2014-09-20 10:13:53 +08:00
Chia-I Wu
284d767be0 ilo: merge ilo_gpe.h to ilo_state*.h
Move the #define's and struct's to ilo_state.h.  Move the inline functions and
function declarations to ilo_state_gen.h.
2014-09-20 10:13:53 +08:00
Chia-I Wu
4a8a6ce154 ilo: rename ilo_gpe_gen*.[ch]
Rename them to ilo_state_gen*.[ch].
2014-09-20 10:13:53 +08:00
Chia-I Wu
3cb383c1c9 ilo: make ilo_fence opaque
It is manipulated only in ilo_screen.c.
2014-09-20 10:13:53 +08:00
Chris Forbes
c4ed6c730f i965/gen6: Enable GL 3.3 and GLSL 3.30
Tested on my snb-gt2:

4 tests skip->pass in spec/EXT_texture_array
51 tests skip->pass in spec.glsl-3.30
4 tests skip->pass in spec/!OpenGL 3.3
No regressions; no skip->fail changes.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-20 13:59:36 +12:00
Roland Scheidegger
7ede5a1a7b gallivm: add information about different sampler/view units if analyzing shader
Useful to know in some cases.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-09-20 02:19:02 +02:00
Emil Velikov
4824eecc0c docs: Add 10.3 sha256 sums, news item and link release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 18571edea8)

Conflicts:
	docs/index.html
	docs/relnotes.html
2014-09-19 20:18:43 +01:00
Emil Velikov
991242ece1 docs: Update 10.3 release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 1b12af300d)
2014-09-19 20:16:37 +01:00
Emil Velikov
878e8a89f4 docs: Add sha256 sums for the 10.2.8 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit f95fcb1716)
2014-09-19 20:16:25 +01:00
Emil Velikov
4e8d1c7899 Add release notes for the 10.2.8 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 1e2b4120f7)
2014-09-19 20:16:14 +01:00
Marek Olšák
8449121971 st/dri: remove GALLIUM_MSAA and __GL_FSAA_MODE environment variables
Some users don't understand that these variables can break OpenGL.
The general is rule is that if an app supports MSAA, you mustn't use
GALLIUM_MSAA.

For example, if an app has an 8xMSAA FBO and GALLIUM_MSAA=4
is set, resolving the FBO to the back buffer will be rejected which will look
like this on all gallium drivers:

http://www.phoronix.com/scan.php?page=article&item=amd_radeonsi_msaa

The environment variables also have no effect on modern apps like TF2, but
there is still a performance hit due to wasted bandwidth and VRAM.

In a nutshell, it does more harm than good.

Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-19 20:38:06 +02:00
Eric Anholt
001247d230 vc4: Fix perspective interpolation.
Fixes the mesa reflect demo and 6 tests under interpolation/
2014-09-19 11:25:02 -07:00
Eric Anholt
dcd03e7476 vc4: Use the same method as for FRAG_Z to handle fragcoord W.
I need to get the non-reciprocal version of W for interpolation, anyway.
2014-09-19 11:09:04 -07:00
Roland Scheidegger
f2c39dd0e1 util: don't try to emit half-float intrinsics if avx isn't available
These instructions only have vex encodings, thus they can't be used without
avx. (Technically, one can still use avx-128 if avx isn't available because
the environment doesn't store the ymm registers, however I don't think llvm
can.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-09-19 16:58:28 +02:00
Samuel Iglesias Gonsalvez
74d7ff2efd i965/gen6: enable GLSL 1.50, OpenGL 3.2 and GL_AMD_vertex_shader_layered
Geometry shaders was the only thing we needed to enable GLSL 1.50 and
OpenGL 3.2 in gen6.

v2: Layered clears do not work properly in gen6 with OpenGL 3.2. Kenneth
and Jordan realized that for this to work we also need
GL_AMD_vertex_shader_layered (which requires OpenGL 3.2, so it could not be
enabled before this patch), so we agreed to enable this together with
OpenGL 3.2 in this patch.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-19 15:01:16 +02:00
Iago Toral Quiroga
d2c2ca9ee8 i965/gen6/gs: Use a specific implementation of geometry shaders for gen6.
In gen6 we will use the geometry shader implementation from gen6_gs_visitor.cpp
and keep the implementation in brw_vec4_gs_visitor.cpp for gen7+. Notice that
gen6_gs_visitor inherits from brw_vec4_gs_visitor so it is not a completely
seprate implementation of geometry shaders.

Also, gen6 does not support multiple dispatch modes, its default operation mode
is equivalent to gen7's SINGLE mode, so select that in gen6 for consistency.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:16 +02:00
Iago Toral Quiroga
3a4aee34a2 i965/gen6/gs: upload ubo and pull constants surfaces.
Uniforms declared as uniform blocks are stored in ubo surfaces and need to
be pulled from the geometry shader program so make sure we upload them first
and do the same for pull constants.

This fixes all piglit tests that use uniform blocks:
bin/shader_runner tests/spec/glsl-1.50/uniform_buffer/gs-*

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez
6947a8a593 i965/gen6/gs: Enable transform feedback support in geometry shaders
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:16 +02:00
Iago Toral Quiroga
c66165ab2b i965/gen6/gs: Fix binding table clash between TF surfaces and textures.
For gen6 geometry shaders we use the first BRW_MAX_SOL_BINDINGS entries of the
binding table for transform feedback surfaces. However, vec4_visitor will
setup the binding table so that textures use the same space in the binding
table. This is done when calling assign_common_binding_table_offsets(0) as
part if its run() method.

To fix this clash we add a virtual method to the vec4_visitor hierarchy to
assign the binding table offsets, so that we can change this behavior
specifically for gen6 geometry shaders by mapping textures right after the
first BRW_MAX_SOL_BINDINGS entries.

Also, when there is no user-provided geometry shader, we only need to upload
the binding table if we have transform feedback, however, in the case of a
user-provided geometry shader, we can't only look into transform feedback
to make that decision.

This fixes multiple piglit tests for textureSize() and texelFetch() when these
functions are called from a geometry shader in gen6, like these:

bin/textureSize gs sampler2D -fbo -auto
bin/texelFetch gs usampler2D -fbo -auto

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:16 +02:00
Iago Toral Quiroga
2614cde998 i965/gen6/gs: Avoid buffering transform feedback varyings twice.
Currently we buffer transform feedack varyings separately. This patch makes
it so that we reuse the values we have already buffered for all the output
varyings of the geometry shader instead.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez
2120443484 i965/gen6/gs: Buffer PSIZ/flags vertex data in gen6_gs_visitor
Since geometry shaders can alter the value of varyings packed in the first
output VUE slot (PSIZ), we need to buffer it together with all the other
vertex data so we can emit the right value for each vertex when we do the
URB writes.

This fixes the following piglit test in gen6:
tests/spec/glsl-1.50/execution/redeclare-pervertex-out-subset-gs.shader_test

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez
28a7da612b i965/gen6/gs: Setup SOL surfaces for user-provided geometry shaders
Update gen6_gs_binding_table and gen6_sol_surface to use user-provided
geometry program information when present. This is necessary to implement
transform feedback support.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez
fda4470944 i965/gen6/gs: implement transform feedback support in gen6_gs_visitor
This takes care of generating code required to handle transform feedback.
Notice that transform feedback isn't enabled yet, since that requires
additional setups in other parts of the code that will come in later patches.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez
1f77bfce7d i965/gen6/gs: Add an additional parameter to the FF_SYNC opcode.
We will use this parameter in later patches to provide information relevant
to transform feedback that needs to be set as part of the FF_SYNC message.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez
3ea410972a i965/gen6/gs: implement GS_OPCODE_FF_SYNC_SET_PRIMITIVES opcode
This opcode will be used when filling FF_SYNC header before
emitting vertices and their data.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez
5933a08bd9 i965/gen6/gs: implement GS_OPCODE_SVB_SET_DST_INDEX opcode
This opcode generates code to copy the specified destination index
into subregister 5 of the MRF message header.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez
e86ae1b0a3 i965/gen6/gs: implement GS_OPCODE_SVB_WRITE opcode
This opcode will be used when sending SVB WRITE messages to save
transform feedback outputs into Streamed Vertex Buffers.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:16 +02:00
Iago Toral Quiroga
66ec61c49f i965/gen6/gs: Enable texture units and upload sampler state.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:16 +02:00
Iago Toral Quiroga
6669fd0818 i965/gen6/gs: Assign geometry shader VUE map properly.
So far in gen6 we only used geometry shaders to implement transform feedback
in vertex shaders, so we assumed that the VUE map for the geometry shader
stage was always the same as for the vertex shader stage. This is no longer
true now that we support user provided geometry shaders in gen6 too.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
524ad6b901 i965/gen6/gs: Implement support for gl_PrimitiveIdIn.
For this we will need to move PrimitiveID information, delivered in the thread
payload in r0.1, to a separate register (we use GS_OPCODE_SET_PRIMITIVE_ID
for this), then map the corresponding varying slot to that register in the
setup_payload() method.

Notice that we cannot use a virtual register as the destination for the
PrimitiveID because we need to map all input attributes to hardware registers
in setup_payload(), which happens before virtual registers are mapped to
hardware registers. We could work around that issue if we were able to compute
the first non-payload register in emit_prolog() and move the PrimitiveID
information to that register, but we can't because at that point we still
don't know the final number uniforms that will be included in the payload.

So, what we do is to place PrimitiveID information in r1, which is always
delivered as part of the payload but its only populated with data
relevant for transform feedback when we set GEN6_GS_SVBI_PAYLOAD_ENABLE
in the 3DSTATE_GS state packet.

When we implement transform feedback, we wil make sure to move the value of r1
to another register before we overwrite it with the PrimitiveID.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
024b7c0f33 i965/gen6/gs: Implement GS_OPCODE_SET_PRIMITIVE_ID.
In gen6 the geometry shader payload includes the PrimitiveID information in
r0.1. When the shader code uses glPimitiveIdIn we will have to move this to
a separate hardware register where we can map this attribute. This opcode
takes the selected destination register and moves r0.1 there.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
c091804f4c i965/gen6/gs: Handle the case where a geometry shader emits no output.
In gen6 we need to end the thread differently depending on whether we have
emitted at least one vertex or not. In case we did, the EOT message must
always include the COMPLETE flag or else the GPU hangs. If we have not
produced any output, however, we can't use the COMPLETE flag.

This would lead us to end the program with an ENDIF opcode, which we want
to avoid (and actually is not permitted since it hits an assertion), so
instead what we do is that we always request a new VUE handle every time we do
an URB WRITE, even for the last vertex we emit. With this we make sure that
whether we have emitted at least one vertex or none at all we have to finish the
thread without writing to the URB, which works for both cases by setting the
COMPLETE and UNUSED flags in the EOT message.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
c1b8a5155b i965/gen6/gs: Make sure we complete the last primitive.
Just in case the GS algorithm does not call EndPrimitive() for the last
primitive produced. This is relevant only for non point outputs, since for
this we are already setting the PrimEnd flag on each vertex we emit.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
d93ca68666 i965/gen6/gs: Implement geometry shaders for outputs other than points.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
8411bf2c69 i965/gen6/gs: Add initial implementation for a gen6 geometry shader visitor.
Geometry shaders in gen6 are significantly different from gen7+ so it is better
to have them implemented in a different file rather than adding gen6 branching
paths all over brw_vec4_gs_visitor.cpp.

This commit adds an initial implementation that only handles point output, which
is the simplest case.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
5c30da1845 i965: Generalize emit_urb_slot() to emit to any dst_reg.
In gen7+ we emit vertices as they come, however in gen6 geometry shaders we
have to buffer vertex data for all vertices and then emit it all in one go
at the end. To achieve this we need to generalize emit_urb_slot() to store
vertex data in general purpose registers and not only MRF registers.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
9b32fd0f70 i965: Provide means to create registers of a given size.
Implemented by Ilia Mirkin <imirkin@alum.mit.edu>.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
f373b7ed82 i965/gen6/gs: Implement GS_OPCODE_SET_DWORD_2.
We had GS_OPCODE_SET_DWORD_2_IMMED but this required its source argument to be
an immediate. In gen6 we need to set dword 2 of the URB write message header
from values stored in separate register, so we need something more flexible.
This change replaces GS_OPCODE_SET_DWORD_2_IMMED with GS_OPCODE_SET_DWORD_2.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
7ccd47d644 i965/gen6/gs: Upload binding table for user-provided geometry shaders.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
5ac8294f9b i965/gen6/gs: Enable URB space for user-provided geometry shaders.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
c09ddf82ff i965/gen6/gs: Compute URB entry size for user-provided geometry shaders.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
621685ad4c i965/gen6/gs: Add instruction URB flags to geometry shaders EOT message.
Gen6 seems to require that EOT messages include the complete flag too or else
the GPU hangs. We add will this flag to the instruction when we emit the
thread end opcode.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
2c85132e51 i965/gen6/gs: Implement GS_OPCODE_URB_WRITE_ALLOCATE.
Gen6 geometry shaders need to allocate URB handles for each new vertex they
emit after the first (the URB handle for the first vertex is obtained via the
FF_SYNC message).

This opcode adds the URB allocation mechanism to regular URB writes.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
d0bdd4ce98 i965/gen6/gs: Implement GS_OPCODE_FF_SYNC.
This implements the FF_SYNC message required in gen6  geometry shaders to
get the initial URB handle.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-19 15:01:15 +02:00
Samuel Iglesias Gonsalvez
406e04113f i965/gs: Reuse gen6 constant push buffers setup code in gen7+.
The code required for gen6 and gen7+ is almost the same, so reuse it.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-19 15:01:15 +02:00
Iago Toral Quiroga
96012dfe80 i965/gen6/gs: Setup constant push buffers for gen6 geometry shaders.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-19 15:01:15 +02:00
Samuel Iglesias Gonsalvez
cf06136b63 i965/gen6/gs: Set brw->gs.enabled to FALSE in gen6_blorp_emit_gs_disable()
See 7dfb4b2d00 for more details.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-19 15:01:14 +02:00
Samuel Iglesias Gonsalvez
bc383cb55b i965/gen6/gs: use brw_gs_prog atom instead of brw_ff_gs_prog
This is needed to support user-provided geometry shaders, since the
brw_ff_gs_prog atom in gen6 only takes care of implementing transform feedback
for vertex shaders.

If there is no user-provided geometry shader the implementation falls back to
the original code.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-19 15:01:14 +02:00
Samuel Iglesias Gonsalvez
dd376bdb25 i965/gen6/gs: Skeleton for user GS program support
Currently, gen6 only uses geometry shaders for transform feedback so the state
we emit is not suitable to accomodate general purpose, user-provided geometry
shaders. This patch paves the way to add these support and the needed
3DSTATE_GS packet modifications for it.

Previous code that emitted state to implement transform feedback in gen6 goes
to upload_gs_state_adhoc_tf().

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-19 15:01:14 +02:00
Iago Toral Quiroga
03164f6285 i965/gs: Use single dispatch mode as fallback to dual object mode when possible.
Currently, when a geometry shader can't use dual object mode we fall back to
dual instance mode, however, when invocations == 1, single dispatch mode is
more performant and equally efficient in terms of register pressure.

Single dispatch mode requires that the driver can handle interleaving of
input registers, but this is already supported (dual instance mode has
the same requirement). However, to take full advantage of single dispatch mode
to reduce register pressure we would also need the ability to store two
separate vec4 output values into vec8 registers, which would approximately
double our capacity to store temporary values, but currently the vec4 visitor
and generator classes do not support this, so at the moment register pressure
in single and dual instance modes is the same.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-19 15:01:14 +02:00
Chia-I Wu
45cbc9267e ilo: rename ILO_DEBUG=3d
It has been a bad name since we added the builder.  Rename it to
ILO_DEBUG=batch to match i965, and call ilo_builder_decode() from
ilo_cp_submit_internal().
2014-09-19 16:02:11 +08:00
Chia-I Wu
8a2352262e ilo: rename ilo_cp_flush()
"Flush" is used for too many things already: pipe resource flush, pipe context
flush, pipe transfer region flush, and hardware pipeline flush.  Rename it to
ilo_cp_submit().  As such, ILO_DEBUG=flush is renamed to ILO_DEBUG=submit.
2014-09-19 16:02:11 +08:00
Chia-I Wu
1887d15eed ilo: remove ilo_cp_empty()
Call ilo_builder_batch_used() directly.
2014-09-19 16:02:11 +08:00
Chia-I Wu
270667472f ilo: simplify ilo_cp_set_owner()
The simplification allows us to get rid of ilo_cp_set_ring() and
ilo_cp_implicit_flush().  The 3D query code is refactored for the
simplification.
2014-09-19 16:02:11 +08:00
Kenneth Graunke
26ee6f23a9 mesa: Delete VAO _MaxElement code and index buffer bounds checking.
Fredrik's implementation of ARB_vertex_attrib_binding introduced new
gl_vertex_attrib_array and gl_vertex_buffer_binding structures, and
converted Mesa's older gl_client_array to be derived state.  Ultimately,
we'd like to drop gl_client_array and use those structures directly.

One hitch is that gl_client_array::_MaxElement doesn't correspond to
either structure (unlike every other field), so we'd have to figure out
where to store it.  The _MaxElement computation uses values from both
structures, so it doesn't really belong in either place.  We could put
it in the VAO, but we'd have to pass it around everywhere.

It turns out that it's only used when ctx->Const.CheckArrayBounds is
set, which is only set by the (rarely used) classic swrast driver.
It appears that drivers/x11 used to set it as well, which was intended
to avoid segmentation faults on out-of-bounds memory access in the X
server (probably for indirect GLX clients).  However, ajax deleted that
code in 2010 (commit 1ccef926be).

The bounds checking apparently doesn't actually work, either.  Non-VBO
attributes arbitrarily set _MaxElement to 2 * 1000 * 1000 * 1000.
vbo_save_draw and vbo_exec_draw remark /* ??? */ when setting it, and
the i965 code contains a comment noting that _MaxElement is often bogus.

Given that the code is complex, rarely used, and dubiously functional,
it doesn't seem worth maintaining going forward.  This patch drops it.

This will probably mean the classic swrast driver may begin crashing on
out of bounds vertex buffer access in some cases, but I believe that is
allowed by OpenGL (and probably happened for non-VBO accesses anyway).
There do not appear to be any Piglit regressions, either.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2014-09-19 00:43:01 -07:00
Eric Anholt
19589147ef vc4: Add support for stencil operations.
While depth test state is passed through the fragment shader as sideband,
data, the stencil test state has to be set by the fragment shader itself.

Many tests are still failing, but this gets most of hiz/ passing.
2014-09-18 17:46:43 -07:00
Eric Anholt
6e39854e23 vc4: Actually implement VC4_DEBUG=cl. 2014-09-18 11:46:50 -07:00
Roland Scheidegger
019ca99bee draw: (trivial) remove duplicated lines 2014-09-18 16:13:24 +02:00
Brian Paul
7b2c703244 mesa: fix prog_optimize.c assertions triggered by SWZ opcode
The SWZ instruction can have swizzle terms >4 (SWIZZLE_ZERO, SWIZZLE_ONE).
These swizzle terms caused a few assertions to fail.
This started happening after the commit "mesa: Actually use the Mesa IR
optimizer for ARB programs." when replaying some apitrace files.

A new piglit test (tests/asmparsertest/shaders/ARBfp1.0/swz-08.txt)
exercises this.

Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-09-18 07:28:36 -06:00
Eric Anholt
71d4fc88d6 vc4: Allow copy propagation of uniforms.
Fixes 12 piglit tests (and 8 more crash -> fail) from reducing register
pressure.
2014-09-17 14:21:24 -07:00
Eric Anholt
79be2cc383 vc4: Make sure thread end doesn't have a uniform read.
Prevents regression when I start doing copy propagation on uniforms.
2014-09-17 14:21:24 -07:00
Eric Anholt
44b8eb743d vc4: Allow dead code elimination of instructions that read uniforms. 2014-09-17 14:21:24 -07:00
Eric Anholt
5e90ed79f6 vc4: Add support for reordering the uniform stream after optimization.
This allows for introducing dead code eliminating of uniforms, copy
propagation of uniforms, and instruction rescheduling between instructions
that both read uniforms.
2014-09-17 14:21:24 -07:00
Eric Anholt
b0256fb75f vc4: Initialize the various qreg arrays when allocating them.
This is particularly important for outputs, where we try to MOV the whole
vec4 to the VPM, even if only 1-3 components had been set up.  It might
also be important for temporaries, if the shader reads components before
writing them.
2014-09-17 14:21:24 -07:00
Eric Anholt
b44a7a3223 vc4: Fix stray disable of the CSE pass.
Somehow I slipped this in with the original commit of CSE.
2014-09-17 14:21:24 -07:00
rconde
ffeb77c7b0 gallivm,tgsi: fix idiv by zero crash
While the result of signed integer division by zero is undefined by glsl
(and doesn't exist with d3d10), we must not crash, so need to make sure we
don't get sigfpe much like udiv already does.
Unlike udiv where we return 0xffffffff (as required by d3d10) there is
no requirement right now to return anything specific so we use zero.
2014-09-17 18:31:54 +02:00
Roland Scheidegger
4d996877ca gallivm: add texture target information for sample opcodes to tgsi info
sample opcodes don't have valid texture target information (and I don't think
this should be changed), however it would be nice if we had that information
ready elsewhere, so stuff that information into the tgsi info when analyzing
a shader.

v2: Ilja Mirkin spotted some bugs wrt not handling msaa resources. So add them
and while there also add them to the tex opcode analysis this was cloned from
as well (plus get rid of some bug not detecting indirect textures there in some
cases too).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-09-17 18:31:54 +02:00
Richard Sandiford
2e49559c77 st/mesa: Fix handling of 8888 SNORM and SRGB formats for big-endian
MESA_FORMAT_x8y8z8w8 puts the x channel in the least significant part of
the containing 32-bit integer, which is equivalent to PIPE_FORMAT_xyzw8888.
PIPE_FORMAT_x8y8z8w8 puts the x channel first in memory.

This patch fixes up the mesa<->gallium mapping accordingly.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-17 13:20:08 +10:00
Richard Sandiford
ccdbcd9586 st/mesa: Fix handling of LA and RG formats for big-endian
MESA_FORMAT_LnAn puts the luminance in the least significant part of
the containing integer, which is equivalent to PIPE_FORMAT_LAnn.
PIPE_FORMAT_LnAn puts the luminance first in memory.

This patch fixes up the mesa<->gallium mapping accordingly.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-17 13:20:08 +10:00
Richard Sandiford
be6ef203aa mesa: Add MESA_FORMAT_{A8R8G8B8, X8R8G8B8, X8B8G8R8}_SRGB (v2)
This means that each 8888 SRGB format has a reversed counterpart,
which is necessary for handling big-endian mesa<->gallium mappings.

v2: fix missing i965 additions. (Jason)
fix 127->255 max alpha for SRGB formats. (Jason)

v1: Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-17 13:19:45 +10:00
Richard Sandiford
df14091c58 mesa: Add MESA_FORMAT_A8L8_{SNORM,SRGB}
The associated UNORM format already existed.

This means that each LnAn format has a reversed counterpart,
which is necessary for handling big-endian mesa<->gallium mappings.

[airlied: rebased onto current master]

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-17 13:17:47 +10:00
Richard Sandiford
234d194b49 gallium: Define PIPE_FORMAT_xyzw8888_{SNORM, SRGB} aliases
...i.e. formats in which the first listed component is in the least
significant byte of the integer.  The corresponding UNORM aliases already exist.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-17 13:17:46 +10:00
Richard Sandiford
f9d8574b5e gallium: Add PIPE_FORMAT_x8B8G8R8_SNORM formats
This means that each RnGnBnxn format has a reversed counterpart,
which is necessary for handling big-endian mesa<->gallium mappings.
The associated UNORM and SRGB formats already exist.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-17 13:17:46 +10:00
Richard Sandiford
9b4c13995c gallium: Define PIPE_FORMAT_{LA, AL, RG, GR}nn aliases
...i.e. formats in which the first listed component is in the least
significant half of the integer.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-17 13:17:46 +10:00
Richard Sandiford
f14b40ab32 gallium: Add PIPE_FORMAT_AnLn and PIPE_FORMAT_GnRn formats
...i.e. formats in which the alpha or green channel is first in memory.

This means that each LnAn and RnGn format has a reversed counterpart,
which is necessary for handling big-endian mesa<->gallium mappings.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-17 13:17:46 +10:00
Dave Airlie
9ea045e85e mesa: fix SRGB alpha channel value in pack_float_R8G8B8X8_SRGB
Jason pointed out the bug on review adding new formats,
but the existing format also appears to have the bug, so
use 255 as the max, these are SRGB no SNORM.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-17 13:28:18 +10:00
Richard Sandiford
ecc48f83c8 swrast: Fix handling of MESA_FORMAT_L8A8_SRGB for big-endian
Luminance is the least-significant byte of the uint16, rather than the
lowest byte in memory.  Other parts of mesa already handle this correctly
for big-endian, and swrast already handles other MESA_FORMAT_x8y8 formats
correctly.  This case was just an odd-one-out.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-17 12:07:13 +10:00
Richard Sandiford
3e3b1db5f7 mesa: Tweak unpack name for MESA_FORMAT_R8G8B8X8_SNORM
MESA_FORMAT_R8G8B8X8_SNORM used a function called unpack_X8B8G8R8_SNORM
while MESA_FORMAT_R8G8B8X8_SRGB used a function called unpack_R8G8B8X8_SRGB.
This patch renames the SNORM function to have the same order as the
MESA_FORMAT name, like the SRGB function does.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-17 12:05:08 +10:00
Richard Sandiford
3ff5c6a6c4 mesa: Fix alpha component in unpack_R8G8B8X8_SRGB.
The function was using the "X" component as the alpha channel,
rather than setting alpha to 1.0.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-17 11:56:23 +10:00
Dave Airlie
ebcb2ee989 util: move shared rgtc code to util (v2)
This was being shared using a ../../ get out of gallium into
mesa, and I swore when I did it I'd fix things when we got a util
dir, we did, so I have.

v2: move RGTC_DEBUG define

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-17 11:27:25 +10:00
Eric Anholt
2b6711cc5f vc4: Claim ARB_fbo.
This gets a ton of piglit working that crashes in waffle context
management stuff otherwise.  Actually supporting mismatched FB sizes is at
best going to require some more load/store generals for color buffers, but
if I can't manage to do that I'll want to just have state_tracker reject
those FBOs as unsupported, rather than deny GL 2.1.
2014-09-16 15:14:52 -07:00
Eric Anholt
3c6d85e725 vc4: Fix memory leaks in register allocation. 2014-09-16 15:14:52 -07:00
Eric Anholt
ad02ba42f0 vc4: Move register allocation to a separate file.
I'm going to be rewriting it all, and having it mixed up with the
QIR-to-QPU opcode translation was messy.
2014-09-16 15:14:52 -07:00
Chris Forbes
b84c02f9cd glsl: fix error message for redeclaring gl_PerVertex as output
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-17 08:33:16 +12:00
Chris Forbes
667f758788 i965/vec4: slightly improve insn dumping with no srcs
Previously, we would get a trailing ', ' which looked strange.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-17 08:32:46 +12:00
Eric Anholt
2264925f85 vc4: Add support for computed depth writes.
Fixes piglit glsl-1.10-fragdepth and early-z.
2014-09-16 13:03:41 -07:00
Eric Anholt
aae4223fbd vc4: Restructure depth input/output in fragment shaders.
The goal here is to have an argument for the depth write opcode so that I
can do computed depth.  In the process, this makes the calculations that
will be emitted more obvious in the QIR.
2014-09-16 13:03:32 -07:00
Ilia Mirkin
a420aa1b41 freedreno: add a standalone ir3_compiler binary for building TGSI
Compiler taken from the combo old/new compiler comparer + simulator.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-16 12:13:22 -04:00
Ilia Mirkin
5b1d316c51 freedreno: add default .dir-locals.el for emacs settings
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-16 12:13:22 -04:00
Gwenole Beauchesne
e1c50abf8a i965: add support for RGBA dma_buf imports.
This allows for importing foreign buffers in RGB32 native endian
byte order, i.e. DRM_FORMAT_XBGR8888, and DRM_FORMAT_ABGR8888.

Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
2014-09-16 01:11:06 -07:00
Kenneth Graunke
78bd126194 i965: Mark delta_x/y as BAD_FILE if remapped away completely.
Commit afe3d1556f (i965: Stop doing
remapping of "special" regs.) stopped remapping delta_x/delta_y, and
additionally stopped considering them always-live.  We later realized
delta_x was used in register allocaiton, so we actually needed to remap
it, which was fixed in commit 23d782067a
(i965/fs: Keep track of the register that hold delta_x/delta_y.).

However, that commit didn't restore the "always consider it live" part.
If all the code using delta_x was eliminated, fs_visitor::delta_x would
be left pointing at its old register number.  Later code in register
allocation would handle that register number specially...even though it
wasn't actually delta_x.

To combat this, set delta_x/y to BAD_FILE if they're eliminated, and
check for that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83127
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
2014-09-16 00:46:46 -07:00
Dave Airlie
7f6872d012 st_glsl_to_tgsi: init have_sqrt field.
Coverity reported this.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-16 15:13:05 +10:00
Dave Airlie
8de5522d93 llvmpipe: fix rast debugging output
The triangle_32_ rast functions never made it into the debug output,
confused me for a few seconds.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-16 15:12:54 +10:00
Richard Sandiford
f93b6d8cc5 util: Add big-endian layout for a number of formats.
This patch builds on 6c8f547f66 and
previous patches by allowing u_format.csv to specify separate big-endian
and little-endian layouts.  It then uses this to specify the correct layouts
for various depth/stencil formats.  Later patches handle other formats.

To recap, the idea is that u_format.csv lists the channels for an N-byte
value as though it were an N-byte integer.  For little-endian targets
the channels are listed starting at the least-significant bit of the
integer while for big-endian targets the channels are listed starting
at the most-significant bit.  This means that for something like
PIPE_FORMAT_B8G8R8A8_UNORM (blue in first byte of memory, alpha in last
byte of memory) the orders are the same for both endiannesses.  But for
something like PIPE_FORMAT_S8_UINT_Z24_UNORM, where the stencil is in
the least significant byte of a 32-bit integer, there need to be separate
channel definitions for each endianness.

The effect of this patch is to make the affected PIPE_FORMAT_*s have
the same layout as the associated MESA_FORMAT_*s for big-endian.
The MESA_FORMAT_*s are already handled correctly.

Fixes various piglit tests on z.  No regressions on x86_64.

[airlied: squash subsequent patches]
util: Add big-endian layout for 5551 and 565 formats
util: Add big-endian layout for 10/10/10/2 formats
util: Add big-endian layout for 4444 formats
util: Add big-endian layout for 233 format
util: Add big-endian layout for 44 formats

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-16 14:02:56 +10:00
Richard Sandiford
9cd4dced06 llvmpipe: Fix PIPE_FORMAT_Z32_FLOAT_S8X24_UINT handling for big-endian.
llvmpipe treats PIPE_FORMAT_Z32_FLOAT_S8X24_UINT as a bit of a special case,
handling it as two 32-bit pieces rather than a single 64-bit block:

   /* 64bit d/s format is special already extracted 32 bits */
   total_bits = format_desc->block.bits > 32 ? 32 : format_desc->block.bits;

The format_desc describes the whole 64-bit block, so the z shift
will be 32 for big-endian.  But since we're accessing the z channel
as a 32-bit value rather than a 64-bit value, we need to mask the shift
with 31.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-16 14:02:55 +10:00
Richard Sandiford
1a65629ccc gallivm: Fix uses of 2^24
Fallback cases in lp_bld_arit.c used 2^24 to mean "2 to the power 24",
but in C it's "2 xor 24", i.e. 26.  Fixed by using 1<< instead.

Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-16 14:02:55 +10:00
Richard Sandiford
0a7f9fe42b gallivm: Add SNORM clamping to lp_build_{add, sub}
...fixing the associated TODO.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2014-09-16 14:02:54 +10:00
Rafael Ávila de Espíndola
f6e71ff9eb gallivm: attach DataLayout to module too, not just pass manager.
It looks like it was possible to attach it to both for a long time, however
since llvm r217548 attaching it to just the pass manager is no longer
sufficient and causes bugs (see http://llvm.org/bugs/show_bug.cgi?id=20903).

Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-09-16 03:50:32 +02:00
Roland Scheidegger
145fef9636 gallivm: handle SAMPLE opcode in aos sampling
This is just a very limited version, in particular sampler and sampler view
index must be the same. It cannot handle any modifiers neither.
Works much the same as soa version otherwise, to figure out the target we
need to store the sampler view dcls.
While here, also handle (no-op) RET and get rid of a couple bogus deprecated
comments.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-09-16 03:50:31 +02:00
Roland Scheidegger
02595c55b0 tgsi: accept offsets for sample opcodes too in the text parser
sample opcodes are a little oddly represented in the opcode_info, since
they don't count as texture instructions - they don't have valid target
information, but they may have offsets (unlike "ordinary" texture
instructions, the texture token may be optional for them).
So just make sure with these opcodes the optional offsets are accepted.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-09-16 03:50:31 +02:00
Roland Scheidegger
3a9eb40ee1 tgsi: don't print texture target for sample opcodes
sample opcodes don't encode a texture target, it would thus always
print UNKNOWN, which is not helpful (and wouldn't parse when giving
back the shader text to tgsi).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-09-16 03:50:31 +02:00
Eric Anholt
84ca6bc113 vc4: Bump maximum ARB program temporaries to match Intel/AMD.
This query has always been useless, but we could potentially reject
well-formed, runnable programs if we expose a value that's too low.
2014-09-15 15:05:06 -07:00
Eric Anholt
33443f506f vc4: Bump maximum uniforms count to match other drivers.
We don't have any specific limits in the hardware, just like the other
GPUs, so match their behavior.  Fixes minmax_gles2 and several other
piglit tests relying on the specced uniform minmax values.
2014-09-15 15:04:38 -07:00
Eric Anholt
5638b87d4c vc4: Dynamically allocate the TGSI-to-qreg arrays.
Fixes buffer overflows in some piglit tests (which are still failing to
register allocate anyway).
2014-09-15 13:12:27 -07:00
Eric Anholt
2147dd9681 vc4: Fix memory leaks of struct qinst. 2014-09-15 13:12:27 -07:00
Eric Anholt
f78ee1b280 vc4: Fix memory leaks of some vc4_compile contents. 2014-09-15 13:12:27 -07:00
Eric Anholt
50292d76c5 vc4: Reuse the util header instead of defining our own ARRAY_SIZE.
Fixes redefinition warnings if you end up including this header before
util stuff.
2014-09-15 13:12:27 -07:00
Brian Paul
418da97905 mesa: move i, j var decls into SWIZZLE_CONVERT_LOOP() macro
Put macro code in do {} while loop and put semicolons on macro calls
so auto indentation works properly.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-15 09:52:44 -06:00
Brian Paul
cfeb394224 mesa: break up _mesa_swizzle_and_convert() to reduce compile time
This reduces gcc -O3 compile time to 1/4 of what it was on my system.
Reduces MSVC release build time too.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-09-15 09:52:44 -06:00
Kalyan Kondapally
dbc2d81d2b Generate a warning when not writing gl_Position with GLES.
With GLES we don't give any kind of warning in case we don't
write to gl_position. This patch makes changes so that we
generate a warning in case of GLES (VER < 300) and an error
in case of GL.

Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-15 08:14:33 +03:00
Tapani Pälli
9bd139e451 mesa: check that uniform exists in glUniform* functions
Remap table for uniforms may contain empty entries when using explicit
uniform locations. If no active/inactive variable exists with given
location, remap table contains NULL.

v2: move remap table bounds check before existence check (Ian Romanick)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Erik Faye-Lund <kusmabite@gmail.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83574
2014-09-15 07:33:12 +03:00
Chia-I Wu
ce50a61d36 ilo: clean up 3D/media functions
Mostly style changes to set dw[0] directly.
2014-09-15 10:25:35 +08:00
Chia-I Wu
c39377d3fc ilo: fix gen6_3DSTATE_MULTISAMPLE()
There was a typo introduced by 90f4b131fc.
2014-09-15 09:00:54 +08:00
Rob Clark
ca29c4c3b0 freedreno/a3xx: 3d/array textures
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-13 15:31:58 -04:00
Rob Clark
eea1cdf687 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-13 15:31:58 -04:00
Chia-I Wu
a32f48361a ilo: trust vertex element count more
We might run into ve->count == 0 and last_velement_edgeflag == true in
gen6_3DSTATE_VERTEX_ELEMENTS() when the state tracker sets an invalid
combination of VS and VE (does not seem to happen with st/mesa).  Do not
assume ve->count is positive when last_velement_edgeflag is true.

Reported by Coverity.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-14 00:30:33 +08:00
Chia-I Wu
8fcf1b1f90 ilo: simplify src operand gathering in disassembler
Always initialize the operand array to point to src0, src1, and src2.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-14 00:30:33 +08:00
Chia-I Wu
5341001b94 ilo: derive 3-src instructions from the opcode table
One less switch statement to maintain.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-09-14 00:30:33 +08:00
Ilia Mirkin
1d7b0d832c nouveau: check for mesa context init failure
Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-13 11:29:23 -04:00
Ilia Mirkin
2e86432cc1 nouveau: avoid leaking screen on initialization fail
Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-13 11:17:26 -04:00
Ilia Mirkin
b13a4ca3f7 nouveau: change internal variables to avoid conflicts with macro args
Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-13 10:55:16 -04:00
Chia-I Wu
9133784a46 ilo: clean up 3DPRIMITIVE functions
Add ILO_PRIM_RECTANGLES to replace the rectlist bool.
2014-09-13 09:33:20 +08:00
Chia-I Wu
eca98153e9 ilo: clean up 3D/media common functions
Rename ilo_builder_batch_state_base_address() to gen6_state_base_address() for
consistency and remove unused gen6_STATE_BASE_ADDRESS().  Reorder the code in
gen6_PIPE_CONTROL() a bit.  Finally, some mostly cosmetic changes.
2014-09-13 09:31:08 +08:00
Chia-I Wu
ea8e7a8d4a ilo: move 3D functions to ilo_builder_3d*.h
Move functions for the 3D pipeline to the new headers.  We artificially split
the functions into top (vertex processing) and bottom (pixel processing), to
keep the headers at reasonable sizes.
2014-09-13 09:31:08 +08:00
Chia-I Wu
aec8521166 ilo: move media functions to ilo_builder_media.h
Move functions for the media pipeline to the new header.
2014-09-13 08:32:25 +08:00
Chia-I Wu
45023db7a9 ilo: move GPE common functions to ilo_builder_render.h
Move 3D/media common functions to the new header.
2014-09-13 08:30:32 +08:00
Kenneth Graunke
84a40ce86b glsl: Speed up constant folding for swizzles.
ir_rvalue::constant_expression_value() recursively walks down an IR
tree, attempting to reduce it to a single constant value.  This is
useful when you want to know whether a variable has a constant
expression value at all, and if so, what it is.

The constant folding optimization pass attempts to replace rvalues with
their constant expression value from the bottom up.  That way, we can
optimize subexpressions, and ideally stop as soon as we find a
non-constant subexpression.

In order to obtain the actual value of an expression, the optimization
pass calls constant_expression_value().  But it should only do so if it
knows the value can be combined into a constant.  Otherwise, at each
step of walking back up the tree, it will walk down the tree again, only
to discover what it already knew: it isn't constant.

We properly avoided this call for ir_expression nodes, but not for
ir_swizzle nodes.  This patch fixes that, drastically reducing compile
times on certain shaders where tree grafting has given us huge
expression trees.  It also fixes SuperTuxKart.

Thanks to Iago and Mike for help in tracking this down.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78468
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2014-09-12 16:35:39 -07:00
Kenneth Graunke
7865026c04 i965/vec4: Make type_size() return 0 for samplers.
The FS backend has always used 0, and the VS backend has always used 1.
I think 1 is just working around other problems, and is incorrect.
Samplers are baked in; nothing uses the UNIFORM register we would
create, and we shouldn't upload any constant values for them.

Fixes ES3-CTS.shaders.struct.uniform.sampler_array_vertex.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-12 16:35:39 -07:00
Kenneth Graunke
2408f166db i965: Skip allocating UNIFORM file storage for uniforms of size 0.
Samplers take up zero slots and therefore don't exist in the params
array, nor are they included in stage_prog_data->nr_params.  There's no
need to store their size in param_size, as it's only used for dealing
with arrays of "real" uniforms (ones uploaded as shader constants).

We run into all kinds of problems trying to refer to the uniform storage
for variables that don't have uniform storage.  For one, we may use some
other variable's index, or access out of bounds in arrays.  In the FS
backend, our extra 2 * MaxSamplerImageUnits params for texture rectangle
rescaling paper over a lot of problems.  In the VS backend, we claim
samplers take up a slot, which also papers over problems.

Instead, just skip allocating storage for variables that don't have any.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-12 16:35:39 -07:00
Kenneth Graunke
6b6145204d i965: Separate gl_InstanceID and gl_VertexID uploading.
We always uploaded them together, mostly out of laziness - both required
an additional vertex element.  However, gl_VertexID now also requires an
additional vertex buffer for storing gl_BaseVertex; for non-indirect
draws this also means uploading (a small amount of) data.  This is extra
overhead we don't need if the shader only uses gl_InstanceID.

In particular, our clear shaders currently use gl_InstanceID for doing
layered clears, but don't need gl_VertexID.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-12 16:35:35 -07:00
Kenneth Graunke
e980fe6071 i965: Fix reference counting in new basevertex upload code.
In the non-indirect draw case, we call intel_upload_data to upload
gl_BaseVertex.  It makes brw->draw.draw_params_bo point to the upload
buffer, and increments the upload BO reference count.

So, we need to unreference it when making brw->draw.draw_params_bo point
at something else, or else we'll retain a reference to stale upload
buffers and hold on to them forever.

This also means that the indirect case should increment the reference
count on the indirect draw buffer when making brw->draw.draw_params_bo
point at it.  That way, both paths increment the reference count, so
we can safely unreference it every time.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-12 16:23:02 -07:00
Rob Clark
9b6281a7da freedreno: "fix" problems with excessive flushes
4f338c9b introduced logic to trigger a flush rather than overflowing
cmdstream buffer.  But the threshold was too low, triggering flushes
where they were not needed.  This caused problems with games like
xonotic.

Part of the problem is that we need to mark all state dirty between
cmdstream submit ioctls, because we cannot rely on state being
preserved across ioctls.  But even with that, there are still some
problems that are still being debugged.  For now:

1) correctly mark all state dirty
2) introduce FD_MESA_DEBUG flush flag to force rendering to be flushed
between each draw, to trigger problems (so that I can debug)
3) use a more reasonable threshold so for normal usecases we don't
trigger the problems

This at least corrects the regression, but there is still more debugging
to do.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 18:35:39 -04:00
Marek Olšák
d13d2fd161 r600g,radeonsi: add debug option which forces DMA for copy_region and blit 2014-09-12 22:51:28 +02:00
Ilia Mirkin
d7ec3db349 freedreno/ir3: implement UMUL correctly
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:26:21 -04:00
Ilia Mirkin
436dd1e2f8 freedreno/ir3: fix UCMP handling
UCMP does not require a compare, only a select.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:26:15 -04:00
Ilia Mirkin
9f5bd154d7 freedreno/ir3: add TXL support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:26:11 -04:00
Rob Clark
459f8f3d66 freedreno/ir3: add missing put_dst
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:26:09 -04:00
Rob Clark
59ff81663a freedreno/ir3: catch incorrect usage of tmp-dst
Each get_dst() should have a matching put_dst().  Add a bit of checking
to catch mistakes.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:26:09 -04:00
Ilia Mirkin
db1a94b1cc freedreno/ir3: use unsigned comparison for UIF
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:26:05 -04:00
Ilia Mirkin
11d72553c5 freedreno/ir3: negate result of USLT/etc
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:26:01 -04:00
Ilia Mirkin
8edf83b377 freedreno/ir3: add UARL support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:25:57 -04:00
Ilia Mirkin
10273f84c2 freedreno/ir3: INEG operates on src0, not src1
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:25:52 -04:00
Ilia Mirkin
572ffca050 freedreno/ir3: fix FSLT/etc handling to return 0/-1 instead of 0/1.0
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:25:47 -04:00
Rob Clark
80058c0f08 freedreno/a3xx: alpha render-target shenanigans
We need the .w component to end up in .x, since the hw appears to fetch
gl_FragColor starting with the .x coordinate regardless of MRT format.
As long as we are doing this, we might as well throw out the remaining
unneeded components.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:23:52 -04:00
Rob Clark
3e0a82b52e util/u_format: add _is_alpha()
Because of render-to-alpha (000x) shenanigans, freedreno needs to do
some special handling when rendering to alpha-only formats.  And I
noticed that while we had _is_luminance(), _is_intensity(), etc, an
_is_alpha() helper was missing.  So fix that.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:23:52 -04:00
Rob Clark
480fe244dd freedreno/a3xx: format fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:23:52 -04:00
Rob Clark
1fba490569 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:23:52 -04:00
Rob Clark
2ed7640eec freedreno/a3xx: handle rendering to layer != 0
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-12 16:23:52 -04:00
Brian Paul
0d73ac6b02 mesa: fix _mesa_free_pipeline_data() use-after-free bug
Unreference the ctx->_Shader object before we delete all the pipeline
objects in the hash table.  Before, ctx->_Shader could point to freed
memory when _mesa_reference_pipeline_object(ctx, &ctx->_Shader, NULL)
was called.

Fixes crash when exiting the piglit rendezvous_by_location test on
Windows.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-12 09:17:31 -06:00
Connor Abbott
2828680e39 ra: assert against unsigned underflow in q_total
q_total should never go below 0 (which is why it's defined as unsigned),
and if it does, then something is seriously wrong.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-12 16:07:47 +02:00
Connor Abbott
ec046bc08e ra: note a restriction in the interfence graph API
As noted in the previous commit, this was introduced in
567e2769b8 ("ra: make the p, q test more
efficient"), but I forgot to mention it.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-12 16:07:47 +02:00
Connor Abbott
afd82dcad1 r300g: set register classes before interferences
In commit 567e2769b8 ("ra: make the p, q
test more efficient") I unknowingly introduced a new requirement to the
register allocator API: the user must set the register class of all
nodes before setting up their interferences, because
ra_add_conflict_list() now uses the classes of the two interfering
nodes. i965 already did this, but r300g was setting up register classes
interleaved with setting up the interference graph. This led to us
calculating the wrong q total, and in certain cases
e78a01d5e6 (" ra: optimistically color
only one node at a time") made it so that this bug caused a segfault. In
particular, the error occurred if the q total was decremented to 1 below
0 for the last node to be pushed onto the stack.  Since q_total is an
unsigned integer, it overflowed to 0xffffffff, which is what
lowest_q_total happens to be initialzed to. This means that we would
fail the "new_q_total < lowest_q_total" check on line 476 of
register_allocate.c, and so the node would never be pushed onto the
stack, which led to segfaults in ra_select() when we failed to ever give
it a register.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82828
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Pavel Ondračka <pavel.ondracka@email.cz>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-09-12 16:07:07 +02:00
Andreas Boll
2a13ff954d gallium/util: add missing u_debug include
Needed for assert.
Fixes build on BE archs with -Werror=implicit-function-declaration.

In file included from
../../../../../src/gallium/auxiliary/draw/draw_fs.c:30:0:
../../../../../src/gallium/auxiliary/util/u_math.h: In function
'util_memcpy_cpu_to_le32':
../../../../../src/gallium/auxiliary/util/u_math.h:810:4: error:
implicit declaration of function 'assert'
[-Werror=implicit-function-declaration]
    assert(n % 4 == 0);
        ^

Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-12 15:55:12 +02:00
Chia-I Wu
802018df5f ilo: fix builder size checks for BLT buffer clear/copy
In buf_clear_region() and buf_copy_region(), max_cmd_size was set to 0.  If
either of the functions is called and there is not enough space in the
builder, the next ilo_cp_flush() will fail silently in a release build.

Replace magic numbers by size defines in tex_clear_region()/tex_copy_region()
for consistency and readability.
2014-09-12 16:58:31 +08:00
Chia-I Wu
07e0923203 ilo: reduce BLT function parameters
Intruduce gen6_blt_bo and gen6_blt_xy_bo to describe BOs.  In the extreme case
of gen6_XY_SRC_COPY_BLT(), the number of parameters goes down from 18 to 8.
2014-09-12 16:58:30 +08:00
Chia-I Wu
8fa62a9982 ilo: clean up BLT functions
Follow the changes for MI functions, but for BLT this time.
2014-09-12 16:58:30 +08:00
Chia-I Wu
a77aaf4363 ilo: clean up MI functions
With ilo_builder in place, some conventions we had to build commands are no
longer needed.
2014-09-12 16:58:30 +08:00
Chia-I Wu
0c6a9cde94 ilo: move BLT functions to ilo_builder_blt.h
Follow the changes for MI functions, but for BLT this time.
2014-09-12 16:58:30 +08:00
Chia-I Wu
50d2d9a69d ilo: move MI functions to ilo_builder_mi.h
Have a centralized place for MI functions, and remove the duplicated
gen6_MI_LOAD_REGISTER_IMM().
2014-09-12 16:58:30 +08:00
Chia-I Wu
521887f9fd ilo: add ILO_DEV_ASSERT()
It replaces ILO_GPE_VALID_GEN().
2014-09-12 16:58:30 +08:00
Chia-I Wu
56d2ebb019 ilo: use an accessor for dev->gen
It should enable us to do specialized builds by making the accessor return a
constant.
2014-09-12 16:58:30 +08:00
Chia-I Wu
ea5de3e0bd ilo: add GEN_EXTRACT() and GEN_SHIFT32()
They replace READ() and SET_FIELD() that we have been using.
2014-09-12 16:58:29 +08:00
Chia-I Wu
e8f4dd70ab ilo: remove ILO_GEN_GET_MAJOR()
The last user has gone away.
2014-09-12 16:58:29 +08:00
Chia-I Wu
611f09890e ilo: careful with empty fb state in ilo_gpe_set_fb()
We cannot pass 0 as the width or height to ilo_gpe_init_view_surface_null().
2014-09-12 16:58:29 +08:00
Ilia Mirkin
95058bdec3 nv50,nvc0: enable ARB_texture_view
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-12 00:57:45 -04:00
Ilia Mirkin
d82bd7eb06 mesa/st: add ARB_texture_view support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-09-12 00:55:26 -04:00
Ilia Mirkin
c113095acd gallium: add a texture target to sampler view and a CAP to use it
This allows a sampler view to have a different texture target than the
underlying resource. This will be used to implement the type casting
between 2d arrays and cube maps as specified in ARB_texture_view.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-12 00:54:55 -04:00
Ilia Mirkin
3c81de5851 nouveau: only enable stencil func if the visual has stencil bits
The _Enabled property already has the relevant information.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-12 00:51:20 -04:00
Ilia Mirkin
79959e5de5 nouveau: only enable the depth test if there actually is a depth buffer
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-12 00:50:56 -04:00
Maarten Lankhorst
8ab85bfcd5 nouveau: remove unneeded assert
No idea why it was added, but the code runs fine even on videos
where it triggers.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-11 23:18:07 -04:00
Maarten Lankhorst
a41aad8431 nouveau: rework reference frame handling
Fixes a regression from "nouveau/vdec: small fixes to h264 handling"

New picking order for frames:
 1. Vidbuf pointer matches.
 2. Take the first kicked ref.
 3. If that fails, take a ref that has a different last_used.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-11 23:18:05 -04:00
Maarten Lankhorst
121ceb38f4 nouveau: fix MPEG4 hw decoding
Reorder some fields to make I-frame decoding work correctly.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-11 23:18:03 -04:00
Maarten Lankhorst
f6afed7076 nouveau: re-allocate bo's on overflow
The BSP bo might be too small to contain all of the bsp data,
bump its size on overflow. Also bump inter_bo when this happens,
it might be too small otherwise.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-11 23:17:52 -04:00
Chia-I Wu
1187dbdd10 ilo: fix a compile error with -Werror=format-security
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83777
2014-09-12 09:45:42 +08:00
Ian Romanick
7aeb853c90 i965/vec4: Only examine virtual_grf_end for GRF sources
If the source is not a GRF, it could have a register >= virtual_grf_count.
Accessing virtual_grf_end with such a register would lead to
out-of-bounds access.  Make sure the source is a GRF before accessing
virtual_grf_end.

Fixes Valgrind complaints while compiling some shaders.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2014-09-11 11:18:36 -07:00
Brian Paul
a46d7579e9 st/mesa: handle failed context creation for core profile
If the glx/wgl state tracker requested a core profile but the gallium
driver did not support some feature of GL 3.1 or later, we were setting
ctx->Version=0 and then failing the assertion in
_mesa_initialize_exec_table().

With this change we check for ctx->Version=0 and tear down the context
and return NULL from st_create_context().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-11 08:22:55 -06:00
Iago Toral Quiroga
f976b4c1bf i965: Implement GL_PRIMITIVES_GENERATED with non-zero streams.
So far we have been using CL_INVOCATION_COUNT to resolve this query but this
is no good with streams, as only stream 0 reaches the clipping stage.

From ARB_transform_feedback3:

"When a generated primitive query for a vertex stream is active, the
 primitives-generated count is incremented every time a primitive emitted to
 that stream reaches the Discarding Rasterization stage (see Section 3.x)
 right before rasterization. This counter is incremented whether or not
 transform feedback is active."

Unfortunately, we don't have any registers that provide the number of primitives
written to a specific stream other than the ones that track the number of
primitives written to transform feedback in the SOL stage, so we can't
implement this exactly as specified.

In the past we implemented this feature by activating the SOL unit even if
transform feeback was disabled, but making it so that all buffers were
disabled and it only recorded statistics, which gave us the right semantics
(see 3178d2474a). Unfortunately, this came with
a significant performance impact and had to be reverted.

This new take does not intend to implement the exact semantics required by
the spec, but improves what we have now, since now we return the primitive
count for stream 0 in all cases. With this patch we use
GEN7_SO_PRIM_STORAGE_NEEDED to resolve GL_PRIMITIVES_GENERATED queries
for non-zero streams. This would return the number of primitives written
to transform feedback for each stream instead. Since non-zero streams are
only useful in combination with transform feedback this should not be too
bad, and the only case that I think we would not be supporting would be
the one in which we want to use both GL_PRIMITIVES_GENERATED and
GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN on the same non-zero stream to
detect buffer overflow.

This patch also fixes the following piglit test:
arb_gpu_shader5-xfb-streams-without-invocations

This test uses both GL_PRIMITIVES_GENERATED and
GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN queries on non-zero streams, but it
does never hit the overflow case, so both queries are always expected to return
the same value.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
2014-09-11 15:17:22 +02:00
Christian König
6327b58415 radeon/uvd: use PIPE_USAGE_STAGING for msg&fb buffers
That better matches the actual userspace use case, the
kernel will force it to VRAM if the hardware requires it.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-11 15:06:09 +02:00
Christian König
4dfdcdb4b3 radeon/video: use the hw to initial clear the buffers
Less CPU overhead and avoids contention over CPU accessible memory on startup.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-11 15:06:03 +02:00
Christian König
4bc0059229 radeon/video: use more of the common buffer code v2
In preparation to using buffers clears with the hw engine(s).

v2: split out flipping to using hw buffer clears.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-11 15:05:55 +02:00
José Fonseca
771ab951a8 scons: add /dynamicbase and /nxcompat to MinGW linkflags
Just like b26503b196d51dc46c815e241343e42ab30e8d66 for MSVC.
2014-09-11 11:59:28 +01:00
Brian Paul
4860e98972 scons: add /dynamicbase and /nxcompat to MSVC linkflags
This builds the opengl DLLs with address layout space randomization
(ASLR) and data execution prevention (DEP) for better security.

Reviewed-by: Kurt Daverman <krd@vmware.com>
2014-09-11 11:59:28 +01:00
Chia-I Wu
6816d853db ilo: add a new disassembler
The old disassembler was modified from i965's.  It is as much work as doing a
new one to keep it up-to-date, which also requires copying more headers over.

The outputs of this new disassembler should match i965's as closely as
possible.
2014-09-11 16:29:38 +08:00
Chia-I Wu
b51b349942 ilo: update genhw headers
Add some new registers and some tweaks.  The changes that affect ilo are

 GEN6_REG_HS_INVOCATION_COUNT -> GEN7_REG_HS_INVOCATION_COUNT
 GEN6_REG_DS_INVOCATION_COUNT -> GEN7_REG_DS_INVOCATION_COUNT
 GEN6_COND_NORMAL             -> GEN6_COND_NONE
2014-09-11 16:29:38 +08:00
Frank Henigman
9c707d065a glsl: allow precision qualifier on sampler arrays
If a precision qualifer is allowed on type T, it should be allowed
on an array of T.  Refactor the check to ensure this is the case.

(Fixes failures in WebGL conformance test 'gl-min-textures')

Signed-off-by: Frank Henigman <fjhenigman@google.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-09-11 10:41:00 +03:00
Tapani Pälli
096ee4c3b0 glsl: mark variable as loop constant when it is set read only
Patch modifies is_loop_constant() to take advantage of 'read_only' bit
in ir_variable to detect a loop constant. Variables marked read-only
are loop constant like mentioned by a comment in the function.

v2: remove unnecessary comment (Francisco)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82537
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-09-11 10:09:12 +03:00
Michel Dänzer
82edcb918b radeonsi: Simplify si_dma_copy_tile function
No functional change intended.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-11 12:36:03 +09:00
Brian Paul
5cf8d9f54b u_vbuf: simple whitespace fix 2014-09-10 16:37:54 -06:00
Brian Paul
9608193cbc mesa: fix UNCLAMPED_FLOAT_TO_UBYTE() macro for MSVC
MSVC replaces the "F" in "255.0F" with the macro argument which leads
to an error.  s/F/FLT/ to avoid that.

It turns out we weren't using this macro at all on MSVC until the
recent "mesa: Drop USE_IEEE define." change.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-09-10 16:37:54 -06:00
Brian Paul
56d8cfd7a5 mesa: trim down some #includes 2014-09-10 13:16:00 -06:00
Vinson Lee
cc20c45a36 pipe-loader: Include unistd.h in pipe_loader_drm.c for close function.
This patch fixes a build error on DragonFly.

  CC       libpipe_loader_la-pipe_loader_drm.lo
pipe_loader_drm.c: In function 'pipe_loader_drm_probe':
pipe_loader_drm.c:207:10: error: implicit declaration of function 'close' [-Werror=implicit-function-declaration]

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-10 11:59:38 -07:00
Kenneth Graunke
0bac2551e4 i965: Disable guardband clipping in the smaller-than-viewport case.
Apparently guardband clipping doesn't work like we thought: objects
entirely outside fthe guardband are trivially rejected, regardless of
their relation to the viewport.  Normally, the guardband is larger than
the viewport, so this is not a problem.  However, when the viewport is
larger than the guardband, this means that we would discard primitives
which were wholly outside of the guardband, but still visible.

We always program the guardband to 8K x 8K to enforce the restriction
that the screenspace bounding box of a single triangle must be no more
than 8K x 8K.  So, if the viewport is larger than that, we need to
disable guardband clipping.

Fixes ES3 conformance tests:
- framebuffer_blit_functionality_negative_height_blit
- framebuffer_blit_functionality_negative_width_blit
- framebuffer_blit_functionality_negative_dimensions_blit
- framebuffer_blit_functionality_magnifying_blit
- framebuffer_blit_functionality_multisampled_to_singlesampled_blit

v2: Mention the acronym expansion for TA/TR/MC in the comments.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-09-10 11:40:30 -07:00
Ian Romanick
927f5db461 i965: Request lowering gl_VertexID
Fixes the (new) piglit tests gles-3.0-drawarrays-vertexid,
gl-3.0-multidrawarrays-vertexid, and gl-3.2-basevertex-vertexid.

Fixes gles3conform failure in:

ES3-CTS.gtf.GL3Tests.transform_feedback.transform_feedback_vertex_id

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80247
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-10 11:05:09 -07:00
Kenneth Graunke
fbb353bc13 i965: Expose gl_BaseVertex via a vertex attribute.
Now that we have the data available, we need to expose it to the
shaders.  We can reuse the same vertex element that we use for
gl_VertexID, but we need to back it by an actual vertex buffer.

A hardware restriction requires that vertex attributes coming from a
buffer (STORE_SRC) must come before any other types (i.e. STORE_0).
So, we have to make gl_BaseVertex be the .x component of the vertex
attribute.  This means moving gl_VertexID to a different component.

I chose to move gl_VertexID and gl_InstanceID to the .z and .w
components, respectively, to make room for gl_BaseInstance in the .y
component (which would also come from a buffer, and therefore be
STORE_SRC).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-10 11:05:08 -07:00
Kenneth Graunke
87b10c4a71 i965: Refactor Gen4-7 VERTEX_BUFFER_STATE emission into a helper.
We'll need to emit another VERTEX_BUFFER_STATE for gl_BaseVertex;
pulling this into a helper function will save us from having to deal
with cross-generation differences in that code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-10 11:05:08 -07:00
Kenneth Graunke
fdbabf22e1 i965: Make gl_BaseVertex available in a buffer object.
This will be used for GL_ARB_shader_draw_parameters, as well as fixing
gl_VertexID, which is supposed to include gl_BaseVertex's value.

For indirect draws, we simply point at the indirect buffer; for normal
draws, we upload the value via the upload buffer.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-10 11:05:08 -07:00
Kenneth Graunke
c89306983c i965: Calculate start/base_vertex_location after preparing vertices.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-10 11:05:08 -07:00
Ian Romanick
9975792abd i965: Handle SYSTEM_VALUE_VERTEX_ID_ZERO_BASE
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-10 11:05:08 -07:00
Kenneth Graunke
26e949b26e mesa: Fix glGetActiveAttribute for gl_VertexID when lowered.
The lower_vertex_id pass converts uses of the gl_VertexID system value
to the gl_BaseVertex and gl_VertexIDMESA system values.  Since
gl_VertexID is no longer accessed, it would not be considered active.

Of course, it should be, since the shader uses gl_VertexID.

v2: Move the var->name dereference past the var != NULL check.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-10 11:05:08 -07:00
Kenneth Graunke
26c9514155 mesa: Replace string comparisons with SYSTEM_VALUE enum checks.
This is more efficient.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-09-10 11:05:08 -07:00
Ian Romanick
ec08b5e768 glsl: Add a lowering pass for gl_VertexID
Converts gl_VertexID to (gl_VertexIDMESA + gl_BaseVertex). gl_VertexIDMESA
is backed by SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, and gl_BaseVertex is backed
by SYSTEM_VALUE_BASE_VERTEX.

v2: Put the enum in struct gl_constants and propoerly resolve the scope
in C++ code.  Fix suggested by Marek.

v3: Reabase on Matt's foreach_in_list changes (was using foreach_list).

v4 (Ken): Use a systemvalue instead of a uniform because
STATE_BASE_VERTEX has been removed.

v5: Use a boolean to select lowering, and only allow one lowering
method.  Suggested by Ken.

v6 (Ken): Replace strcmp against literal "gl_BaseVertex"/"gl_VertexID"
with SYSTEM_VALUE enum checks, for efficiency.

v7: Rebase on context constant initialization work.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-10 11:05:08 -07:00
Ian Romanick
04d3323d4b glsl/linker: Make get_main_function_signature public
The next patch will use this function in a different file.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-09-10 11:05:05 -07:00
Ian Romanick
1e87fbd78f mesa: Add SYSTEM_VALUE_BASE_VERTEX
This system value represents the basevertex value passed to
glDrawElementsBaseVertex and related functions.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-09-10 11:04:50 -07:00
Ian Romanick
5964a4f344 mesa: Add SYSTEM_VALUE_VERTEX_ID_ZERO_BASE
There exists hardware, such as i965, that does not implement the OpenGL
semantic for gl_VertexID.  Instead, that hardware does not include the
value of basevertex in the gl_VertexID value.
SYSTEM_VALUE_VERTEX_ID_ZERO_BASE is the system value that represents
this semantic.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-09-10 11:04:48 -07:00
Ian Romanick
9afb5ae8ca mesa: Document SYSTEM_VALUE_VERTEX_ID and SYSTEM_VALUE_INSTANCE_ID
v2: Additions to the documentation for SYSTEM_VALUE_VERTEX_ID.  Quote
the GL_ARB_shader_draw_parameters spec and mention DirectX SV_VertexID.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-09-10 11:04:44 -07:00
Jonathan Gray
cdb353539c configure.ac: unbreak the build with non gnu grep
181581280b changed the way the
llvm-config version is read from sed to grep and introduced
a requirement for gnu grep extension that treats BREs as EREs.

Avoid this by calling egrep instead of grep which should be
able to handle EREs everywhere.

This allows Mesa to build on OpenBSD again.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
2014-09-10 08:35:11 -07:00
Eric Anholt
d64ca0a765 vc4: Add support for shadow samplers.
This doesn't quite make depth-tex-compare work, presumably because we're
not hitting equality with itof(sample) * 1.0/0xffffff in the 0xffffff
case.  arb_fragment_program_shadow tests pass, though, as well as a bunch
of other shadow-related stuff.
2014-09-09 20:41:43 -07:00
Eric Anholt
7d5c57f8e9 vc4: Add support for texture swizzles.
Fixes depth-tex-modes.
2014-09-09 20:39:29 -07:00
Eric Anholt
1e77c93340 vc4: Move the texture format into a struct.
I'm going to be putting some bitfields into the struct as well.
2014-09-09 20:38:39 -07:00
Eric Anholt
e7a6c54473 vc4: Add support for depth texturing. 2014-09-09 20:38:39 -07:00
Eric Anholt
d952a98c53 vc4: Expose r4 to register allocation.
We potentially need to be careful that use of a value stored in r4 isn't
copy-propagated (or something) across another r4 write.  That doesn't
appear to happen currently, and this makes the dataflow more obvious.  It
also opens up not unpacking the r4 value, which will be useful for depth
textures.
2014-09-09 20:38:39 -07:00
Eric Anholt
be1fcd2cd3 vc4: Drop pointless raddr conflict handling on SF.
SF doesn't have a src[1].
2014-09-09 20:38:39 -07:00
Eric Anholt
04faeff28a vc4: The r4_count is supposed to be how many writes, not reads.
It's part of the key so that you can tell which r4 value is being read.
2014-09-09 20:38:38 -07:00
Michel Dänzer
5679ccfcaf r600g,radeonsi: Set RADEON_GEM_NO_CPU_ACCESS flag for tiled BOs
This lets the kernel know that such BOs can be pinned outside of the CPU
accessible part of VRAM.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-10 12:01:10 +09:00
Rob Clark
720cfb6fe9 freedreno/a3xx: enable hw primitive-restart
Since software primitive-restart emulation is going to be removed (and
anyways, mostly seemed to be crash prone in combination with
u_primconvert and oddball scenarios (like PIPE_PRIM_POLYGON with only a
single vertex), might as well do it in hardware (which fortunately
didn't turn out to be too hard to figure out).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-09 19:42:18 -04:00
Rob Clark
564183f39c freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-09 19:42:18 -04:00
Rob Clark
a2c22d80d4 freedreno/ir3: fix potential segfault in RA
Triggered by shaders like:

  FRAG
  PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
  DCL OUT[0], COLOR
  DCL CONST[0]
  DCL TEMP[0..2], LOCAL
    0: IF CONST[0].xxxx :0
    1:   MOV TEMP[0], TEMP[1]
    2: ELSE :0
    3:   MOV TEMP[0], TEMP[2]
    4: ENDIF
    5: MOV OUT[0], TEMP[0]
    6: END

not really a sane shader, although driver segfaulting is probably
not the appropriate response.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-09 19:42:18 -04:00
Rob Clark
4f338c9bbf freedreno: don't overflow cmdstream buffer so much
We currently aren't too clever about dealing with running out of
cmdstream buffer space.  Since we use a single buffer for both drawing
and tiling commands, we need to ensure there is enough space at the tail
of the cmdstream buffer to fit the tiling commands.

Until we get more clever, the easy solution is a threshold to trigger
flushing rendering even if the application does not trigger flush (swap,
changing render target, etc).  This way we at least don't crash for apps
that do several thousand draw calls (like some piglit tests do).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-09 19:42:18 -04:00
Rob Clark
fd4884e929 freedreno/ir3: add no-copy-propagate fallback step
Most of the things the new compiler still has trouble with basically
amount to cp stage removing too many copies.  But without the cp stage,
the shaders the new compiler produces are still better (perf and
correctness) than the old compiler.  So a simple thing to do until I
have more time to work on it is first trying falling back to new
compiler without cp, before finally falling back to old compiler.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-09 19:42:18 -04:00
Emil Velikov
e387fdd235 ilo: add ilo_builder.h to the sources list
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-09 22:17:39 +01:00
Kenneth Graunke
e36bbff0e6 ir_to_mesa: Stop converting uniform booleans.
Excess conversions considered harmful.

Recently Matt reworked the boolean uniform handling to use the value of
UniformBooleanTrue, rather than integer 1, when uploading uniforms:

    mesa: Upload boolean uniforms using UniformBooleanTrue.
    glsl: Use UniformBooleanTrue value for uniform initializers.

Marek then set the default to 1.0f for drivers without native integer
support:

    mesa: set UniformBooleanTrue = 1.0f by default

However, ir_to_mesa was assuming a value of integer 1, and arranging for
it to be converted to 1.0f on upload.  Since Marek's commit, we were
uploading 1.0f = 0x3f800000 which was being interpreted as the integer
value 1065353216 and converted to float as 1.06535322E9, which broke
assumptions in ir_to_mesa that "true" was exactly 1.0f.

+13 Piglits on classic swrast (fs-bool-less-compare-true,
{vs,fs}-op-not-bool-using-if, glsl-1.20/execution/uniform-initializer).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83573
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-09 13:19:44 -07:00
Jonathan Gray
c68073e65f configure.ac: strip _GNU_SOURCE from llvm-config output
Mesa already defines _GNU_SOURCE for glibc based systems and defining
_GNU_SOURCE will break the Mesa build on other systems such as OpenBSD.

_GNU_SOURCE only seems to be included in llvm-config output when
LLVM is built via autoconf and not when it is built by cmake.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
2014-09-09 20:04:45 +01:00
Stefan Dirsch
49022a9713 xmlconfig: suppress libGL warnings when LIBGL_DEBUG == "quiet"
Let's handle LIBGL_DEBUG env. variable in Mesa in a consistent way.

Fixes: https://bugzilla.novell.com/show_bug.cgi?id=895730
Signed-off-by: Stefan Dirsch <sndirsch@suse.de>
Reviewed-by: Courtney Goeltzenleuchter <courtney@lunarg.com>
2014-09-09 19:46:57 +01:00
Emil Velikov
3d8b53ffb4 automake: remove obsolete NEED_GALLIUM_LOADER
Superseded by HAVE_LOADER_GALLIUM. The latter has a *DRM* brethren
making the whose easier on which one to keep.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-09 19:45:24 +01:00
Emil Velikov
44ec468e80 configure: enable the gallium loader only when needed
With the gallium megadrivers we've converted most ST to optionally
use either statically linked in or shared pipe-drivers.

The hardcoded switch forgot to conditionally enable the build of the
shared pipe-drivers which resulted in them being constantly build.

Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Cc: James Ausmus <james.ausmus@intel.com>
Reported-by: James Ausmus <james.ausmus@intel.com>
Tested-by: James Ausmus <james.ausmus@intel.com>
Bugzilla: https://code.google.com/p/chromium/issues/detail?id=412089
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-09 19:45:10 +01:00
Emil Velikov
6dcd5ae725 configure: inform the user when we're building sw/kms-dri
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-09 19:39:37 +01:00
Emil Velikov
2903289706 configure: kill off NEED_WINSYS_WRAPPER
Just drop the conditional and simplify our build. This means that
it'll build every time, but it does not require any dependencies nor
does it take that long to compile 200 lines of boilerplate code.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-09 19:39:37 +01:00
Emil Velikov
0d0313ce9b configure: kill off NEED_NONNULL_WINSYS
The variable was unused and gave false information. The need for nonnull
winsys currently does not relate as it used to. Nowadays one can mix and
match more freely with plenty of winsys' to make your head spin.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-09 19:39:36 +01:00
Emil Velikov
40bb6f9313 configure: bail out if building svga without libdrm
With recent commit we removed the NEED_NONNULL_WINSYS checks when
selecting the hardware (inc svga) winsys. svga has only one winsys
that explicitly requires libdrm (via it's bundled version of
vmwgfx_drm.h) but configure.ac never really checks for it.

Add the check early to prevent people from shooting themselves when
they select the driver but lack libdrm.

$ ./autogen.sh --disable-dri --disable-egl --disable-gallium-llvm
--with-dri-drivers=swrast --with-gallium-drivers=svga,swrast

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82539
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-09 19:39:36 +01:00
Eric Anholt
2220692330 vc4: Fix segfaults when rendering with no color render target. 2014-09-09 07:29:16 -07:00
Eric Anholt
5774f16453 vc4: Fill out the stencil clear field.
The rest of stencil handling isn't done yet, but it documents an extra
cl_u8(0) and helps make it obvious why we don't need to format clear_depth
the same way the depth/stencil buffer is formatted.
2014-09-09 07:29:16 -07:00
Eric Anholt
fd6e4fccad vc4: Flip around the depth/stencil fields.
After implementing depth stores, it looks like this is the way things
actually are, according to hiz-depth-read-fbo-d24-s0's probes.
2014-09-09 07:29:16 -07:00
Eric Anholt
2cbecee4b7 vc4: Add support for loading/storing the depth buffer.
For now it still requires the color buffer to be present -- we're relying
on the store of color buffer contents to end the frame, and we have to do
something with color buffers in the rendering config packet.
2014-09-09 07:29:16 -07:00
Eric Anholt
1663a89374 vc4: Don't forget to do initial tile clearing for depth/stencil. 2014-09-09 07:29:16 -07:00
Eric Anholt
2cbdbeb4fa vc4: Ignore non-address bits of the offset for load/store.
These only get used for full buffer dumps, which we don't support yet
anyway.
2014-09-09 07:29:16 -07:00
Eric Anholt
a894898255 vc4: Add a debug flag for flushing after every draw.
It was useful on i965, but it's even more useful for debugging tiled
renderers.
2014-09-09 07:29:12 -07:00
Eric Anholt
840f381120 vc4: Add missing null terminator to the debug options list.
So far, apparently there's been some NULL laying at the address just after
the options anyway, but the next commit changed that.
2014-09-09 07:28:12 -07:00
Tom Stellard
181581280b configure.ac: Fix build with git-svn llvm version string
Reviewed-and-tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2014-09-09 09:47:25 -04:00
Kalyan Kondapally
78c9201a5b Linking fails when not writing gl_Position.
According to GLSL-ES Spec(i.e. 1.0, 3.0), gl_Position value is undefined
after the vertex processing stage if we don't write gl_Position. However,
GLSL 1.10 Spec mentions that writing to gl_Position is mandatory. In case
of GLSL-ES, it's not an error and atleast the linking should pass.
Currently, Mesa throws an linker error in case we dont write to gl_position
and Version is less then 140(GLSL) and 300(GLSL-ES). This patch changes
it so that we don't report an error in case of GLSL-ES.

Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83380
2014-09-09 10:39:39 +03:00
Chia-I Wu
2a49a94079 ilo: remove unused ilo_cp functions
Remove

  ilo_cp_begin()
  ilo_cp_steal()
  ilo_cp_write()
  ilo_cp_write_multi()
  ilo_cp_write_bo()
  ilo_cp_end()
  ilo_cp_steal_ptr()
  ilo_cp_assert_no_implicit_flush()
2014-09-09 13:31:37 +08:00
Chia-I Wu
90f4b131fc ilo: convert GPE GEN6 command functions to use ilo_builder
Similar to the changes to GEN7 command functions, but to GEN6 this time.

As every GPE function has been converted, remove
ilo_cp_assert_no_implicit_flush() calls.
2014-09-09 13:31:37 +08:00
Chia-I Wu
80e29ae42c ilo: convert GPE GEN7 command functions to use ilo_builder
Make these changes

  ilo_cp_begin()    -> ilo_builder_batch_pointer()
  ilo_cp_write()    -> direct memory set
  ilo_cp_write_bo() -> ilo_builder_batch_reloc()

and use this chance to drop the "_emit_" infix.
2014-09-09 13:31:37 +08:00
Chia-I Wu
fff9869164 ilo: convert GPE state functions to use ilo_builder
Make these changes

  ilo_cp_steal_ptr() and memcpy() -> ilo_builder_state_write()
  ilo_cp_steal_ptr()              -> ilo_builder_state_pointer()

and use this chance to drop the "_emit_" infix.
2014-09-09 13:31:37 +08:00
Chia-I Wu
c81a973e04 ilo: convert GPE surface functions to use ilo_builder
Make these changes

  ilo_cp_steal_ptr() and memcpy()   -> ilo_builder_surface_write()
  ilo_cp_steal() and ilo_cp_write() -> ilo_builder_surface_write()
  ilo_cp_write_bo()                 -> ilo_builder_surface_reloc()

and use this chance to drop the "_emit_" infix.
2014-09-09 13:31:37 +08:00
Chia-I Wu
6cbd1f4bd3 ilo: convert BLT to use ilo_builder
Make these changes

  ilo_cp_begin()    -> ilo_builder_batch_pointer()
  ilo_cp_write()    -> direct memory set
  ilo_cp_write_bo() -> ilo_builder_batch_reloc()

and make sure there is no implicit flush.  Use this chance to drop the
"_emit_" infix.
2014-09-09 13:31:37 +08:00
Chia-I Wu
d2acd67313 ilo: use ilo_builder for kernels and STATE_BASE_ADDRESS
Remove instruction buffer management from ilo_3d and adapt ilo_shader_cache to
upload kernels to ilo_builder.  To be able to do that, we also let ilo_builder
manage STATE_BASE_ADDRESS.
2014-09-09 13:31:37 +08:00
Chia-I Wu
55f80a3290 ilo: make ilo_cp based on ilo_builder
This makes ilo_cp use the builder to manage batch buffers, and use
ilo_builder_decode() to replace ilo_3d_pipeline_dump().
2014-09-09 13:31:36 +08:00
Chia-I Wu
dab4a676f7 ilo: add a builder for building BOs for submission
Comparing to how we manage batch and instruction buffers, the new builder

 - does not flush
 - manages both types of buffers
 - manages STATE_BASE_ADDRESS
 - uploads kernels using unsynchronized mapping
 - has its own decoder for the buffers
 - provides more helpers
2014-09-09 13:31:36 +08:00
Chia-I Wu
43bf14eaeb ilo: make toy_compiler_disassemble() more useful
Do not require a toy_compiler so that it can be used in other places, such as
state dumping.  Add a bool to control whether the raw instruction words are
shown.
2014-09-09 13:31:30 +08:00
Ilia Mirkin
4ea1565bbc nv50/ir: accomodate all file types, there are now more than 8
Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-08 20:06:12 -04:00
Ilia Mirkin
5966903c28 nvc0/ir: uses was always null at that point in the code
Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-08 20:06:12 -04:00
Ilia Mirkin
874a9396c5 nv50/ir: avoid array overrun when checking for supported mods
Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-08 20:06:12 -04:00
Ilia Mirkin
64c5aeaa94 nouveau: buffer can never be null
Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-08 20:06:11 -04:00
Ilia Mirkin
1792d60900 nvc0/ir: insn can never be null
Reported by Coverity.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-08 20:06:11 -04:00
Ilia Mirkin
9ced42b1aa nvc0: size is a uint16_t, remove unnecessary assertion
Reported by Coverity.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-08 20:06:11 -04:00
Ilia Mirkin
564e305094 nvc0: avoid null deref of screen when collecting stats
Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-08 20:06:11 -04:00
Ilia Mirkin
c02ac40837 nvc0: use 64-bit math when scaling the query results
Reported by Coverity.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-08 20:06:11 -04:00
Roland Scheidegger
08f13ff439 gallivm: (trivial) don't try to use rcp when the division 1/x is integer
This would just crash. Noticed by accident while checking int divisions by zero
with a quickly hacked piglit test.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-09-09 01:44:08 +02:00
Roland Scheidegger
51b52ea013 docs: (trivial) mark softpipe, llvmpipe as done for GL_ARB_base_instance
Forgot to add it when I fixed up the start instance handling in (llvm) draw.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-09-09 01:44:07 +02:00
Roland Scheidegger
9405e15f51 gallivm: (trivial) fix min / max variable names
Calling the variable min when it's really max and vice versa seems a bit
confusing.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-09-09 01:44:05 +02:00
Kenneth Graunke
a20cc2796f i965: Handle ir_binop_ubo_load in boolean expression code.
UBO loads can be boolean-valued expressions, too, so we need to handle
them in emit_bool_to_cond_code() and emit_if_gen6().

However, unlike most expressions, it doesn't make sense to evaluate
their operands, then do something with the results.  We just want to
evaluate the UBO load as a whole---which performs the read from
memory---then load the boolean result into the flag register.

Instead of adding code to handle it, we can simply bypass the
ir_expression handling, and fall through to the default code, which will
do exactly that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83468
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2014-09-08 15:43:52 -07:00
Kenneth Graunke
b9699e09bc i965/fs: Make emit_if_gen6 never fall back to emit_bool_to_cond_code.
Matt and I believe that Sandybridge actually uses 0xFFFFFFFF for a
"true" comparison result, similar to Ivybridge.  This matches the
internal documentation, and empirical results, but contradicts the PRM.

So, the comment is inaccurate, and we can actually just handle these
directly without ever needing to fall through to the condition code
path.

Also, the vec4 backend has always done it this way, and has apparently
been working fine.  This patch makes the FS backend match the vec4
backend's behavior.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-08 15:43:51 -07:00
Kenneth Graunke
6272e60ca3 i965: Handle ir_triop_csel in emit_if_gen6().
ir_triop_csel can return a boolean expression, so we need to handle it
here; we simply forgot when we added ir_triop_csel, and forgot again
when adding it to emit_bool_to_cond_code.

Fixes Piglit's EXT_shader_integer_mix/{vs,fs}-mix-if-bool on Sandybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2014-09-08 15:43:49 -07:00
Christian König
12fb74fe89 mesa/st: don't advertise NV_vdpau_interop if it doesn't work.
As long as we don't have a workaround for frame based
decoding in VDPAU we should not advertise NV_vdpau_interop.

v2: fix commit message, check if get_video_param is present

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2014-09-08 16:53:39 +02:00
Brian Paul
a3306f028e docs: add news link to 10.2.7 release notes 2014-09-08 08:08:46 -06:00
Jordan Justen
dc0bd799ca i965/fs: Remove direct fs_visitor gl_fragment_program dependence
Instead we cast backend_visitor::prog for fragment shader specific code paths.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-06 11:17:53 -07:00
Ulrich Weigand
0feb977bbf gallivm: Fix Altivec pack intrinsics for little-endian
This patch fixes use of Altivec pack intrinsics on little-endian PowerPC
systems.  Since little-endian operation only affects the load and store
instructions, the semantics of pack (and other) instructions that take
two input vectors implicitly change: the pack instructions still fill
a register placing values from the first operand into the "high" parts
of the register, and values from the second operand into the "low" parts
of the register, but since vector loads and stores perform an endian swap,
the high parts end up at high memory addresses.

To still achieve the desired effect, we have to swap the two inputs to
the pack instruction on little-endian systems.  This is done automatically
by the back-end for instructions generated by LLVM, but needs to be done
manually when emitting intrisincs (which still result in that instruction
being emitted directly).

Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Maarten Lankhorst <dev@mblankhorst.nl>
2014-09-06 15:51:58 +02:00
Jordan Justen
1f184bc114 i965/fs: Remove direct fs_generator brw_wm_prog_key dependence
Instead we store a void pointer to the key, and cast it to
brw_wm_prog_key for fragment shader specific code paths.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-05 22:15:06 -07:00
Jordan Justen
c43ae405aa i965/fs: Remove direct fs_generator brw_wm_prog_data dependence
Instead we store a brw_stage_prog_data pointer, and cast it to
brw_wm_prog_data for fragment shader specific code paths.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-05 22:15:06 -07:00
Jordan Justen
f96a02c7ca i965/fs: Don't store gl_fragment_program* in fs_generator
gl_program* is named prog similar to backend_visitor.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-05 22:15:06 -07:00
Jordan Justen
936ca6f3cf i965: Add uses_kill to brw_wm_prog_data
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-05 22:15:06 -07:00
Jordan Justen
d0e166752a i965/fs: Rename fs_generator::prog to shader_prog
This matches backend_visitor, and will allow gl_program to be named prog.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-05 22:15:06 -07:00
Jordan Justen
000a9ee1ba i965/fs: Add stage variable to fs_generator
This will allow for stage specific code paths.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-05 22:15:06 -07:00
Kristian Høgsberg
2d6d3461d3 i965: Adjust fast-clear resolve rect for BDW
The scale factors for the resolve rectangle change for BDW and we have
to look at brw->gen now to figure out how big it should be.

Fixes: https://bugs.freedesktop.org/attachment.cgi?id=105777
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-05 20:47:03 -07:00
Christoph Bumiller
ca9ab05d45 nvc0/ir: clarify recursion fix to finding first tex uses
This is a simple shader for reproducing the case mentioned:

FRAG
DCL IN[0], GENERIC[0], PERSPECTIVE
DCL OUT[0], COLOR
DCL SAMP[0]
DCL CONST[0]
DCL TEMP[0..1], LOCAL
IMM[0] FLT32 {    0.0000,    -1.0000,     1.0000,     0.0000}
  0: MOV TEMP[0].x, CONST[0].wwww
  1: MOV TEMP[1].x, CONST[0].wwww
  2: BGNLOOP
  3:   IF TEMP[0].xxxx
  4:     BRK
  5:   ENDIF
  6:   ADD TEMP[0].x, TEMP[0], IMM[0].zzzz
  7:   IF CONST[0].xxxx
  8:     TEX TEMP[1].x, CONST[0], SAMP[0], 2D
  9:   ENDIF
 10:   IF CONST[0].zzzz
 11:     MOV TEMP[1].x, CONST[0].zzzz
 12:   ENDIF
 13: ENDLOOP
 14: MOV OUT[0], TEMP[1].xxxx
 15: END

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-05 23:08:24 -04:00
Christoph Bumiller
b9f9e3ce03 nv50/ir/util: fix BitSet issues
BitSet::allocate() is being used with the expectation that it would
leave the bitfield untouched if its size hasn't changed, however,
the function always zeroed the last word, which led to obscure bugs
with live set computation.

This also fixes BitSet::resize(), which was broken, but luckily not
being used.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-05 23:05:42 -04:00
Ilia Mirkin
a71380040c nvc0: remove nvc0_push, replaced with nvc0_vbo_translate
Fixes build.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-05 23:00:27 -04:00
Ilia Mirkin
12311c7c52 nv50,nvc0: get rid of draw module support
This hasn't been enabled in a long time and is completely stale and
unnecessary. Remove, esp since it doesn't build.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-05 23:00:27 -04:00
Jason Ekstrand
ecf6c26757 i965/fs: Don't look at virtual_grf_sizes for uniforms
Uniform values are in the UNIFORM register file, not the GRF register file.
Looking in virtual_grf_sizes makes no sense and only makes the output of
dump_instructions confusing.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-05 17:33:17 -07:00
Dave Airlie
291ae622fd loader: fds can be 0
Possible resource leak reported by coverity.

Reported-by: Coverity scanner.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-06 10:24:25 +10:00
Emil Velikov
196e949cf7 docs: Import 10.2.7 release notes, add news item.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-06 01:18:45 +01:00
Emil Velikov
2c69c9fdcb gallium/vc4: ship all files in the tarball
- include all headers in Makefile.sources

Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:27 +01:00
Emil Velikov
ec9d8060e4 gallium/trace: ship all files in the tarball
- include all headers in Makefile.sources
 - bundle the scons buildscript, README and trace.xsl

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:27 +01:00
Emil Velikov
7134043837 gallium/svga: ship all files in the tarball
- include all headers in Makefile.sources
 - sort the list(s)
 - bundle the android & scons buildscript
 - include the headers' README & svga_dump.py

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:27 +01:00
Emil Velikov
f7008a6c5e gallium/softpipe: ship all files in the tarball
- include all headers in Makefile.sources
 - sort the list(s)
 - bundle the android & scons buildscript

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:27 +01:00
Emil Velikov
858d932d6a gallium/rbug: ship all files in the tarball
- include all headers in Makefile.sources
 - sort the list(s)
 - bundle the android buildscript & README

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:26 +01:00
Emil Velikov
36b5012a8d gallium/radeonsi: ship all files in the tarball
- include all headers in Makefile.sources
 - bundle the android buildscript

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:26 +01:00
Emil Velikov
8b48e14a48 gallium/radeon: ship all files in the tarball
- include all headers in Makefile.sources
 - sort the list(s)
 - bundle the android buildscript & LLVM note

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:26 +01:00
Emil Velikov
27d4f2eae3 gallium/r600: ship all files in the tarball
- include all headers in Makefile.sources
 - sort the list(s)
 - bundle the android buildscript & custom include

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:26 +01:00
Emil Velikov
cdd3a34096 gallium/r300: ship all files in the tarball
- include all headers in Makefile.sources
 - sort the list(s)
 - bundle the android buildscript & the tests

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:26 +01:00
Emil Velikov
2ba31a5185 gallium/nouveau: ship all files in the tarball
- include all headers in Makefile.sources
 - sort the list(s)
 - bundle the android buildscript

v2: Don't double-include the compiler sources.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:26 +01:00
Emil Velikov
0cba104921 gallium/noop: ship all files in the tarball
- include all headers in Makefile.sources
 - bundle the scons buildscript

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:26 +01:00
Emil Velikov
48d251cebb gallium/llvmpipe: ship all files in the tarball
- include all headers in Makefile.sources
 - sort the list(s)
 - bundle the scons buildscript

v2: Don't double include the test sources.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:25 +01:00
Emil Velikov
a408b75849 gallium/identity: ship all files in the tarball
- include all headers in Makefile.sources
 - sort the list(s)
 - bundle the scons buildscript

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:25 +01:00
Emil Velikov
930afeaa54 gallium/ilo: ship all files in the tarball
- include all headers in Makefile.sources
 - bundle the android buildscript

Cc: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:25 +01:00
Emil Velikov
38719795a6 gallium/i915: ship all files in the tarball
- include all headers in Makefile.sources
 - sort the list(s)
 - bundle the android buildscript & TODO

Cc: Stephane Marchesin <stephane.marchesin@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:25 +01:00
Emil Velikov
8928788d58 gallium/galahad: ship all files in the tarball
- include all headers in Makefile.sources
 - sort the list(s)
 - bundle the scons buildscript

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:25 +01:00
Emil Velikov
0ea9569d8f gallium/freedreno: ship all files in the tarball
- include all headers in Makefile.sources
 - sort the list(s)
 - bundle the android build

Cc: freedreno@lists.freedesktop.org
Cc: Rob Clark <robclark@freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:25 +01:00
Emil Velikov
525c48a316 gallium/tools: pick up the tools for distribution
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:25 +01:00
Emil Velikov
c6948da666 gallium/tests: ship all the tests in the release tarball
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:24 +01:00
Emil Velikov
13a5adc1b7 st/vega: ship the final headers
Commit 60d772cd9d1(st/vega: add headers and SConscript in
the tarball) meant to pick all the headers to be included in
the release tarball yet it missed a few.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:24 +01:00
Emil Velikov
cd2e62a2f3 st/egl: include the remaining files in the tarball
A few files were missing, namely:
 - a few of the common headers
 - the android + gdi sources

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:24 +01:00
Emil Velikov
96fb492583 st/glx/xlib: ship the SConscript in the release tarball
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:24 +01:00
Emil Velikov
fc69d1141b st/dri: ship the scons buildscript in the release tarball
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:24 +01:00
Emil Velikov
3d3d9c3617 st/clover: ship Doxyfile in the release tarball
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:24 +01:00
Emil Velikov
cf0c4d6d63 gallium: ship state-tracker/README in the release tarball
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:23 +01:00
Emil Velikov
c553b6e2df gallium: ship the non-automaked state-trackers & targets
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:23 +01:00
Emil Velikov
0fd45d3079 winsys/intel: drop intel_winsys.h from makefile.sources
With the last revisions of commit 664c2d76947(gallium/ilo: cleanup
intel_winsys.h) we moved the header from winsys to drivers, but we
forgot to update the makefile.sources to reflect this.

Cc: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2014-09-05 23:46:23 +01:00
Anuj Phogat
d09167a39f meta: Store precompiled msaa shaders for all supported sample counts
Currently, BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE* and
BLIT_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_RESOLVE* shaders
in setup_glsl_msaa_blit_shader() are not recompiled
when the source buffer sample count changes. For example,
implementation continued using a 4X msaa shader, even if
source buffer changes from 4X msaa to 8x msaa. It causes
incorrect rendering.

This patch adds new enums in blit_msaa_shader, one for
each supported sample count, and uses them to store
msaa shaders.

Fixes following piglit tests on Broadwell:
ext_framebuffer_multisample-accuracy all_samples color
ext_framebuffer_multisample-accuracy all_samples depth_draw
ext_framebuffer_multisample-accuracy all_samples depth_resolve
ext_framebuffer_multisample-accuracy all_samples stencil_draw
ext_framebuffer_multisample-accuracy all_samples stencil_resolve
ext_framebuffer_multisample-formats all_samples

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstarnd <jason@jlekstrand.net>
2014-09-05 15:40:37 -07:00
Emil Velikov
0b76c51728 configure: check for core xcb and link the VL targets against it
Make sure to check the presence of the module in order to pick the
correct libs flag and before feeding them to the compiler/linker.

Current libXvMC*, libvdpau* and libomx_mesa depends unconditionally
upon xcb, due to their usage of the aux/vl gallium module.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2014-09-05 23:18:00 +01:00
Emil Velikov
17798bfb47 configure: check for core xcb and link libEGL against it
Make sure to check the presence of the module in order to pick the
correct libs flag and before feeding them to the compiler/linker.

Current libEGL depends conditionally (when building with x11 platform)
upon xcb.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-05 23:17:59 +01:00
Emil Velikov
da029f8081 configure: check for core xcb and link libGL against it
Make sure to check the presence of the module in order to pick the
correct libs flag and before feeding them to the compiler/linker.

Current libGL depends conditionally (when building with dri3) upon
xcb 1.9.3 and unconditionally on ancient xcb functions -
xcb_generate_id and xcb_request_check amongst others.

v2: Use PKG_CHECK_EXISTS() when checking for dri3 xcb.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80848
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-05 23:17:49 +01:00
Jason Ekstrand
7599886b26 i965/blorp: Pass image formats seperately from the miptree
When a texture is wrapped in a texture view, we can't trust the format in
the miptree itself.  This patch allows us to pass the format seperately
through blorp so we can proprerly handled wrapped textures.

It's worth noting here that we can use the miptree format directly for
depth/stencil formats because they cannot be reinterpreted by a texture
view.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
CC: "10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-09-05 10:45:27 -07:00
Matt Turner
87472ae58c i965/fs: Brown bag fix. 2014-09-05 10:29:38 -07:00
Matt Turner
e8df6a6b32 i965/vec4: Add ability to reswizzle arbitrary swizzles.
Before commit 04895f5c we would only reswizzle dot product instructions
(since they wrote the same value into all channels, and we didn't have
to think about anything else). That commit extended reswizzling to cases
when the swizzle was single valued -- i.e., writing the same result into
all channels.

But allowing reswizzling of arbitrary things is actually really easy and
is even less code. (Why didn't we do this in the first place?!)

total instructions in shared programs: 4266079 -> 4261000 (-0.12%)
instructions in affected programs:     351933 -> 346854 (-1.44%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-05 10:22:06 -07:00
Matt Turner
1ee1d8ab46 i965/vec4: Reswizzle sources when necessary.
Despite the comment above the function claiming otherwise, the function
did not reswizzle sources, which would lead to bad code generation since
commit 04895f5c, which began claiming we could do such swizzling when we
could not.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82932
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-05 10:22:06 -07:00
Jason Ekstrand
e49cfe9bfc i965/fs: Clean up emitting of untyped atomic and surface reads
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-05 10:04:06 -07:00
Matt Turner
ef8477cddf i965/fs: Fix basic block tracking in try_rep_send().
The 'start' instruction is always in the current block, except for the
case of shader time, which emits code in a pattern seen no where else.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-05 09:53:21 -07:00
Matt Turner
248eaff63d i965/fs: Pass block to insert and remove functions missed earlier.
Otherwise, the basic block start/end IPs don't get updated properly,
leading to a broken CFG.  This usually results in the following
assertion failure:

brw_fs_live_variables.cpp:141:
void brw::fs_live_variables::setup_def_use():
Assertion `ip == block->start_ip' failed.

Fixes KWin, WebGL demos, and a score of Piglit tests on Sandybridge and
earlier hardware.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-05 09:52:50 -07:00
Kenneth Graunke
6ff5bb2465 i965: Mark cfg dumping functions const.
The dump() methods don't alter the CFG or basic blocks, so we should
mark them as const.  This lets you call them even if you have a const
cfg_t - which is the case in certain portions of the code (such as live
interval handling).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-05 09:52:38 -07:00
Matt Turner
88d673bde6 i965: Update if_block/else_block in the dead control flow pass.
I think this bug crept in only recently.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-05 09:52:29 -07:00
Matt Turner
3e248e0418 i965/fs: Connect cfg properly in predicated break peephole.
If the ENDIF instruction was the only instruction in its block, we'd
leave the successors of the merged if+jump block in a bad state.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83080
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-05 09:08:59 -07:00
Marek Olšák
1a00f24751 st/mesa: use 1.0f as boolean true on drivers without integer support
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82882

Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-05 15:41:47 +02:00
Marek Olšák
d67db73458 mesa: set UniformBooleanTrue = 1.0f by default
because NativeIntegers is 0 by default.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82882

Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-05 15:41:47 +02:00
Jonathan Gray
635477dc4b automake: check if the linker supports --dynamic-list
As older versions of gnu ld did not support --dynamic-list check to see
if it is supported before using it.  Non gnu linkers such the apple one
likely lack this option as well.

Fixes the build on OpenBSD which has binutils 2.15 and 2.17.
The --dynamic-list option seems to been have introduced sometime after
binutils 2.17 was released as it is present in 2.18.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-05 14:20:42 +01:00
Jonathan Gray
d3dee3df97 st/xvmc/tests: avoid non portable error.h functions
To improve compatibility with OpenBSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-05 14:17:34 +01:00
Andreas Pokorny
8bcd57a46c kms-swrast: Support Prime fd handling
Allows using prime fds as display target and from display target.
Test for PRIME capability after initializing kms_swrast screen.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andreas Pokorny <andreas.pokorny@canonical.com>
2014-09-05 14:14:37 +01:00
Michel Dänzer
76b906c9f6 configure.ac: Add AC_SYS_LARGEFILE
Making sure large file support is enabled across the tree even on 32-bit
systems.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-05 18:08:59 +09:00
Francisco Jerez
b4539274b6 clover/util: Null-terminate the result of compat::string::c_str().
Reported-by: EdB <edb+mesa@sigluy.net>
2014-09-05 09:27:20 +03:00
Francisco Jerez
923c72982e clover/util: Implement compat::string using aggregation instead of inheritance. 2014-09-05 09:27:20 +03:00
Francisco Jerez
7c1e6d582c clover/util: Have compat::vector track separate size and capacity.
In order to make the behaviour of resize() and reserve() closer to the
standard.

Reported-by: EdB <edb+mesa@sigluy.net>
2014-09-05 09:27:20 +03:00
Francisco Jerez
995f7b37da clover: Use conversion operator to initialize build log from compat::string.
Fixes binary garbage in the compilation logs caused by
compat::string::c_str() not being null-terminated (which is a bug on
its own that will be fixed in another commit).

Reported-by: EdB <edb+mesa@sigluy.net>
2014-09-05 09:27:20 +03:00
Jordan Justen
864c463485 Revert 5 i965 patches: 8e27a4d2, 373143ed, c5bdf9be, 6f56e142, 88e3d404
Reverts
* "i965: Modify state upload to allow 2 different sets of state atoms."
   8e27a4d2b3
* "i965: Modify dirty bit handling to support 2 pipelines."
   373143ed91
* "i965: Create a macro for checking a dirty bit."
   c5bdf9be1e
   Conflicts:
	src/mesa/drivers/dri/i965/brw_context.h
* "i965: Create a macro for setting all dirty bits."
   6f56e1424d
   Conflicts:
	src/mesa/drivers/dri/i965/brw_blorp.cpp
	src/mesa/drivers/dri/i965/brw_state_cache.c
	src/mesa/drivers/dri/i965/brw_state_upload.c
* "i965: Create a macro for setting a dirty bit."
   88e3d404da

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-04 23:06:27 -07:00
Rob Clark
5d8f40a53a freedreno/ir3: fix constlen with relative addressing
We can't rely on the value from the assembler if relative addressing is
used.  So instead use the max of declared-consts (which does not include
compiler immediates) and what we get from the assembler (which does).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-04 22:28:50 -04:00
Rob Clark
73ff4c5f70 freedreno/ir3: fix error in bail logic
all_delayed will also be true if we didn't attempt to schedule anything
due to no more instructions using current addr/pred.  We rely on coming
in to block_sched_undelayed() to detect and clean up when there are no
more uses of the current addr/pred, which isn't necessarily an error.

This fixes a regression introduced in b823abed.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-04 22:28:50 -04:00
Rob Clark
08ee0488e6 freedreno/ir3: bit of debug
Make it easier to figure out which compiler stage failed.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-04 22:28:50 -04:00
Eric Anholt
4bca922878 vc4: Merge qcompile and tgsi_to_qir
The split between these two didn't make much sense.  I'm going to want the
chance to look at uniform contents in optimization passes, and the QPU
emit I think is going to end up rewriting the uniforms stream.
2014-09-04 17:00:54 -07:00
Jordan Justen
23e20f4687 i965/fs: Use prog rather than fp->Base in fs_visitor
Reduce fs_visitor's dependence on gl_fragment_program.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-04 11:46:42 -07:00
Jordan Justen
a346870ba8 i965/fs: Use stage_prog_data instead of prog_data->base in fs_visitor
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-04 11:46:42 -07:00
Jordan Justen
246211d366 i965/fs: Add init function to fs_visitor
This common init routine can be used by constructors for multiple program
types.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-04 11:46:42 -07:00
Eric Anholt
55d2a16262 vc4: Add a CSE optimization pass.
Debugging a regression in discard support was just too full of duplicate
instructions, so I decided to remove them instead of re-analyzing each of
them as I dumped their outputs in simulation.
2014-09-04 11:39:51 -07:00
Eric Anholt
80b27ca2cd vc4: Switch to using native integers.
There were troubles with bools without using native integers
(st_glsl_to_tgsi seemed to think bool true was 1.0f sometimes, when as a
uniform it's stored as ~0), and since I've got native integers other than
divide, I might as well just support them.
2014-09-04 11:39:51 -07:00
Eric Anholt
874dfa8b2e vc4: Expose compares at a lower level in QIR.
Before, we had some special opcodes like CMP and SNE that emitted multiple
instructions.  Now, we reduce those operations significantly, giving
optimization more to look at for reducing redundant operations.

The downside is that QOP_SF is pretty special -- we're going to have to
track it separately when we're doing instruction scheduling, and we want
to peephole it into the instruction generating the destination write in
most cases (and not allocate the destination reg, probably.  Unless it's
used for some other purpose, as well).
2014-09-04 11:39:51 -07:00
Eric Anholt
3972a6f057 vc4: Stop being so clever in CMP handling.
This kind of cleverness should be in a general merging-of-ADD-and-MUL
instruction scheduler, rather than individual opcodes.
2014-09-04 11:39:51 -07:00
Eric Anholt
511d2f9a13 state_tracker: Fix bug in conditional discards with native ints.
A bool is 0 or ~0, and KILL_IF takes a float arg that's <0 for discard or
>= 0 for not.  By negating it, we ended up doing a floating point subtract
of (0 - ~0), which ended up as an inf.  To make this actually work, we
need to convert the bool to a float.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-04 11:39:50 -07:00
Brian Paul
e69b4abc43 swrast: s/INLINE/inline/
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-04 12:17:44 -06:00
Brian Paul
0f255fd26b osmesa: s/INLINE/inline/
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-04 12:17:44 -06:00
Brian Paul
27727b8479 xlib: s/INLINE/inline/
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-04 12:17:44 -06:00
Brian Paul
c4a0be73ea meta: s/INLINE/inline/
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-04 12:17:43 -06:00
Brian Paul
44df6df05b mesa: s/INLINE/inline/
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-04 12:17:40 -06:00
Marek Olšák
3dbf55c1be r600g,radeonsi: make sure there's enough CS space before resuming queries
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83432

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-04 16:15:21 +02:00
Marek Olšák
374f3e9e19 mesa: invalidate draw state in glPopClientAttrib
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82538

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-04 16:09:56 +02:00
Marek Olšák
8bd6723179 Revert "r600g,radeonsi: initialize HTILE to fully-expanded state"
This reverts commit f05fe294e7.

Apparently the hw doesn't like this. Revert to the "cleared" state.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83418
2014-09-04 15:48:38 +02:00
Thomas Hellstrom
2d6206140a winsys/svga: Fix incorrect type usage in IOCTL v2
While similar in layout, the size of the SVGA3dSize type may be smaller than
the struct drm_vmw_size type that is part of the ioctl interface. The kernel
driver could accordingly overwrite a memory area following the size variable
on the stack. Typically that would be another local variable, causing
breakage in, for example, ubuntu 12.04.5 where the handle local variable
becomes overwritten.

v2: Fix whitespace errors

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Cc: "10.1 10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-04 14:31:52 +02:00
Timothy Arceri
504f5f9d1a glapi: Add KHR_debug functions to check_table test
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
2014-09-04 12:29:14 +10:00
Carl Worth
ecc89e4e42 egl: Restrict multiplication in calloc arguments to use compile-time constants
As explained in the previous commit, we want to avoid the possibility of
integer-multiplication overflow while allocating buffers.

In these two cases, the final allocation size is the product of three values:
one variable and two that are fixed constants at compile time.

In this commit, we move the explicit multiplication to involve only the
compile-time constants, preventing any overflow from that multiplication, (and
allowing calloc to catch any potential overflow from the remainining implicit
multiplication).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-03 18:37:02 -07:00
Carl Worth
c35f14f368 Eliminate several cases of multiplication in arguments to calloc
In commit 32f2fd1c5d, several calls to
_mesa_calloc(x) were replaced with calls to calloc(1, x). This is strictly
equivalent to what the code was doing previously.

But for cases where "x" involves multiplication, now that we are explicitly
using the two-argument calloc, we can do one step better and replace:

	calloc(1, A * B);

with:

	calloc(A, B);

The advantage of the latter is that calloc will detect any overflow that would
have resulted from the multiplication and will fail the allocation, (whereas
the former would return a small allocation). So this fix can change
potentially exploitable buffer overruns into segmentation faults.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-03 18:37:02 -07:00
Kenneth Graunke
96ce065db4 glsl: Report progress from opt_copy_propagation_elements().
It's been altering the tree and reporting "false" since January 2011.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-03 17:26:06 -07:00
Kenneth Graunke
702b6ea051 glsl: Skip rewriting instructions in opt_cpe when unnecessary.
Previously, opt_copy_propagation_elements would always rewrite the
instruction stream, even if was the same thing as before.  In order to
report progress correctly, we'll need to bail if the suggested
replacement is identical (or equivalent) to the original code.

This also introduced unnecessary noop swizzles, as far as I can tell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-03 17:26:04 -07:00
Kenneth Graunke
5ced83ee15 glsl: Initialize source_chan in opt_copy_propagation_elements.
Previously, if chans < 4, we passed uninitialized stack garbage to the
ir_swizzle constructor for the excess components.  Thankfully, it
ignores that data, as it's unnecessary, so no harm actually comes of it.

However, it's obviously better to initialize it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-03 17:25:56 -07:00
Kenneth Graunke
8270b048cf i965: Handle ir_triop_csel in emit_bool_to_cond_code().
ir_triop_csel can return a boolean expression, so we need to handle it
here; we simply forgot when we added it.

Fixes Piglit's EXT_shader_integer_mix/{vs,fs}-mix-if-bool.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2014-09-03 17:12:03 -07:00
Kenneth Graunke
f92fbd554f i965: Move curb_read_length/total_scratch to brw_stage_prog_data.
All shader stages have these fields, so it makes sense to store them in
the common base structure, rather than duplicating them in each.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-03 17:11:33 -07:00
Carl Worth
7528f6fd17 build: Rename md5 to checksums as part of .PHONY target
In commit 46d03d37bf I renamed a Makefile target
from md5 to checksums, (as we switched from MD5 checksums to SHA-256
checksums, so the more general name is more future proof).

But that commit missed one mention of "md5" as a dependency of the .PHONY
target. Rename that here as well.
2014-09-03 16:08:20 -07:00
tiffany
cfc42db592 glsl: fix assertion which fails for unsigned array indices.
According to the GLSL 1.40 spec, section 5.7 Structure and Array Operations:

"Array elements are accessed using an expression whose type is int or uint."

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-09-03 13:52:39 -06:00
Jason Ekstrand
11ee9a4d99 i965/copy_image: Divide the x offsets by block width when using the blitter
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82804
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-03 12:27:19 -07:00
Jason Ekstrand
499acf6e4a i965/copy_image: Use the correct block dimension
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82804
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-03 12:27:19 -07:00
Jason Ekstrand
b608cd7fbf meta/copy_image: Use the correct texture level when creating views
Previously, we were accidentally assuming that the level of both textures
was 0.  Now we actually use the correct level in our hacked texture view.
This doesn't 100% fix the meta path because the texture type is getting
lost somewhere in the pipeline.  However, it actually copies to/from the
correct layer now.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82804
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-03 12:27:19 -07:00
Jason Ekstrand
fcb6d5b9ef i965/copy_image: Use the correct texture level
Previously, we were using the source images level for both source and
destination.  Also, we weren't taking the MinLevel from a potential texture
view into account.  This commit fixes both problems.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82804
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-03 12:27:19 -07:00
Michel Dänzer
58b386dce4 gallivm: Fix build against LLVM SVN >= r216982
Only MCJIT is available anymore.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-03 09:15:01 -07:00
Marek Olšák
8abdc3c4a9 r600g: fix alpha-test with HyperZ enabled, fixing L4D2 tree corruption
*_update_db_shader_control depends on the alpha test state. The problem was
it was in a block which is only entered if the pixel shader is changed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74863

Cc: mesa-stable@lists.freedesktop.org
Tested-by: Benjamin Bellec <b.bellec@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-03 11:50:21 +02:00
Michel Dänzer
2adf7ee92e r600g,radeonsi: Preserve existing buffer flags
The default case was accidentally clearing RADEON_FLAG_CPU_ACCESS from the
previous fall-through cases.

Reported-by: Mathias Fröhlich <Mathias.Froehlich@gmx.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-03 12:49:59 +09:00
Jason Ekstrand
454aab45ef main: Don't leak temporary texture rows
Reviewed-by: Dave Airlie <airlied@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-09-02 15:50:27 -07:00
Dave Airlie
8380b894ad r300g: pointless assignment of info.indexed
Did this code mean to do something else, you tell me!

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-03 07:59:09 +10:00
Dave Airlie
2b24e58310 omx/h264: remove stray semicolon after if
Coverity reported this, looks wrong to me.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-03 07:58:58 +10:00
Dave Airlie
f4ccf687a6 vdpau: unlock the mutex on error paths in attribute setting.
Coverity pointed out we never dropped the lock here, so fix
it by using a common exit path.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-03 07:58:50 +10:00
Eric Anholt
2da9118852 u_primconvert: Use u_upload_mgr for our little IB allocations.
tex-miplevel-selection was hammering my memory manager with primconverts
on individual quads.  This gets all those converted IBs packed into larger
IBs.

Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-09-02 13:55:15 -07:00
Eric Anholt
6720d1573a u_primconvert: Shut up compiler warning.
gcc isn't detecting that src is set before used, since both are under if
(info->indexed).

Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-09-02 13:55:15 -07:00
Eric Anholt
1718ba30e5 gbm: Fix gallium build when X11 is in a non-system directory
pipe-loader.h will include Xlib.h when HAVE_PIPE_LOADER_XLIB is set in the
build.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-02 13:48:54 -07:00
Eric Anholt
d71a9b7d9d vc4: Handle a couple of the transfer map flags.
This is part of fixing extremely long runtimes on some piglit tests that
involve streaming vertex reuploads due to format conversions, and will
similarly be important for X performance, which relies on these flags.
2014-09-02 12:10:56 -07:00
Kristian Høgsberg
8f55174fbd meta: Make MESA_META_DRAW_BUFFERS restore properly
A meta begin/end pair with MESA_META_DRAW_BUFFERS will change visible GL
state.  We recreate the draw buffer enums from the buffer bitfield, which
changes GL_BACK to GL_BACK_LEFT (and GL_FRONT to GL_FRONT_LEFT).

This commit modifes the save/restore logic to instead copy the buffer enums
from the gl_framebuffer and then set them on restore using
_mesa_drawbuffers().

It's not clear how this breaks the benchmark in 82796, but fixing meta to not
leak the state change fixes the regression.

No piglit regressions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=82796
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: mesa-stable@lists.freedesktop.org
2014-09-02 10:33:13 -07:00
Emil Velikov
5a4e0f3873 Revert "mesa: fix make tarballs"
This reverts commit 0fbb9a599d.

Rather than adding hacks around the issue drop the sources from the
final tarball, and re-add them back with 'make dist'. This fixes a
problem when running parallel 'make install' fails as it recreates
sources and triggers partial recompilation.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83355
Reported-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
2014-09-02 11:39:29 +01:00
Dave Airlie
021e84f292 mesa/program_cache: calloc the correct size for the cache.
Coverity reported this, and I think this is the right solution,
since cache->items is struct cache_item ** not struct cache_item *,
we also realloc it using struct cache_item * at some point.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-02 16:42:24 +10:00
Michel Dänzer
a75fee78c6 radeonsi: Compile dummy pixel shader on demand
It's never used under normal circumstances.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-02 15:24:07 +09:00
Michel Dänzer
b84b9eae20 u_blitter: Create all shaders on demand
Not all of these are used in every context, so this can make a
significant difference for short-lived contexts such as in piglit tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-02 15:24:07 +09:00
Michel Dänzer
51131c423c r600g,radeonsi: Inform the kernel if a BO will likely be accessed by the CPU
This allows the kernel to prevent such BOs from ever being stored in the
CPU inaccessible part of VRAM.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-02 15:24:07 +09:00
Dave Airlie
2d5d1f5598 glsl: free uniform_map on failure path.
If we fails in reserve_explicit_locations, we leak uniform_map.

Reported-by: coverity scanner.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-02 16:05:52 +10:00
Paul Berry
9f20503658 main/cs: Add gl_context::ComputeProgram
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-01 19:38:27 -07:00
Jordan Justen
d035d50e05 mesa: Convert NewDriverState to 64-bits
i965 will have more than 32 bits when BRW_STATE_COMPUTE_PROGRAM is added.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-01 19:38:27 -07:00
Paul Berry
8e27a4d2b3 i965: Modify state upload to allow 2 different sets of state atoms.
The set of state atoms for compute shaders is currently empty; it will
be filled in by future patches.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-01 19:38:27 -07:00
Paul Berry
373143ed91 i965: Modify dirty bit handling to support 2 pipelines.
The hardware state for compute shaders is almost entirely orthogonal
to the hardware state for 3D rendering.  To avoid sending unnecessary
state to the hardware, we'll need to have a separate set of state
atoms for the compute pipeline and the 3D pipeline.  That means we
need to maintain two separate sets of dirty bits to determine which
state atoms need to be run.

But the dirty bits are not completely independent; for example, if
BRW_NEW_SURFACES is flagged while doing 3D rendering, then not only do
we need to re-run 3D state atoms that depend on BRW_NEW_SURFACES, but
we also need to re-run compute state atoms that depend on
BRW_NEW_SURFACES.  But we'll also need to re-run those state atoms the
next time the compute pipeline is run.

To accomplish this, we record two sets of dirty bits, one for each
pipeline.  When bits are dirtied (via SET_DIRTY_BIT() or
SET_DIRTY_ALL()) we set them to the dirty state in both pipelines.
When brw_state_upload() is run, we clear the dirty bits just for the
pipeline that was run.

Note that since the number of pipelines is known at compile time to be
2, the compiler should unroll the loops in SET_DIRTY_BIT() and
SET_DIRTY_ALL().

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-01 19:38:27 -07:00
Paul Berry
c5bdf9be1e i965: Create a macro for checking a dirty bit.
This will make it easier to extend dirty bit handling to support
compute shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-01 19:38:27 -07:00
Paul Berry
6f56e1424d i965: Create a macro for setting all dirty bits.
This will make it easier to extend dirty bit handling to support
compute shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-01 19:38:27 -07:00
Paul Berry
88e3d404da i965: Create a macro for setting a dirty bit.
This will make it easier to extend dirty bit handling to support
compute shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-01 19:38:27 -07:00
Dave Airlie
94a909ec2d i965: add missing parens in vec4 visitor
coverity reported this, Matt said it look like missing parens,
not bad identing, so lets try that.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-02 11:07:11 +10:00
Dave Airlie
19f6e80a1e nouveau: don't leak dec struct on error
This one path doesn't goto fail, so it seems to leak dec.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-02 10:08:58 +10:00
Dave Airlie
32a8b2cf54 xvmc/tests: %C isn't a valid printf specifier.
Reported-by: Coverity scanner.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-02 10:07:54 +10:00
Dave Airlie
ea88b1de2f nouveau/nv40: quiten coverity warning in unused vertex texture code.
This fixes the code, but we never run it anyways, so silence coverity.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-02 10:04:29 +10:00
Ilia Mirkin
d0cd86686d nv50: remove unused variables
Recent code changes have caused these to no longer be used. Remove them.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-01 18:47:42 -04:00
Ilia Mirkin
0c38006b55 mesa: force height of 1D textures to be 1 in texture views
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-09-01 18:38:02 -04:00
Ilia Mirkin
2c44043313 nv50: attach the buffer bo to the miptree structures
The current code... makes no sense. Use nouveau_bo_ref to attach the bo
to the exposed resource so as to have the proper lifetime guarantees.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Ilia Mirkin
9d52e551a5 nv50: mt address may not be the underlying bo's start address
With VP2, nv50_miptree is faked because the underlying bo's have to be
laid out in a certain way. This is done by adjusting the address. Make
sure that blits (and everything else for consistency) use the mt address
rather than the bo address as a base.

This fixes retrieving chroma plane with VDPAU.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82255
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Ilia Mirkin
2528d402b9 nv50: set the miptree address when clearing bo's in vp2 init
The mt address is about to be used more, make sure it's set
appropriately.

Reported-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Ilia Mirkin
6c2b079231 nv50/ir: avoid creating instructions that can't be emitted
When constant folding a MAD operation, we first fold the multiply and
generate an ADD. However we do so without making sure that the immediate
can be handled in the saturate case. If it can't, load the immediate in
a separate instruction.

Reported-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Ilia Mirkin
115d9a5525 nvc0: don't make 1d staging textures linear
Experimentally, the sampler doesn't appear to like these, neither as
buffer nor as rect textures. So remove 1D from the list of texture types
to make linear when used for staging.

This fixes the OSD in mplayer for VDPAU.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Ilia Mirkin
362cd26960 nv50: zero out unbound samplers
Samplers are only defined up to num_samplers, so set all samplers above
nr to NULL so that we don't try to read them again later.

Tested-by: Christian Ruppert <idl0r@qasl.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Ilia Mirkin
c4bb436f76 nvc0/ir: avoid infinite recursion when finding first uses of tex
In certain circumstances, findFirstUses could end up doubling back on
instructions it had already processed, resulting in an infinite
recursion. Avoid this by keeping track of already-visited instructions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83079
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Rob Clark
ef858ac770 freedreno/ir3: add DDX/DDY
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-01 18:08:21 -04:00
Rob Clark
5e5604cc28 freedreno/ir3: don't keep IR around
Once we've assembled the shader, no need to keep the intermediate
around.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-01 18:08:21 -04:00
Jason Ekstrand
e8f83538dd i965/fs: Don't segfault when debug-logging a null program
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-01 12:33:13 -07:00
Jason Ekstrand
1c573c9adb i965/vec4: Don't segfault when debug-logging a null program
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-01 12:31:56 -07:00
Marek Olšák
a10c8db715 radeonsi: implement EXPCLEAR optimization for depth
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:52 +02:00
Marek Olšák
f05fe294e7 r600g,radeonsi: initialize HTILE to fully-expanded state
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:52 +02:00
Marek Olšák
573313c94e radeonsi: implement fast depth clear
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:51 +02:00
Marek Olšák
63cb4077e6 radeonsi: move DB_RENDER_CONTROL into draw_vbo
So that I can add fast depth clear.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:51 +02:00
Marek Olšák
78aa717601 radeonsi: disable occlusion queries if they are not needed
We always left them enabled, which turned off HiZ in some cases.
This should improve performace with Hyper-Z.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:51 +02:00
Marek Olšák
ab9ad91779 r600g,radeonsi: force fast stencil and HTILE stencil off, fixing a Hyper-Z hang
This should be as fast as no HTILE for stencil. I think we can still get full
performance with depth-only rendering even if stencil is present in the buffer
but not used, but I'm not 100% sure. This may be revisited when HiS and fast
stencil clear are implemented.

This fixes a hang in Brutal Legend.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64471

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:51 +02:00
Marek Olšák
ba14d4910c r600g: set VGT_ENHANCE=4 on R7xx
This is a golden setting on RV740, but there is a hw bug which recommends
setting it on all R7xx chipsets.

Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:49 +02:00
Marek Olšák
13b93596da r600g: expose AMD_vertex_shader_layer and *_viewport_index on R600-R700
already implemented

Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:45 +02:00
Marek Olšák
d159c5e3e0 r600g: fix layered clear
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:42 +02:00
Marek Olšák
e6d191bb6f r600g: some DB bug workarounds for R6xx DB flushing
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:40 +02:00
Marek Olšák
0ccc653c70 r600g: enable fast depth clear for array textures and cubemaps
I have a piglit test that hits this.

Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:37 +02:00
Marek Olšák
6d751065cc r600g: use HTILE allocator from SI
It's almost the same.

This enables tiling for HTILE. It also enables Hyper-Z for other texture
targets (1D, 1D_ARRAY, 2D_ARRAY, CUBE, CUBE_ARRAY, 3D, RECT).

2D array depth textures are tested by Unigine Sanctuary and my new piglit
test.

Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:33 +02:00
Marek Olšák
ee1b30eaff r600g: set DB_DEPTH_SIZE.HEIGHT_TILE_MAX for EG/CM, inline other fields
This fixes rendering to non-zero layer/face/slice with HTILE.

v2: added the assertion

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:17:40 +02:00
Marek Olšák
91050ff215 radeonsi: set DB_DEPTH_SIZE.HEIGHT_TILE_MAX, inline other fields
This fixes rendering to a non-zero layer/face/slice with HTILE.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72685

v2: added the assertion

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:15:36 +02:00
Glenn Kennard
8d0f6ff810 r600g: Implement sm5 geometry shader instancing
Requires Evergreen or later hardware.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
2014-09-01 21:12:03 +02:00
Marek Olšák
482def592f glsl_to_tgsi: allocate and enlarge arrays for temporaries on demand
This fixes crashes if the number of temporaries is greater than 4096.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66184

v2: added fail paths for realloc failures

Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-01 21:03:58 +02:00
Marek Olšák
b419c651fb gallium/pb_bufmgr_cache: limit the size of cache
This should make a machine which is running piglit more responsive at times.
e.g. streaming-texture-leak can easily eat 600 MB because of how fast it
creates new textures.
2014-09-01 20:17:48 +02:00
Marek Olšák
bba7d29a86 pipe-loader: use the correct screen index 2014-09-01 20:09:19 +02:00
Marek Olšák
0b56e23e7f egl/dri2: use the correct screen index
Required for multi-GPU configuration where each GPU has its own X screen.
2014-09-01 20:09:19 +02:00
Jordan Justen
1a428a5256 docs: Mark ARB_compute_shader as work in progress
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2014-09-01 10:45:37 -07:00
Connor Abbott
d571f2b15d i965/fs: don't use ir->shadow_comparitor in emit_texture_*
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-01 00:55:14 -07:00
Connor Abbott
cbfcb1b069 i965/fs: don't pass ir_variable * to emit_samplepos_setup()
We were only using it to get at its type, which we already know because
it's a builtin variable.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-01 00:12:15 -07:00
Connor Abbott
ec3d06f591 i965/fs: don't pass ir_variable * to emit_frontfacing_interpolation()
We were only using it to get at its type, which we already know because
it's a builtin variable.

v2 (Ken): Rebase on Matt's optimized gl_FrontFacing calculations.

Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-09-01 00:11:16 -07:00
Kenneth Graunke
70691f0c28 i965: Fix GPU hangs when INTEL_DEBUG=no16 is set.
The replicated data clear shader needs to be SIMD16, or else the GPU
will hang.  So, compile it even if INTEL_DEBUG=no16 is set.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-08-31 17:03:31 -07:00
Emil Velikov
88cbe3908f mesa: fix make tarballs
Current method of generating distribution tar-balls involves manually
invoking make + target name in the appropriate places. This temporary
solution is used until we get 'make dist' working.

Currently it does not work, as in order to have the target (which is
also a filename) available in the final Makefile we need to add a PHONY
target + use the correct target name.

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-01 00:22:20 +01:00
Abdiel Janulgue
5598458e69 i965/vec4: Remove try_emit_saturate
Now that saturate is implemented natively as an instruction,
we can cut down on unneeded functionality.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:09 +03:00
Abdiel Janulgue
cbd225057a i965/fs: Refactor try_emit_saturate
v3: Since the fs backend can emit saturate as a separate instruction, there is
    no need to detect for min/max instructions and to rewrite the instruction tree
    accordingly. On the other hand, we don't need to emit a separate saturated
    mov either when the expression generating src can do saturate directly.
v4: Add can_do_saturate() check before enabling saturate modifer (Ken)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:09 +03:00
Abdiel Janulgue
b2c0c35907 ir_to_mesa, glsl_to_tgsi: Remove try_emit_saturate
Now that saturate is implemented natively as instruction,
we can cut down on unneeded functionality.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:09 +03:00
Abdiel Janulgue
7841a246b9 i965/vec4: Allow propagation of instructions with saturate flag to sel
When sel conditon is bounded within 0 and 1.0. This allows code as:
        mov.sat a b
        sel.ge  dst a 0.25F

To be propagated as:
        sel.ge.sat dst b 0.25F

v3: - Syntax clarifications in inst->saturate assignment
    - Remove extra parenthesis when assigning src_reg value
      from copy_entry (Matt Turner)
v4: - Take channels into consideration when propagating saturated instructions.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:09 +03:00
Abdiel Janulgue
40aeb558ce i965/fs: Allow propagation of instructions with saturate flag to sel
When sel conditon is bounded within 0 and 1.0. This allows code as:
	mov.sat a b
	sel.ge  dst a 0.25F

To be propagated as:
	sel.ge.sat dst b 0.25F

v3: Syntax clarifications in inst->saturate assignment (Matt Turner)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:09 +03:00
Abdiel Janulgue
0e2ba3ee82 glsl: Optimize clamp(x, b, 1.0), where b > 0.0 as max(saturate(x),b)
v2: - Output max(saturate(x),b) instead of saturate(max(x,b))
    - Make sure we do component-wise comparison for vectors (Ian Romanick)
v3: - Add missing condition where the outer constant value is > 0.0 and
      inner constant is 1.0.
    - Fix comments to show that the optimization is a commutative operation
      (Matt Turner)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:08 +03:00
Abdiel Janulgue
d92394c5d8 glsl: Optimize clamp(x, 0.0, b), where b < 1.0 as min(saturate(x),b)
v2: - Output min(saturate(x),b) instead of saturate(min(x,b)) suggested by Ilia Mirkin
    - Make sure we do component-wise comparison for vectors (Ian Romanick)
v3: - Add missing condition where the outer constant value is zero and
      inner constant is < 1
    - Fix comments to reflect we are doing a commutative operation (Matt Turner)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:08 +03:00
Abdiel Janulgue
8f890b119e glsl: Optimize clamp(x, 0, 1) as saturate(x)
v2: - Check that the base type is float (Ian Romanick)
v3: - Make sure comments reflect that we are doing a commutative operation
    - Add missing condition where the inner constant is 1.0 and outer constant is 0.0
    - Make indexing of operands easier to read (Matt Turner)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:08 +03:00
Abdiel Janulgue
cbd0d643a3 glsl: Implement saturate as ir_unop_saturate
Now that we have the ir_unop_saturate implemented as a single
instruction, generate the correct simplified expression.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:08 +03:00
Abdiel Janulgue
cb621166dc yi965/vec4: Add support for ir_unop_saturate
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:08 +03:00
Abdiel Janulgue
4bfe8a1e61 i965/fs: Add support for ir_unop_saturate
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:08 +03:00
Abdiel Janulgue
909fa50f5b ir_to_mesa, glsl_to_tgsi: Add support for ir_unop_saturate
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:08 +03:00
Abdiel Janulgue
cfa8c1cb39 ir_to_mesa, glsl_to_tgsi: lower ir_unop_saturate
Needed when vertex programs doesn't allow saturate

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:08 +03:00
Abdiel Janulgue
8935c12937 glsl: Add a pass to lower ir_unop_saturate to clamp(x, 0, 1)
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-31 21:04:08 +03:00
Abdiel Janulgue
4c0ccfc5b3 glsl: Add constant evaluation of ir_unop_saturate
v2: Use CLAMP macro (Ian Romanick)

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-31 21:04:08 +03:00
Abdiel Janulgue
a5f02b6696 glsl: Add ir_unop_saturate
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-31 21:04:08 +03:00
Abdiel Janulgue
f340145107 i965/vec4/fs: Count loops in shader debug
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:04:03 +03:00
Abdiel Janulgue
ddc1d297bc i965/vec4: inline generate_vec4_instruction() within generate_code()
Suggested by Matt. This patch combines and moves back the code-generation
functions from generate_vec4_instruction() into generate_code(). Makes
generate_code() a bit larger, but helps us to count loops in a
straightforward manner.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2014-08-31 21:03:49 +03:00
Kenneth Graunke
e34a363a78 i965: Add 2x MSAA support to Broadwell fast clear code.
According to the cited documentation section (but in the newer docs),
x_scaledown is the same for 2x and 4x MSAA.

+47 piglits.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83081
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
2014-08-31 01:48:10 -07:00
Matt Turner
8b5ac1df17 i965/vec4: Update register coalescing test.
In commit 04895f5c I added support for reswizzling writemasks. This test
was checking that we didn't support this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82881
2014-08-30 21:00:28 -07:00
Matt Turner
0492275038 i965: Use unreachable() to silence warning.
brw_meta_fast_clear.c:211:17: warning: 'x_scaledown' may be used
uninitialized in this function [-Wmaybe-uninitialized]
    unsigned int x_scaledown, y_scaledown;

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-30 21:00:28 -07:00
Chia-I Wu
a14c23735e ilo: set INTEL_RELOC_GGTT only on GEN6
We asked MI commands to use GGTT only on GEN6.
2014-08-31 10:34:39 +08:00
Chia-I Wu
255b274d75 ilo: fix bound check for 3DSTATE_URB_VS
Fix max/min entries on GEN7.5 GT2/GT3.
2014-08-31 10:34:39 +08:00
Chia-I Wu
5f4b13f5fa ilo: replace cmd by dw0 in GPE
With e3c251071b, the magic values are gone.  We
no longer need "cmd" to hide them.  Replace it by dw0.
2014-08-31 10:34:39 +08:00
Alexander von Gluck IV
7b6ea6ab8c st/hgl: Move st_visual create/destroy into hgl state_tracker 2014-08-30 19:35:24 -04:00
Alexander von Gluck IV
15da8d0761 st/hgl: Move st_manager create/destroy into hgl state_tracker 2014-08-30 19:35:24 -04:00
Rob Clark
c06afcede2 freedreno/ir3: fix potential null ptr deref
Fix potential segfault in debug code.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-08-30 18:02:51 -04:00
Rob Clark
c99f09f4be freedreno/ir3: add TXB
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-08-30 18:02:51 -04:00
Rob Clark
b823abedf8 freedreno/ir3: detect scheduler fail
There are some cases where the scheduler can get itself into impossible
situations, by scheduling the wrong write to pred or addr register
first.  (Ie. it could end up being unable to schedule any instruction if
some instruction which depends on the current addr/reg value also
depends on another addr/reg value.)

To solve this we'd need to be able to insert extra mov instructions
(which would also help when register assignment gets into impossible
situations).  To do that, we'd need to move the nop padding from sched
into legalize.

But to start with, just detect when we get into an impossible situation
and bail, rather than sitting forever in an infinite loop.  This way it
will at least fall back to the old compiler, which might even work if
you are lucky.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-08-30 18:02:50 -04:00
Ian Romanick
932b0ef1ce glsl: Use bit-flags image attributes and uint16_t for the image format
All of the GL image enums fit in 16-bits.

Also move the fields from the anonymous "image" structucture to the next
higher structure.  This will enable packing the bits with the other
bitfield.

Valgrind massif results for a trimmed apitrace of dota2:

                  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
Before (32-bit): 76 40,572,916,873       68,831,248       63,328,783     5,502,465            0
After  (32-bit): 70 40,577,421,777       68,487,584       62,973,695     5,513,889            0

Before (64-bit): 60 36,822,640,058       96,526,824       88,735,296     7,791,528            0
After  (64-bit): 74 37,124,603,758       95,891,808       88,466,712     7,425,096            0

A real savings of 346KiB on 32-bit and 262KiB on 64-bit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-29 23:29:19 -07:00
Ian Romanick
8eeca7a56c glsl: Use a single bit for the dual-source blend index
The only values allowed are 0 and 1, and the value is checked before
assigning.

Valgrind massif results for a trimmed apitrace of dota2:

                  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
Before (32-bit): 74 40,580,119,657       69,186,544       63,506,327     5,680,217            0
After  (32-bit): 76 40,572,916,873       68,831,248       63,328,783     5,502,465            0

Before (64-bit): 89 36,822,971,897       96,526,616       88,735,296     7,791,320            0
After  (64-bit): 60 36,822,640,058       96,526,824       88,735,296     7,791,528            0

A real savings of 173KiB on 32-bit and no change on 64-bit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-29 23:28:26 -07:00
Ian Romanick
c0cd5bedf6 glsl: Eliminate ir_variable::data.atomic.buffer_index
Just use ir_variable::data.binding... because that's the where the
binding is stored for everything else that can use layout(binding=).

Valgrind massif results for a trimmed apitrace of dota2:

                  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
Before (32-bit): 50 40,564,927,443       69,185,408       63,683,871     5,501,537            0
After  (32-bit): 74 40,580,119,657       69,186,544       63,506,327     5,680,217            0

Before (64-bit): 59 36,822,048,449       96,526,888       89,113,000     7,413,888            0
After  (64-bit): 89 36,822,971,897       96,526,616       88,735,296     7,791,320            0

A real savings of 173KiB on 32-bit and 368KiB on 64-bit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-29 23:27:59 -07:00
Kenneth Graunke
941269f89c mesa: Delete ctx->GeometryProgram.Cache.
The VertexProgram and FragmentProgram have a Cache member for dealing
with fixed function programs.  There are no fixed function geometry
programs, so this should never have existed, and was just copy and
pasted.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-08-29 22:13:37 -07:00
Roland Scheidegger
ca4f0baca2 gallivm: fix somewhat broken NaN behavior for exp2
I actually screwed that up in 754319490f,
mistakenly thinking the code actually wanted the non-nan result before.
So, introduce that missing nan behavior case and use that instead.
For sse, there's no actual change in the resulting code at all, the fallback
code wouldn't have done the right thing though.
Of course, the actual issue I saw with pow() was completely unrelated...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-08-30 01:34:41 +02:00
Roland Scheidegger
3d29e75a5f softpipe: handle vertex texture sampling when using llvm for draw
Pretty trivial, just fill in the offsets and such. The implementation
is near 100% copy and paste from llvmpipe. Should be useful for debugging.

No piglit change when not using SOFTPIPE_USE_LLVM=1.
Now that it can do the same tests with and without using llvm for vs/gs,
with llvm more pass, the only things failing only with llvm seems to be
edgeflags tests and vs/gs-pow-float-float (and for the latter I'm not
convinced the zero tolerance it requires is somehow mandated by glsl).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-08-30 01:34:16 +02:00
Roland Scheidegger
62fd871984 llvmpipe: (trivial) enable cube map arrays
The code is all in place now so enable it.
Seems to pass all relevant piglit tests (just like cube maps, some of the
cube map array tests need GALLIVM_DEBUG=no_quad_lod,no_rho_approx)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-08-30 01:33:40 +02:00
Roland Scheidegger
9da75f96bc gallivm: handle cube map arrays for texture sampling
Pretty easy, just make sure that all paths testing for PIPE_TEXTURE_CUBE
also recognize PIPE_TEXTURE_CUBE_ARRAY, and add the layer * 6 calculation
to the calculated face.
Also handle it for texture size query, looks like OpenGL wants the number
of cubes, not layers (so need division by 6).

No piglit regressions.

v2: fix up adding cube layer to face for seamless filtering (needs to happen
after calculating per-sample face). Undetected by piglit unfortunately.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (v1)
2014-08-30 01:33:02 +02:00
Roland Scheidegger
26a5156de7 draw: kill off bogus assertion in tgsi_fetch_gs_outputs
Not sure why it was there but it is definitely not an error if gs outputs are
infs/nans. Besides, the outputs can be ints, in which case any small negative
number asserted.
This fixes piglit's texelFetch gs isamplerXX crashes with softpipe (down from
14 to 2).

Bug https://bugs.freedesktop.org/show_bug.cgi?id=80012

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-08-30 01:17:47 +02:00
Roland Scheidegger
c9ae5038d5 softpipe: don't assert on illegal wrap mode for rect textures
piglit tex-miplevel-selection nowadays doesn't use repeat wrap mode due to
sampler objects any longer, however at the time of the clear the wrap mode
is still illegal and at this point we get to verify the state, including
samplers (even though they won't get used), and because mesa doesn't treat
it as an incomplete texture as the spec says it should, we hit the assertion.
Just warn about this for now instead.
Gets crashes down from 44 to 14 in a piglit run (all were in various tests of
tex-miplevel-selection with texture rectangles). Though just about all
tex-miplevel-selection tests fail anyway for other reasons.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-08-30 01:17:47 +02:00
Roland Scheidegger
032fe4ed23 tgsi: (trivial) fix handling msaa resources on TXF
Just handle as ordinary 2d / 2d array resources. Prevents an assertion failure
with softpipe and piglit glsl-resource-not-bound 2DMS/2DMSArray tests.
While here also fix TXD shadowCube similarly, which fixes the crash with piglit
tex-miplevel-selection textureGrad CubeShadow (the test will still fail due to
softpipe being broken).

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=80011

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-08-30 01:17:47 +02:00
Roland Scheidegger
99105454b0 draw: remove fishy num_samplers/num_sampler_views check in llvm path
This was meant for softpipe to not crash at some point if vertex texturing
was used. It is, however, fishy because it uses values from
draw_set_samplers/draw_set_sampler_views and not from the shader key. Albeit
we should still in all cases actually generate a new shader if this changes
(because the samplers and views themselves are in the key) I don't want to
think again wondering if that's really correct in the future.
Besides, at least today, it does not actually work for softpipe, as this was
relying on softpipe not actually calling draw_set_samplers/sampler_views at
all - I've verified it crashes regardless (if there were a tex instruction in
the vs, which normally should not happen anyway). For drivers which do indeed
not call these functions because they don't support vertex texturing at all
(r300), this should still not crash because the static texture data is all
zero, which causes the sampling functions to take an early out (same as is done
if no texture is bound at the slot used for sampling - verified with hacked up
softpipe).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-08-30 01:17:46 +02:00
Roland Scheidegger
85d4cc4790 mesa: fix fallback texture for cube map array
mesa was creating a cube map array texture with just one layer, which is
not legal. This caused an assertion failure when using that texture later
in llvmpipe (when enabling cube map arrays) since it verifies the number
of layers in the view is divisible by 6 (the sampling code might well crash
randomly otherwise) with piglit glsl-resource-not-bound CubeArray -fbo -auto.

v2: use appropriately sized texel array...

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
2014-08-30 01:17:46 +02:00
Aaron Watry
7c73ee677f r600/compute: Don't leak compute pool item_list/unallocated_list
v3: Fix multi-line comment format
v2: Change to C-style comments and fix indentation

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Bruno Jiménez <brunojimen@gmail.com>
2014-08-29 17:38:24 -05:00
Michel Dänzer
6cd0dbc415 u_vbuf: Make sure all caps are initialized
Pointed out by valgrind.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83148
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-08-29 12:15:10 +09:00
Michel Dänzer
2a99b6e40f r600g: Reinstate include path to common radeon source directory
Fixes build failure since commit a131263a2f
('gallium/radeon: cleanup header inclusion'):

../../../../../src/gallium/drivers/r600/evergreen_compute.c:50:30: fatal error: radeon_llvm_util.h: No such file or directory
 #include "radeon_llvm_util.h"
                              ^
compilation terminated.

Trivial.
2014-08-29 12:09:16 +09:00
Matt Turner
2cab62a68d i965: Mark BRW_CONDITIONAL_R as Gen <= 5. 2014-08-28 19:06:45 -07:00
Matt Turner
4fcefac753 i965/disasm: Show jump count for if/iff/halt.
These instructions don't have pop count.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-28 19:06:27 -07:00
Matt Turner
fb2fddefce i965/disasm: Disassemble JMPI's source properly.
The source can be a register as well as an immediate, and disassembling
a register as an immediate can have some strange results.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-28 19:06:27 -07:00
Matt Turner
bef7a025eb i965/disasm: Add break/cont/halt to list of has_uip().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-28 19:06:27 -07:00
Matt Turner
383eccb77e i965/disasm: Disassemble Z/NZ conditional modifiers as .z/.nz.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-28 19:06:27 -07:00
Ilia Mirkin
b4418cd4ce nouveau: allow more tokens by default to avoid parse failures
Also print a note saying that parsing failed to help isolate issues.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-08-28 21:53:55 -04:00
Emil Velikov
76e5406e58 targets/haiku-softpipe: explicitly prefix sw/hgl header
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:41:51 -04:00
Emil Velikov
f5fb9c556b sw/hgl: struct haiku_displaytarget is not public struct
It is meant to be private within the actual winsys. Remove it from
the exported header, and fold it into it's only user.

Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:41:46 -04:00
Emil Velikov
3b36ba4c39 include/haiku: fix comment typo
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:41:29 -04:00
Emil Velikov
5b8900ded3 hgl: trivial bits
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:34:43 -04:00
Alexander von Gluck IV
311b59495c gallium/targets: Break haiku state_tracker out to own directory
Ack'ed by Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:27:29 -04:00
Alexander von Gluck IV
86d1aa8531 gallium/targets: Haiku softpipe, perform better framebuffer validation
* Check for back left attachment as well
* Set and act on pipe format none

Ack'ed by Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:27:26 -04:00
Emil Velikov
96b45e67d5 st/egl: ship all the files in the tarball
Namely we were missing the headers and the Android/SCons buildscripts.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:24:42 +01:00
Emil Velikov
da1d324909 st/clover: sort the sources list
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:24:42 +01:00
Emil Velikov
010fa9074e st/gbm: include the header in the sources list
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:24:42 +01:00
Emil Velikov
27be19aa45 st/xlib: Include the headers in the sources list.
Yet another step towards a working 'make dist'.

Cc: José Fonseca <jfonseca@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: José Fonseca <jfonseca@vmware.com>
2014-08-28 21:24:42 +01:00
Emil Velikov
526a9d9c5e st/omx: use makefile.sources to handle sources lists
... and add the headers so that 'make check' is happy.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2014-08-28 21:24:41 +01:00
Emil Velikov
f6507d2357 st/vdpau: pickup/ship the private header
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2014-08-28 21:24:41 +01:00
Emil Velikov
e3fd703e85 st/vdpau: remove obsolete define VL_HANDLES
This define is always set and it had no real purpose according to
git log. Seems like it is a leftover from the vl/vdpau prototype
stage.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2014-08-28 21:24:41 +01:00
Emil Velikov
60d772cd9d st/vega: add headers and SConscript in the tarball
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:24:41 +01:00
Emil Velikov
bcdb47d838 st/xa: add remaining files in the tarball
Namely
 - the private header (xa_priv.h)
 - README and
 - xa-indent

Sort the sources list while we're here.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:24:40 +01:00
Emil Velikov
398f6eefee st/xvmc: pick up the headers for distribution
- autotools/make will pick them up in the tarball.
 - Sort the list alphabetically.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:24:40 +01:00
Emil Velikov
c6e5801b40 Revert "configure: Disable xvmc by default"
This reverts commit 6a19bb56e0.

The above commit disabled the default build of xvmc as the xvmc tests
were failing. As pointed out by Ilia, the tests are "broken by design"
as they do not test the object that is build but the one that is
installed and setup on the workstation.

With previous commit we moved the programs from the 'make check' to
noinst automake target. This way they won't be run but will be around
for people to use them.

Cc: Tom Stellard <thomas.stellard@amd.com>
2014-08-28 21:24:40 +01:00
Emil Velikov
91f49befd0 st/xvmc: automake: move tests to noinst
All the tests require an installed and setup XvMC, thus they
are not good candidates for 'make check'.
Keep them around as the user might want to actually test the
implementation post installation/setup.

Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Tom Stellard <thomas.stellard@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:24:40 +01:00
Emil Velikov
015792fb02 winsys/sw: add the final files to the tarball
Add the final remaining files into the tarball (make dist), namely:
 - SConscripts
 - Non-autotooled winsys' - android, gdi and hgl.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:24:39 +01:00
Emil Velikov
95603e259b winsys/sw: automake: consistently use Makefile.sources
- Include the headers within.
 - Update scons to use them.
 - Drop useless include (gallium/drivers) from scons.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:24:39 +01:00
Emil Velikov
f0ae81cc13 winsys/$(hw): ship the Android/SCons scripts in the tarball
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:24:38 +01:00
Emil Velikov
63e9831756 winsys/$(hw): include headers in Makefile.sources
Otherwise 'make dist' will not pick them up :'(

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:24:38 +01:00
Emil Velikov
afdc44deca st/egl: cleanup sw winsys header inclusions
- Drop duplicate include compiler directives.
 - Leave the sw/ prefix for all the software winsys headers.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:24:37 +01:00
Emil Velikov
30f3df4e53 winsys/radeon: move radeon_cs_dump.h to drm
... to ease packaging (make dist).
Update it to fetch libdrm's include/libs via pkg-config.

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-08-28 21:24:37 +01:00
Emil Velikov
a131263a2f gallium/radeon: cleanup header inclusion
- Add top_srcdir/src/gallium/winsys to GALLIUM_DRIVER_C{XXFLAGS}.
 - Remove top_srcdir/src/gallium/drivers/radeon from the includes.

As a result:
 - Common radeon headers are prefixed with 'radeon/'
 - Winsys header inclusion is prefixed 'radeon/drm'

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-08-28 21:24:37 +01:00
Emil Velikov
22a13f5b09 winsys/svga: build: cleanup the includes
gallium/drivers is already part fo GALLIUM_WINSYS_CFLAGS.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-28 21:24:36 +01:00
Emil Velikov
7dc2f9f919 winsys/i915: remove the software winsys
We stopped building it recently as it was unused and not tested.
Good bye, it's been nice knowing you :)

Cc: Stephane Marchesin <stephane.marchesin@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Stephane Marchesin <stephane.marchesin@gmail.com>
2014-08-28 21:24:36 +01:00
Emil Velikov
664c2d7694 gallium/ilo: cleanup intel_winsys.h
Make the header location, inclusion and contents more common with
its i915,r* and nouveau counterparts:

 - Move the header within drivers/ilo.
 - Separate out intel_winsys_create_for_fd into 'drm_public' header.
 - Cleanup the compiler includes.

v2: Move the header to drivers/ilo. Suggested by Chia-I.
v3: Correct intel_winsys.h inclusion. Spotted by Chia-I.

Cc: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2014-08-28 21:24:16 +01:00
Timothy Arceri
4ca203f6a1 docs: mark GL_MAX_VERTEX_ATTRIB_STRIDE as done
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-27 20:36:03 -10:00
Timothy Arceri
89e6806dea gallium: add cap for MAX_VERTEX_ATTRIB_STRIDE
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-08-27 20:35:59 -10:00
Timothy Arceri
3246e11d33 mesa: implement GL_MAX_VERTEX_ATTRIB_STRIDE
V2: moved test for the VertexAttrib*Pointer() functions
 to update_array(), and made constant available for drivers to set

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-27 20:35:56 -10:00
Michel Dänzer
eae9da879f st/clover: Fix build against LLVM SVN >= r216583
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-08-28 12:05:21 +09:00
Roland Scheidegger
eee9f6ae8a draw: fix base instance handling in llvm path
The base instance needs to be passed to the jited function, otherwise the
instanced data fetch will only work with the same start instance when the
jit function was created (and baking that into the key instead is not a viable
option).
This fixes piglit arb_base_instance-drawarrays (modulo some unrelated
core/compat context trouble I get for the test).
And fix the pipe cap bit in llvmpipe for it now that it actually works (it
already worked for softpipe).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-08-28 03:03:23 +02:00
Roland Scheidegger
17eabfeccf docs: fix up status of softpipe, llvmpipe
The docs were never really up to date for them, missing just about everything.
So mark them off as all done for GL 3.3 (though softpipe is in fact quite
broken for some newer things especially wrt texturing, and both don't have
compliant, real msaa support). And add the extensions missing too (no
guarantee of completeness).

Reviewed-by: Dave Airlie <airlied@gmail.com>
2014-08-28 03:01:16 +02:00
Alexander von Gluck IV
0348429586 glsl: Add strings.h on non-MSC platforms
* IEEE Std 1003.1-2001 placed strcasecmp() in strings.h.
* ISO C99 doesn't mention strcase* in string.h
* On all platforms I could find, strcasecmp is in strings.h and string.h
  as a compatibility layer for software written pre-2001 POSIX
* Technically strcasecmp should be only in strings.h and the man
  pages back this up.
* Tested build on CentOS and Haiku

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-27 20:20:58 -04:00
Alex Deucher
6b48c18b03 radeon/uvd: remove comment about RV770
It doesn't seem to support field based decode after testing.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2014-08-27 10:04:13 -04:00
Christian König
80771e47b6 radeon/uvd: fix field handling on R6XX style UVD
The first UVD generation can only do frame based output.

Signed-off-by: Christian König <christian.koenig@amd.com>
2014-08-26 17:56:57 +02:00
Christian König
03a99ba9e4 vl/compositor: set the scissor before clearing the render target
Otherwise we clear areas that shouldn't be cleared.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2014-08-26 17:56:57 +02:00
Christian König
b73c20759f st/vdpau: fix vlVdpOutputSurfaceRender(Output|Bitmap)Surface
Correctly handle that the source_surface is only optional.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80561

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2014-08-26 17:56:57 +02:00
Chia-I Wu
e3c251071b ilo: use genhw command opcodes
Replace ILO_GPE_MI and ILO_GPE_CMD with magic values by descriptive genhw
macros.
2014-08-26 14:11:02 +08:00
Chia-I Wu
6c73478223 ilo: rename intel_bo_map_unsynchronized()
Rename it to intel_bo_map_gtt_async().
2014-08-26 14:10:50 +08:00
Chia-I Wu
354d84b629 ilo: remove max_batch_size
It is used to derive an artificial limit on max relocs per bo.  We choose not
to export it anymore.
2014-08-26 14:10:50 +08:00
Chia-I Wu
fbb869c1aa ilo: replace domains by reloc flags
It is simpler and is supported by the kernel.  It cannot be used with
libdrm_intel yet though.
2014-08-26 14:10:50 +08:00
Chris Forbes
01887593a4 docs: Update who is working on tessellation
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2014-08-26 07:51:11 +12:00
Chris Forbes
38a3490368 glsl: Remove bogus "OUPTUT" token
This is never used. There is another token "OUTPUT" which the lexer can
generate, though. This has been around since the dawn of time; is most
likely a typo.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2014-08-26 07:50:43 +12:00
Marek Olšák
83503f9e68 radeonsi: handle PIPE_BIND_BLENDABLE
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-08-25 13:12:24 +02:00
Marek Olšák
770719eb82 r600g: only set PIPE_BIND_BLENDABLE if colorbuffer rendering is supported
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-08-25 13:12:24 +02:00
Marek Olšák
bc0ae40616 r300g: handle PIPE_BIND_BLENDABLE
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-08-25 13:12:23 +02:00
Eric Anholt
7317f11859 vc4: Stop doing qpu_inst(add, NOP) or qpu_inst(NOP, mul).
Now that the extra WADDR is set, we can knock this off.  Saves a lot of
typing, and makes this code much more legible.
2014-08-24 22:13:26 -07:00
Eric Anholt
78d144f7de vc4: Set the other WADDR in the qpu instruction helpers.
Now you don't need to qpu_inst() your instruction with a NOP to get the
other waddr set.
2014-08-24 22:13:26 -07:00
Eric Anholt
54499a85ff vc4: Merge qpu_a_NOP() and qpu_m_NOP to a single qpu_NOP() helper.
Now that qpu_inst() ignores the WADDR from the other half of the
instruction, we can set both the ADD and MUL WADDRs in the NOP helper.
Thanks to that, we also no longer need to qpu_inst(NOP, NOP).
2014-08-24 22:13:25 -07:00
Eric Anholt
1a7035f386 vc4: Ignore WADDRs from the other half of the instruction when merging.
This allows setting the opposite-side WADDR to NOP (a non-zero value) in
qpu_* helpers, so that we don't need to qpu_inst() merge them with NOPs
all the time just to get the waddr set.
2014-08-24 22:13:25 -07:00
Eric Anholt
3212bafc28 vc4: Fix LT/GE set-0-or-1 compares.
We were using the integer sub, which worked for the common case of EQ and
NE.  Fixes fs-lessThan-ivec2-ivec2 and other tests.
2014-08-24 22:13:25 -07:00
Eric Anholt
e2f66315cb u_vbuf: Add a few more format fallbacks.
Fixes piglit draw-vertices and gl-2.0-vertexattribpointer on vc4, where
I'm only advertising R32F to RGBA32F support so far.

Note: regresses gl-1.5-normal3b3s-invariance due to introduced flushes and
missing depth buffer load/store support in the driver.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-08-24 22:13:25 -07:00
Eric Anholt
bbbe3b65ad u_vbuf: Simplify the format fallback translation.
Individual caps made supporting new fallbacks more complicated than it
needed to be.  Instead, just make a table of fallbacks at context init
time.

v2: Fix inverted "do we need to install vbuf?" flagging caught by Marek.

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
2014-08-24 22:13:25 -07:00
Rob Clark
306e421887 freedreno/a2xx: fix segfault
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-08-24 13:09:47 -04:00
Rob Clark
bd3b096467 freedreno/a3xx: handle first/last level properly
Fixes some assumptions about first_level being zero.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-08-24 13:09:23 -04:00
Rob Clark
b40a6c2b17 freedreno: implement pipe_flush_resource()
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-08-24 13:09:00 -04:00
Rob Clark
478a08ebd2 freedreno: don't ignore src/dst level
Don't ignore src/dst_level in pipe_copy_region.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-08-24 13:08:14 -04:00
Eric Anholt
8d8a5eb792 vc4: Fix save/restore of the VS/FS in the blitter.
When I made the shader cache take the .fs member and moved the binding
point to .bind_fs, I failed to update these.  Fixes crashes in
copyteximage-related tests.
2014-08-23 16:30:42 -07:00
Eric Anholt
9542e68207 vc4: Clear padding of ioctl arguments.
Fixes valgrind complaints from valgrind being unaware of our ioctls.
2014-08-23 16:30:34 -07:00
Vinson Lee
c2867f5b36 auxilary/os: Add Solaris support in os_get_total_physical_memory.
The patch fixes the build on Oracle Solaris.

  CC     os/os_misc.lo
"os/os_misc.c", line 59: #error: unexpected platform in os_sysinfo.c

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-22 18:24:34 -07:00
Alexander von Gluck IV
12a679a6f6 gallium/targets: Haiku, Fix some improper type warnings 2014-08-22 19:37:19 -04:00
Alexander von Gluck IV
31406d978d gallium/targets: Clean up Haiku softpipe renderer visual
* Drop creating gl_config first as it's only really used
  to create the state tracker visual.
2014-08-22 19:37:19 -04:00
Carl Worth
23163df24c glcpp: Don't use alternation in the lookahead for empty pragmas.
We've found that there's a buffer overrun bug in flex that's triggered by
using alternation in a lookahead pattern.

Fortunately, we don't need to match the exact {NEWLINE} expression to
detect an empty pragma. It suffices to verify that there are no non-space
characters before any newline character. So we can use a simple [\r\n] to
get the desired behavior while avoiding the flex bug.

Fixes the regression of piglit's 17000-consecutive-chars-identifier test,
(which has been crashing since commit
04e40fd337 ).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82472
Signed-off-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

CC: <mesa-stable@lists.freedesktop.org>
2014-08-22 15:14:59 -07:00
Kenneth Graunke
97d03b9366 i965: Disable try_emit_b2f_of_compare on Gen4-6.
The optimization relies on CMP setting the destination to 0, which is
equivalent to 0.0f.  However, early platforms only set the least
significant byte, leaving the other bits undefined.  So, we must disable
the optimization on those platforms.

Oddly, Sandybridge wasn't reported as broken.  The PRM states that it
only sets the LSB, but the internal documentation says that it follows
the IVB behavior.  Since it wasn't reported as broken, we believe it
really does follow the IVB behavior.

v2: Allow the optimization on Sandybridge (requested by Matt).

+32 piglits on Ironlake.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?=79963
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-08-22 11:40:32 -07:00
Matt Turner
b8aa1005c8 i965/fs: Preserve CFG in predicated break pass.
Operating on this code,

B0: ...
    cmp.ne.f0(8)
    (+f0) if(8)
B1: break(8)
B2: endif(8)

We can delete B2 without attempting to merge any blocks, since the
break/continue instruction necessarily ends the previous block.

After deleting the if instruction, we attempt to merge blocks B0 and B1.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-08-22 10:23:34 -07:00
Matt Turner
3c4c2a6e30 i965/fs: Rename variable in predicated break pass.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-08-22 10:23:34 -07:00
Matt Turner
1db74a423f i965/fs: Preserve CFG in the SEL peephole.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-08-22 10:23:34 -07:00
Matt Turner
81755bc67b i965: Preserve CFG when deleting dead control flow.
This pass deletes an IF/ELSE/ENDIF or IF/ENDIF sequence, or the ELSE in
an ELSE/ENDIF sequence.

In the typical case (where IF and ENDIF) aren't the only instructions in
their basic blocks, we can simply remove the instructions (implicitly
deleting the block containing only the ELSE), and attempt to merge
blocks B0 and B2 together.

B0: ...
    (+f0) if(8)
B1: else(8)
B2: endif(8)
    ...

If the IF or ENDIF instructions are the only instructions in their
respective basic blocks (which are deleted by the removal of the
instructions), we'll want to instead merge the next blocks.

Both B0 and B2 are possibly removed by the removal of if & endif.
Same situation for if/endif. E.g., in the following example we'd remove
blocks B1 and B2, and then attempt to combine B0 and B3.

B0: ...
B1: (+f0) if(8)
B2: endif(8)
B3: ...

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-08-22 10:23:34 -07:00
Matt Turner
9cf06e27e1 i965/cfg: Add functions to combine basic blocks.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-08-22 10:23:34 -07:00
Matt Turner
5e6ead5e8b i965/cfg: Point to bblock_t containing associated control flow
... rather than pointing directly to the associated instruction. This
will let us set the block containing the IF statement's else-pointer to
NULL, when we delete a useless ELSE instruction, as in the case

   (+f0) if(8)
   ...
   else(8)
   endif(8)

Also, remove the pointer to the ENDIF, since it's unused, and it was
also potentially wrong, in the case of a basic block containing both an
ENDIF and an IF instruction:

   endif(8)
   cmp.ne.f0(8) ...
   (+f0) if(8)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-08-22 10:23:34 -07:00
Matt Turner
2a98ebd42b i965/fs: Preserve CFG in register allocation.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-08-22 10:23:34 -07:00
Matt Turner
20a849b4aa i965: Use basic-block aware insertion/removal functions.
To avoid invalidating and recreating the control flow graph. Also stop
invalidating the CFG in places we didn't add or remove an instruction.

cfg calculations:     202951 -> 80307 (-60.43%)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-08-22 10:23:34 -07:00
Matt Turner
e0aa45768c i965: Add invalidate_cfg parameter to invalidate_live_intervals().
Will let us avoid invalidating the CFG if the optimization pass has
removed instructions using the new basic block methods.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-08-22 10:23:34 -07:00
Matt Turner
3d6d4dc6f7 i965: Add basic-block aware backend_instruction::insert_* methods.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-08-22 10:23:34 -07:00
Matt Turner
dc527fbf7d i965: Add a basic-block aware backend_instruction::remove method.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-08-22 10:23:33 -07:00
Matt Turner
240adc1346 i965/cfg: Add a function to remove a block from the cfg.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-08-22 10:23:33 -07:00
Matt Turner
b7d50beea4 i965/cfg: Add functions to test if a block is a successor/predecessor.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-08-22 10:23:33 -07:00
Eric Anholt
e51e20c35e vc4: Add support for fragment discards.
Fixes piglit glsl-fs-discard-01 and -03, and allows a lot of mesa demos to
start running.  glsl-fs-discard-02 has a problem where the first tile is
not getting stored on the first render.
2014-08-22 10:16:58 -07:00
Eric Anholt
0f894b2795 vc4: Make some helpers for setting condition codes in instructions. 2014-08-22 10:16:58 -07:00
Eric Anholt
cc68be2620 vc4: Avoid using undefined values when there's no color write.
The simulator assertion fails when you read-before-write a temporary
value, and there's no point in doing the packing if there was no color
written.
2014-08-22 10:16:58 -07:00
Eric Anholt
ae83955b1d vc4: Emit the scoreboard wait just when it's needed.
This should improve performance on real hardware by allowing more shader
instances to run in parallel.  It also fixes assertion failures in tests
that don't emit a fragment color, since otherwise we didn't have enough
instructions to fit our signals in.
2014-08-22 10:16:58 -07:00
Eric Anholt
c3c922289b vc4: Fix FLR for integer values less than 0.
If we didn't truncate at all, then we don't need to fix for truncation
happening in the wrong direction.

Fixes piglit builtin-functions/*-floor-*
2014-08-22 10:16:57 -07:00
Eric Anholt
2ab4e48f94 vc4: Fix totally broken assertions about inter-instruction reg conflicts.
The spec citation talked about A and B, and I proceeded to pay no
attention to whether the waddrs were for A or B.  As a result, this pair
of instructions would claim to conflict:

mov ra4, ra4 ; nop nop, r0, r0
mov.ns ra4, rb4 ; nop nop, r0, r0
2014-08-22 10:16:57 -07:00
Eric Anholt
b064c9103d vc4: Add support for all the texture and FBO formats we can.
Now that tiling is in place, we can expose the other formats.  Depth is
still broken (need to make changes in the shader), but if you don't expose
it things crash all over.  SNORM is dropped, but we could re-add it later
with some shader fixes to handle converting between [0,1] and [-1,1].
2014-08-22 10:16:57 -07:00
Eric Anholt
3a1efcc7f9 vc4: Add support for texture tiling.
This still treats everything as RGBA8888 for the most part, same as
before.  This is a prerequisite for handling other texture formats, since
only RGBA8888 has a raster-layout mode.
2014-08-22 10:16:57 -07:00
Eric Anholt
1b6dcaf40c vc4: Fix a typo in the validation for miplevels.
It meant that LUMALPHA was being marked as *many* miplevels, and
unsurprisingly wouldn't validate.  On the other hand, some miplevel counts
wouldn't get the small mips validated at all.
2014-08-22 10:16:57 -07:00
Eric Anholt
74ea87cde4 vc4: Convert to using an enum for texture data types 2014-08-22 10:16:57 -07:00
Eric Anholt
1cb5cfba85 vc4: Stop complaining about unknown texture channel types.
It doesn't matter to this code -- the sampler always returns 8-bit unorm
rgba.
2014-08-22 10:16:57 -07:00
Eric Anholt
b0a1e401a9 vc4: Include stdio/stdlib in headers so I don't have to include it per file.
There are a few tools I want to have always available, and fprintf() and
abort() are among them.
2014-08-22 10:16:57 -07:00
Matt Turner
d77f5603a5 i965: Fix JIP/UIP calculations.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82846
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82929
2014-08-22 09:30:03 -07:00
Aaron Watry
2a553e4dc9 st/clover: Change platform name from Default to Clover
Signed-off-by: Aaron Watry <awatry at gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-08-22 10:02:31 -05:00
Emil Velikov
e7f2f2dea5 dri/radeon: nuke the remaining references to sarea
Remainder of the dri1 times.

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-08-21 21:47:44 +01:00
Emil Velikov
515ffb6c93 dri/radeon: cleanup the radeon_context vtbl
Remove the set-but-unused, and set-but-empty vtable entries.
Most likely a leftover from the dri1 days.

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-08-21 21:47:40 +01:00
Emil Velikov
dd46f0926d include: move sarea.h next to it's only user
The header is used by DRI1 drivers, which we've removed a while
back. Now only the dri1 loader in libGL is using it, so let's
move it in src/glx, and prefix it accordingly.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-08-21 21:47:37 +01:00
Emil Velikov
7550a24fa6 dri/radeon: drop obsolete radeon_{dri,macros}.h headers
Both have been unused for at least a couple of years.
For example the last user of radeon_macros.h was removed with

commit 8c11f0a883
Author: Eric Anholt <eric@anholt.net>
Date:   Fri Oct 14 13:27:02 2011 -0700

    radeon: Drop the legacy BO manager code.

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-08-21 21:47:22 +01:00
Vinson Lee
1748ea8b2b SCons: Rename dri2_query_renderer.c to dri_common_query_renderer.c.
Fix SCons build error introduced with commit
3fe7daec14.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2014-08-21 12:22:18 -07:00
Connor Abbott
06ef631573 glsl/linker: pass through the is_intrinsic flag
This flag was set to true for the atomic counter intrinsics, but it
never got plumbed through the linker, so by the time it got to the
backends it would always be set to the false. The current i965 backend
code doesn't use is_intrinsic, so this should not change any existing
code, but it's useful for codepaths that want to distinguish between
intrinsics and non-intrinsics without using strcmp.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
2014-08-21 11:46:13 -07:00
Carl Worth
619505ac7c docs: Update instructions for creating a release
This captures all of the steps I have been following in making releases for
the past year or so. This way, the instructions should be sound for anyone who
would like to take over the release process going forward.
2014-08-21 10:46:02 -07:00
Roland Scheidegger
eb4541ebaf llvmpipe: change LP_MAX_SHADER_INSTRUCTIONS definition
This change will double cache size for branches which have a lower
LP_MAX_SHADER_VARIANTS limit (it will not do anything on master).
The reason is that nowadays shaders tend to be quite a bit larger than they
were (they were big when llvmpipe didn't have a fs loop, got much smaller with
that loop, and since then have gradually increased quite a bit though still
smaller than without the fs loop for various reasons - among them being d3d10
compliance, usage of 8-wide vectors, non-swizzled blend code). Thus effectively
less shaders would be cached (unless they were very small and the variant limit
was hit first). Also, since we're getting rid of the IR nowadays, the cached
shaders shouldn't need all that much memory actually.
2014-08-21 19:00:29 +02:00
Carl Worth
399b4e2227 docs: Add my notes on stable-branch patch criteria
This captures the set of rules I have been using for stable-branch management,
(starting with a discussion on the mesa-dev mailing list on July 2013, and
then refined through my own experience of performing stable-branch releases
since then).
2014-08-21 09:46:57 -07:00
Carl Worth
46d03d37bf Makefile: Switch from md5sums to sha256sums
We switched to these several stable releases ago, (since the MD5 algorithm has
been broken for some time), but only now did I get around to fixing this in
the Makefile rather than just performing this step manually.

CC: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-08-21 09:05:01 -07:00
Jon TURNEY
3fe7daec14 glx: Fix build since 679c2ef "glx/drisw: add support for DRI2rendererQueryExtension", when only building drisw renderer
v2:
- Move dri*_query_renderer_* into their respective dri*_priv.h headers
- Drop then unnneeded include of dri2.h from dri2_query_renderer.c
- Rename dri2_query_renderer.c as dri_common_query_renderer.c, as it's contents
now are used for more than dri[23]

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-21 16:59:48 +01:00
Carl Worth
ea565108ae Increment version to 10.4.0-devel
Now that the 10.3 branch has been created
2014-08-21 08:38:24 -07:00
Alex Deucher
153df68834 radeonsi: add new SI pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2014-08-21 11:16:15 -04:00
Alex Deucher
f50b6b4895 radeonsi: add new CIK pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2014-08-21 11:13:17 -04:00
1245 changed files with 91860 additions and 37271 deletions

View File

@@ -81,6 +81,7 @@ SUBDIRS := \
src/mapi \
src/glsl \
src/mesa \
src/util \
src/egl/main
ifeq ($(strip $(MESA_BUILD_CLASSIC)),true)

View File

@@ -64,14 +64,13 @@ IGNORE_FILES = \
parsers: configure
$(MAKE) -C src/glsl glsl_parser.cpp glsl_parser.h glsl_lexer.cpp glcpp/glcpp-lex.c glcpp/glcpp-parse.c glcpp/glcpp-parse.h
$(MAKE) -C src/mesa program/lex.yy.c program/program_parse.tab.c program/program_parse.tab.h
# Everything for new a Mesa release:
ARCHIVES = $(PACKAGE_NAME).tar.gz \
$(PACKAGE_NAME).tar.bz2 \
$(PACKAGE_NAME).zip
tarballs: md5
tarballs: checksums
rm -f ../$(PACKAGE_DIR) $(PACKAGE_NAME).tar
manifest.txt: .git
@@ -98,9 +97,9 @@ $(PACKAGE_NAME).zip: parsers ../$(PACKAGE_DIR) manifest.txt
zip -q -@ $(PACKAGE_NAME).zip < $(PACKAGE_DIR)/manifest.txt ; \
mv $(PACKAGE_NAME).zip $(PACKAGE_DIR)
md5: $(ARCHIVES)
@-md5sum $(PACKAGE_NAME).tar.gz
@-md5sum $(PACKAGE_NAME).tar.bz2
@-md5sum $(PACKAGE_NAME).zip
checksums: $(ARCHIVES)
@-sha256sum $(PACKAGE_NAME).tar.gz
@-sha256sum $(PACKAGE_NAME).tar.bz2
@-sha256sum $(PACKAGE_NAME).zip
.PHONY: tarballs md5
.PHONY: tarballs checksums

View File

@@ -1 +1 @@
10.3.0-devel
10.4.7

18
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,18 @@
# No whitespace commits in stable.
a10bf5c10caf27232d4df8da74d5c35c23eb883d
# The following patches address code which is missing in 10.4
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078515.html
06084652fefe49c3d6bf1b476ff74ff602fdc22a common: Correct texture init for meta pbo uploads and downloads.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078547.html
ccc5ce6f72c1ec86be4dfcef96c0b51fba0faa6d common: Correct PBO 2D_ARRAY handling.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078549.html
546aba143d13ba3f993ead4cc30b2404abfc0202 common: Fix PBOs for 1D_ARRAY.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078501.html
2b2fa1865248c6e3b7baec81c4f92774759b201f mesa: Indent break statements and add a missing one.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078502.html
87109acbed9c9b52f33d58ca06d9048d0ac7a215 mesa: Free memory allocated for luminance in readpixels.

View File

@@ -14,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*10\.4.*mesa-stable\)' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

File diff suppressed because it is too large Load Diff

View File

@@ -18,7 +18,7 @@ are exposed in the 3.0 context as extensions.
Feature Status
----------------------------------------------------- ------------------------
GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi
GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe (*), softpipe (*)
glBindFragDataLocation, glGetFragDataLocation DONE
Conditional rendering (GL_NV_conditional_render) DONE (r300, swrast)
@@ -47,8 +47,10 @@ GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi
GLX_ARB_create_context (GLX 1.4 is required) DONE
Multisample anti-aliasing DONE (r300)
(*) llvmpipe and softpipe have fake Multisample anti-aliasing support
GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi
GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
Forward compatible context support/deprecations DONE ()
Instanced drawing (GL_ARB_draw_instanced) DONE (swrast)
@@ -61,7 +63,7 @@ GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi
Signed normalized textures (GL_EXT_texture_snorm) DONE (r300)
GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi
GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
Core/compatibility profiles DONE
Geometry shaders DONE ()
@@ -76,9 +78,9 @@ GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi
GLX_ARB_create_context_profile DONE
GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi
GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
GL_ARB_blend_func_extended DONE (softpipe)
GL_ARB_blend_func_extended DONE ()
GL_ARB_explicit_attrib_location DONE (all drivers that support GLSL)
GL_ARB_occlusion_query2 DONE (r300, swrast)
GL_ARB_sampler_objects DONE (all drivers)
@@ -92,27 +94,27 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi
GL 4.0, GLSL 4.00:
GL_ARB_draw_buffers_blend DONE (i965, nv50, nvc0, r600, radeonsi, softpipe)
GL_ARB_draw_indirect DONE (i965, nvc0, radeonsi, softpipe, llvmpipe)
GL_ARB_draw_buffers_blend DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_draw_indirect DONE (i965, nvc0, radeonsi, llvmpipe, softpipe)
GL_ARB_gpu_shader5 DONE (i965, nvc0)
- 'precise' qualifier DONE
- Dynamically uniform sampler array indices DONE ()
- Dynamically uniform UBO array indices DONE ()
- Dynamically uniform sampler array indices DONE (r600)
- Dynamically uniform UBO array indices DONE (r600)
- Implicit signed -> unsigned conversions DONE
- Fused multiply-add DONE ()
- Packing/bitfield/conversion functions DONE (r600)
- Enhanced textureGather DONE (r600, radeonsi)
- Geometry shader instancing DONE ()
- Geometry shader instancing DONE (r600)
- Geometry shader multiple streams DONE ()
- Enhanced per-sample shading DONE (r600)
- Interpolation functions DONE ()
- Interpolation functions DONE (r600)
- New overload resolution rules DONE
GL_ARB_gpu_shader_fp64 started (Dave)
GL_ARB_sample_shading DONE (i965, nv50, nvc0, radeonsi)
GL_ARB_sample_shading DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_shader_subroutine not started
GL_ARB_tessellation_shader started (Fabian)
GL_ARB_texture_buffer_object_rgb32 DONE (i965, nvc0, r600, radeonsi, softpipe)
GL_ARB_texture_cube_map_array DONE (i965, nv50, nvc0, r600, radeonsi, softpipe)
GL_ARB_tessellation_shader started (Chris, Ilia)
GL_ARB_texture_buffer_object_rgb32 DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_cube_map_array DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_gather DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_texture_query_lod DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_transform_feedback2 DONE (i965, nv50, nvc0, r600, radeonsi)
@@ -121,12 +123,12 @@ GL 4.0, GLSL 4.00:
GL 4.1, GLSL 4.10:
GL_ARB_ES2_compatibility DONE (i965, nv50, nvc0, r300, r600, radeonsi)
GL_ARB_ES2_compatibility DONE (i965, nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_get_program_binary DONE (0 binary formats)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_shader_precision started (Micah)
GL_ARB_vertex_attrib_64bit started (Dave)
GL_ARB_viewport_array DONE (i965, nv50, nvc0, r600)
GL_ARB_viewport_array DONE (i965, nv50, nvc0, r600, llvmpipe)
GL 4.2, GLSL 4.20:
@@ -136,11 +138,11 @@ GL 4.2, GLSL 4.20:
GL_ARB_shader_atomic_counters DONE (i965)
GL_ARB_texture_storage DONE (all drivers)
GL_ARB_transform_feedback_instanced DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_base_instance DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_base_instance DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_shader_image_load_store in progress (curro)
GL_ARB_conservative_depth DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_420pack DONE (all drivers that support GLSL 1.30)
GL_ARB_internalformat_query DONE (i965, nv50, nvc0, r300, r600, radeonsi)
GL_ARB_internalformat_query DONE (i965, nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_map_buffer_alignment DONE (all drivers)
@@ -149,58 +151,58 @@ GL 4.3, GLSL 4.30:
GL_ARB_arrays_of_arrays started (Timothy)
GL_ARB_ES3_compatibility DONE (all drivers that support GLSL 3.30)
GL_ARB_clear_buffer_object DONE (all drivers)
GL_ARB_compute_shader started (currently stalled)
GL_ARB_compute_shader in progress (jljusten)
GL_ARB_copy_image DONE (i965)
GL_KHR_debug DONE (all drivers)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_fragment_layer_viewport DONE (nv50, nvc0, r600)
GL_ARB_fragment_layer_viewport DONE (nv50, nvc0, r600, llvmpipe)
GL_ARB_framebuffer_no_attachments not started
GL_ARB_internalformat_query2 not started
GL_ARB_invalidate_subdata DONE (all drivers)
GL_ARB_multi_draw_indirect DONE (i965, nvc0, radeonsi, softpipe, llvmpipe)
GL_ARB_multi_draw_indirect DONE (i965, nvc0, radeonsi, llvmpipe, softpipe)
GL_ARB_program_interface_query not started
GL_ARB_robust_buffer_access_behavior not started
GL_ARB_shader_image_size not started
GL_ARB_shader_storage_buffer_object not started
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi)
GL_ARB_texture_buffer_range DONE (nv50, nvc0, i965, r600, radeonsi)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_buffer_range DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
GL_ARB_texture_query_levels DONE (all drivers that support GLSL 1.30)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_texture_view DONE (i965)
GL_ARB_texture_view DONE (i965, nv50, nvc0)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GL 4.4, GLSL 4.40:
GL_MAX_VERTEX_ATTRIB_STRIDE not started
GL_MAX_VERTEX_ATTRIB_STRIDE DONE (all drivers)
GL_ARB_buffer_storage DONE (i965, nv30, nv50, nvc0, r300, r600, radeonsi)
GL_ARB_clear_texture DONE (i965)
GL_ARB_enhanced_layouts not started
GL_ARB_multi_bind DONE (all drivers)
GL_ARB_query_buffer_object not started
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv30, nv50, nvc0, r300, r600, radeonsi, swrast)
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv30, nv50, nvc0, r300, r600, radeonsi, swrast, llvmpipe, softpipe)
GL_ARB_texture_stencil8 not started
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL 4.5, GLSL 4.50:
GL_ARB_ES3_1_compatibility not started
GL_ARB_clip_control not started
GL_ARB_conditional_render_inverted DONE (i965, nvc0, softpipe, llvmpipe)
GL_ARB_clip_control DONE (nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_conditional_render_inverted DONE (i965, nv50, nvc0, llvmpipe, softpipe)
GL_ARB_cull_distance not started
GL_ARB_derivative_control DONE (i965, nv50, nvc0, r600)
GL_ARB_direct_state_access not started
GL_ARB_get_texture_sub_image started (Brian Paul)
GL_ARB_shader_texture_image_samples not started
GL_ARB_texture_barrier DONE (nv50, nvc0, r300, r600, radeonsi)
GL_KHR_context_flush_control not started
GL_KHR_context_flush_control DONE (all - but needs GLX/EXT extension to be useful)
GL_KHR_robust_buffer_access_behavior not started
GL_KHR_robustness 90% done (the ARB variant)
These are the extensions cherry-picked to make GLES 3.1
GLES3.1, GLSL ES 3.1
GL_ARB_arrays_of_arrays started (Timothy)
GL_ARB_compute_shader started (currently stalled)
GL_ARB_compute_shader in progress (jljusten)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_framebuffer_no_attachments not started
GL_ARB_program_interface_query not started
@@ -208,7 +210,7 @@ GLES3.1, GLSL ES 3.1
GL_ARB_shader_image_load_store in progress (curro)
GL_ARB_shader_storage_buffer_object not started
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GS5 Enhanced textureGather DONE (i965, nvc0, r600, radeonsi)
GS5 Packing/bitfield/conversion functions DONE (i965, nvc0, r600)

View File

@@ -218,15 +218,93 @@ commit ID of the commit of interest (as it appears in the mesa master branch).
The latest set of patches that have been nominated, accepted, or rejected for
the upcoming stable release can always be seen on the
<a href=http://cworth.org/~cworth/mesa-stable-queue/">Mesa Stable Queue</a>
<a href="http://cworth.org/~cworth/mesa-stable-queue/">Mesa Stable Queue</a>
page.
<h2>Cherry-picking candidates for a stable branch</h2>
<h2>Criteria for accepting patches to the stable branch</h2>
<p>
Please use <code>git cherry-pick -x &lt;commit&gt;</code> for cherry-picking a commit
from master to a stable branch.
</p>
Mesa has a designated release manager for each stable branch, and the release
manager is the only developer that should be pushing changes to these
branches. Everyone else should simply nominate patches using the mechanism
described above.
The stable-release manager will work with the list of nominated patches, and
for each patch that meets the crtieria below will cherry-pick the patch with:
<code>git cherry-pick -x &lt;commit&gt;</code>. The <code>-x</code> option is
important so that the picked patch references the comit ID of the original
patch.
The stable-release manager may at times need to force-push changes to the
stable branches, for example, to drop a previously-picked patch that was later
identified as causing a regression). These force-pushes may cause changes to
be lost from the stable branch if developers push things directly. Consider
yourself warned.
The stable-release manager is also given broad discretion in rejecting patches
that have been nominated for the stable branch. The most basic rule is that
the stable branch is for bug fixes only, (no new features, no
regressions). Here is a non-exhaustive list of some reasons that a patch may
be rejected:
<ul>
<li>Patch introduces a regression. Any reported build breakage or other
regression caused by a particular patch, (game no longer work, piglit test
changes from PASS to FAIL), is justification for rejecting a patch.</li>
<li>Patch is too large, (say, larger than 100 lines)</li>
<li>Patch is not a fix. For example, a commit that moves code around with no
functional change should be rejected.</li>
<li>Patch fix is not clearly described. For example, a commit message
of only a single line, no description of the bug, no mention of bugzilla,
etc.</li>
<li>Patch has not obviously been reviewed, For example, the commit message
has no Reviewed-by, Signed-off-by, nor Tested-by tags from anyone but the
author.</li>
<li>Patch has not already been merged to the master branch. As a rule, bug
fixes should never be applied first to a stable branch. Patches should land
first on the master branch and then be cherry-picked to a stable
branch. (This is to avoid future releases causing regressions if the patch
is not also applied to master.) The only things that might look like
exceptions would be backports of patches from master that happen to look
significantly different.</li>
<li>Patch depends on too many other patches. Ideally, all stable-branch
patches should be self-contained. It sometimes occurs that a single, logical
bug-fix occurs as two separate patches on master, (such as an original
patch, then a subsequent fix-up to that patch). In such a case, these two
patches should be squashed into a single, self-contained patch for the
stable branch. (Of course, if the squashing makes the patch too large, then
that could be a reason to reject the patch.)</li>
<li>Patch includes new feature development, not bug fixes. New OpenGL
features, extensions, etc. should be applied to Mesa master and included in
the next major release. Stable releases are intended only for bug fixes.
Note: As an exception to this rule, the stable-release manager may accept
hardware-enabling "features". For example, backports of new code to support
a newly-developed hardware product can be accepted if they can be reasonably
determined to not have effects on other hardware.</li>
<li>Patch is a performance optimization. As a rule, performance patches are
not candidates for the stable branch. The only exception might be a case
where an application's performance was recently severely impacted so as to
become unusable. The fix for this performance regression could then be
considered for a stable branch. The optimization must also be
non-controversial and the patches still need to meet the other criteria of
being simple and self-contained</li>
<li>Patch introduces a new failure mode (such as an assert). While the new
assert might technically be correct, for example to make Mesa more
conformant, this is not the kind of "bug fix" we want in a stable
release. The potential problem here is that an OpenGL program that was
previously working, (even if technically non-compliant with the
specification), could stop working after this patch. So that would be a
regression that is unaacceptable for the stable branch.</li>
</ul>
<h2>Making a New Mesa Release</h2>
@@ -237,64 +315,205 @@ These are the instructions for making a new Mesa release.
<h3>Get latest source files</h3>
<p>
Use git to get the latest Mesa files from the git repository, from whatever
branch is relevant.
branch is relevant. This document uses the convention X.Y.Z for the release
being created, which should be created from a branch named X.Y.
</p>
<h3>Verify and update version info in VERSION</h3>
<h3>Perform basic testing</h3>
<p>
Create a docs/relnotes/x.y.z.html file.
The bin/bugzilla_mesa.sh and bin/shortlog_mesa.sh scripts can be used to
create the HTML-formatted lists of bugfixes and changes to include in the file.
Link the new docs/relnotes/x.y.z.html file into the main <a href="relnotes.html">relnotes.html</a> file.
The release manager should, at the very least, test the code by compiling it,
installing it, and running the latest piglit to ensure that no piglit tests
have regressed since the previous release.
</p>
<p>
Update <a href="index.html">docs/index.html</a>.
The release manager should do this testing with at least one hardware driver,
(say, whatever is contained in the local development machine), as well as on
both Gallium and non-Gallium software drivers. The software testing can be
performed by running piglit with the following environment-variable set:
</p>
<p>
Tag the files with the release name (in the form <b>mesa-x.y</b>)
with: <code>git tag -s mesa-x.y -m "Mesa x.y Release"</code>
Then: <code>git push origin mesa-x.y</code>
</p>
<h3>Make the tarballs</h3>
<p>
Make the distribution files. From inside the Mesa directory:
<pre>
./autogen.sh
make tarballs
LIBGL_ALWAYS_SOFTWARE=1
</pre>
And Gallium vs. non-Gallium software drivers can be obtained by using the
following configure flags on separate builds:
<pre>
--with-dri-drivers=swrast
--with-gallium-drivers=swrast
</pre>
<p>
After the tarballs are created, the md5 checksums for the files will
be computed.
Add them to the docs/relnotes/x.y.html file.
Note: If both options are given in one build, both swrast_dri.so drivers will
be compiled, but only one will be installed. The following command can be used
to ensure the correct driver is being tested:
</p>
<pre>
LIBGL_ALWAYS_SOFTWARE=1 glxinfo | grep "renderer string"
</pre>
If any regressions are found in this testing with piglit, stop here, and do
not perform a release until regressions are fixed.
<h3>Update version in file VERSION</h3>
<p>
Increment the version contained in the file VERSION at Mesa's top-level, then
commit this change.
</p>
<h3>Create release notes for the new release</h3>
<p>
Create a new file docs/relnotes/X.Y.Z.html, (follow the style of the previous
release notes). Note that the sha256sums section of the release notes should
be empty at this point.
</p>
<p>
Copy the distribution files to a temporary directory, unpack them,
compile everything, and run some demos to be sure everything works.
</p>
Two scripts are available to help generate portions of the release notes:
<pre>
./bin/bugzilla_mesa.sh
./bin/shortlog_mesa.sh
</pre>
<h3>Update the website and announce the release</h3>
<p>
Make a new directory for the release on annarchy.freedesktop.org with:
<br>
<code>
mkdir /srv/ftp.freedesktop.org/pub/mesa/x.y
</code>
The first script identifies commits that reference bugzilla bugs and obtains
the descriptions of those bugs from bugzilla. The second script generates a
log of all commits. In both cases, HTML-formatted lists are printed to stdout
to be included in the release notes.
</p>
<p>
Basically, to upload the tarball files with:
<br>
<code>
rsync -avP -e ssh MesaLib-x.y.* USERNAME@annarchy.freedesktop.org:/srv/ftp.freedesktop.org/pub/mesa/x.y/
</code>
Commit these changes
</p>
<h3>Make the release archives, signatures, and the release tag</h3>
<p>
From inside the Mesa directory:
<pre>
./autogen.sh
make -j1 tarballs
</pre>
<p>
After the tarballs are created, the sha256 checksums for the files will
be computed and printed. These will be used in a step below.
</p>
<p>
It's important at this point to also verify that the constructed tar file
actually builds:
</p>
<pre>
tar xjf MesaLib-X.Y.Z.tar.bz2
cd Mesa-X.Y.Z
./configure --enable-gallium-llvm
make -j6
make install
</pre>
<p>
Some touch testing should also be performed at this point, (run glxgears or
more involved OpenGL programs against the installed Mesa).
</p>
<p>
Create detached GPG signatures for each of the archive files created above:
</p>
<pre>
gpg --sign --detach MesaLib-X.Y.Z.tar.gz
gpg --sign --detach MesaLib-X.Y.Z.tar.bz2
gpg --sign --detach MesaLib-X.Y.Z.zip
</pre>
<p>
Tag the commit used for the build:
</p>
<pre>
git tag -s mesa-X.Y.X -m "Mesa X.Y.Z release"
</pre>
<p>
Note: It would be nice to investigate and fix the issue that causes the
tarballs target to fail with multiple build process, such as with "-j4". It
would also be nice to incorporate all of the above commands into a single
makefile target. And instead of a custom "tarballs" target, we should
incorporate things into the standard "make dist" and "make distcheck" targets.
</p>
<h3>Add the sha256sums to the release notes</h3>
<p>
Edit docs/relnotes/X.Y.Z.html to add the sha256sums printed as part of "make
tarballs" in the previous step. Commit this change.
</p>
<h3>Push all commits and the tag creates above</h3>
<p>
This is the first step that cannot easily be undone. The release is going
forward from this point:
</p>
<pre>
git push origin X.Y --tags
</pre>
<h3>Install the release files and signatures on the distribution server</h3>
<p>
The following commands can be used to copy the release archive files and
signatures to the freedesktop.org server:
</p>
<pre>
scp MesaLib-X.Y.Z* people.freedesktop.org:
ssh people.freedesktop.org
cd /srv/ftp.freedesktop.org/pub/mesa
mkdir X.Y.Z
cd X.Y.Z
mv ~/MesaLib-X.Y.Z* .
</pre>
<h3>Back on mesa master, andd the new release notes into the tree</h3>
<p>
Something like the following steps will do the trick:
</p>
<pre>
cp docs/relnotes/X.Y.Z.html /tmp
git checkout master
cp /tmp/X.Y.Z.html docs/relnotes
git add docs/relnotes/X.Y.Z.html
</pre>
<p>
Also, edit docs/relnotes.html to add a link to the new release notes, and edit
docs/index.html to add a news entry. Then commit and push:
</p>
<pre>
git commit -a -m "docs: Import X.Y.Z release notes, add news item."
git push origin
</pre>
<h3>Update the mesa3d.org website</h3>
<p>
NOTE: The recent release managers have not been performing this step
themselves, but leaving this to Brian Paul, (who has access to the
sourceforge.net hosting for mesa3d.org). Brian is more than willing to grant
the permission necessary to future release managers to do this step on their
own.
</p>
<p>
@@ -306,13 +525,22 @@ sftp USERNAME,mesa3d@web.sourceforge.net
</code>
</p>
<h3>Announce the release</h3>
<p>
Make an announcement on the mailing lists:
<em>mesa-dev@lists.freedesktop.org</em>,
<em>mesa-users@lists.freedesktop.org</em>
and
<em>mesa-announce@lists.freedesktop.org</em>
Follow the template of previously-sent release announcements. The following
command can be used to generate the log of changes to be included in the
release announcement:
<pre>
git shortlog mesa-X.Y.Z-1..mesa-X.Y.Z
</pre>
</p>
</div>

View File

@@ -77,13 +77,6 @@ drivers will be installed to <code>${libdir}/egl</code>.</p>
</dd>
<dt><code>--enable-gallium-egl</code></dt>
<dd>
<p>Enable the optional <code>egl_gallium</code> driver.</p>
</dd>
<dt><code>--with-egl-platforms</code></dt>
<dd>

View File

@@ -16,6 +16,54 @@
<h1>News</h1>
<h2>December 14, 2014</h2>
<p>
<a href="relnotes/10.4.html">Mesa 10.4</a> is released. This is a new
development release. See the release notes for more information about
the release.
</p>
<h2>November 8, 2014</h2>
<p>
<a href="relnotes/10.3.3.html">Mesa 10.3.3</a> is released.
This is a bug-fix release.
</p>
<h2>October 24, 2014</h2>
<p>
<a href="relnotes/10.3.2.html">Mesa 10.3.2</a> is released.
This is a bug-fix release.
</p>
<h2>October 12, 2014</h2>
<p>
<a href="relnotes/10.2.9.html">Mesa 10.2.9</a>
and <a href="relnotes/10.3.1.html">Mesa 10.3.1</a> are released.
These are bug-fix releases from the 10.2 and 10.3 branches, respectively.
<br>
NOTE: It is anticipated that 10.2.9 will be the final release in the 10.2
series. Users of 10.2 are encouraged to migrate to the 10.3 series in order
to obtain future fixes.
</p>
<h2>September 19, 2014</h2>
<p>
<a href="relnotes/10.3.html">Mesa 10.3</a> is released. This is a new
development release. See the release notes for more information about
the release.
</p>
<p>
Also, <a href="relnotes/10.2.8.html">Mesa 10.2.8</a> is released.
This is a bug fix release from the 10.2 branch.
</p>
<h2>September 6, 2014</h2>
<p>
<a href="relnotes/10.2.7.html">Mesa 10.2.7</a> is released.
This is a bug-fix release.
</p>
<h2>August 19, 2014</h2>
<p>
<a href="relnotes/10.2.6.html">Mesa 10.2.6</a> is released.

View File

@@ -43,7 +43,7 @@ It's the fastest software rasterizer for Mesa.
</p>
</li>
<li>
<p>LLVM: version 3.4 recommended; 3.1 or later required.</p>
<p>LLVM: version 3.4 recommended; 3.3 or later required.</p>
<p>
For Linux, on a recent Debian based distribution do:
</p>

View File

@@ -21,6 +21,14 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/10.4.html">10.4 release notes</a>
<li><a href="relnotes/10.3.3.html">10.3.3 release notes</a>
<li><a href="relnotes/10.3.2.html">10.3.2 release notes</a>
<li><a href="relnotes/10.3.1.html">10.3.1 release notes</a>
<li><a href="relnotes/10.2.9.html">10.2.9 release notes</a>
<li><a href="relnotes/10.3.html">10.3 release notes</a>
<li><a href="relnotes/10.2.8.html">10.2.8 release notes</a>
<li><a href="relnotes/10.2.7.html">10.2.7 release notes</a>
<li><a href="relnotes/10.2.6.html">10.2.6 release notes</a>
<li><a href="relnotes/10.2.5.html">10.2.5 release notes</a>
<li><a href="relnotes/10.2.4.html">10.2.4 release notes</a>

211
docs/relnotes/10.2.7.html Normal file
View File

@@ -0,0 +1,211 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.7 Release Notes / September 06, 2014</h1>
<p>
Mesa 10.2.7 is a bug fix release which fixes bugs found since the 10.2.6 release.
</p>
<p>
Mesa 10.2.7 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
cb67dfaabf88acba29aa2cf0dd58ee17b21ebf9594f8d1226c41794da8de3e9d MesaLib-10.2.7.tar.gz
27b958063a4c002071f14ed45c7d2a1ee52cd85e4ac8876e8a1c273495a7d43f MesaLib-10.2.7.tar.bz2
a2796a2d5bbbc2edd22857ecc267cba68dfe5d0296f5d84ba7510877b216cc40 MesaLib-10.2.7.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=36193">Bug 36193</a> - [i965] brw_eu_emit.c:182: validate_reg: Assertion `execsize &gt;= width' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66184">Bug 66184</a> - src/mesa/state_tracker/st_glsl_to_tgsi.cpp:3216:simplify_cmp: Assertion `inst-&gt;dst.index &lt; 4096' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70441">Bug 70441</a> - [Gen4-5 clip] Piglit spec_OpenGL_1.1_polygon-offset hits (execsize &gt;= width) assertion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76188">Bug 76188</a> - EGL_EXT_image_dma_buf_import fd ownership is incorrect</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76789">Bug 76789</a> - [radeonsi] si_descriptors.c requires -std=gnu99 or -fms-extensions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82139">Bug 82139</a> - [r600g, bisected] multiple ubo piglit regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82255">Bug 82255</a> - [VP2] Chroma planes are vertically stretched during VDPAU playback</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82671">Bug 82671</a> - [r600g-evergreen][compute]Empty kernel execution causes crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82709">Bug 82709</a> - OpenCL not working on radeon hainan</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82814">Bug 82814</a> - glDrawBuffers(0, NULL) segfaults in _mesa_drawbuffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83079">Bug 83079</a> - [NVC0] Dota 2 (Linux native and Wine) crash with Nouveau Drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83355">Bug 83355</a> - FTBFS: src/mesa/program/program_lexer.l:122:64: error: unknown type name 'YYSTYPE'</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (1):</p>
<ul>
<li>radeonsi: Don't use anonymous struct trick in atom tracking</li>
</ul>
<p>Alex Deucher (2):</p>
<ul>
<li>radeonsi: add new CIK pci ids</li>
<li>radeonsi: add new SI pci ids</li>
</ul>
<p>Andreas Boll (1):</p>
<ul>
<li>winsys/radeon: fix nop packet padding for hawaii</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>i965: Bail on vec4 copy propagation for scratch writes with source modifiers</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: fix NULL pointer deref bug in _mesa_drawbuffers()</li>
</ul>
<p>Carl Worth (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.2.6 release</li>
<li>Makefile: Switch from md5sums to sha256sums</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>i965: add missing parens in vec4 visitor</li>
</ul>
<p>Emil Velikov (17):</p>
<ul>
<li>configure.ac: bail out if building gallium_gbm without gallium_egl</li>
<li>android: gallium/nouveau: fix include folders, link against libstlport</li>
<li>android: egl/main: fixup the nouveau build</li>
<li>automake: gallium/freedreno: drop spurious include dirs</li>
<li>android: gallium/freedreno: add preliminary build</li>
<li>android: egl/main: add/enable freedreno</li>
<li>android: gallium/auxiliary: drop log2/log2f redefitions</li>
<li>android: drop HAL_PIXEL_FORMAT_RGBA_{5551,4444}</li>
<li>android: glsl: the stlport over the limited Android STL</li>
<li>android: dri/i915: do not build an 'empty' driver</li>
<li>cherry-ignore: remove patch that lacking previous dependencies</li>
<li>cherry-ignore: PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE is not it 10.2</li>
<li>cherry-ignore: drop whitespace fix</li>
<li>cherry-ignore: reject a15088338eb</li>
<li>get-pick-list.sh: Require explicit "10.2" for nominating stable patches</li>
<li>mesa: fix make tarballs</li>
<li>Update VERSION to 10.2.7</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>mesa: Handle uninitialized textures like other textures in get_tex_level_parameter_image</li>
</ul>
<p>Ilia Mirkin (9):</p>
<ul>
<li>nouveau: make sure to invalidate any vbo state as well</li>
<li>nouveau: don't keep stale pointer to free'd data</li>
<li>nvc0/ir: avoid infinite recursion when finding first uses of tex</li>
<li>nv50: zero out unbound samplers</li>
<li>nvc0: don't make 1d staging textures linear</li>
<li>nv50/ir: avoid creating instructions that can't be emitted</li>
<li>nv50: set the miptree address when clearing bo's in vp2 init</li>
<li>nv50: mt address may not be the underlying bo's start address</li>
<li>nv50: attach the buffer bo to the miptree structures</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>gallivm: Fix build with latest LLVM</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>mesa: Move declaration to top of block.</li>
</ul>
<p>Kenneth Graunke (3):</p>
<ul>
<li>i965/vec4: Set NoMask for GS_OPCODE_SET_VERTEX_COUNT on Gen8+.</li>
<li>i965/vec4: Respect ir-&gt;force_writemask_all in Gen8 code generation.</li>
<li>i965/clip: Fix brw_clip_unfilled.c/compute_offset's assembly.</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>r600g: fix constant buffer fetches</li>
<li>radeonsi: save scissor state and sample mask for u_blitter</li>
<li>glsl_to_tgsi: allocate and enlarge arrays for temporaries on demand</li>
</ul>
<p>Paulo Sergio Travaglia (2):</p>
<ul>
<li>android: gallium/radeon: attempt to fix the android build</li>
<li>android: egl/main: resolve radeon linking issues</li>
</ul>
<p>Pekka Paalanen (1):</p>
<ul>
<li>egl_dri2: fix EXT_image_dma_buf_import fds</li>
</ul>
<p>Robert Bragg (1):</p>
<ul>
<li>meta: save and restore swizzle for _GenerateMipmap</li>
</ul>
<p>Tom Stellard (7):</p>
<ul>
<li>radeon/compute: Fix reported values for MAX_GLOBAL_SIZE and MAX_MEM_ALLOC_SIZE</li>
<li>radeonsi/compute: Update reference counts for buffers in si_set_global_binding()</li>
<li>radeonsi/compute: Call si_pm4_free_state() after emitting compute state</li>
<li>clover: Flush the command queue in clReleaseCommandQueue()</li>
<li>radeon: Add work-around for missing Hainan support in clang &lt; 3.6 v2</li>
<li>pipe-loader: Fix memory leak v2</li>
<li>r600g/compute: Don't initialize vertex_buffer_state masks to 0x2</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>gallivm: Fix build with LLVM &gt;= 3.6 r215967.</li>
</ul>
</div>
</body>
</html>

130
docs/relnotes/10.2.8.html Normal file
View File

@@ -0,0 +1,130 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.8 Release Notes / September 19, 2014</h1>
<p>
Mesa 10.2.8 is a bug fix release which fixes bugs found since the 10.2.7 release.
</p>
<p>
Mesa 10.2.8 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
4c5a25ccaf1a9734bbd10d62a1420cc8fd35a1060ce679f2fc846769a25fbeec MesaLib-10.2.8.tar.gz
1ef9ad3f241788d454f2ff8c9d65b6849dfc31c8fe91f70fd2930b81c8af1398 MesaLib-10.2.8.tar.bz2
d26218da3b44734b1d555267b4c63c48803c4c8b14d2bc53071be57014da37fa MesaLib-10.2.8.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77493">Bug 77493</a> - lp_test_arit fails with llvm &gt;= llvm-3.5svn r206094</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82539">Bug 82539</a> - vmw_screen_dri.lo In file included from vmw_screen_dri.c:41: vmwgfx_drm.h:32:17: error: drm.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82882">Bug 82882</a> - [swrast] piglit glsl-fs-uniform-bool-1 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83432">Bug 83432</a> - r600_query.c:269:r600_emit_query_end: Assertion `ctx-&gt;num_pipelinestat_queries &gt; 0' failed [Gallium HUD]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83567">Bug 83567</a> - Mesa 10.2.6 does not compile with llvm 3.5</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83735">Bug 83735</a> - [mesa-10.2.x] broken with llvm-3.5 and old CPUs</li>
</ul>
<h2>Changes</h2>
<p>Aaron Watry (1):</p>
<ul>
<li>gallivm: Fix build after LLVM commit 211259</li>
</ul>
<p>Christoph Bumiller (2):</p>
<ul>
<li>nv50/ir/util: fix BitSet issues</li>
<li>nvc0/ir: clarify recursion fix to finding first tex uses</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: Add sha256 sums for the 10.2.7 release</li>
<li>configure: bail out if building svga without libdrm</li>
<li>Update VERSION to 10.2.8</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>nv50/ir: avoid array overrun when checking for supported mods</li>
<li>nouveau: only enable the depth test if there actually is a depth buffer</li>
<li>nouveau: only enable stencil func if the visual has stencil bits</li>
<li>nouveau: change internal variables to avoid conflicts with macro args</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>configure.ac: strip _GNU_SOURCE from llvm-config output</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>gallivm: Disable workaround for PR12833 on LLVM 3.2+.</li>
</ul>
<p>Maarten Lankhorst (4):</p>
<ul>
<li>nouveau: re-allocate bo's on overflow</li>
<li>nouveau: fix MPEG4 hw decoding</li>
<li>nouveau: rework reference frame handling</li>
<li>nouveau: remove unneeded assert</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>r600g,radeonsi: make sure there's enough CS space before resuming queries</li>
<li>mesa: set UniformBooleanTrue = 1.0f by default</li>
<li>st/mesa: use 1.0f as boolean true on drivers without integer support</li>
</ul>
<p>Richard Sandiford (1):</p>
<ul>
<li>gallivm: Fix uses of 2^24</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>gallivm: set mcpu when initializing llvm execution engine</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>winsys/svga: Fix incorrect type usage in IOCTL v2</li>
</ul>
</div>
</body>
</html>

101
docs/relnotes/10.2.9.html Normal file
View File

@@ -0,0 +1,101 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.9 Release Notes / October 12, 2014</h1>
<p>
Mesa 10.2.9 is a bug fix release which fixes bugs found since the 10.2.8 release.
This is the final planned release for the 10.2 branch.
</p>
<p>
Mesa 10.2.9 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
f8d62857eed8f604a57710c58a8ffcfb8dab2dc4977ec27c956c7c4fd14032f6 MesaLib-10.2.9.tar.gz
f6031f8b7113a92325b60635c504c510490eebb2e707119bbff7bd86aa34657d MesaLib-10.2.9.tar.bz2
11c0ef4f3308fc29d9f15a77fd8f4842a946fce9e830250a1c95b171a446171a MesaLib-10.2.9.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79462">Bug 79462</a> - [NVC0/Codegen] Shader compilation falis in spill logic</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83570">Bug 83570</a> - Glyphy demo throws unhandled Integer division by zero exception</li>
</ul>
<h2>Changes</h2>
<p>Andreas Pokorny (2):</p>
<ul>
<li>egl/drm: expose KHR_image_pixmap extension</li>
<li>i915: Fix black buffers when importing prime fds</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.2.8 release</li>
<li>Update VERSION to 10.2.9</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nv50/ir: avoid deleting pseudo instructions too early</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>radeonsi: release GS rings at context destruction</li>
<li>radeonsi: properly destroy the GS copy shader and scratch_bo for compute</li>
<li>st/dri: remove GALLIUM_MSAA and __GL_FSAA_MODE environment variables</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>gallivm: fix idiv</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>st/xa: Fix regression in xa_yuv_planar_blit()</li>
</ul>
<p>Tom Stellard (1):</p>
<ul>
<li>configure.ac: Compute LLVM_VERSION_PATCH using llvm-config</li>
</ul>
<p>rconde (1):</p>
<ul>
<li>gallivm,tgsi: fix idiv by zero crash</li>
</ul>
</div>
</body>
</html>

View File

@@ -88,6 +88,8 @@ following options during configure, if you would like support for svga driver
Note: The files are installed in $(libdir)/gallium-pipe/ and the interface
between them and libxatracker.so is <strong>not</strong> stable.
</p>
<li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>
</ul>
</div>

158
docs/relnotes/10.3.1.html Normal file
View File

@@ -0,0 +1,158 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.3.1 Release Notes / October 12, 2014</h1>
<p>
Mesa 10.3.1 is a bug fix release which fixes bugs found since the 10.3 release.
</p>
<p>
Mesa 10.3.1 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
155afcbad17be8bb80282c761b957d5cc716c14a1fa16c4f5ee04e76df729c6d MesaLib-10.3.1.tar.gz
b081d077d717e5d56f2d59677490856052c41573e50378ff86d6c72456714add MesaLib-10.3.1.tar.bz2
07a14febfed06412d519e091a62d24513fee6745f1a6f8a8f1956bfe04b77d15 MesaLib-10.3.1.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79462">Bug 79462</a> - [NVC0/Codegen] Shader compilation falis in spill logic</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82932">Bug 82932</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.shaders.indexing.vector_subscript.vec3_static_loop_subscript_write_direct_read_vertex fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83506">Bug 83506</a> - [UBO] row_major layout ignored inside structures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83533">Bug 83533</a> - [UBO] nested structures don't get appropriate padding</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83570">Bug 83570</a> - Glyphy demo throws unhandled Integer division by zero exception</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83741">Bug 83741</a> - [UBO] row_major layout partially ignored for arrays of structures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84178">Bug 84178</a> - Big glamor regression in Xorg server 1.6.99.1 GIT: x11perf 1.5 Test: PutImage XY 500x500 Square</li>
</ul>
<h2>Changes</h2>
<p>Andreas Pokorny (2):</p>
<ul>
<li>egl/drm: expose KHR_image_pixmap extension</li>
<li>i915: Fix black buffers when importing prime fds</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: fix prog_optimize.c assertions triggered by SWZ opcode</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add 10.3 sha256 sums, news item and link release notes</li>
<li>Update VERSION to 10.3.1</li>
</ul>
<p>Ian Romanick (4):</p>
<ul>
<li>glsl: Make sure fields after small structs have correct padding</li>
<li>glsl: Make sure row-major array-of-structure get correct layout</li>
<li>glsl: Round struct size up to at least 16 bytes</li>
<li>glsl: Strip arrayness from ir_type_dereference_variable too</li>
</ul>
<p>Ilia Mirkin (5):</p>
<ul>
<li>nv50/ir: avoid deleting pseudo instructions too early</li>
<li>gm107/ir: fix manual TXD for array targets</li>
<li>gm107/ir: fix texture argument order</li>
<li>gm107/ir: add support for indirect const buffer selection</li>
<li>gm107/ir: take relative pfetch offset into account</li>
</ul>
<p>Keith Packard (1):</p>
<ul>
<li>glx/dri3: Provide error diagnostics when DRI3 allocation fails</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>mesa: Use proper structure for glGet*(GL_TEXTURE_COORD_ARRAY*).</li>
<li>mesa: Set correct array element in vbo_exec_vtx_init.</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>radeonsi: release GS rings at context destruction</li>
<li>radeonsi: properly destroy the GS copy shader and scratch_bo for compute</li>
<li>st/dri: remove GALLIUM_MSAA and __GL_FSAA_MODE environment variables</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>st/mesa: Use PIPE_USAGE_STAGING for GL_STATIC/DYNAMIC/STREAM_READ buffers</li>
</ul>
<p>Richard Sandiford (2):</p>
<ul>
<li>mesa: Fix alpha component in unpack_R8G8B8X8_SRGB.</li>
<li>swrast: Fix handling of MESA_FORMAT_L8A8_SRGB for big-endian</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>gallivm: fix idiv</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>st/xa: Fix regression in xa_yuv_planar_blit()</li>
</ul>
<p>Tom Stellard (2):</p>
<ul>
<li>clover: Add support to mem objects for multiple destructor callbacks v2</li>
<li>configure.ac: Compute LLVM_VERSION_PATCH using llvm-config</li>
</ul>
<p>Tomasz Figa (3):</p>
<ul>
<li>util: Include in Android builds</li>
<li>st/mesa: Generate format_info.c in Android builds</li>
<li>st/mesa: Fix paths used in Android builds</li>
</ul>
<p>rconde (1):</p>
<ul>
<li>gallivm,tgsi: fix idiv by zero crash</li>
</ul>
</div>
</body>
</html>

115
docs/relnotes/10.3.2.html Normal file
View File

@@ -0,0 +1,115 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.3.2 Release Notes / October 24, 2014</h1>
<p>
Mesa 10.3.2 is a bug fix release which fixes bugs found since the 10.3 release.
</p>
<p>
Mesa 10.3.2 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e65f8e691f06f111c1aeb3a376b13c9cc88cb162bee2709e0e7e6b0e6628ca75 MesaLib-10.3.2.tar.gz
e9849bcb9aa9acd98a753d6d46d2e7d7238d3367036e11357a60efd16de8bea3 MesaLib-10.3.2.tar.bz2
427dc0d670d38e713ebff2675665ec2fe4ff7d04ce227bd54de946999fc1d234 MesaLib-10.3.2.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81680">Bug 81680</a> - [r600g] Firefox crashes with hardware acceleration turned on</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84140">Bug 84140</a> - mplayer crashes playing some files using vdpau output</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84662">Bug 84662</a> - Long pauses with Unreal demo Elemental on R9270X since : Always flush the HDP cache before submitting a CS to the GPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85267">Bug 85267</a> - vlc crashes with vdpau (Radeon 3850HD) [r600]</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (3):</p>
<ul>
<li>mesa: fix spurious wglGetProcAddress / GL_INVALID_OPERATION error</li>
<li>st/wgl: add WINAPI qualifiers on wgl function typedefs</li>
<li>glsl: fix several use-after-free bugs</li>
</ul>
<p>Daniel Manjarres (1):</p>
<ul>
<li>glx: Fix glxUseXFont for glxWindow and glxPixmaps</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>mesa: fix GetTexImage for 1D array depth textures</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.3.1 release</li>
<li>Update VERSION to 10.3.2</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>gm107/ir: add dnz emission for fmul</li>
<li>gk110/ir: add dnz flag emission for fmul/fmad</li>
<li>nouveau: 3d textures are unsupported, limit 3d levels to 1</li>
<li>st/gbm: fix order of arguments passed to is_format_supported</li>
</ul>
<p>Kenneth Graunke (3):</p>
<ul>
<li>i965: Add a BRW_MOCS_PTE #define.</li>
<li>i965: Use BDW_MOCS_PTE for renderbuffers.</li>
<li>i965: Fix register write checks.</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>st/mesa: use pipe_sampler_view_release for releasing sampler views</li>
<li>glsl_to_tgsi: fix the value of gl_FrontFacing with native integers</li>
</ul>
<p>Michel Dänzer (4):</p>
<ul>
<li>radeonsi: Clear sampler view flags when binding a buffer</li>
<li>r600g,radeonsi: Always use GTT again for PIPE_USAGE_STREAM buffers</li>
<li>winsys/radeon: Use separate caching buffer manager for each set of flags</li>
<li>r600g: Drop references to destroyed blend state</li>
</ul>
</div>
</body>
</html>

209
docs/relnotes/10.3.3.html Normal file
View File

@@ -0,0 +1,209 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.3.3 Release Notes / November 8, 2014</h1>
<p>
Mesa 10.3.3 is a bug fix release which fixes bugs found since the 10.3.2 release.
</p>
<p>
Mesa 10.3.3 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
23a0c36d88cd5d8968ae6454160de2878192fd1d37b5d606adca1f1b7e788b79 MesaLib-10.3.3.tar.gz
0e4eee4a2ddf86456eed2fc44da367f95471f74249636710491e85cc256c4753 MesaLib-10.3.3.tar.bz2
a83648f17d776b7cf6c813fbb15782d2644b937dc6a7c53d8c0d1b35411f4840 MesaLib-10.3.3.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70410">Bug 70410</a> - egl-static/Makefile: linking fails with llvm &gt;= 3.4</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82921">Bug 82921</a> - layout(location=0) emits error &gt;= MAX_UNIFORM_LOCATIONS due to integer underflow</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83574">Bug 83574</a> - [llvmpipe] [softpipe] piglit arb_explicit_uniform_location-use-of-unused-loc regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85454">Bug 85454</a> - Unigine Sanctuary with Wine crashes on Mesa Git</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85918">Bug 85918</a> - Mesa: MSVC 2010/2012 Compile error</li>
</ul>
<h2>Changes</h2>
<p>Anuj Phogat (2):</p>
<ul>
<li>glsl: Fix crash due to negative array index</li>
<li>glsl: Use signed array index in update_max_array_access()</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: fix UNCLAMPED_FLOAT_TO_UBYTE() macro for MSVC</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.3.2 release</li>
<li>Update version to 10.3.3</li>
</ul>
<p>Ilia Mirkin (27):</p>
<ul>
<li>freedreno/ir3: fix FSLT/etc handling to return 0/-1 instead of 0/1.0</li>
<li>freedreno/ir3: INEG operates on src0, not src1</li>
<li>freedreno/ir3: add UARL support</li>
<li>freedreno/ir3: negate result of USLT/etc</li>
<li>freedreno/ir3: use unsigned comparison for UIF</li>
<li>freedreno/ir3: add TXL support</li>
<li>freedreno/ir3: fix UCMP handling</li>
<li>freedreno/ir3: implement UMUL correctly</li>
<li>freedreno: add default .dir-locals.el for emacs settings</li>
<li>freedreno/ir3: make texture instruction construction more dynamic</li>
<li>freedreno/ir3: fix TXB/TXL to actually pull the bias/lod argument</li>
<li>freedreno/ir3: add TXQ support</li>
<li>freedreno/ir3: add TXB2 support</li>
<li>freedreno: dual-source render targets are not supported</li>
<li>freedreno: instanced drawing/compute not yet supported</li>
<li>freedreno/ir3: avoid fan-in sources referring to same instruction</li>
<li>freedreno/ir3: add IDIV/UDIV support</li>
<li>freedreno/ir3: add UMOD support, based on UDIV</li>
<li>freedreno/ir3: add MOD support</li>
<li>freedreno/ir3: add ISSG support</li>
<li>freedreno/ir3: add UMAD support</li>
<li>freedreno/ir3: make TXQ return integers, not floats</li>
<li>freedreno/ir3: shadow comes before array</li>
<li>freedreno/ir3: add texture offset support</li>
<li>freedreno/ir3: add TXD support and expose ARB_shader_texture_lod</li>
<li>freedreno/ir3: add TXF support</li>
<li>freedreno: positions come out as integers, not half-integers</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>configure: include llvm systemlibs when using static llvm</li>
</ul>
<p>Marek Olšák (5):</p>
<ul>
<li>r600g: fix polygon mode for points and lines and point/line fill modes</li>
<li>radeonsi: fix polygon mode for points and lines and point/line fill modes</li>
<li>radeonsi: fix incorrect index buffer max size for lowered 8-bit indices</li>
<li>Revert "st/mesa: set MaxUnrollIterations = 255"</li>
<li>r300g: remove enabled/disabled hyperz and AA compression messages</li>
</ul>
<p>Mauro Rossi (1):</p>
<ul>
<li>gallium/nouveau: fully build the driver under android</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeon/llvm: Dynamically allocate branch/loop stack arrays</li>
</ul>
<p>Rob Clark (62):</p>
<ul>
<li>freedreno/ir3: detect scheduler fail</li>
<li>freedreno/ir3: add TXB</li>
<li>freedreno/ir3: add DDX/DDY</li>
<li>freedreno/ir3: bit of debug</li>
<li>freedreno/ir3: fix error in bail logic</li>
<li>freedreno/ir3: fix constlen with relative addressing</li>
<li>freedreno/ir3: add no-copy-propagate fallback step</li>
<li>freedreno: don't overflow cmdstream buffer so much</li>
<li>freedreno/ir3: fix potential segfault in RA</li>
<li>freedreno: update generated headers</li>
<li>freedreno/a3xx: enable hw primitive-restart</li>
<li>freedreno/a3xx: handle rendering to layer != 0</li>
<li>freedreno: update generated headers</li>
<li>freedreno/a3xx: format fixes</li>
<li>util/u_format: add _is_alpha()</li>
<li>freedreno/a3xx: alpha render-target shenanigans</li>
<li>freedreno/ir3: catch incorrect usage of tmp-dst</li>
<li>freedreno/ir3: add missing put_dst</li>
<li>freedreno: "fix" problems with excessive flushes</li>
<li>freedreno: update generated headers</li>
<li>freedreno/a3xx: 3d/array textures</li>
<li>freedreno: add DRM_CONF_SHARE_FD</li>
<li>freedreno/a3xx: more texture array fixes</li>
<li>freedreno/a3xx: initial texture border-color</li>
<li>freedreno: fix compiler warning</li>
<li>freedreno: don't advertise mirror-clamp support</li>
<li>freedreno: update generated headers</li>
<li>freedreno: we have more than 0 viewports!</li>
<li>freedreno: turn missing caps into compile warnings</li>
<li>freedreno/a3xx: add LOD_BIAS</li>
<li>freedreno/a3xx: add flat interpolation mode</li>
<li>freedreno/a3xx: add 32bit integer vtx formats</li>
<li>freedreno/a3xx: fix border color order</li>
<li>freedreno: move bind_sampler_states to per-generation</li>
<li>freedreno: add texcoord clamp support to lowering</li>
<li>freedreno/a3xx: add support to emulate GL_CLAMP</li>
<li>freedreno/a3xx: re-emit shaders on variant change</li>
<li>freedreno/lowering: fix token calculation for lowering</li>
<li>freedreno: destroy transfer pool after blitter</li>
<li>freedreno: max-texture-lod-bias should be 15.0f</li>
<li>freedreno: update generated headers</li>
<li>freedreno/a3xx: handle large shader program sizes</li>
<li>freedreno/a3xx: emit all immediates in one shot</li>
<li>freedreno/ir3: fix lockups with lame FRAG shaders</li>
<li>freedreno/a3xx: handle VS only outputting BCOLOR</li>
<li>freedreno: query fixes</li>
<li>freedreno/a3xx: refactor vertex state emit</li>
<li>freedreno/a3xx: refactor/optimize emit</li>
<li>freedreno/ir3: optimize shader key comparision</li>
<li>freedreno: inline fd_draw_emit()</li>
<li>freedreno: fix layer_stride</li>
<li>freedreno: update generated headers</li>
<li>freedreno/ir3: large const support</li>
<li>freedreno/a3xx: more layer/level fixes</li>
<li>freedreno/ir3: comment + better fxn name</li>
<li>freedreno/ir3: fix potential gpu lockup with kill</li>
<li>freedreno/a3xx: disable early-z when we have kill's</li>
<li>freedreno/ir3: add debug flag to disable cp</li>
<li>freedreno: clear vs scissor</li>
<li>freedreno: mark scissor state dirty when enable bit changes</li>
<li>freedreno/a3xx: fix viewport state during clear</li>
<li>freedreno/a3xx: fix depth/stencil restore format</li>
</ul>
<p>Tapani Pälli (2):</p>
<ul>
<li>glsl: fix uniform location count used for glsl types</li>
<li>mesa: check that uniform exists in glUniform* functions</li>
</ul>
</div>
</body>
</html>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.3 Release Notes / TBD</h1>
<h1>Mesa 10.3 Release Notes / September 19, 2014</h1>
<p>
Mesa 10.3 is a new development release.
@@ -31,9 +31,11 @@ because compatibility contexts are not supported.
</p>
<h2>MD5 checksums</h2>
<h2>SHA256 checksums</h2>
<pre>
TBD.
9a1bf52040fc3dda81e83a35f944f1c3f532847dbe9fdf57161265cf71ea1bae MesaLib-10.3.0.tar.gz
0283bfe710fa449ed82e465cfa09612a269e19abb7e0382082608062ce7960b5 MesaLib-10.3.0.tar.bz2
221420763c2c3a244836a736e735612c4a6a0377b4e5223fca1e612f49906789 MesaLib-10.3.0.zip
</pre>
@@ -75,7 +77,249 @@ DRM drivers that don't have a full-fledged GEM (such as qxl or simpledrm)</li>
<h2>Bug fixes</h2>
TBD.
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=50754">Bug 50754</a> - Building 32 bit mesa on 64 bit OS fails since change for automake</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=53617">Bug 53617</a> - [llvmpipe] piglit fbo-depthtex regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=56127">Bug 56127</a> - [ILK bisected]unigine-sanctruary performance reduced by 98%</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66184">Bug 66184</a> - src/mesa/state_tracker/st_glsl_to_tgsi.cpp:3216:simplify_cmp: Assertion `inst-&gt;dst.index &lt; 4096' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66452">Bug 66452</a> - JUNIPER UVD accelerated playback of WMV3 streams does not work</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68365">Bug 68365</a> - [SNB Bisected]Piglit spec_ARB_framebuffer_object_fbo-blit-stretch fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70441">Bug 70441</a> - [Gen4-5 clip] Piglit spec_OpenGL_1.1_polygon-offset hits (execsize &gt;= width) assertion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73846">Bug 73846</a> - [llvmpipe] lp_test_format fails with llvm-3.5svn &gt;= r199602</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74005">Bug 74005</a> - [i965 Bisected]Piglit/glx_glx-make-glxdrawable-current fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74863">Bug 74863</a> - [r600g] HyperZ broken on RV770 and CYPRESS (Left 4 Dead 2 trees corruption) bisected!</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75010">Bug 75010</a> - clang: error: unknown argument: '-fstack-protector-strong'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75478">Bug 75478</a> - [BDW]Some Piglit and Ogles2conform cases cause GPU hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75664">Bug 75664</a> - Unigine Valley &amp; Heaven &quot;error: syntax error, unexpected EXTENSION, expecting $end&quot; IVB HD4000</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75878">Bug 75878</a> - [BDW] GPU hang running Raytracer WebGL demo</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76188">Bug 76188</a> - EGL_EXT_image_dma_buf_import fd ownership is incorrect</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76223">Bug 76223</a> - [radeonsi] luxmark segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76939">Bug 76939</a> - [BDW] GPU hang when running “Metro:Last Light “ /“Crusader Kings II”</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77245">Bug 77245</a> - Bogus GL_ARB_explicit_attrib_location layout identifier warnings</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77493">Bug 77493</a> - lp_test_arit fails with llvm &gt;= llvm-3.5svn r206094</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77703">Bug 77703</a> - [ILK Bisected]Piglit glean_texCombine4 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77704">Bug 77704</a> - [IVB/HSW Bisected]Ogles3conform GL3Tests_shadow_shadow_execution_frag.test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77705">Bug 77705</a> - [SNB/IVB/HSW/BYT/BDW Bisected]Ogles3conform GL3Tests/packed_pixels/packed_pixels_pixelstore.test segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77707">Bug 77707</a> - [ILK Bisected]Ogles2conform GL_sin_sin_float_frag_xvary.test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77740">Bug 77740</a> - i965: Relax accumulator dependency scheduling on Gen &lt; 6</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77852">Bug 77852</a> - [BDW]Piglit spec_ARB_framebuffer_object_fbo-drawbuffers-none_glBlitFramebuffer fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77856">Bug 77856</a> - [BDW]Piglit spec_OpenGL_3.0_clearbuffer-mixed-format fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77865">Bug 77865</a> - [BDW] Many Ogles3conform framebuffer_blit cases fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78225">Bug 78225</a> - Compile error due to undefined reference to `gbm_dri_backend', fix attached</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78258">Bug 78258</a> - make check link_varyings.gl_ClipDistance failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78403">Bug 78403</a> - query_renderer_implementation_unittest.cpp:144:4: error: expected primary-expression before . token</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78468">Bug 78468</a> - Compiling of shader gets stuck in infinite loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78537">Bug 78537</a> - no anisotropic filtering in a native Half-Life 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78546">Bug 78546</a> - [swrast] piglit copyteximage-border regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78581">Bug 78581</a> - OpenCL: clBuildProgram prints error messages directly rather than storing them</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78648">Bug 78648</a> - Texture artifacts in Kerbal Space Program</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78665">Bug 78665</a> - macros in builtin_functions.cpp make invalid assumptions about M_PI definitions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78679">Bug 78679</a> - Gen4-5 code lost: runtime_check_aads_emit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78691">Bug 78691</a> - [G45 - Tesseract] Mesa 10.1.2 implementation error: Unsupported opcode 169872468 in FS</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78692">Bug 78692</a> - Football Manager 2014, gameplay rendered black &amp; white</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78716">Bug 78716</a> - Fix Mesa bugs for running Unreal Engine 4.1 Cave effects demo compiled for Linux</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78803">Bug 78803</a> - gallivm/lp_bld_debug.cpp:42:28: fatal error: llvm/IR/Module.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78842">Bug 78842</a> - [swrast] piglit fcc-read-after-clear copy rb regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78843">Bug 78843</a> - [swrast] piglit copyteximage 1D regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78872">Bug 78872</a> - [ILK Bisected]Piglit spec_ARB_depth_buffer_float_fbo-depthstencil-GL_DEPTH32F_STENCIL8-blit Aborted</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78875">Bug 78875</a> - [ILK Bisected]Webglc conformance/uniforms/uniform-default-values.html fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78888">Bug 78888</a> - test_eu_compact.c:54:3: error: implicit declaration of function brw_disasm [-Werror=implicit-function-declaration]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79029">Bug 79029</a> - INTEL_DEBUG=shader_time is full of lies</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79095">Bug 79095</a> - x86/common_x86.c:348:14: error: use of undeclared identifier 'bit_SSE4_1'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79115">Bug 79115</a> - glFramebufferRenderbuffer(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_RENDERBUFFER, 0) doesn't unbind stencil buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79263">Bug 79263</a> - Linking error in egl_gallium.la when compiling 32 bit on multiarch</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79294">Bug 79294</a> - Xlib-based build broken on non x86/x86-64 architectures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79373">Bug 79373</a> - Non-const initializers for matrix and vector constructors</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79382">Bug 79382</a> - build error: multiple definition of `loader_get_pci_id_for_fd'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79421">Bug 79421</a> - [llvmpipe] SIGSEGV src/gallium/drivers/llvmpipe/lp_rast_priv.h:218</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79440">Bug 79440</a> - prog_hash_table.c:146: undefined reference to `_mesa_error_no_memory'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79469">Bug 79469</a> - Commit e3cc0d90e14e62a0a787b6c07a6df0f5c84039be breaks unigine heaven</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79534">Bug 79534</a> - gen&lt;7 renders garbage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79616">Bug 79616</a> - L4D2 crash on startup</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79724">Bug 79724</a> - switch statement type check</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79729">Bug 79729</a> - [i965] glClear on a multisample texture doesn't work</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79809">Bug 79809</a> - radeonsi: mouse cursor corruption using weston on AMD Kaveri</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79823">Bug 79823</a> - [NV30/gallium] Mozilla apps freeze on startup with nouveau-dri-10.2.1 libs on dual-screen</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79885">Bug 79885</a> - commit b52a530 (gallium/egl: st_profiles are build time decision, treat them as such) broke egl</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79903">Bug 79903</a> - [HSW Bisected]Some Piglit and Ogles2conform cases fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79907">Bug 79907</a> - Mesa 10.2.1 --enable-vdpau default=auto broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79948">Bug 79948</a> - [i965] Incorrect pixels when using discard and uniform loads</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80015">Bug 80015</a> - Transparency glitches in native Civilization 5 (Civ5) port</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80115">Bug 80115</a> - MESA_META_DRAW_BUFFERS induced GL_INVALID_VALUE errors</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80211">Bug 80211</a> - [ILK/SNB Bisected]Piglit shaders_glsl-fs-copy-propagation-texcoords-1 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80247">Bug 80247</a> - Khronos conformance test ES3-CTS.gtf.GL3Tests.transform_feedback.transform_feedback_vertex_id fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80254">Bug 80254</a> - pipe_loader_sw.c:90: undefined reference to `dri_create_sw_winsys'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80541">Bug 80541</a> - [softpipe] piglit levelclamp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80561">Bug 80561</a> - Incorrect implementation of some VDPAU APIs.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80614">Bug 80614</a> - [regression] Error in `omxregister-bellagio': munmap_chunk(): invalid pointer: 0x00007f5f76626dab</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80778">Bug 80778</a> - [bisected regression] piglit spec/glsl-1.50/compiler/incorrect-in-layout-qualifier-repeated-prim.geom</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80827">Bug 80827</a> - [radeonsi,R9 270X] Corruptions in window menus in KDE</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80880">Bug 80880</a> - Unreal Engine 4 demos fail GLSL compiler assertion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80991">Bug 80991</a> - [BDW]Piglit spec_ARB_sample_shading_builtin-gl-sample-mask_2 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81020">Bug 81020</a> - [radeonsi][regresssion] Wireframe of background rendered through objects in Half-Life 2: Episode 2 with MSAA enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81150">Bug 81150</a> - [SNB]Piglit spec_arb_shading_language_packing_execution_built-in-functions_fs-packSnorm4x8 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81157">Bug 81157</a> - [BDW]Piglit some spec_glsl-1.50_execution_built-in-functions* cases fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81450">Bug 81450</a> - [BDW]Piglit spec_glsl-1.30_execution_tex-miplevel-selection_textureGrad_1DArray cases intel_do_flush_locked failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81828">Bug 81828</a> - [BDW Bisected]Ogles3conform GL3Tests_packed_pixels_packed_pixels_pbo.test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81834">Bug 81834</a> - TGSI constant buffer overrun causes assertion failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81857">Bug 81857</a> - [SNB+]Piglit spec_glsl-1.30_execution_switch_fs-default_last sporadically fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81967">Bug 81967</a> - [regression] Selections in Blender renders wrong</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82139">Bug 82139</a> - [r600g, bisected] multiple ubo piglit regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82159">Bug 82159</a> - No rule to make target `../../../../src/mesa/libmesa.la', needed by `collision'.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82255">Bug 82255</a> - [VP2] Chroma planes are vertically stretched during VDPAU playback</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82268">Bug 82268</a> - Add support for the OpenRISC architecture (or1k)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82428">Bug 82428</a> - [radeonsi,R9 270X] System lockup when using mplayer/mpv with VDPAU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82472">Bug 82472</a> - piglit 16385-consecutive-chars regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82483">Bug 82483</a> - format_srgb.h:145: undefined reference to `util_format_srgb_to_linear_8unorm_table'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82517">Bug 82517</a> - [RADEONSI,VDPAU] SIGSEGV in map_msg_fb_buf called from ruvd_destroy, when closing a Tab with accelerated video player</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82534">Bug 82534</a> - src\egl\main\eglapi.h : fatal error LNK1107: invalid or corrupt file: cannot read at 0x2E02</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82536">Bug 82536</a> - u_current.h:72: undefined reference to `__imp__glapi_Dispatch'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82538">Bug 82538</a> - Super Maryo Chronicles fails with st/mesa assertion failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82539">Bug 82539</a> - vmw_screen_dri.lo In file included from vmw_screen_dri.c:41: vmwgfx_drm.h:32:17: error: drm.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82546">Bug 82546</a> - [regression] libOSMesa build failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82574">Bug 82574</a> - GLSL: opt_vectorize goes wrong on texture lookups</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82628">Bug 82628</a> - bisected: GALLIUM_HUD hangs radeon 7970M (PRIME)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82671">Bug 82671</a> - [r600g-evergreen][compute]Empty kernel execution causes crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82709">Bug 82709</a> - OpenCL not working on radeon hainan</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82796">Bug 82796</a> - [IVB/BYT-M/HSW/BDW Bisected]Synmark2_v6.0_OglTerrainFlyInst/OglTerrainPanInst cannot run as image validation failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82804">Bug 82804</a> - unreal engine 4 rendering errors</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82814">Bug 82814</a> - glDrawBuffers(0, NULL) segfaults in _mesa_drawbuffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82828">Bug 82828</a> - Regression: Crash in 3Dmark2001</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82846">Bug 82846</a> - [BDW Bisected] Gpu hang when running Lightsmark v2008/Warsow v1.0/Xonotic v0.7/unigine-demos</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82881">Bug 82881</a> - test_vec4_register_coalesce regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82882">Bug 82882</a> - [swrast] piglit glsl-fs-uniform-bool-1 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82929">Bug 82929</a> - [BDW Bisected]glxgears causes X hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82932">Bug 82932</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.shaders.indexing.vector_subscript.vec3_static_loop_subscript_write_direct_read_vertex fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83046">Bug 83046</a> - [BDW bisected]] Warsow v1.0/Xonotic v0.7/Gputest v0.5_triangle_fullscreen/synmark2_v6/GLBenchmark v2.5.0/GLBenchmark v2.7.0/Ungine-demos performance reduced 30%~60%</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83079">Bug 83079</a> - [NVC0] Dota 2 (Linux native and Wine) crash with Nouveau Drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83081">Bug 83081</a> - [BDW Bisected]Piglit spec_ARB_sample_shading_builtin-gl-sample-mask_2 is core dumped</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83127">Bug 83127</a> - [ILK Bisected]Piglit glean_texCombine fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83355">Bug 83355</a> - FTBFS: src/mesa/program/program_lexer.l:122:64: error: unknown type name 'YYSTYPE'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83432">Bug 83432</a> - r600_query.c:269:r600_emit_query_end: Assertion `ctx-&gt;num_pipelinestat_queries &gt; 0' failed [Gallium HUD]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83468">Bug 83468</a> - [UBO] Using bool from UBO as if-statement condition asserts</li>
</ul>
<h2>Changes</h2>
@@ -83,6 +327,7 @@ TBD.
<li>Removed support for the GL_ATI_envmap_bumpmap extension</li>
<li>The hacky --enable-32/64-bit is no longer available in configure. To build
32/64 bit mesa refer to the default method recommended by your distribution</li>
</li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>
</ul>
</div>

97
docs/relnotes/10.4.1.html Normal file
View File

@@ -0,0 +1,97 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.1 Release Notes / December 29, 2014</h1>
<p>
Mesa 10.4.1 is a bug fix release which fixes bugs found since the 10.4.0 release.
</p>
<p>
Mesa 10.4.1 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
5311285e791a6bfaa468ad002bd1e1164acb3eaa040b5a1bf958bdb7c27e0a9d MesaLib-10.4.1.tar.gz
91e8b71c8aff4cb92022a09a872b1c5d1ae5bfec8c6c84dbc4221333da5bf1ca MesaLib-10.4.1.tar.bz2
e09c8135f5a86ecb21182c6f8959aafd39ae2f98858fdf7c0e25df65b5abcdb8 MesaLib-10.4.1.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82585">Bug 82585</a> - geometry shader with optional out variable segfaults</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82991">Bug 82991</a> - Inverted bumpmap in webgl applications</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83908">Bug 83908</a> - [i965] Incorrect icon colors in Steam Big Picture</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (1):</p>
<ul>
<li>i965/brw_reg: struct constructor now needs explicit negate and abs values.</li>
</ul>
<p>Cody Northrop (1):</p>
<ul>
<li>i965: Require pixel alignment for GPU copy blit</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: Add 10.4 sha256 sums, news item and link release notes</li>
<li>Revert "glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)"</li>
<li>Update version to 10.4.1</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>linker: Wrap access of producer_var with a NULL check</li>
<li>linker: Assign varying locations geometry shader inputs for SSO</li>
</ul>
<p>Mario Kleiner (4):</p>
<ul>
<li>glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly. (v2)</li>
<li>glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc (v2)</li>
<li>glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)</li>
<li>glx/dri3: Don't fail on glXSwapBuffersMscOML(dpy, window, 0, 0, 0) (v2)</li>
</ul>
<p>Maxence Le Doré (1):</p>
<ul>
<li>glsl: Add gl_MaxViewports to available builtin constants</li>
</ul>
</div>
</body>
</html>

127
docs/relnotes/10.4.2.html Normal file
View File

@@ -0,0 +1,127 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.2 Release Notes / January 12, 2015</h1>
<p>
Mesa 10.4.2 is a bug fix release which fixes bugs found since the 10.4.1 release.
</p>
<p>
Mesa 10.4.2 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e303e77dd774df0d051b2870b165f98c97084a55980f884731df89c1b56a6146 MesaLib-10.4.2.tar.gz
08a119937d9f2aa2f66dd5de97baffc2a6e675f549e40e699a31f5485d15327f MesaLib-10.4.2.tar.bz2
c2c2921a80a3395824f02bee4572a6a17d6a12a928a3e497618eeea04fb06490 MesaLib-10.4.2.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85529">Bug 85529</a> - Surfaces not drawn in Unvanquished</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87619">Bug 87619</a> - Changes to state such as render targets change fragment shader without marking it dirty.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87658">Bug 87658</a> - [llvmpipe] SEGV in sse2_has_daz on ancient Pentium4-M</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87913">Bug 87913</a> - CPU cacheline size of 0 can be returned by CPUID leaf 0x80000006 in some virtual machines</li>
</ul>
<h2>Changes</h2>
<p>Chad Versace (2):</p>
<ul>
<li>i965: Use safer pointer arithmetic in intel_texsubimage_tiled_memcpy()</li>
<li>i965: Use safer pointer arithmetic in gather_oa_results()</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>Revert "r600g/sb: fix issues cause by GLSL switching to loops for switch"</li>
<li>r600g: fix regression since UCMP change</li>
<li>r600g/sb: implement r600 gpr index workaround. (v3.1)</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.1 release</li>
<li>Update version to 10.4.2</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>nv50,nvc0: set vertex id base to index_bias</li>
<li>nv50/ir: fix texture offsets in release builds</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.</li>
<li>i965: Fix start/base_vertex_location for &gt;1 prims but !BRW_NEW_VERTICES.</li>
</ul>
<p>Leonid Shatz (1):</p>
<ul>
<li>gallium/util: make sure cache line size is not zero</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>glsl_to_tgsi: fix a bug in copy propagation</li>
<li>vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays</li>
<li>st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX</li>
<li>radeonsi: fix VertexID for OpenGL</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>gallium/util: fix crash with daz detection on x86</li>
</ul>
<p>Tiziano Bacocco (1):</p>
<ul>
<li>nv50,nvc0: implement half_pixel_center</li>
</ul>
<p>Vadim Girlin (1):</p>
<ul>
<li>r600g/sb: fix issues with loops created for switch</li>
</ul>
</div>
</body>
</html>

145
docs/relnotes/10.4.3.html Normal file
View File

@@ -0,0 +1,145 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.3 Release Notes / January 24, 2015</h1>
<p>
Mesa 10.4.3 is a bug fix release which fixes bugs found since the 10.4.2 release.
</p>
<p>
Mesa 10.4.3 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c53eaafc83d9c6315f63e0904d9954d929b841b0b2be7a328eeb6e14f1376129 MesaLib-10.4.3.tar.gz
ef6ecc9c2f36c9f78d1662382a69ae961f38f03af3a0c3268e53f351aa1978ad MesaLib-10.4.3.tar.bz2
179325fc8ec66529d3b0d0c43ef61a33a44d91daa126c3bbdd1efdfd25a7db1d MesaLib-10.4.3.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80568">Bug 80568</a> - [gen4] GPU Crash During Google Chrome Operation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85367">Bug 85367</a> - [gen4] GPU hang in glmark-es2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85696">Bug 85696</a> - r600g+nine: Bioshock shader failure after 7b1c0cbc90d456384b0950ad21faa3c61a6b43ff</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88219">Bug 88219</a> - include/c11/threads_posix.h:197: undefined reference to `pthread_mutex_lock'</li>
</ul>
<h2>Changes</h2>
<p>Axel Davy (39):</p>
<ul>
<li>st/nine: Add new texture format strings</li>
<li>st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS</li>
<li>st/nine: NineBaseTexture9: fix setting of last_layer</li>
<li>st/nine: CubeTexture: fix GetLevelDesc</li>
<li>st/nine: Fix crash when deleting non-implicit swapchain</li>
<li>st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format</li>
<li>st/nine: NineBaseTexture9: update sampler view creation</li>
<li>st/nine: Check if srgb format is supported before trying to use it.</li>
<li>st/nine: Add ATI1 and ATI2 support</li>
<li>st/nine: Rework of boolean constants</li>
<li>st/nine: Convert integer constants to floats before storing them when cards don't support integers</li>
<li>st/nine: Remove some shader unused code</li>
<li>st/nine: Saturate oFog and oPts vs outputs</li>
<li>st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs</li>
<li>st/nine: Fix typo for M4x4</li>
<li>st/nine: Fix POW implementation</li>
<li>st/nine: Handle RSQ special cases</li>
<li>st/nine: Handle NRM with input of null norm</li>
<li>st/nine: Correct LOG on negative values</li>
<li>st/nine: Rewrite LOOP implementation, and a0 aL handling</li>
<li>st/nine: Fix CND implementation</li>
<li>st/nine: Clamp ps 1.X constants</li>
<li>st/nine: Fix some fixed function pipeline operation</li>
<li>st/nine: Implement TEXCOORD special behaviours</li>
<li>st/nine: Fill missing dst and src number for some instructions.</li>
<li>st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC</li>
<li>st/nine: implement TEXM3x2DEPTH</li>
<li>st/nine: Implement TEXM3x2TEX</li>
<li>st/nine: Implement TEXM3x3SPEC</li>
<li>st/nine: Implement TEXDEPTH</li>
<li>st/nine: Implement TEXDP3</li>
<li>st/nine: Implement TEXDP3TEX</li>
<li>st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB</li>
<li>st/nine: Correct rules for relative adressing and constants.</li>
<li>st/nine: Remove unused code for ps</li>
<li>st/nine: Fix sm3 relative addressing for non-debug build</li>
<li>st/nine: Add variables containing the size of the constant buffers</li>
<li>st/nine: Allocate the correct size for the user constant buffer</li>
<li>st/nine: Allocate vs constbuf buffer for indirect addressing once.</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.2 release</li>
<li>Update version to 10.4.3</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>mesa: Fix clamping to -1.0 in snorm_to_float</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>glsl: Link glsl_test with pthreads library.</li>
</ul>
<p>Jose Fonseca (1):</p>
<ul>
<li>nine: Drop use of TGSI_OPCODE_CND.</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Respect the no_8 flag on Gen6, not just Gen7+.</li>
<li>i965: Work around mysterious Gen4 GPU hangs with minimal state changes.</li>
</ul>
<p>Stanislaw Halik (1):</p>
<ul>
<li>st/nine: Hack to generate resource if it doesn't exist when getting view</li>
</ul>
<p>Xavier Bouchoux (3):</p>
<ul>
<li>st/nine: Additional defines to d3dtypes.h</li>
<li>st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9</li>
<li>st/nine: Fix D3DRS_POINTSPRITE support</li>
</ul>
</div>
</body>
</html>

100
docs/relnotes/10.4.4.html Normal file
View File

@@ -0,0 +1,100 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.4 Release Notes / February 06, 2015</h1>
<p>
Mesa 10.4.4 is a bug fix release which fixes bugs found since the 10.4.3 release.
</p>
<p>
Mesa 10.4.4 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
5cb427eaf980cb8555953e9928f5797979ed783e277745d5f8cbae8bc5364086 MesaLib-10.4.4.tar.gz
f18a967e9c4d80e054b2fdff8c130ce6e6d1f8eecfc42c9f354f8628d8b4df1c MesaLib-10.4.4.tar.bz2
86baad73b77920c80fe58402a905e7dd17e3ea10ead6ea7d3afdc0a56c860bd7 MesaLib-10.4.4.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88662">Bug 88662</a> - unaligned access to gl_dlist_node</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88930">Bug 88930</a> - [osmesa] osbuffer-&gt;textures should be indexed by attachment type</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: fix display list 8-byte alignment issue</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.3 release</li>
<li>Update version to 10.4.4</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>egl: Pass the correct X visual depth to xcb_put_image().</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>gallium/util: Don't use __builtin_clrsb in util_last_bit().</li>
</ul>
<p>Niels Ole Salscheider (1):</p>
<ul>
<li>configure: Link against all LLVM targets when building clover</li>
</ul>
<p>Park, Jeongmin (1):</p>
<ul>
<li>st/osmesa: Fix osbuffer-&gt;textures indexing</li>
</ul>
<p>Ville Syrjälä (1):</p>
<ul>
<li>i965: Fix max_wm_threads for CHV</li>
</ul>
</div>
</body>
</html>

114
docs/relnotes/10.4.5.html Normal file
View File

@@ -0,0 +1,114 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.5 Release Notes / February 21, 2015</h1>
<p>
Mesa 10.4.5 is a bug fix release which fixes bugs found since the 10.4.4 release.
</p>
<p>
Mesa 10.4.5 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e12bbdaee9a758617e8ebd0bb0e987f72addd11db2e4da25ba695e386cd63843 MesaLib-10.4.5.tar.gz
bf60000700a9d58e3aca2bfeee7e781053b0d839e61a95b1883e05a2dee247a0 MesaLib-10.4.5.tar.bz2
3b926de8eee500bb67cf85332c51292f826cc539b8636382aadbb8e70c76527a MesaLib-10.4.5.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82477">Bug 82477</a> - [softpipe] piglit fp-long-alu regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88658">Bug 88658</a> - (bisected) Slow video playback on Kabini</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89069">Bug 89069</a> - Lack of grass in The Talos Principle on radeonsi (native\wine\nine)</li>
</ul>
<h2>Changes</h2>
<p>Carl Worth (1):</p>
<ul>
<li>Revert use of Mesa IR optimizer for ARB_fragment_programs</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.4 release</li>
<li>get-pick-list.sh: Require explicit "10.4" for nominating stable patches</li>
<li>Update version to 10.4.5</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>nvc0: bail out of 2d blits with non-A8_UNORM alpha formats</li>
<li>st/mesa: treat resource-less xfb buffers as if they weren't there</li>
<li>nvc0: allow holes in xfb target lists</li>
</ul>
<p>Jeremy Huddleston Sequoia (2):</p>
<ul>
<li>darwin: build fix</li>
<li>darwin: build fix</li>
</ul>
<p>Kenneth Graunke (4):</p>
<ul>
<li>i965: Override swizzles for integer luminance formats.</li>
<li>i965: Use a gl_color_union for sampler border color.</li>
<li>i965: Fix integer border color on Haswell.</li>
<li>glsl: Reduce memory consumption of copy propagation passes.</li>
</ul>
<p>Laura Ekstrand (1):</p>
<ul>
<li>main: Fixed _mesa_GetCompressedTexImage_sw to copy slices correctly.</li>
</ul>
<p>Marek Olšák (5):</p>
<ul>
<li>r600g,radeonsi: don't append to streamout buffers that haven't been used yet</li>
<li>radeonsi: fix instanced arrays with non-zero start instance</li>
<li>radeonsi: small fix in SPI state</li>
<li>mesa: fix AtomicBuffer typo in _mesa_DeleteBuffers</li>
<li>radeonsi: fix a crash if a stencil ref state is set before a DSA state</li>
</ul>
<p>Michel Dänzer (2):</p>
<ul>
<li>st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB</li>
<li>Revert "radeon/llvm: enable unsafe math for graphics shaders"</li>
</ul>
</div>
</body>
</html>

143
docs/relnotes/10.4.6.html Normal file
View File

@@ -0,0 +1,143 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.6 Release Notes / March 06, 2015</h1>
<p>
Mesa 10.4.6 is a bug fix release which fixes bugs found since the 10.4.5 release.
</p>
<p>
Mesa 10.4.6 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
46c9082142e811c01e49a2c332a9ac0a1eb98f2908985fb9df216539d7eaeaf4 MesaLib-10.4.6.tar.gz
d8baedd20e79ccd98a5a7b05e23d59a30892e68de1fcc057ca6873dafca02735 MesaLib-10.4.6.tar.bz2
6aded6eac7f0d4d55117b8b581d8424710bbb4c768fc90f7b881f29311a751aa MesaLib-10.4.6.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=45348">Bug 45348</a> - [swrast] piglit fbo-drawbuffers-arbfp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84613">Bug 84613</a> - [G965, bisected] piglit regressions : glslparsertest.glsl2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87516">Bug 87516</a> - glProgramBinary violates spec</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88885">Bug 88885</a> - Transform feedback uses incorrect interleaving if a previous draw did not write gl_Position</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89180">Bug 89180</a> - [IVB regression] Rendering issues in Mass Effect through VMware Workstation</li>
</ul>
<h2>Changes</h2>
<p>Abdiel Janulgue (2):</p>
<ul>
<li>glsl: Don't optimize min/max into saturate when EmitNoSat is set</li>
<li>st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported</li>
</ul>
<p>Andreas Boll (1):</p>
<ul>
<li>glx: Fix returned values of GLX_RENDERER_PREFERRED_PROFILE_MESA</li>
</ul>
<p>Brian Paul (2):</p>
<ul>
<li>swrast: fix multiple color buffer writing</li>
<li>st/mesa: fix sampler view reference counting bug in glDraw/CopyPixels</li>
</ul>
<p>Chris Forbes (1):</p>
<ul>
<li>i965/gs: Check newly-generated GS-out VUE map against correct stage</li>
</ul>
<p>Eduardo Lima Mitev (1):</p>
<ul>
<li>mesa: Fix error validating args for TexSubImage3D</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.5 release</li>
<li>install-lib-links: remove the .install-lib-links file</li>
<li>Revert "mesa: Correct backwards NULL check."</li>
<li>mesa: cherry-pick the second half of commit 2aa71e9485a</li>
<li>Revert "gallivm: Update for RTDyldMemoryManager becoming an unique_ptr."</li>
<li>Update version to 10.4.6</li>
</ul>
<p>Ian Romanick (3):</p>
<ul>
<li>mesa: Add missing error checks in _mesa_ProgramBinary</li>
<li>mesa: Ensure that length is set to zero in _mesa_GetProgramBinary</li>
<li>mesa: Always generate GL_INVALID_OPERATION in _mesa_GetProgramBinary</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>auxilary/os: correct sysctl use in os_get_total_physical_memory()</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>gallivm: Update for RTDyldMemoryManager becoming an unique_ptr.</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>st/omx/dec/h264: fix picture out-of-order with poc type 0 v2</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>install-lib-links: don't depend on .libs directory</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>vbo: fix an unitialized-variable warning</li>
<li>radeonsi: fix point sprites</li>
</ul>
<p>Matt Turner (4):</p>
<ul>
<li>glsl: Rewrite and fix min/max to saturate optimization.</li>
<li>mesa: Correct backwards NULL check.</li>
<li>i965/fs: Don't use backend_visitor::instructions after creating the CFG.</li>
<li>mesa: Correct backwards NULL check.</li>
</ul>
</div>
</body>
</html>

132
docs/relnotes/10.4.7.html Normal file
View File

@@ -0,0 +1,132 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.7 Release Notes / March 20, 2015</h1>
<p>
Mesa 10.4.7 is a bug fix release which fixes bugs found since the 10.4.6 release.
</p>
<p>
Mesa 10.4.7 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79202">Bug 79202</a> - valgrind errors in glsl-fs-uniform-array-loop-unroll.shader_test; random code generation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89156">Bug 89156</a> - r300g: GL_COMPRESSED_RED_RGTC1 / ATI1N support broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89224">Bug 89224</a> - Incorrect rendering of Unigine Valley running in VM on VMware Workstation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89530">Bug 89530</a> - FTBFS in loader: missing fstat</li>
</ul>
<h2>Changes</h2>
<p>Andrey Sudnik (1):</p>
<ul>
<li>i965/vec4: Don't lose the saturate modifier in copy propagation.</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>egl: Take alpha bits into account when selecting GBM formats</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.6 release</li>
<li>cherry-ignore: add not applicable/rejected commits</li>
<li>mesa: rename format_info.c to format_info.h</li>
<li>loader: include &lt;sys/stat.h&gt; for non-sysfs builds</li>
<li>auxiliary/os: fix the android build - s/drm_munmap/os_munmap/</li>
<li>Update version to 10.4.7</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>i965: Fix out-of-bounds accesses into pull_constant_loc array</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>freedreno: move fb state copy after checking for size change</li>
<li>freedreno/ir3: fix array count returned by TXQ</li>
<li>freedreno/ir3: get the # of miplevels from getinfo</li>
<li>freedreno: fix slice pitch calculations</li>
</ul>
<p>Marc-Andre Lureau (1):</p>
<ul>
<li>gallium/auxiliary/indices: fix start param</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>r300g: fix RGTC1 and LATC1 SNORM formats</li>
<li>r300g: fix a crash when resolving into an sRGB texture</li>
<li>r300g: fix sRGB-&gt;sRGB blits</li>
<li>radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>glx: Handle out-of-sequence swap completion events correctly. (v2)</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>r300g: Use PATH_MAX instead of limiting ourselves to 100 chars.</li>
<li>r300g: Check return value of snprintf().</li>
</ul>
<p>Rob Clark (2):</p>
<ul>
<li>freedreno/ir3: fix silly typo for binning pass shaders</li>
<li>freedreno: update generated headers</li>
</ul>
<p>Samuel Iglesias Gonsalvez (1):</p>
<ul>
<li>glsl: optimize (0 cmp x + y) into (-x cmp y).</li>
</ul>
<p>Stefan Dösinger (1):</p>
<ul>
<li>r300g: Fix the ATI1N swizzle (RGTC1 and LATC1)</li>
</ul>
</div>
</body>
</html>

259
docs/relnotes/10.4.html Normal file
View File

@@ -0,0 +1,259 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4 Release Notes / December 14, 2014</h1>
<p>
Mesa 10.4 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 10.4.1.
</p>
<p>
Mesa 10.4 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
abfbfd2d91ce81491c5bb6923ae649212ad5f82d0bee277de8704cc948dc221e MesaLib-10.4.0.tar.gz
98a7dff3a1a6708c79789de8b9a05d8042e867067f70e8f30387c15026233219 MesaLib-10.4.0.tar.bz2
443a6d46d0691b5ac811d8d30091b1716c365689b16d49c57cf273c2b76086fe MesaLib-10.4.0.zip
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_ARB_conditional_render_inverted on nv50</li>
<li>GL_ARB_sample_shading on r600</li>
<li>GL_ARB_texture_view on nv50, nvc0</li>
<li>GL_ARB_clip_control on nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe</li>
<li>GL_KHR_context_flush_control on all drivers</li>
</ul>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79963">Bug 79963</a> - [ILK Bisected]some piglit and ogles2conform cases fail </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=29661">Bug 29661</a> - MSVC built u_format_test fails on Windows</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=38873">Bug 38873</a> - [855gm] gnome-shell misrendered</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60879">Bug 60879</a> - [radeonsi] X11 can't start with acceleration enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61415">Bug 61415</a> - Clover ignores --with-opencl-libdir path</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64471">Bug 64471</a> - Radeon HD6570 lockup in Brütal Legend with HyperZ</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66184">Bug 66184</a> - src/mesa/state_tracker/st_glsl_to_tgsi.cpp:3216:simplify_cmp: Assertion `inst-&gt;dst.index &lt; 4096' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67672">Bug 67672</a> - [llvmpipe] lp_test_arit fails on old CPUs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69200">Bug 69200</a> - [Bisected]Piglit glx/glx-multithread-shader-compile aborted</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70410">Bug 70410</a> - egl-static/Makefile: linking fails with llvm &gt;= 3.4</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72685">Bug 72685</a> - [radeonsi hyperz] Artifacts in Unigine Sanctuary</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72819">Bug 72819</a> - [855GM] Incorrect drop shadow color on windows and strange white rectangle when showing/hiding GLX-dock...</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74563">Bug 74563</a> - Surfaceless contexts are not properly released by DRI drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74863">Bug 74863</a> - [r600g] HyperZ broken on RV770 and CYPRESS (Left 4 Dead 2 trees corruption) bisected!</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75011">Bug 75011</a> - [hyperz] Performance drop since git-01e6371 (disable hyperz by default) with radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75112">Bug 75112</a> - Meta Bug for HyperZ issues on r600g and radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76252">Bug 76252</a> - Dynamic loading/unloading of opengl32.dll results in a deadlock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76861">Bug 76861</a> - mid3 generates slow code for constant arguments</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77957">Bug 77957</a> - Variably-indexed constant arrays result in terrible shader code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78468">Bug 78468</a> - Compiling of shader gets stuck in infinite loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78770">Bug 78770</a> - [SNB bisected]Webglc conformance/textures/texture-size-limit.html fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79155">Bug 79155</a> - [Tesseract Game] Global Illumination: Medium Causes Color Distortion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79462">Bug 79462</a> - [NVC0/Codegen] Shader compilation falis in spill logic</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80011">Bug 80011</a> - [softpipe] tgsi/tgsi_exec.c:2023:exec_txf: Assertion `0' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80012">Bug 80012</a> - [softpipe] draw/draw_gs.c:113:tgsi_fetch_gs_outputs: Assertion `!util_is_inf_or_nan(output[slot][0])' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80050">Bug 80050</a> - [855GM] Incorrect drop shadow color under windows in Cinnamon persists with MESA 10.1.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80247">Bug 80247</a> - Khronos conformance test ES3-CTS.gtf.GL3Tests.transform_feedback.transform_feedback_vertex_id fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80561">Bug 80561</a> - Incorrect implementation of some VDPAU APIs.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80615">Bug 80615</a> - Files in bellagio directory [omx tracker] don't respect installation folder</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80848">Bug 80848</a> - [dri3] Building mesa fails with dri3 enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81680">Bug 81680</a> - [r600g] Firefox crashes with hardware acceleration turned on</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82255">Bug 82255</a> - [VP2] Chroma planes are vertically stretched during VDPAU playback</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82472">Bug 82472</a> - piglit 16385-consecutive-chars regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82537">Bug 82537</a> - Stunt Rally GLSL compiler assertion failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82538">Bug 82538</a> - Super Maryo Chronicles fails with st/mesa assertion failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82539">Bug 82539</a> - vmw_screen_dri.lo In file included from vmw_screen_dri.c:41: vmwgfx_drm.h:32:17: error: drm.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82796">Bug 82796</a> - [IVB/BYT-M/HSW/BDW Bisected]Synmark2_v6.0_OglTerrainFlyInst/OglTerrainPanInst cannot run as image validation failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82804">Bug 82804</a> - unreal engine 4 rendering errors</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82828">Bug 82828</a> - Regression: Crash in 3Dmark2001</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82846">Bug 82846</a> - [BDW Bisected] Gpu hang when running Lightsmark v2008/Warsow v1.0/Xonotic v0.7/unigine-demos</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82881">Bug 82881</a> - test_vec4_register_coalesce regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82882">Bug 82882</a> - [swrast] piglit glsl-fs-uniform-bool-1 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82921">Bug 82921</a> - layout(location=0) emits error &gt;= MAX_UNIFORM_LOCATIONS due to integer underflow</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82929">Bug 82929</a> - [BDW Bisected]glxgears causes X hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82932">Bug 82932</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.shaders.indexing.vector_subscript.vec3_static_loop_subscript_write_direct_read_vertex fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83079">Bug 83079</a> - [NVC0] Dota 2 (Linux native and Wine) crash with Nouveau Drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83080">Bug 83080</a> - [SNB+ Bisected]ES3-CTS.shaders.loops.do_while_constant_iterations.mixed_break_continue_fragment fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83081">Bug 83081</a> - [BDW Bisected]Piglit spec_ARB_sample_shading_builtin-gl-sample-mask_2 is core dumped</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83127">Bug 83127</a> - [ILK Bisected]Piglit glean_texCombine fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83148">Bug 83148</a> - Unity invisible under Ubuntu 14.04 and 14.10</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83355">Bug 83355</a> - FTBFS: src/mesa/program/program_lexer.l:122:64: error: unknown type name 'YYSTYPE'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83380">Bug 83380</a> - Linking fails when not writing gl_Position.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83418">Bug 83418</a> - EU IV is incorrectly rendered after git1409011930.d571f2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83432">Bug 83432</a> - r600_query.c:269:r600_emit_query_end: Assertion `ctx-&gt;num_pipelinestat_queries &gt; 0' failed [Gallium HUD]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83463">Bug 83463</a> - [swrast] piglit glsl-vs-clamp-1 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83468">Bug 83468</a> - [UBO] Using bool from UBO as if-statement condition asserts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83500">Bug 83500</a> - si_dma_copy_tile causes GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83506">Bug 83506</a> - [UBO] row_major layout ignored inside structures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83533">Bug 83533</a> - [UBO] nested structures don't get appropriate padding</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83573">Bug 83573</a> - [swrast] piglit fs-op-not-bool-using-if regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83574">Bug 83574</a> - [llvmpipe] [softpipe] piglit arb_explicit_uniform_location-use-of-unused-loc regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83741">Bug 83741</a> - [UBO] row_major layout partially ignored for arrays of structures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83777">Bug 83777</a> - [regression] ilo fails to build</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83934">Bug 83934</a> - Structures must have same name to be considered same type.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84140">Bug 84140</a> - mplayer crashes playing some files using vdpau output</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84145">Bug 84145</a> - UE4: Realistic Rendering Demo render blue</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84178">Bug 84178</a> - Big glamor regression in Xorg server 1.6.99.1 GIT: x11perf 1.5 Test: PutImage XY 500x500 Square</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84355">Bug 84355</a> - texture2DProjLod and textureCubeLod are not supported when using GLES.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84529">Bug 84529</a> - [IVB bisected] glean fragProg1 CMP test failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84538">Bug 84538</a> - lp_test_format.c:226:4: error: too few arguments to function gallivm_create</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84539">Bug 84539</a> - brw_fs_register_coalesce.cpp:183: bool fs_visitor::register_coalesce(): Assertion `src_size &lt;= 11' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84557">Bug 84557</a> - [HSW] &quot;Emit ELSE/ENDIF JIP with type D on Gen 7&quot; causes Atomic Afterlife and GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84651">Bug 84651</a> - Distorted graphics or black window when running Battle.net app on Intel hardware via wine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84662">Bug 84662</a> - Long pauses with Unreal demo Elemental on R9270X since : Always flush the HDP cache before submitting a CS to the GPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84777">Bug 84777</a> - [BSW]Piglit spec_glsl-1.50_execution_geometry-basic fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84807">Bug 84807</a> - Build issue starting between bf4aecfb2acc8d0dc815105d2f36eccbc97c284b and a3e9582f09249ad27716ba82c7dfcee685b65d51</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85189">Bug 85189</a> - llvm/invocation.cpp: In function 'void {anonymous}::optimize(llvm::Module*, unsigned int, const std::vector&lt;llvm::Function*&gt;&amp;)': llvm/invocation.cpp:324:18: error: expected type-specifier</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85267">Bug 85267</a> - vlc crashes with vdpau (Radeon 3850HD) [r600]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85377">Bug 85377</a> - lp_test_format failure with llvm-3.6</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85425">Bug 85425</a> - [bisected] Compiler error in clip control operations in meta</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85429">Bug 85429</a> - indirect.c:296: multiple definition of `__indirect_glNewList'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85454">Bug 85454</a> - Unigine Sanctuary with Wine crashes on Mesa Git</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85647">Bug 85647</a> - Random radeonsi crashes with mesa 10.3.x</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85683">Bug 85683</a> - [i965 Bisected]Piglit shaders_glsl-vs-raytrace-bug26691 segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85691">Bug 85691</a> - 'glsl: Drop constant 0.0 components from dot products.' broke piglit shaders/glsl-gnome-shell-dim-window and a few others with Gallium</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86025">Bug 86025</a> - src\glsl\list.h(535) : error C2143: syntax error : missing ';' before 'type'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86089">Bug 86089</a> - [r600g][mesa 10.4.0-dev] shader failure - r600_sb::bc_finalizer::cf_peephole() when starting Second Life</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86145">Bug 86145</a> - Pipeline statistic counter values for VF always 0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86618">Bug 86618</a> - [NV96] neg modifiers not working in MIN and MAX operations</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86760">Bug 86760</a> - mesa doesn't build: recipe for target 'r600_llvm.lo' failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86764">Bug 86764</a> - [SNB+ Bisected]Piglit glean/pointSprite fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86788">Bug 86788</a> - (bisected) 32bit UrbanTerror 4.1 timedemo sse4.1 segfault...</li>
</ul>
<h2>Changes</h2>
<ul>
<li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>
</ul>
</div>
</body>
</html>

1868
include/D3D9/d3d9.h Normal file

File diff suppressed because it is too large Load Diff

387
include/D3D9/d3d9caps.h Normal file
View File

@@ -0,0 +1,387 @@
/*
* Copyright 2011 Joakim Sindholt <opensource@zhasha.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE. */
#ifndef _D3D9CAPS_H_
#define _D3D9CAPS_H_
#include "d3d9types.h"
/* Caps flags */
#define D3DCAPS2_FULLSCREENGAMMA 0x00020000
#define D3DCAPS2_CANCALIBRATEGAMMA 0x00100000
#define D3DCAPS2_RESERVED 0x02000000
#define D3DCAPS2_CANMANAGERESOURCE 0x10000000
#define D3DCAPS2_DYNAMICTEXTURES 0x20000000
#define D3DCAPS2_CANAUTOGENMIPMAP 0x40000000
#define D3DCAPS2_CANSHARERESOURCE 0x80000000
#define D3DCAPS3_ALPHA_FULLSCREEN_FLIP_OR_DISCARD 0x00000020
#define D3DCAPS3_LINEAR_TO_SRGB_PRESENTATION 0x00000080
#define D3DCAPS3_COPY_TO_VIDMEM 0x00000100
#define D3DCAPS3_COPY_TO_SYSTEMMEM 0x00000200
#define D3DCAPS3_DXVAHD 0x00000400
#define D3DCAPS3_RESERVED 0x8000001F
#define D3DPRESENT_INTERVAL_DEFAULT 0x00000000
#define D3DPRESENT_INTERVAL_ONE 0x00000001
#define D3DPRESENT_INTERVAL_TWO 0x00000002
#define D3DPRESENT_INTERVAL_THREE 0x00000004
#define D3DPRESENT_INTERVAL_FOUR 0x00000008
#define D3DPRESENT_INTERVAL_IMMEDIATE 0x80000000
#define D3DCURSORCAPS_COLOR 0x00000001
#define D3DCURSORCAPS_LOWRES 0x00000002
#define D3DDEVCAPS_EXECUTESYSTEMMEMORY 0x00000010
#define D3DDEVCAPS_EXECUTEVIDEOMEMORY 0x00000020
#define D3DDEVCAPS_TLVERTEXSYSTEMMEMORY 0x00000040
#define D3DDEVCAPS_TLVERTEXVIDEOMEMORY 0x00000080
#define D3DDEVCAPS_TEXTURESYSTEMMEMORY 0x00000100
#define D3DDEVCAPS_TEXTUREVIDEOMEMORY 0x00000200
#define D3DDEVCAPS_DRAWPRIMTLVERTEX 0x00000400
#define D3DDEVCAPS_CANRENDERAFTERFLIP 0x00000800
#define D3DDEVCAPS_TEXTURENONLOCALVIDMEM 0x00001000
#define D3DDEVCAPS_DRAWPRIMITIVES2 0x00002000
#define D3DDEVCAPS_SEPARATETEXTUREMEMORIES 0x00004000
#define D3DDEVCAPS_DRAWPRIMITIVES2EX 0x00008000
#define D3DDEVCAPS_HWTRANSFORMANDLIGHT 0x00010000
#define D3DDEVCAPS_CANBLTSYSTONONLOCAL 0x00020000
#define D3DDEVCAPS_HWRASTERIZATION 0x00080000
#define D3DDEVCAPS_PUREDEVICE 0x00100000
#define D3DDEVCAPS_QUINTICRTPATCHES 0x00200000
#define D3DDEVCAPS_RTPATCHES 0x00400000
#define D3DDEVCAPS_RTPATCHHANDLEZERO 0x00800000
#define D3DDEVCAPS_NPATCHES 0x01000000
#define D3DPMISCCAPS_MASKZ 0x00000002
#define D3DPMISCCAPS_CULLNONE 0x00000010
#define D3DPMISCCAPS_CULLCW 0x00000020
#define D3DPMISCCAPS_CULLCCW 0x00000040
#define D3DPMISCCAPS_COLORWRITEENABLE 0x00000080
#define D3DPMISCCAPS_CLIPPLANESCALEDPOINTS 0x00000100
#define D3DPMISCCAPS_CLIPTLVERTS 0x00000200
#define D3DPMISCCAPS_TSSARGTEMP 0x00000400
#define D3DPMISCCAPS_BLENDOP 0x00000800
#define D3DPMISCCAPS_NULLREFERENCE 0x00001000
#define D3DPMISCCAPS_INDEPENDENTWRITEMASKS 0x00004000
#define D3DPMISCCAPS_PERSTAGECONSTANT 0x00008000
#define D3DPMISCCAPS_FOGANDSPECULARALPHA 0x00010000
#define D3DPMISCCAPS_SEPARATEALPHABLEND 0x00020000
#define D3DPMISCCAPS_MRTINDEPENDENTBITDEPTHS 0x00040000
#define D3DPMISCCAPS_MRTPOSTPIXELSHADERBLENDING 0x00080000
#define D3DPMISCCAPS_FOGVERTEXCLAMPED 0x00100000
#define D3DPMISCCAPS_POSTBLENDSRGBCONVERT 0x00200000
#define D3DPRASTERCAPS_DITHER 0x00000001
#define D3DPRASTERCAPS_ZTEST 0x00000010
#define D3DPRASTERCAPS_FOGVERTEX 0x00000080
#define D3DPRASTERCAPS_FOGTABLE 0x00000100
#define D3DPRASTERCAPS_MIPMAPLODBIAS 0x00002000
#define D3DPRASTERCAPS_ZBUFFERLESSHSR 0x00008000
#define D3DPRASTERCAPS_FOGRANGE 0x00010000
#define D3DPRASTERCAPS_ANISOTROPY 0x00020000
#define D3DPRASTERCAPS_WBUFFER 0x00040000
#define D3DPRASTERCAPS_WFOG 0x00100000
#define D3DPRASTERCAPS_ZFOG 0x00200000
#define D3DPRASTERCAPS_COLORPERSPECTIVE 0x00400000
#define D3DPRASTERCAPS_SCISSORTEST 0x01000000
#define D3DPRASTERCAPS_SLOPESCALEDEPTHBIAS 0x02000000
#define D3DPRASTERCAPS_DEPTHBIAS 0x04000000
#define D3DPRASTERCAPS_MULTISAMPLE_TOGGLE 0x08000000
#define D3DPCMPCAPS_NEVER 0x00000001
#define D3DPCMPCAPS_LESS 0x00000002
#define D3DPCMPCAPS_EQUAL 0x00000004
#define D3DPCMPCAPS_LESSEQUAL 0x00000008
#define D3DPCMPCAPS_GREATER 0x00000010
#define D3DPCMPCAPS_NOTEQUAL 0x00000020
#define D3DPCMPCAPS_GREATEREQUAL 0x00000040
#define D3DPCMPCAPS_ALWAYS 0x00000080
#define D3DPBLENDCAPS_ZERO 0x00000001
#define D3DPBLENDCAPS_ONE 0x00000002
#define D3DPBLENDCAPS_SRCCOLOR 0x00000004
#define D3DPBLENDCAPS_INVSRCCOLOR 0x00000008
#define D3DPBLENDCAPS_SRCALPHA 0x00000010
#define D3DPBLENDCAPS_INVSRCALPHA 0x00000020
#define D3DPBLENDCAPS_DESTALPHA 0x00000040
#define D3DPBLENDCAPS_INVDESTALPHA 0x00000080
#define D3DPBLENDCAPS_DESTCOLOR 0x00000100
#define D3DPBLENDCAPS_INVDESTCOLOR 0x00000200
#define D3DPBLENDCAPS_SRCALPHASAT 0x00000400
#define D3DPBLENDCAPS_BOTHSRCALPHA 0x00000800
#define D3DPBLENDCAPS_BOTHINVSRCALPHA 0x00001000
#define D3DPBLENDCAPS_BLENDFACTOR 0x00002000
#ifndef D3D_DISABLE_9EX
# define D3DPBLENDCAPS_SRCCOLOR2 0x00004000
# define D3DPBLENDCAPS_INVSRCCOLOR2 0x00008000
#endif
#define D3DPSHADECAPS_COLORGOURAUDRGB 0x00000008
#define D3DPSHADECAPS_SPECULARGOURAUDRGB 0x00000200
#define D3DPSHADECAPS_ALPHAGOURAUDBLEND 0x00004000
#define D3DPSHADECAPS_FOGGOURAUD 0x00080000
#define D3DPTEXTURECAPS_PERSPECTIVE 0x00000001
#define D3DPTEXTURECAPS_POW2 0x00000002
#define D3DPTEXTURECAPS_ALPHA 0x00000004
#define D3DPTEXTURECAPS_SQUAREONLY 0x00000020
#define D3DPTEXTURECAPS_TEXREPEATNOTSCALEDBYSIZE 0x00000040
#define D3DPTEXTURECAPS_ALPHAPALETTE 0x00000080
#define D3DPTEXTURECAPS_NONPOW2CONDITIONAL 0x00000100
#define D3DPTEXTURECAPS_PROJECTED 0x00000400
#define D3DPTEXTURECAPS_CUBEMAP 0x00000800
#define D3DPTEXTURECAPS_VOLUMEMAP 0x00002000
#define D3DPTEXTURECAPS_MIPMAP 0x00004000
#define D3DPTEXTURECAPS_MIPVOLUMEMAP 0x00008000
#define D3DPTEXTURECAPS_MIPCUBEMAP 0x00010000
#define D3DPTEXTURECAPS_CUBEMAP_POW2 0x00020000
#define D3DPTEXTURECAPS_VOLUMEMAP_POW2 0x00040000
#define D3DPTEXTURECAPS_NOPROJECTEDBUMPENV 0x00200000
#define D3DPTFILTERCAPS_MINFPOINT 0x00000100
#define D3DPTFILTERCAPS_MINFLINEAR 0x00000200
#define D3DPTFILTERCAPS_MINFANISOTROPIC 0x00000400
#define D3DPTFILTERCAPS_MINFPYRAMIDALQUAD 0x00000800
#define D3DPTFILTERCAPS_MINFGAUSSIANQUAD 0x00001000
#define D3DPTFILTERCAPS_MIPFPOINT 0x00010000
#define D3DPTFILTERCAPS_MIPFLINEAR 0x00020000
#define D3DPTFILTERCAPS_MAGFPOINT 0x01000000
#define D3DPTFILTERCAPS_MAGFLINEAR 0x02000000
#define D3DPTFILTERCAPS_MAGFANISOTROPIC 0x04000000
#define D3DPTFILTERCAPS_MAGFPYRAMIDALQUAD 0x08000000
#define D3DPTFILTERCAPS_MAGFGAUSSIANQUAD 0x10000000
#define D3DPTADDRESSCAPS_WRAP 0x00000001
#define D3DPTADDRESSCAPS_MIRROR 0x00000002
#define D3DPTADDRESSCAPS_CLAMP 0x00000004
#define D3DPTADDRESSCAPS_BORDER 0x00000008
#define D3DPTADDRESSCAPS_INDEPENDENTUV 0x00000010
#define D3DPTADDRESSCAPS_MIRRORONCE 0x00000020
#define D3DLINECAPS_TEXTURE 0x00000001
#define D3DLINECAPS_ZTEST 0x00000002
#define D3DLINECAPS_BLEND 0x00000004
#define D3DLINECAPS_ALPHACMP 0x00000008
#define D3DLINECAPS_FOG 0x00000010
#define D3DLINECAPS_ANTIALIAS 0x00000020
#define D3DSTENCILCAPS_KEEP 0x00000001
#define D3DSTENCILCAPS_ZERO 0x00000002
#define D3DSTENCILCAPS_REPLACE 0x00000004
#define D3DSTENCILCAPS_INCRSAT 0x00000008
#define D3DSTENCILCAPS_DECRSAT 0x00000010
#define D3DSTENCILCAPS_INVERT 0x00000020
#define D3DSTENCILCAPS_INCR 0x00000040
#define D3DSTENCILCAPS_DECR 0x00000080
#define D3DSTENCILCAPS_TWOSIDED 0x00000100
#define D3DFVFCAPS_TEXCOORDCOUNTMASK 0x0000FFFF
#define D3DFVFCAPS_DONOTSTRIPELEMENTS 0x00080000
#define D3DFVFCAPS_PSIZE 0x00100000
#define D3DTEXOPCAPS_DISABLE 0x00000001
#define D3DTEXOPCAPS_SELECTARG1 0x00000002
#define D3DTEXOPCAPS_SELECTARG2 0x00000004
#define D3DTEXOPCAPS_MODULATE 0x00000008
#define D3DTEXOPCAPS_MODULATE2X 0x00000010
#define D3DTEXOPCAPS_MODULATE4X 0x00000020
#define D3DTEXOPCAPS_ADD 0x00000040
#define D3DTEXOPCAPS_ADDSIGNED 0x00000080
#define D3DTEXOPCAPS_ADDSIGNED2X 0x00000100
#define D3DTEXOPCAPS_SUBTRACT 0x00000200
#define D3DTEXOPCAPS_ADDSMOOTH 0x00000400
#define D3DTEXOPCAPS_BLENDDIFFUSEALPHA 0x00000800
#define D3DTEXOPCAPS_BLENDTEXTUREALPHA 0x00001000
#define D3DTEXOPCAPS_BLENDFACTORALPHA 0x00002000
#define D3DTEXOPCAPS_BLENDTEXTUREALPHAPM 0x00004000
#define D3DTEXOPCAPS_BLENDCURRENTALPHA 0x00008000
#define D3DTEXOPCAPS_PREMODULATE 0x00010000
#define D3DTEXOPCAPS_MODULATEALPHA_ADDCOLOR 0x00020000
#define D3DTEXOPCAPS_MODULATECOLOR_ADDALPHA 0x00040000
#define D3DTEXOPCAPS_MODULATEINVALPHA_ADDCOLOR 0x00080000
#define D3DTEXOPCAPS_MODULATEINVCOLOR_ADDALPHA 0x00100000
#define D3DTEXOPCAPS_BUMPENVMAP 0x00200000
#define D3DTEXOPCAPS_BUMPENVMAPLUMINANCE 0x00400000
#define D3DTEXOPCAPS_DOTPRODUCT3 0x00800000
#define D3DTEXOPCAPS_MULTIPLYADD 0x01000000
#define D3DTEXOPCAPS_LERP 0x02000000
#define D3DVTXPCAPS_TEXGEN 0x00000001
#define D3DVTXPCAPS_MATERIALSOURCE7 0x00000002
#define D3DVTXPCAPS_DIRECTIONALLIGHTS 0x00000008
#define D3DVTXPCAPS_POSITIONALLIGHTS 0x00000010
#define D3DVTXPCAPS_LOCALVIEWER 0x00000020
#define D3DVTXPCAPS_TWEENING 0x00000040
#define D3DVTXPCAPS_TEXGEN_SPHEREMAP 0x00000100
#define D3DVTXPCAPS_NO_TEXGEN_NONLOCALVIEWER 0x00000200
#define D3DDEVCAPS2_STREAMOFFSET 0x00000001
#define D3DDEVCAPS2_DMAPNPATCH 0x00000002
#define D3DDEVCAPS2_ADAPTIVETESSRTPATCH 0x00000004
#define D3DDEVCAPS2_ADAPTIVETESSNPATCH 0x00000008
#define D3DDEVCAPS2_CAN_STRETCHRECT_FROM_TEXTURES 0x00000010
#define D3DDEVCAPS2_PRESAMPLEDDMAPNPATCH 0x00000020
#define D3DDEVCAPS2_VERTEXELEMENTSCANSHARESTREAMOFFSET 0x00000040
#define D3DDTCAPS_UBYTE4 0x00000001
#define D3DDTCAPS_UBYTE4N 0x00000002
#define D3DDTCAPS_SHORT2N 0x00000004
#define D3DDTCAPS_SHORT4N 0x00000008
#define D3DDTCAPS_USHORT2N 0x00000010
#define D3DDTCAPS_USHORT4N 0x00000020
#define D3DDTCAPS_UDEC3 0x00000040
#define D3DDTCAPS_DEC3N 0x00000080
#define D3DDTCAPS_FLOAT16_2 0x00000100
#define D3DDTCAPS_FLOAT16_4 0x00000200
#define D3DVS20_MAX_DYNAMICFLOWCONTROLDEPTH 24
#define D3DVS20_MIN_DYNAMICFLOWCONTROLDEPTH 0
#define D3DVS20_MAX_NUMTEMPS 32
#define D3DVS20_MIN_NUMTEMPS 12
#define D3DVS20_MAX_STATICFLOWCONTROLDEPTH 4
#define D3DVS20_MIN_STATICFLOWCONTROLDEPTH 1
#define D3DVS20CAPS_PREDICATION (1 << 0)
#define D3DPS20CAPS_ARBITRARYSWIZZLE (1 << 0)
#define D3DPS20CAPS_GRADIENTINSTRUCTIONS (1 << 1)
#define D3DPS20CAPS_PREDICATION (1 << 2)
#define D3DPS20CAPS_NODEPENDENTREADLIMIT (1 << 3)
#define D3DPS20CAPS_NOTEXINSTRUCTIONLIMIT (1 << 4)
#define D3DPS20_MAX_DYNAMICFLOWCONTROLDEPTH 24
#define D3DPS20_MIN_DYNAMICFLOWCONTROLDEPTH 0
#define D3DPS20_MAX_NUMTEMPS 32
#define D3DPS20_MIN_NUMTEMPS 12
#define D3DPS20_MAX_STATICFLOWCONTROLDEPTH 4
#define D3DPS20_MIN_STATICFLOWCONTROLDEPTH 0
#define D3DPS20_MAX_NUMINSTRUCTIONSLOTS 512
#define D3DPS20_MIN_NUMINSTRUCTIONSLOTS 96
#define D3DMIN30SHADERINSTRUCTIONS 512
#define D3DMAX30SHADERINSTRUCTIONS 32768
/* Structs */
typedef struct _D3DVSHADERCAPS2_0 {
DWORD Caps;
INT DynamicFlowControlDepth;
INT NumTemps;
INT StaticFlowControlDepth;
} D3DVSHADERCAPS2_0, *PD3DVSHADERCAPS2_0, *LPD3DVSHADERCAPS2_0;
typedef struct _D3DPSHADERCAPS2_0 {
DWORD Caps;
INT DynamicFlowControlDepth;
INT NumTemps;
INT StaticFlowControlDepth;
INT NumInstructionSlots;
} D3DPSHADERCAPS2_0, *PD3DPSHADERCAPS2_0, *LPD3DPSHADERCAPS2_0;
typedef struct _D3DCAPS9 {
D3DDEVTYPE DeviceType;
UINT AdapterOrdinal;
DWORD Caps;
DWORD Caps2;
DWORD Caps3;
DWORD PresentationIntervals;
DWORD CursorCaps;
DWORD DevCaps;
DWORD PrimitiveMiscCaps;
DWORD RasterCaps;
DWORD ZCmpCaps;
DWORD SrcBlendCaps;
DWORD DestBlendCaps;
DWORD AlphaCmpCaps;
DWORD ShadeCaps;
DWORD TextureCaps;
DWORD TextureFilterCaps;
DWORD CubeTextureFilterCaps;
DWORD VolumeTextureFilterCaps;
DWORD TextureAddressCaps;
DWORD VolumeTextureAddressCaps;
DWORD LineCaps;
DWORD MaxTextureWidth;
DWORD MaxTextureHeight;
DWORD MaxVolumeExtent;
DWORD MaxTextureRepeat;
DWORD MaxTextureAspectRatio;
DWORD MaxAnisotropy;
float MaxVertexW;
float GuardBandLeft;
float GuardBandTop;
float GuardBandRight;
float GuardBandBottom;
float ExtentsAdjust;
DWORD StencilCaps;
DWORD FVFCaps;
DWORD TextureOpCaps;
DWORD MaxTextureBlendStages;
DWORD MaxSimultaneousTextures;
DWORD VertexProcessingCaps;
DWORD MaxActiveLights;
DWORD MaxUserClipPlanes;
DWORD MaxVertexBlendMatrices;
DWORD MaxVertexBlendMatrixIndex;
float MaxPointSize;
DWORD MaxPrimitiveCount;
DWORD MaxVertexIndex;
DWORD MaxStreams;
DWORD MaxStreamStride;
DWORD VertexShaderVersion;
DWORD MaxVertexShaderConst;
DWORD PixelShaderVersion;
float PixelShader1xMaxValue;
DWORD DevCaps2;
float MaxNpatchTessellationLevel;
DWORD Reserved5;
UINT MasterAdapterOrdinal;
UINT AdapterOrdinalInGroup;
UINT NumberOfAdaptersInGroup;
DWORD DeclTypes;
DWORD NumSimultaneousRTs;
DWORD StretchRectFilterCaps;
D3DVSHADERCAPS2_0 VS20Caps;
D3DPSHADERCAPS2_0 PS20Caps;
DWORD VertexTextureFilterCaps;
DWORD MaxVShaderInstructionsExecuted;
DWORD MaxPShaderInstructionsExecuted;
DWORD MaxVertexShader30InstructionSlots;
DWORD MaxPixelShader30InstructionSlots;
} D3DCAPS9, *PD3DCAPS9, *LPD3DCAPS9;
typedef struct _D3DCONTENTPROTECTIONCAPS {
DWORD Caps;
GUID KeyExchangeType;
UINT BufferAlignmentStart;
UINT BlockAlignmentSize;
ULONGLONG ProtectedMemorySize;
} D3DCONTENTPROTECTIONCAPS, *PD3DCONTENTPROTECTIONCAPS, *LPD3DCONTENTPROTECTIONCAPS;
typedef struct _D3DOVERLAYCAPS {
UINT Caps;
UINT MaxOverlayDisplayWidth;
UINT MaxOverlayDisplayHeight;
} D3DOVERLAYCAPS, *PD3DOVERLAYCAPS, *LPD3DOVERLAYCAPS;
#endif /* _D3D9CAPS_H_ */

1810
include/D3D9/d3d9types.h Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -33,14 +33,14 @@ extern "C" {
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/
**
** Khronos $Revision: 25922 $ on $Date: 2014-03-17 03:54:32 -0700 (Mon, 17 Mar 2014) $
** Khronos $Revision: 28335 $ on $Date: 2014-09-26 18:55:45 -0700 (Fri, 26 Sep 2014) $
*/
#ifndef GL_APIENTRYP
#define GL_APIENTRYP GL_APIENTRY*
#endif
/* Generated on date 20140317 */
/* Generated on date 20140926 */
/* Generated C header for:
* API: gles2
@@ -54,7 +54,6 @@ extern "C" {
#ifndef GL_KHR_blend_equation_advanced
#define GL_KHR_blend_equation_advanced 1
#define GL_BLEND_ADVANCED_COHERENT_KHR 0x9285
#define GL_MULTIPLY_KHR 0x9294
#define GL_SCREEN_KHR 0x9295
#define GL_OVERLAY_KHR 0x9296
@@ -76,6 +75,17 @@ GL_APICALL void GL_APIENTRY glBlendBarrierKHR (void);
#endif
#endif /* GL_KHR_blend_equation_advanced */
#ifndef GL_KHR_blend_equation_advanced_coherent
#define GL_KHR_blend_equation_advanced_coherent 1
#define GL_BLEND_ADVANCED_COHERENT_KHR 0x9285
#endif /* GL_KHR_blend_equation_advanced_coherent */
#ifndef GL_KHR_context_flush_control
#define GL_KHR_context_flush_control 1
#define GL_CONTEXT_RELEASE_BEHAVIOR_KHR 0x82FB
#define GL_CONTEXT_RELEASE_BEHAVIOR_FLUSH_KHR 0x82FC
#endif /* GL_KHR_context_flush_control */
#ifndef GL_KHR_debug
#define GL_KHR_debug 1
typedef void (GL_APIENTRY *GLDEBUGPROCKHR)(GLenum source,GLenum type,GLuint id,GLenum severity,GLsizei length,const GLchar *message,const void *userParam);
@@ -145,6 +155,34 @@ GL_APICALL void GL_APIENTRY glGetPointervKHR (GLenum pname, void **params);
#endif
#endif /* GL_KHR_debug */
#ifndef GL_KHR_robust_buffer_access_behavior
#define GL_KHR_robust_buffer_access_behavior 1
#endif /* GL_KHR_robust_buffer_access_behavior */
#ifndef GL_KHR_robustness
#define GL_KHR_robustness 1
#define GL_CONTEXT_ROBUST_ACCESS_KHR 0x90F3
#define GL_LOSE_CONTEXT_ON_RESET_KHR 0x8252
#define GL_GUILTY_CONTEXT_RESET_KHR 0x8253
#define GL_INNOCENT_CONTEXT_RESET_KHR 0x8254
#define GL_UNKNOWN_CONTEXT_RESET_KHR 0x8255
#define GL_RESET_NOTIFICATION_STRATEGY_KHR 0x8256
#define GL_NO_RESET_NOTIFICATION_KHR 0x8261
#define GL_CONTEXT_LOST_KHR 0x0507
typedef GLenum (GL_APIENTRYP PFNGLGETGRAPHICSRESETSTATUSKHRPROC) (void);
typedef void (GL_APIENTRYP PFNGLREADNPIXELSKHRPROC) (GLint x, GLint y, GLsizei width, GLsizei height, GLenum format, GLenum type, GLsizei bufSize, void *data);
typedef void (GL_APIENTRYP PFNGLGETNUNIFORMFVKHRPROC) (GLuint program, GLint location, GLsizei bufSize, GLfloat *params);
typedef void (GL_APIENTRYP PFNGLGETNUNIFORMIVKHRPROC) (GLuint program, GLint location, GLsizei bufSize, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETNUNIFORMUIVKHRPROC) (GLuint program, GLint location, GLsizei bufSize, GLuint *params);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL GLenum GL_APIENTRY glGetGraphicsResetStatusKHR (void);
GL_APICALL void GL_APIENTRY glReadnPixelsKHR (GLint x, GLint y, GLsizei width, GLsizei height, GLenum format, GLenum type, GLsizei bufSize, void *data);
GL_APICALL void GL_APIENTRY glGetnUniformfvKHR (GLuint program, GLint location, GLsizei bufSize, GLfloat *params);
GL_APICALL void GL_APIENTRY glGetnUniformivKHR (GLuint program, GLint location, GLsizei bufSize, GLint *params);
GL_APICALL void GL_APIENTRY glGetnUniformuivKHR (GLuint program, GLint location, GLsizei bufSize, GLuint *params);
#endif
#endif /* GL_KHR_robustness */
#ifndef GL_KHR_texture_compression_astc_hdr
#define GL_KHR_texture_compression_astc_hdr 1
#define GL_COMPRESSED_RGBA_ASTC_4x4_KHR 0x93B0
@@ -200,6 +238,10 @@ GL_APICALL void GL_APIENTRY glEGLImageTargetRenderbufferStorageOES (GLenum targe
#define GL_SAMPLER_EXTERNAL_OES 0x8D66
#endif /* GL_OES_EGL_image_external */
#ifndef GL_OES_compressed_ETC1_RGB8_sub_texture
#define GL_OES_compressed_ETC1_RGB8_sub_texture 1
#endif /* GL_OES_compressed_ETC1_RGB8_sub_texture */
#ifndef GL_OES_compressed_ETC1_RGB8_texture
#define GL_OES_compressed_ETC1_RGB8_texture 1
#define GL_ETC1_RGB8_OES 0x8D64
@@ -512,6 +554,10 @@ GL_APICALL void GL_APIENTRY glGetPerfMonitorCounterDataAMD (GLuint monitor, GLen
#define GL_Z400_BINARY_AMD 0x8740
#endif /* GL_AMD_program_binary_Z400 */
#ifndef GL_ANDROID_extension_pack_es31a
#define GL_ANDROID_extension_pack_es31a 1
#endif /* GL_ANDROID_extension_pack_es31a */
#ifndef GL_ANGLE_depth_texture
#define GL_ANGLE_depth_texture 1
#endif /* GL_ANGLE_depth_texture */
@@ -587,6 +633,23 @@ GL_APICALL void GL_APIENTRY glGetTranslatedShaderSourceANGLE (GLuint shader, GLs
#endif
#endif /* GL_ANGLE_translated_shader_source */
#ifndef GL_APPLE_clip_distance
#define GL_APPLE_clip_distance 1
#define GL_MAX_CLIP_DISTANCES_APPLE 0x0D32
#define GL_CLIP_DISTANCE0_APPLE 0x3000
#define GL_CLIP_DISTANCE1_APPLE 0x3001
#define GL_CLIP_DISTANCE2_APPLE 0x3002
#define GL_CLIP_DISTANCE3_APPLE 0x3003
#define GL_CLIP_DISTANCE4_APPLE 0x3004
#define GL_CLIP_DISTANCE5_APPLE 0x3005
#define GL_CLIP_DISTANCE6_APPLE 0x3006
#define GL_CLIP_DISTANCE7_APPLE 0x3007
#endif /* GL_APPLE_clip_distance */
#ifndef GL_APPLE_color_buffer_packed_float
#define GL_APPLE_color_buffer_packed_float 1
#endif /* GL_APPLE_color_buffer_packed_float */
#ifndef GL_APPLE_copy_texture_levels
#define GL_APPLE_copy_texture_levels 1
typedef void (GL_APIENTRYP PFNGLCOPYTEXTURELEVELSAPPLEPROC) (GLuint destinationTexture, GLuint sourceTexture, GLint sourceBaseLevel, GLsizei sourceLevelCount);
@@ -667,6 +730,14 @@ GL_APICALL void GL_APIENTRY glGetSyncivAPPLE (GLsync sync, GLenum pname, GLsizei
#define GL_TEXTURE_MAX_LEVEL_APPLE 0x813D
#endif /* GL_APPLE_texture_max_level */
#ifndef GL_APPLE_texture_packed_float
#define GL_APPLE_texture_packed_float 1
#define GL_UNSIGNED_INT_10F_11F_11F_REV_APPLE 0x8C3B
#define GL_UNSIGNED_INT_5_9_9_9_REV_APPLE 0x8C3E
#define GL_R11F_G11F_B10F_APPLE 0x8C3A
#define GL_RGB9_E5_APPLE 0x8C3D
#endif /* GL_APPLE_texture_packed_float */
#ifndef GL_ARM_mali_program_binary
#define GL_ARM_mali_program_binary 1
#define GL_MALI_PROGRAM_BINARY_ARM 0x8F61
@@ -691,6 +762,13 @@ GL_APICALL void GL_APIENTRY glGetSyncivAPPLE (GLsync sync, GLenum pname, GLsizei
#define GL_ARM_shader_framebuffer_fetch_depth_stencil 1
#endif /* GL_ARM_shader_framebuffer_fetch_depth_stencil */
#ifndef GL_DMP_program_binary
#define GL_DMP_program_binary 1
#define GL_SMAPHS30_PROGRAM_BINARY_DMP 0x9251
#define GL_SMAPHS_PROGRAM_BINARY_DMP 0x9252
#define GL_DMP_PROGRAM_BINARY_DMP 0x9253
#endif /* GL_DMP_program_binary */
#ifndef GL_DMP_shader_binary
#define GL_DMP_shader_binary 1
#define GL_SHADER_BINARY_DMP 0x9250
@@ -712,6 +790,14 @@ GL_APICALL void GL_APIENTRY glGetSyncivAPPLE (GLsync sync, GLenum pname, GLsizei
#define GL_UNSIGNED_NORMALIZED_EXT 0x8C17
#endif /* GL_EXT_color_buffer_half_float */
#ifndef GL_EXT_copy_image
#define GL_EXT_copy_image 1
typedef void (GL_APIENTRYP PFNGLCOPYIMAGESUBDATAEXTPROC) (GLuint srcName, GLenum srcTarget, GLint srcLevel, GLint srcX, GLint srcY, GLint srcZ, GLuint dstName, GLenum dstTarget, GLint dstLevel, GLint dstX, GLint dstY, GLint dstZ, GLsizei srcWidth, GLsizei srcHeight, GLsizei srcDepth);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glCopyImageSubDataEXT (GLuint srcName, GLenum srcTarget, GLint srcLevel, GLint srcX, GLint srcY, GLint srcZ, GLuint dstName, GLenum dstTarget, GLint dstLevel, GLint dstX, GLint dstY, GLint dstZ, GLsizei srcWidth, GLsizei srcHeight, GLsizei srcDepth);
#endif
#endif /* GL_EXT_copy_image */
#ifndef GL_EXT_debug_label
#define GL_EXT_debug_label 1
#define GL_PROGRAM_PIPELINE_OBJECT_EXT 0x8A4F
@@ -829,6 +915,30 @@ GL_APICALL void GL_APIENTRY glDrawBuffersEXT (GLsizei n, const GLenum *bufs);
#endif
#endif /* GL_EXT_draw_buffers */
#ifndef GL_EXT_draw_buffers_indexed
#define GL_EXT_draw_buffers_indexed 1
#define GL_MIN 0x8007
#define GL_MAX 0x8008
typedef void (GL_APIENTRYP PFNGLENABLEIEXTPROC) (GLenum target, GLuint index);
typedef void (GL_APIENTRYP PFNGLDISABLEIEXTPROC) (GLenum target, GLuint index);
typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONIEXTPROC) (GLuint buf, GLenum mode);
typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONSEPARATEIEXTPROC) (GLuint buf, GLenum modeRGB, GLenum modeAlpha);
typedef void (GL_APIENTRYP PFNGLBLENDFUNCIEXTPROC) (GLuint buf, GLenum src, GLenum dst);
typedef void (GL_APIENTRYP PFNGLBLENDFUNCSEPARATEIEXTPROC) (GLuint buf, GLenum srcRGB, GLenum dstRGB, GLenum srcAlpha, GLenum dstAlpha);
typedef void (GL_APIENTRYP PFNGLCOLORMASKIEXTPROC) (GLuint index, GLboolean r, GLboolean g, GLboolean b, GLboolean a);
typedef GLboolean (GL_APIENTRYP PFNGLISENABLEDIEXTPROC) (GLenum target, GLuint index);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glEnableiEXT (GLenum target, GLuint index);
GL_APICALL void GL_APIENTRY glDisableiEXT (GLenum target, GLuint index);
GL_APICALL void GL_APIENTRY glBlendEquationiEXT (GLuint buf, GLenum mode);
GL_APICALL void GL_APIENTRY glBlendEquationSeparateiEXT (GLuint buf, GLenum modeRGB, GLenum modeAlpha);
GL_APICALL void GL_APIENTRY glBlendFunciEXT (GLuint buf, GLenum src, GLenum dst);
GL_APICALL void GL_APIENTRY glBlendFuncSeparateiEXT (GLuint buf, GLenum srcRGB, GLenum dstRGB, GLenum srcAlpha, GLenum dstAlpha);
GL_APICALL void GL_APIENTRY glColorMaskiEXT (GLuint index, GLboolean r, GLboolean g, GLboolean b, GLboolean a);
GL_APICALL GLboolean GL_APIENTRY glIsEnablediEXT (GLenum target, GLuint index);
#endif
#endif /* GL_EXT_draw_buffers_indexed */
#ifndef GL_EXT_draw_instanced
#define GL_EXT_draw_instanced 1
typedef void (GL_APIENTRYP PFNGLDRAWARRAYSINSTANCEDEXTPROC) (GLenum mode, GLint start, GLsizei count, GLsizei primcount);
@@ -839,6 +949,55 @@ GL_APICALL void GL_APIENTRY glDrawElementsInstancedEXT (GLenum mode, GLsizei cou
#endif
#endif /* GL_EXT_draw_instanced */
#ifndef GL_EXT_geometry_point_size
#define GL_EXT_geometry_point_size 1
#endif /* GL_EXT_geometry_point_size */
#ifndef GL_EXT_geometry_shader
#define GL_EXT_geometry_shader 1
#define GL_GEOMETRY_SHADER_EXT 0x8DD9
#define GL_GEOMETRY_SHADER_BIT_EXT 0x00000004
#define GL_GEOMETRY_LINKED_VERTICES_OUT_EXT 0x8916
#define GL_GEOMETRY_LINKED_INPUT_TYPE_EXT 0x8917
#define GL_GEOMETRY_LINKED_OUTPUT_TYPE_EXT 0x8918
#define GL_GEOMETRY_SHADER_INVOCATIONS_EXT 0x887F
#define GL_LAYER_PROVOKING_VERTEX_EXT 0x825E
#define GL_LINES_ADJACENCY_EXT 0x000A
#define GL_LINE_STRIP_ADJACENCY_EXT 0x000B
#define GL_TRIANGLES_ADJACENCY_EXT 0x000C
#define GL_TRIANGLE_STRIP_ADJACENCY_EXT 0x000D
#define GL_MAX_GEOMETRY_UNIFORM_COMPONENTS_EXT 0x8DDF
#define GL_MAX_GEOMETRY_UNIFORM_BLOCKS_EXT 0x8A2C
#define GL_MAX_COMBINED_GEOMETRY_UNIFORM_COMPONENTS_EXT 0x8A32
#define GL_MAX_GEOMETRY_INPUT_COMPONENTS_EXT 0x9123
#define GL_MAX_GEOMETRY_OUTPUT_COMPONENTS_EXT 0x9124
#define GL_MAX_GEOMETRY_OUTPUT_VERTICES_EXT 0x8DE0
#define GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS_EXT 0x8DE1
#define GL_MAX_GEOMETRY_SHADER_INVOCATIONS_EXT 0x8E5A
#define GL_MAX_GEOMETRY_TEXTURE_IMAGE_UNITS_EXT 0x8C29
#define GL_MAX_GEOMETRY_ATOMIC_COUNTER_BUFFERS_EXT 0x92CF
#define GL_MAX_GEOMETRY_ATOMIC_COUNTERS_EXT 0x92D5
#define GL_MAX_GEOMETRY_IMAGE_UNIFORMS_EXT 0x90CD
#define GL_MAX_GEOMETRY_SHADER_STORAGE_BLOCKS_EXT 0x90D7
#define GL_FIRST_VERTEX_CONVENTION_EXT 0x8E4D
#define GL_LAST_VERTEX_CONVENTION_EXT 0x8E4E
#define GL_UNDEFINED_VERTEX_EXT 0x8260
#define GL_PRIMITIVES_GENERATED_EXT 0x8C87
#define GL_FRAMEBUFFER_DEFAULT_LAYERS_EXT 0x9312
#define GL_MAX_FRAMEBUFFER_LAYERS_EXT 0x9317
#define GL_FRAMEBUFFER_INCOMPLETE_LAYER_TARGETS_EXT 0x8DA8
#define GL_FRAMEBUFFER_ATTACHMENT_LAYERED_EXT 0x8DA7
#define GL_REFERENCED_BY_GEOMETRY_SHADER_EXT 0x9309
typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTUREEXTPROC) (GLenum target, GLenum attachment, GLuint texture, GLint level);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glFramebufferTextureEXT (GLenum target, GLenum attachment, GLuint texture, GLint level);
#endif
#endif /* GL_EXT_geometry_shader */
#ifndef GL_EXT_gpu_shader5
#define GL_EXT_gpu_shader5 1
#endif /* GL_EXT_gpu_shader5 */
#ifndef GL_EXT_instanced_arrays
#define GL_EXT_instanced_arrays 1
#define GL_VERTEX_ATTRIB_ARRAY_DIVISOR_EXT 0x88FE
@@ -911,12 +1070,23 @@ GL_APICALL void GL_APIENTRY glGetIntegeri_vEXT (GLenum target, GLuint index, GLi
#define GL_ANY_SAMPLES_PASSED_CONSERVATIVE_EXT 0x8D6A
#endif /* GL_EXT_occlusion_query_boolean */
#ifndef GL_EXT_primitive_bounding_box
#define GL_EXT_primitive_bounding_box 1
#define GL_PRIMITIVE_BOUNDING_BOX_EXT 0x92BE
typedef void (GL_APIENTRYP PFNGLPRIMITIVEBOUNDINGBOXEXTPROC) (GLfloat minX, GLfloat minY, GLfloat minZ, GLfloat minW, GLfloat maxX, GLfloat maxY, GLfloat maxZ, GLfloat maxW);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glPrimitiveBoundingBoxEXT (GLfloat minX, GLfloat minY, GLfloat minZ, GLfloat minW, GLfloat maxX, GLfloat maxY, GLfloat maxZ, GLfloat maxW);
#endif
#endif /* GL_EXT_primitive_bounding_box */
#ifndef GL_EXT_pvrtc_sRGB
#define GL_EXT_pvrtc_sRGB 1
#define GL_COMPRESSED_SRGB_PVRTC_2BPPV1_EXT 0x8A54
#define GL_COMPRESSED_SRGB_PVRTC_4BPPV1_EXT 0x8A55
#define GL_COMPRESSED_SRGB_ALPHA_PVRTC_2BPPV1_EXT 0x8A56
#define GL_COMPRESSED_SRGB_ALPHA_PVRTC_4BPPV1_EXT 0x8A57
#define GL_COMPRESSED_SRGB_ALPHA_PVRTC_2BPPV2_IMG 0x93F0
#define GL_COMPRESSED_SRGB_ALPHA_PVRTC_4BPPV2_IMG 0x93F1
#endif /* GL_EXT_pvrtc_sRGB */
#ifndef GL_EXT_read_format_bgra
@@ -1064,10 +1234,18 @@ GL_APICALL void GL_APIENTRY glProgramUniformMatrix4x3fvEXT (GLuint program, GLin
#define GL_FRAGMENT_SHADER_DISCARDS_SAMPLES_EXT 0x8A52
#endif /* GL_EXT_shader_framebuffer_fetch */
#ifndef GL_EXT_shader_implicit_conversions
#define GL_EXT_shader_implicit_conversions 1
#endif /* GL_EXT_shader_implicit_conversions */
#ifndef GL_EXT_shader_integer_mix
#define GL_EXT_shader_integer_mix 1
#endif /* GL_EXT_shader_integer_mix */
#ifndef GL_EXT_shader_io_blocks
#define GL_EXT_shader_io_blocks 1
#endif /* GL_EXT_shader_io_blocks */
#ifndef GL_EXT_shader_pixel_local_storage
#define GL_EXT_shader_pixel_local_storage 1
#define GL_MAX_SHADER_PIXEL_LOCAL_STORAGE_FAST_SIZE_EXT 0x8F63
@@ -1087,6 +1265,109 @@ GL_APICALL void GL_APIENTRY glProgramUniformMatrix4x3fvEXT (GLuint program, GLin
#define GL_SAMPLER_2D_SHADOW_EXT 0x8B62
#endif /* GL_EXT_shadow_samplers */
#ifndef GL_EXT_tessellation_point_size
#define GL_EXT_tessellation_point_size 1
#endif /* GL_EXT_tessellation_point_size */
#ifndef GL_EXT_tessellation_shader
#define GL_EXT_tessellation_shader 1
#define GL_PATCHES_EXT 0x000E
#define GL_PATCH_VERTICES_EXT 0x8E72
#define GL_TESS_CONTROL_OUTPUT_VERTICES_EXT 0x8E75
#define GL_TESS_GEN_MODE_EXT 0x8E76
#define GL_TESS_GEN_SPACING_EXT 0x8E77
#define GL_TESS_GEN_VERTEX_ORDER_EXT 0x8E78
#define GL_TESS_GEN_POINT_MODE_EXT 0x8E79
#define GL_ISOLINES_EXT 0x8E7A
#define GL_QUADS_EXT 0x0007
#define GL_FRACTIONAL_ODD_EXT 0x8E7B
#define GL_FRACTIONAL_EVEN_EXT 0x8E7C
#define GL_MAX_PATCH_VERTICES_EXT 0x8E7D
#define GL_MAX_TESS_GEN_LEVEL_EXT 0x8E7E
#define GL_MAX_TESS_CONTROL_UNIFORM_COMPONENTS_EXT 0x8E7F
#define GL_MAX_TESS_EVALUATION_UNIFORM_COMPONENTS_EXT 0x8E80
#define GL_MAX_TESS_CONTROL_TEXTURE_IMAGE_UNITS_EXT 0x8E81
#define GL_MAX_TESS_EVALUATION_TEXTURE_IMAGE_UNITS_EXT 0x8E82
#define GL_MAX_TESS_CONTROL_OUTPUT_COMPONENTS_EXT 0x8E83
#define GL_MAX_TESS_PATCH_COMPONENTS_EXT 0x8E84
#define GL_MAX_TESS_CONTROL_TOTAL_OUTPUT_COMPONENTS_EXT 0x8E85
#define GL_MAX_TESS_EVALUATION_OUTPUT_COMPONENTS_EXT 0x8E86
#define GL_MAX_TESS_CONTROL_UNIFORM_BLOCKS_EXT 0x8E89
#define GL_MAX_TESS_EVALUATION_UNIFORM_BLOCKS_EXT 0x8E8A
#define GL_MAX_TESS_CONTROL_INPUT_COMPONENTS_EXT 0x886C
#define GL_MAX_TESS_EVALUATION_INPUT_COMPONENTS_EXT 0x886D
#define GL_MAX_COMBINED_TESS_CONTROL_UNIFORM_COMPONENTS_EXT 0x8E1E
#define GL_MAX_COMBINED_TESS_EVALUATION_UNIFORM_COMPONENTS_EXT 0x8E1F
#define GL_MAX_TESS_CONTROL_ATOMIC_COUNTER_BUFFERS_EXT 0x92CD
#define GL_MAX_TESS_EVALUATION_ATOMIC_COUNTER_BUFFERS_EXT 0x92CE
#define GL_MAX_TESS_CONTROL_ATOMIC_COUNTERS_EXT 0x92D3
#define GL_MAX_TESS_EVALUATION_ATOMIC_COUNTERS_EXT 0x92D4
#define GL_MAX_TESS_CONTROL_IMAGE_UNIFORMS_EXT 0x90CB
#define GL_MAX_TESS_EVALUATION_IMAGE_UNIFORMS_EXT 0x90CC
#define GL_MAX_TESS_CONTROL_SHADER_STORAGE_BLOCKS_EXT 0x90D8
#define GL_MAX_TESS_EVALUATION_SHADER_STORAGE_BLOCKS_EXT 0x90D9
#define GL_PRIMITIVE_RESTART_FOR_PATCHES_SUPPORTED 0x8221
#define GL_IS_PER_PATCH_EXT 0x92E7
#define GL_REFERENCED_BY_TESS_CONTROL_SHADER_EXT 0x9307
#define GL_REFERENCED_BY_TESS_EVALUATION_SHADER_EXT 0x9308
#define GL_TESS_CONTROL_SHADER_EXT 0x8E88
#define GL_TESS_EVALUATION_SHADER_EXT 0x8E87
#define GL_TESS_CONTROL_SHADER_BIT_EXT 0x00000008
#define GL_TESS_EVALUATION_SHADER_BIT_EXT 0x00000010
typedef void (GL_APIENTRYP PFNGLPATCHPARAMETERIEXTPROC) (GLenum pname, GLint value);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glPatchParameteriEXT (GLenum pname, GLint value);
#endif
#endif /* GL_EXT_tessellation_shader */
#ifndef GL_EXT_texture_border_clamp
#define GL_EXT_texture_border_clamp 1
#define GL_TEXTURE_BORDER_COLOR_EXT 0x1004
#define GL_CLAMP_TO_BORDER_EXT 0x812D
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIIVEXTPROC) (GLenum target, GLenum pname, const GLint *params);
typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIUIVEXTPROC) (GLenum target, GLenum pname, const GLuint *params);
typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERIIVEXTPROC) (GLenum target, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERIUIVEXTPROC) (GLenum target, GLenum pname, GLuint *params);
typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERIIVEXTPROC) (GLuint sampler, GLenum pname, const GLint *param);
typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERIUIVEXTPROC) (GLuint sampler, GLenum pname, const GLuint *param);
typedef void (GL_APIENTRYP PFNGLGETSAMPLERPARAMETERIIVEXTPROC) (GLuint sampler, GLenum pname, GLint *params);
typedef void (GL_APIENTRYP PFNGLGETSAMPLERPARAMETERIUIVEXTPROC) (GLuint sampler, GLenum pname, GLuint *params);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glTexParameterIivEXT (GLenum target, GLenum pname, const GLint *params);
GL_APICALL void GL_APIENTRY glTexParameterIuivEXT (GLenum target, GLenum pname, const GLuint *params);
GL_APICALL void GL_APIENTRY glGetTexParameterIivEXT (GLenum target, GLenum pname, GLint *params);
GL_APICALL void GL_APIENTRY glGetTexParameterIuivEXT (GLenum target, GLenum pname, GLuint *params);
GL_APICALL void GL_APIENTRY glSamplerParameterIivEXT (GLuint sampler, GLenum pname, const GLint *param);
GL_APICALL void GL_APIENTRY glSamplerParameterIuivEXT (GLuint sampler, GLenum pname, const GLuint *param);
GL_APICALL void GL_APIENTRY glGetSamplerParameterIivEXT (GLuint sampler, GLenum pname, GLint *params);
GL_APICALL void GL_APIENTRY glGetSamplerParameterIuivEXT (GLuint sampler, GLenum pname, GLuint *params);
#endif
#endif /* GL_EXT_texture_border_clamp */
#ifndef GL_EXT_texture_buffer
#define GL_EXT_texture_buffer 1
#define GL_TEXTURE_BUFFER_EXT 0x8C2A
#define GL_TEXTURE_BUFFER_BINDING_EXT 0x8C2A
#define GL_MAX_TEXTURE_BUFFER_SIZE_EXT 0x8C2B
#define GL_TEXTURE_BINDING_BUFFER_EXT 0x8C2C
#define GL_TEXTURE_BUFFER_DATA_STORE_BINDING_EXT 0x8C2D
#define GL_TEXTURE_BUFFER_OFFSET_ALIGNMENT_EXT 0x919F
#define GL_SAMPLER_BUFFER_EXT 0x8DC2
#define GL_INT_SAMPLER_BUFFER_EXT 0x8DD0
#define GL_UNSIGNED_INT_SAMPLER_BUFFER_EXT 0x8DD8
#define GL_IMAGE_BUFFER_EXT 0x9051
#define GL_INT_IMAGE_BUFFER_EXT 0x905C
#define GL_UNSIGNED_INT_IMAGE_BUFFER_EXT 0x9067
#define GL_TEXTURE_BUFFER_OFFSET_EXT 0x919D
#define GL_TEXTURE_BUFFER_SIZE_EXT 0x919E
typedef void (GL_APIENTRYP PFNGLTEXBUFFEREXTPROC) (GLenum target, GLenum internalformat, GLuint buffer);
typedef void (GL_APIENTRYP PFNGLTEXBUFFERRANGEEXTPROC) (GLenum target, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizeiptr size);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glTexBufferEXT (GLenum target, GLenum internalformat, GLuint buffer);
GL_APICALL void GL_APIENTRY glTexBufferRangeEXT (GLenum target, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizeiptr size);
#endif
#endif /* GL_EXT_texture_buffer */
#ifndef GL_EXT_texture_compression_dxt1
#define GL_EXT_texture_compression_dxt1 1
#define GL_COMPRESSED_RGB_S3TC_DXT1_EXT 0x83F0
@@ -1099,6 +1380,19 @@ GL_APICALL void GL_APIENTRY glProgramUniformMatrix4x3fvEXT (GLuint program, GLin
#define GL_COMPRESSED_RGBA_S3TC_DXT5_EXT 0x83F3
#endif /* GL_EXT_texture_compression_s3tc */
#ifndef GL_EXT_texture_cube_map_array
#define GL_EXT_texture_cube_map_array 1
#define GL_TEXTURE_CUBE_MAP_ARRAY_EXT 0x9009
#define GL_TEXTURE_BINDING_CUBE_MAP_ARRAY_EXT 0x900A
#define GL_SAMPLER_CUBE_MAP_ARRAY_EXT 0x900C
#define GL_SAMPLER_CUBE_MAP_ARRAY_SHADOW_EXT 0x900D
#define GL_INT_SAMPLER_CUBE_MAP_ARRAY_EXT 0x900E
#define GL_UNSIGNED_INT_SAMPLER_CUBE_MAP_ARRAY_EXT 0x900F
#define GL_IMAGE_CUBE_MAP_ARRAY_EXT 0x9054
#define GL_INT_IMAGE_CUBE_MAP_ARRAY_EXT 0x905F
#define GL_UNSIGNED_INT_IMAGE_CUBE_MAP_ARRAY_EXT 0x906A
#endif /* GL_EXT_texture_cube_map_array */
#ifndef GL_EXT_texture_filter_anisotropic
#define GL_EXT_texture_filter_anisotropic 1
#define GL_TEXTURE_MAX_ANISOTROPY_EXT 0x84FE
@@ -1161,6 +1455,19 @@ GL_APICALL void GL_APIENTRY glTextureStorage3DEXT (GLuint texture, GLenum target
#define GL_UNSIGNED_INT_2_10_10_10_REV_EXT 0x8368
#endif /* GL_EXT_texture_type_2_10_10_10_REV */
#ifndef GL_EXT_texture_view
#define GL_EXT_texture_view 1
#define GL_TEXTURE_VIEW_MIN_LEVEL_EXT 0x82DB
#define GL_TEXTURE_VIEW_NUM_LEVELS_EXT 0x82DC
#define GL_TEXTURE_VIEW_MIN_LAYER_EXT 0x82DD
#define GL_TEXTURE_VIEW_NUM_LAYERS_EXT 0x82DE
#define GL_TEXTURE_IMMUTABLE_LEVELS 0x82DF
typedef void (GL_APIENTRYP PFNGLTEXTUREVIEWEXTPROC) (GLuint texture, GLenum target, GLuint origtexture, GLenum internalformat, GLuint minlevel, GLuint numlevels, GLuint minlayer, GLuint numlayers);
#ifdef GL_GLEXT_PROTOTYPES
GL_APICALL void GL_APIENTRY glTextureViewEXT (GLuint texture, GLenum target, GLuint origtexture, GLenum internalformat, GLuint minlevel, GLuint numlevels, GLuint minlayer, GLuint numlayers);
#endif
#endif /* GL_EXT_texture_view */
#ifndef GL_EXT_unpack_subimage
#define GL_EXT_unpack_subimage 1
#define GL_UNPACK_ROW_LENGTH_EXT 0x0CF2

View File

@@ -7,4 +7,4 @@
// Projects needing GL/glu.h and GL/glut.h should now
// include these headers independently as glu and glut
// are no longe core parts of mesa
// are no longer core parts of mesa

View File

@@ -0,0 +1,101 @@
/*
* Copyright 2011 Joakim Sindholt <opensource@zhasha.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE. */
#ifndef _D3DADAPTER9_H_
#define _D3DADAPTER9_H_
#include "present.h"
#ifndef __cplusplus
/* Representation of an adapter group, although since this is implemented by
* the driver, it knows nothing about the windowing system it's on */
typedef struct ID3DAdapter9Vtbl
{
/* IUnknown */
HRESULT (WINAPI *QueryInterface)(ID3DAdapter9 *This, REFIID riid, void **ppvObject);
ULONG (WINAPI *AddRef)(ID3DAdapter9 *This);
ULONG (WINAPI *Release)(ID3DAdapter9 *This);
/* ID3DAdapter9 */
HRESULT (WINAPI *GetAdapterIdentifier)(ID3DAdapter9 *This, DWORD Flags, D3DADAPTER_IDENTIFIER9 *pIdentifier);
HRESULT (WINAPI *CheckDeviceType)(ID3DAdapter9 *This, D3DDEVTYPE DevType, D3DFORMAT AdapterFormat, D3DFORMAT BackBufferFormat, BOOL bWindowed);
HRESULT (WINAPI *CheckDeviceFormat)(ID3DAdapter9 *This, D3DDEVTYPE DeviceType, D3DFORMAT AdapterFormat, DWORD Usage, D3DRESOURCETYPE RType, D3DFORMAT CheckFormat);
HRESULT (WINAPI *CheckDeviceMultiSampleType)(ID3DAdapter9 *This, D3DDEVTYPE DeviceType, D3DFORMAT SurfaceFormat, BOOL Windowed, D3DMULTISAMPLE_TYPE MultiSampleType, DWORD *pQualityLevels);
HRESULT (WINAPI *CheckDepthStencilMatch)(ID3DAdapter9 *This, D3DDEVTYPE DeviceType, D3DFORMAT AdapterFormat, D3DFORMAT RenderTargetFormat, D3DFORMAT DepthStencilFormat);
HRESULT (WINAPI *CheckDeviceFormatConversion)(ID3DAdapter9 *This, D3DDEVTYPE DeviceType, D3DFORMAT SourceFormat, D3DFORMAT TargetFormat);
HRESULT (WINAPI *GetDeviceCaps)(ID3DAdapter9 *This, D3DDEVTYPE DeviceType, D3DCAPS9 *pCaps);
HRESULT (WINAPI *CreateDevice)(ID3DAdapter9 *This, UINT RealAdapter, D3DDEVTYPE DeviceType, HWND hFocusWindow, DWORD BehaviorFlags, D3DPRESENT_PARAMETERS *pPresentationParameters, IDirect3D9 *pD3D9, ID3DPresentGroup *pPresentationFactory, IDirect3DDevice9 **ppReturnedDeviceInterface);
HRESULT (WINAPI *CreateDeviceEx)(ID3DAdapter9 *This, UINT RealAdapter, D3DDEVTYPE DeviceType, HWND hFocusWindow, DWORD BehaviorFlags, D3DPRESENT_PARAMETERS *pPresentationParameters, D3DDISPLAYMODEEX *pFullscreenDisplayMode, IDirect3D9Ex *pD3D9Ex, ID3DPresentGroup *pPresentationFactory, IDirect3DDevice9Ex **ppReturnedDeviceInterface);
} ID3DAdapter9Vtbl;
struct ID3DAdapter9
{
ID3DAdapter9Vtbl *lpVtbl;
};
/* IUnknown macros */
#define ID3DAdapter9_QueryInterface(p,a,b) (p)->lpVtbl->QueryInterface(p,a,b)
#define ID3DAdapter9_AddRef(p) (p)->lpVtbl->AddRef(p)
#define ID3DAdapter9_Release(p) (p)->lpVtbl->Release(p)
/* ID3DAdapter9 macros */
#define ID3DAdapter9_GetAdapterIdentifier(p,a,b) (p)->lpVtbl->GetAdapterIdentifier(p,a,b)
#define ID3DAdapter9_CheckDeviceType(p,a,b,c,d) (p)->lpVtbl->CheckDeviceType(p,a,b,c,d)
#define ID3DAdapter9_CheckDeviceFormat(p,a,b,c,d,e) (p)->lpVtbl->CheckDeviceFormat(p,a,b,c,d,e)
#define ID3DAdapter9_CheckDeviceMultiSampleType(p,a,b,c,d,e) (p)->lpVtbl->CheckDeviceMultiSampleType(p,a,b,c,d,e)
#define ID3DAdapter9_CheckDepthStencilMatch(p,a,b,c,d) (p)->lpVtbl->CheckDepthStencilMatch(p,a,b,c,d)
#define ID3DAdapter9_CheckDeviceFormatConversion(p,a,b,c) (p)->lpVtbl->CheckDeviceFormatConversion(p,a,b,c)
#define ID3DAdapter9_GetDeviceCaps(p,a,b) (p)->lpVtbl->GetDeviceCaps(p,a,b)
#define ID3DAdapter9_CreateDevice(p,a,b,c,d,e,f,g,h) (p)->lpVtbl->CreateDevice(p,a,b,c,d,e,f,g,h)
#define ID3DAdapter9_CreateDeviceEx(p,a,b,c,d,e,f,g,h,i) (p)->lpVtbl->CreateDeviceEx(p,a,b,c,d,e,f,g,h,i)
#else /* __cplusplus */
struct ID3DAdapter9 : public IUnknown
{
HRESULT WINAPI GetAdapterIdentifier(DWORD Flags, D3DADAPTER_IDENTIFIER9 *pIdentifier);
HRESULT WINAPI CheckDeviceType(D3DDEVTYPE DevType, D3DFORMAT AdapterFormat, D3DFORMAT BackBufferFormat, BOOL bWindowed);
HRESULT WINAPI CheckDeviceFormat(D3DDEVTYPE DeviceType, D3DFORMAT AdapterFormat, DWORD Usage, D3DRESOURCETYPE RType, D3DFORMAT CheckFormat);
HRESULT WINAPI CheckDeviceMultiSampleType(D3DDEVTYPE DeviceType, D3DFORMAT SurfaceFormat, BOOL Windowed, D3DMULTISAMPLE_TYPE MultiSampleType, DWORD *pQualityLevels);
HRESULT WINAPI CheckDepthStencilMatch(D3DDEVTYPE DeviceType, D3DFORMAT AdapterFormat, D3DFORMAT RenderTargetFormat, D3DFORMAT DepthStencilFormat);
HRESULT WINAPI CheckDeviceFormatConversion(D3DDEVTYPE DeviceType, D3DFORMAT SourceFormat, D3DFORMAT TargetFormat);
HRESULT WINAPI GetDeviceCaps(D3DDEVTYPE DeviceType, D3DCAPS9 *pCaps);
HRESULT WINAPI CreateDevice(UINT RealAdapter, D3DDEVTYPE DeviceType, HWND hFocusWindow, DWORD BehaviorFlags, D3DPRESENT_PARAMETERS *pPresentationParameters, IDirect3D9 *pD3D9, ID3DPresentGroup *pPresentationFactory, IDirect3DDevice9 **ppReturnedDeviceInterface);
HRESULT WINAPI CreateDeviceEx(UINT RealAdapter, D3DDEVTYPE DeviceType, HWND hFocusWindow, DWORD BehaviorFlags, D3DPRESENT_PARAMETERS *pPresentationParameters, D3DDISPLAYMODEEX *pFullscreenDisplayMode, IDirect3D9Ex *pD3D9Ex, ID3DPresentGroup *pPresentationFactory, IDirect3DDevice9Ex **ppReturnedDeviceInterface);
};
#endif /* __cplusplus */
#ifdef __cplusplus
extern "C" {
#endif /* __cplusplus */
/* acquire a const struct D3DAdapter9* structure describing the interface
* queried. See */
const void * WINAPI
D3DAdapter9GetProc( const char *name );
#ifdef __cplusplus
}
#endif /* __cplusplus */
#endif /* _D3DADAPTER9_H_ */

44
include/d3dadapter/drm.h Normal file
View File

@@ -0,0 +1,44 @@
/*
* Copyright 2011 Joakim Sindholt <opensource@zhasha.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE. */
#ifndef _D3DADAPTER9_DRM_H_
#define _D3DADAPTER9_DRM_H_
#include "d3dadapter9.h"
/* query driver support name */
#define D3DADAPTER9DRM_NAME "drm"
/* current version */
#define D3DADAPTER9DRM_MAJOR 0
#define D3DADAPTER9DRM_MINOR 0
struct D3DAdapter9DRM
{
unsigned major_version; /* ABI break */
unsigned minor_version; /* backwards compatible feature additions */
/* NOTE: upon passing an fd to this function, it's now owned by this
function. If this function fails, the fd will be closed here as well */
HRESULT (WINAPI *create_adapter)(int fd, ID3DAdapter9 **ppAdapter);
};
#endif /* _D3DADAPTER9_DRM_H_ */

View File

@@ -0,0 +1,136 @@
/*
* Copyright 2011 Joakim Sindholt <opensource@zhasha.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* on the rights to use, copy, modify, merge, publish, distribute, sub
* license, and/or sell copies of the Software, and to permit persons to whom
* the Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
* THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
* USE OR OTHER DEALINGS IN THE SOFTWARE. */
#ifndef _D3DADAPTER_PRESENT_H_
#define _D3DADAPTER_PRESENT_H_
#include <d3d9.h>
#ifndef D3DOK_WINDOW_OCCLUDED
#define D3DOK_WINDOW_OCCLUDED MAKE_D3DSTATUS(2531)
#endif /* D3DOK_WINDOW_OCCLUDED */
#ifndef __cplusplus
typedef struct ID3DPresent ID3DPresent;
typedef struct ID3DPresentGroup ID3DPresentGroup;
typedef struct ID3DAdapter9 ID3DAdapter9;
typedef struct D3DWindowBuffer D3DWindowBuffer;
/* Presentation backend for drivers to display their brilliant work */
typedef struct ID3DPresentVtbl
{
/* IUnknown */
HRESULT (WINAPI *QueryInterface)(ID3DPresent *This, REFIID riid, void **ppvObject);
ULONG (WINAPI *AddRef)(ID3DPresent *This);
ULONG (WINAPI *Release)(ID3DPresent *This);
/* ID3DPresent */
/* This function initializes the screen and window provided at creation.
* Hence why this should always be called as the one of first things a new
* swap chain does */
HRESULT (WINAPI *SetPresentParameters)(ID3DPresent *This, D3DPRESENT_PARAMETERS *pPresentationParameters, D3DDISPLAYMODEEX *pFullscreenDisplayMode);
/* Make a buffer visible to the window system via dma-buf fd.
* For better compatibility, it must be 32bpp and format ARGB/XRGB */
HRESULT (WINAPI *NewD3DWindowBufferFromDmaBuf)(ID3DPresent *This, int dmaBufFd, int width, int height, int stride, int depth, int bpp, D3DWindowBuffer **out);
HRESULT (WINAPI *DestroyD3DWindowBuffer)(ID3DPresent *This, D3DWindowBuffer *buffer);
/* After presenting a buffer to the window system, the buffer
* may be used as is (no copy of the content) by the window system.
* You must not use a non-released buffer, else the user may see undefined content. */
HRESULT (WINAPI *WaitBufferReleased)(ID3DPresent *This, D3DWindowBuffer *buffer);
HRESULT (WINAPI *FrontBufferCopy)(ID3DPresent *This, D3DWindowBuffer *buffer);
/* It is possible to do partial copy, but impossible to do resizing, which must
* be done by the client after checking the front buffer size */
HRESULT (WINAPI *PresentBuffer)(ID3DPresent *This, D3DWindowBuffer *buffer, HWND hWndOverride, const RECT *pSourceRect, const RECT *pDestRect, const RGNDATA *pDirtyRegion, DWORD Flags);
HRESULT (WINAPI *GetRasterStatus)(ID3DPresent *This, D3DRASTER_STATUS *pRasterStatus);
HRESULT (WINAPI *GetDisplayMode)(ID3DPresent *This, D3DDISPLAYMODEEX *pMode, D3DDISPLAYROTATION *pRotation);
HRESULT (WINAPI *GetPresentStats)(ID3DPresent *This, D3DPRESENTSTATS *pStats);
HRESULT (WINAPI *GetCursorPos)(ID3DPresent *This, POINT *pPoint);
HRESULT (WINAPI *SetCursorPos)(ID3DPresent *This, POINT *pPoint);
/* Cursor size is always 32x32. pBitmap and pHotspot can be NULL. */
HRESULT (WINAPI *SetCursor)(ID3DPresent *This, void *pBitmap, POINT *pHotspot, BOOL bShow);
HRESULT (WINAPI *SetGammaRamp)(ID3DPresent *This, const D3DGAMMARAMP *pRamp, HWND hWndOverride);
HRESULT (WINAPI *GetWindowInfo)(ID3DPresent *This, HWND hWnd, int *width, int *height, int *depth);
} ID3DPresentVtbl;
struct ID3DPresent
{
ID3DPresentVtbl *lpVtbl;
};
/* IUnknown macros */
#define ID3DPresent_QueryInterface(p,a,b) (p)->lpVtbl->QueryInterface(p,a,b)
#define ID3DPresent_AddRef(p) (p)->lpVtbl->AddRef(p)
#define ID3DPresent_Release(p) (p)->lpVtbl->Release(p)
/* ID3DPresent macros */
#define ID3DPresent_GetPresentParameters(p,a) (p)->lpVtbl->GetPresentParameters(p,a)
#define ID3DPresent_SetPresentParameters(p,a,b) (p)->lpVtbl->SetPresentParameters(p,a,b)
#define ID3DPresent_NewD3DWindowBufferFromDmaBuf(p,a,b,c,d,e,f,g) (p)->lpVtbl->NewD3DWindowBufferFromDmaBuf(p,a,b,c,d,e,f,g)
#define ID3DPresent_DestroyD3DWindowBuffer(p,a) (p)->lpVtbl->DestroyD3DWindowBuffer(p,a)
#define ID3DPresent_WaitBufferReleased(p,a) (p)->lpVtbl->WaitBufferReleased(p,a)
#define ID3DPresent_FrontBufferCopy(p,a) (p)->lpVtbl->FrontBufferCopy(p,a)
#define ID3DPresent_PresentBuffer(p,a,b,c,d,e,f) (p)->lpVtbl->PresentBuffer(p,a,b,c,d,e,f)
#define ID3DPresent_GetRasterStatus(p,a) (p)->lpVtbl->GetRasterStatus(p,a)
#define ID3DPresent_GetDisplayMode(p,a,b) (p)->lpVtbl->GetDisplayMode(p,a,b)
#define ID3DPresent_GetPresentStats(p,a) (p)->lpVtbl->GetPresentStats(p,a)
#define ID3DPresent_GetCursorPos(p,a) (p)->lpVtbl->GetCursorPos(p,a)
#define ID3DPresent_SetCursorPos(p,a) (p)->lpVtbl->SetCursorPos(p,a)
#define ID3DPresent_SetCursor(p,a,b,c) (p)->lpVtbl->SetCursor(p,a,b,c)
#define ID3DPresent_SetGammaRamp(p,a,b) (p)->lpVtbl->SetGammaRamp(p,a,b)
#define ID3DPresent_GetWindowInfo(p,a,b,c,d) (p)->lpVtbl->GetWindowSize(p,a,b,c,d)
typedef struct ID3DPresentGroupVtbl
{
/* IUnknown */
HRESULT (WINAPI *QueryInterface)(ID3DPresentGroup *This, REFIID riid, void **ppvObject);
ULONG (WINAPI *AddRef)(ID3DPresentGroup *This);
ULONG (WINAPI *Release)(ID3DPresentGroup *This);
/* ID3DPresentGroup */
/* When creating a device, it's relevant for the driver to know how many
* implicit swap chains to create. It has to create one per monitor in a
* multi-monitor setup */
UINT (WINAPI *GetMultiheadCount)(ID3DPresentGroup *This);
/* returns only the implicit present interfaces */
HRESULT (WINAPI *GetPresent)(ID3DPresentGroup *This, UINT Index, ID3DPresent **ppPresent);
/* used to create additional presentation interfaces along the way */
HRESULT (WINAPI *CreateAdditionalPresent)(ID3DPresentGroup *This, D3DPRESENT_PARAMETERS *pPresentationParameters, ID3DPresent **ppPresent);
void (WINAPI *GetVersion) (ID3DPresentGroup *This, int *major, int *minor);
} ID3DPresentGroupVtbl;
struct ID3DPresentGroup
{
ID3DPresentGroupVtbl *lpVtbl;
};
/* IUnknown macros */
#define ID3DPresentGroup_QueryInterface(p,a,b) (p)->lpVtbl->QueryInterface(p,a,b)
#define ID3DPresentGroup_AddRef(p) (p)->lpVtbl->AddRef(p)
#define ID3DPresentGroup_Release(p) (p)->lpVtbl->Release(p)
/* ID3DPresentGroup */
#define ID3DPresentGroup_GetMultiheadCount(p) (p)->lpVtbl->GetMultiheadCount(p)
#define ID3DPresentGroup_GetPresent(p,a,b) (p)->lpVtbl->GetPresent(p,a,b)
#define ID3DPresentGroup_CreateAdditionalPresent(p,a,b) (p)->lpVtbl->CreateAdditionalPresent(p,a,b)
#define ID3DPresentGroup_GetVersion(p,a,b) (p)->lpVtbl->GetVersion(p,a,b)
#endif /* __cplusplus */
#endif /* _D3DADAPTER_PRESENT_H_ */

View File

@@ -38,6 +38,7 @@ CHIPSET(0x6828, VERDE_6828, VERDE)
CHIPSET(0x6829, VERDE_6829, VERDE)
CHIPSET(0x682A, VERDE_682A, VERDE)
CHIPSET(0x682B, VERDE_682B, VERDE)
CHIPSET(0x682C, VERDE_682C, VERDE)
CHIPSET(0x682D, VERDE_682D, VERDE)
CHIPSET(0x682F, VERDE_682F, VERDE)
CHIPSET(0x6830, VERDE_6830, VERDE)
@@ -54,8 +55,11 @@ CHIPSET(0x6600, OLAND_6600, OLAND)
CHIPSET(0x6601, OLAND_6601, OLAND)
CHIPSET(0x6602, OLAND_6602, OLAND)
CHIPSET(0x6603, OLAND_6603, OLAND)
CHIPSET(0x6604, OLAND_6604, OLAND)
CHIPSET(0x6605, OLAND_6605, OLAND)
CHIPSET(0x6606, OLAND_6606, OLAND)
CHIPSET(0x6607, OLAND_6607, OLAND)
CHIPSET(0x6608, OLAND_6608, OLAND)
CHIPSET(0x6610, OLAND_6610, OLAND)
CHIPSET(0x6611, OLAND_6611, OLAND)
CHIPSET(0x6613, OLAND_6613, OLAND)
@@ -73,6 +77,8 @@ CHIPSET(0x666F, HAINAN_666F, HAINAN)
CHIPSET(0x6640, BONAIRE_6640, BONAIRE)
CHIPSET(0x6641, BONAIRE_6641, BONAIRE)
CHIPSET(0x6646, BONAIRE_6646, BONAIRE)
CHIPSET(0x6647, BONAIRE_6647, BONAIRE)
CHIPSET(0x6649, BONAIRE_6649, BONAIRE)
CHIPSET(0x6650, BONAIRE_6650, BONAIRE)
CHIPSET(0x6651, BONAIRE_6651, BONAIRE)
@@ -132,6 +138,7 @@ CHIPSET(0x1313, KAVERI_1313, KAVERI)
CHIPSET(0x1315, KAVERI_1315, KAVERI)
CHIPSET(0x1316, KAVERI_1316, KAVERI)
CHIPSET(0x1317, KAVERI_1317, KAVERI)
CHIPSET(0x1318, KAVERI_1318, KAVERI)
CHIPSET(0x131B, KAVERI_131B, KAVERI)
CHIPSET(0x131C, KAVERI_131C, KAVERI)
CHIPSET(0x131D, KAVERI_131D, KAVERI)

View File

@@ -3,9 +3,9 @@
if BUILD_SHARED
if HAVE_COMPAT_SYMLINKS
all-local : .libs/install-mesa-links
all-local : .install-mesa-links
.libs/install-mesa-links : $(lib_LTLIBRARIES)
.install-mesa-links : $(lib_LTLIBRARIES)
$(AM_V_GEN)$(MKDIR_P) $(top_builddir)/$(LIB_DIR); \
for f in $(join $(addsuffix .libs/,$(dir $(lib_LTLIBRARIES))),$(notdir $(lib_LTLIBRARIES:%.la=%.$(LIB_EXT)*))); do \
if test -h .libs/$$f; then \
@@ -14,5 +14,9 @@ all-local : .libs/install-mesa-links
ln -f $$f $(top_builddir)/$(LIB_DIR); \
fi; \
done && touch $@
clean-local:
$(RM) .install-mesa-links
endif
endif

78
m4/ax_check_gnu_make.m4 Normal file
View File

@@ -0,0 +1,78 @@
# ===========================================================================
# http://www.gnu.org/software/autoconf-archive/ax_check_gnu_make.html
# ===========================================================================
#
# SYNOPSIS
#
# AX_CHECK_GNU_MAKE()
#
# DESCRIPTION
#
# This macro searches for a GNU version of make. If a match is found, the
# makefile variable `ifGNUmake' is set to the empty string, otherwise it
# is set to "#". This is useful for including a special features in a
# Makefile, which cannot be handled by other versions of make. The
# variable _cv_gnu_make_command is set to the command to invoke GNU make
# if it exists, the empty string otherwise.
#
# Here is an example of its use:
#
# Makefile.in might contain:
#
# # A failsafe way of putting a dependency rule into a makefile
# $(DEPEND):
# $(CC) -MM $(srcdir)/*.c > $(DEPEND)
#
# @ifGNUmake@ ifeq ($(DEPEND),$(wildcard $(DEPEND)))
# @ifGNUmake@ include $(DEPEND)
# @ifGNUmake@ endif
#
# Then configure.in would normally contain:
#
# AX_CHECK_GNU_MAKE()
# AC_OUTPUT(Makefile)
#
# Then perhaps to cause gnu make to override any other make, we could do
# something like this (note that GNU make always looks for GNUmakefile
# first):
#
# if ! test x$_cv_gnu_make_command = x ; then
# mv Makefile GNUmakefile
# echo .DEFAULT: > Makefile ;
# echo \ $_cv_gnu_make_command \$@ >> Makefile;
# fi
#
# Then, if any (well almost any) other make is called, and GNU make also
# exists, then the other make wraps the GNU make.
#
# LICENSE
#
# Copyright (c) 2008 John Darrington <j.darrington@elvis.murdoch.edu.au>
#
# Copying and distribution of this file, with or without modification, are
# permitted in any medium without royalty provided the copyright notice
# and this notice are preserved. This file is offered as-is, without any
# warranty.
#serial 7
AC_DEFUN([AX_CHECK_GNU_MAKE], [ AC_CACHE_CHECK( for GNU make,_cv_gnu_make_command,
_cv_gnu_make_command='' ;
dnl Search all the common names for GNU make
for a in "$MAKE" make gmake gnumake ; do
if test -z "$a" ; then continue ; fi ;
if ( sh -c "$a --version" 2> /dev/null | grep GNU 2>&1 > /dev/null ) ; then
_cv_gnu_make_command=$a ;
break;
fi
done ;
) ;
dnl If there was a GNU version, then set @ifGNUmake@ to the empty string, '#' otherwise
if test "x$_cv_gnu_make_command" != "x" ; then
ifGNUmake='' ;
else
ifGNUmake='#' ;
AC_MSG_RESULT("Not found");
fi
AC_SUBST(ifGNUmake)
] )

223
m4/ax_gcc_func_attribute.m4 Normal file
View File

@@ -0,0 +1,223 @@
# ===========================================================================
# http://www.gnu.org/software/autoconf-archive/ax_gcc_func_attribute.html
# ===========================================================================
#
# SYNOPSIS
#
# AX_GCC_FUNC_ATTRIBUTE(ATTRIBUTE)
#
# DESCRIPTION
#
# This macro checks if the compiler supports one of GCC's function
# attributes; many other compilers also provide function attributes with
# the same syntax. Compiler warnings are used to detect supported
# attributes as unsupported ones are ignored by default so quieting
# warnings when using this macro will yield false positives.
#
# The ATTRIBUTE parameter holds the name of the attribute to be checked.
#
# If ATTRIBUTE is supported define HAVE_FUNC_ATTRIBUTE_<ATTRIBUTE>.
#
# The macro caches its result in the ax_cv_have_func_attribute_<attribute>
# variable.
#
# The macro currently supports the following function attributes:
#
# alias
# aligned
# alloc_size
# always_inline
# artificial
# cold
# const
# constructor
# deprecated
# destructor
# dllexport
# dllimport
# error
# externally_visible
# flatten
# format
# format_arg
# gnu_inline
# hot
# ifunc
# leaf
# malloc
# noclone
# noinline
# nonnull
# noreturn
# nothrow
# optimize
# packed
# pure
# unused
# used
# visibility
# warning
# warn_unused_result
# weak
# weakref
#
# Unsuppored function attributes will be tested with a prototype returning
# an int and not accepting any arguments and the result of the check might
# be wrong or meaningless so use with care.
#
# LICENSE
#
# Copyright (c) 2013 Gabriele Svelto <gabriele.svelto@gmail.com>
#
# Copying and distribution of this file, with or without modification, are
# permitted in any medium without royalty provided the copyright notice
# and this notice are preserved. This file is offered as-is, without any
# warranty.
#serial 2
AC_DEFUN([AX_GCC_FUNC_ATTRIBUTE], [
AS_VAR_PUSHDEF([ac_var], [ax_cv_have_func_attribute_$1])
AC_CACHE_CHECK([for __attribute__(($1))], [ac_var], [
AC_LINK_IFELSE([AC_LANG_PROGRAM([
m4_case([$1],
[alias], [
int foo( void ) { return 0; }
int bar( void ) __attribute__(($1("foo")));
],
[aligned], [
int foo( void ) __attribute__(($1(32)));
],
[alloc_size], [
void *foo(int a) __attribute__(($1(1)));
],
[always_inline], [
inline __attribute__(($1)) int foo( void ) { return 0; }
],
[artificial], [
inline __attribute__(($1)) int foo( void ) { return 0; }
],
[cold], [
int foo( void ) __attribute__(($1));
],
[const], [
int foo( void ) __attribute__(($1));
],
[constructor], [
int foo( void ) __attribute__(($1));
],
[deprecated], [
int foo( void ) __attribute__(($1("")));
],
[destructor], [
int foo( void ) __attribute__(($1));
],
[dllexport], [
__attribute__(($1)) int foo( void ) { return 0; }
],
[dllimport], [
int foo( void ) __attribute__(($1));
],
[error], [
int foo( void ) __attribute__(($1("")));
],
[externally_visible], [
int foo( void ) __attribute__(($1));
],
[flatten], [
int foo( void ) __attribute__(($1));
],
[format], [
int foo(const char *p, ...) __attribute__(($1(printf, 1, 2)));
],
[format_arg], [
char *foo(const char *p) __attribute__(($1(1)));
],
[gnu_inline], [
inline __attribute__(($1)) int foo( void ) { return 0; }
],
[hot], [
int foo( void ) __attribute__(($1));
],
[ifunc], [
int my_foo( void ) { return 0; }
static int (*resolve_foo(void))(void) { return my_foo; }
int foo( void ) __attribute__(($1("resolve_foo")));
],
[leaf], [
__attribute__(($1)) int foo( void ) { return 0; }
],
[malloc], [
void *foo( void ) __attribute__(($1));
],
[noclone], [
int foo( void ) __attribute__(($1));
],
[noinline], [
__attribute__(($1)) int foo( void ) { return 0; }
],
[nonnull], [
int foo(char *p) __attribute__(($1(1)));
],
[noreturn], [
void foo( void ) __attribute__(($1));
],
[nothrow], [
int foo( void ) __attribute__(($1));
],
[optimize], [
__attribute__(($1(3))) int foo( void ) { return 0; }
],
[packed], [
struct __attribute__(($1)) foo { int bar; };
],
[pure], [
int foo( void ) __attribute__(($1));
],
[unused], [
int foo( void ) __attribute__(($1));
],
[used], [
int foo( void ) __attribute__(($1));
],
[visibility], [
int foo_def( void ) __attribute__(($1("default")));
int foo_hid( void ) __attribute__(($1("hidden")));
int foo_int( void ) __attribute__(($1("internal")));
int foo_pro( void ) __attribute__(($1("protected")));
],
[warning], [
int foo( void ) __attribute__(($1("")));
],
[warn_unused_result], [
int foo( void ) __attribute__(($1));
],
[weak], [
int foo( void ) __attribute__(($1));
],
[weakref], [
static int foo( void ) { return 0; }
static int bar( void ) __attribute__(($1("foo")));
],
[
m4_warn([syntax], [Unsupported attribute $1, the test may fail])
int foo( void ) __attribute__(($1));
]
)], [])
],
dnl GCC doesn't exit with an error if an unknown attribute is
dnl provided but only outputs a warning, so accept the attribute
dnl only if no warning were issued.
[AS_IF([test -s conftest.err],
[AS_VAR_SET([ac_var], [no])],
[AS_VAR_SET([ac_var], [yes])])],
[AS_VAR_SET([ac_var], [no])])
])
AS_IF([test yes = AS_VAR_GET([ac_var])],
[AC_DEFINE_UNQUOTED(AS_TR_CPP(HAVE_FUNC_ATTRIBUTE_$1), 1,
[Define to 1 if the system has the `$1' function attribute])], [])
AS_VAR_POPDEF([ac_var])
])

View File

@@ -301,6 +301,10 @@ def generate(env):
cppdefines += ['HAVE_ALIAS']
else:
cppdefines += ['GLX_ALIAS_UNSUPPORTED']
if env['platform'] in ('linux', 'darwin'):
cppdefines += ['HAVE_XLOCALE_H']
if env['platform'] == 'haiku':
cppdefines += [
'HAVE_PTHREAD',
@@ -529,6 +533,10 @@ def generate(env):
else:
env['_LIBFLAGS'] = '-Wl,--start-group ' + env['_LIBFLAGS'] + ' -Wl,--end-group'
if env['platform'] == 'windows':
linkflags += [
'-Wl,--nxcompat', # DEP
'-Wl,--dynamicbase', # ASLR
]
# Avoid depending on gcc runtime DLLs
linkflags += ['-static-libgcc']
if 'w64' in env['CC'].split('-'):
@@ -547,6 +555,8 @@ def generate(env):
linkflags += [
'/fixed:no',
'/incremental:no',
'/dynamicbase', # ASLR
'/nxcompat', # DEP
]
env.Append(LINKFLAGS = linkflags)
env.Append(SHLINKFLAGS = shlinkflags)
@@ -577,6 +587,30 @@ def generate(env):
env.Append(CCFLAGS = ['-fopenmp'])
env.Append(LIBS = ['gomp'])
if gcc_compat:
ccversion = env['CCVERSION']
cppdefines += [
'HAVE___BUILTIN_EXPECT',
'HAVE___BUILTIN_FFS',
'HAVE___BUILTIN_FFSLL',
'HAVE_FUNC_ATTRIBUTE_FLATTEN',
]
if distutils.version.LooseVersion(ccversion) >= distutils.version.LooseVersion('3'):
cppdefines += [
'HAVE_FUNC_ATTRIBUTE_FORMAT',
'HAVE_FUNC_ATTRIBUTE_PACKED',
]
if distutils.version.LooseVersion(ccversion) >= distutils.version.LooseVersion('3.4'):
cppdefines += [
'HAVE___BUILTIN_CTZ',
'HAVE___BUILTIN_POPCOUNT',
'HAVE___BUILTIN_POPCOUNTLL',
'HAVE___BUILTIN_CLZ',
'HAVE___BUILTIN_CLZLL',
]
if distutils.version.LooseVersion(ccversion) >= distutils.version.LooseVersion('4.5'):
cppdefines += ['HAVE___BUILTIN_UNREACHABLE']
# Load tools
env.Tool('lex')
env.Tool('yacc')
@@ -587,7 +621,7 @@ def generate(env):
env.Tool('custom')
createInstallMethods(env)
env.PkgCheckModules('X11', ['x11', 'xext', 'xdamage', 'xfixes'])
env.PkgCheckModules('X11', ['x11', 'xext', 'xdamage', 'xfixes', 'glproto >= 1.4.13'])
env.PkgCheckModules('XCB', ['x11-xcb', 'xcb-glx >= 1.8.1', 'xcb-dri2 >= 1.8'])
env.PkgCheckModules('XF86VIDMODE', ['xxf86vm'])
env.PkgCheckModules('DRM', ['libdrm >= 2.4.38'])

View File

@@ -37,7 +37,7 @@ import SCons.Errors
import SCons.Util
required_llvm_version = '3.1'
required_llvm_version = '3.3'
def generate(env):
@@ -98,7 +98,7 @@ def generate(env):
'HAVE_STDINT_H',
])
env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])
if llvm_version >= distutils.version.LooseVersion('3.2'):
if True:
# 3.2
env.Prepend(LIBS = [
'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',
@@ -110,18 +110,6 @@ def generate(env):
'LLVMAnalysis', 'LLVMTarget', 'LLVMMC', 'LLVMCore',
'LLVMSupport', 'LLVMRuntimeDyld', 'LLVMObject'
])
else:
# 3.1
env.Prepend(LIBS = [
'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',
'LLVMX86CodeGen', 'LLVMX86Desc', 'LLVMSelectionDAG',
'LLVMAsmPrinter', 'LLVMMCParser', 'LLVMX86AsmPrinter',
'LLVMX86Utils', 'LLVMX86Info', 'LLVMMCJIT', 'LLVMJIT',
'LLVMExecutionEngine', 'LLVMCodeGen', 'LLVMScalarOpts',
'LLVMInstCombine', 'LLVMTransformUtils', 'LLVMipa',
'LLVMAnalysis', 'LLVMTarget', 'LLVMMC', 'LLVMCore',
'LLVMSupport'
])
env.Append(LIBS = [
'imagehlp',
'psapi',

View File

@@ -543,6 +543,10 @@ dri2_setup_screen(_EGLDisplay *disp)
}
}
/* All platforms but DRM call this function to create the screen, query the
* dri extensions, setup the vtables and populate the driver_configs.
* DRM inherits all that information from its display - GBM.
*/
EGLBoolean
dri2_create_screen(_EGLDisplay *disp)
{
@@ -666,6 +670,7 @@ static EGLBoolean
dri2_terminate(_EGLDriver *drv, _EGLDisplay *disp)
{
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
unsigned i;
_eglReleaseDisplayResources(drv, disp);
_eglCleanupDisplay(disp);
@@ -706,6 +711,15 @@ dri2_terminate(_EGLDriver *drv, _EGLDisplay *disp)
break;
}
/* The drm platform does not create the screen/driver_configs but reuses
* the ones from the gbm device. As such the gbm itself is responsible
* for the cleanup.
*/
if (disp->Platform != _EGL_PLATFORM_DRM) {
for (i = 0; dri2_dpy->driver_configs[i]; i++)
free((__DRIconfig *) dri2_dpy->driver_configs[i]);
free(dri2_dpy->driver_configs);
}
free(dri2_dpy);
disp->DriverData = NULL;

View File

@@ -184,6 +184,7 @@ struct dri2_egl_display
#ifdef HAVE_X11_PLATFORM
xcb_connection_t *conn;
int screen;
#endif
#ifdef HAVE_WAYLAND_PLATFORM

View File

@@ -352,7 +352,7 @@ dri2_drm_get_buffers(__DRIdrawable * driDrawable,
const unsigned int format = 32;
int i;
attachments_with_format = calloc(count * 2, sizeof(unsigned int));
attachments_with_format = calloc(count, 2 * sizeof(unsigned int));
if (!attachments_with_format) {
*out_count = 0;
return NULL;
@@ -418,6 +418,14 @@ dri2_drm_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw)
for (i = 0; i < ARRAY_SIZE(dri2_surf->color_buffers); i++)
if (dri2_surf->color_buffers[i].age > 0)
dri2_surf->color_buffers[i].age++;
/* Make sure we have a back buffer in case we're swapping without
* ever rendering. */
if (get_back_bo(dri2_surf) < 0) {
_eglError(EGL_BAD_ALLOC, "dri2_swap_buffers");
return EGL_FALSE;
}
dri2_surf->current = dri2_surf->back;
dri2_surf->current->age = 1;
dri2_surf->back = NULL;
@@ -660,15 +668,21 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
for (i = 0; dri2_dpy->driver_configs[i]; i++) {
EGLint format, attr_list[3];
unsigned int mask;
unsigned int red, alpha;
dri2_dpy->core->getConfigAttrib(dri2_dpy->driver_configs[i],
__DRI_ATTRIB_RED_MASK, &mask);
if (mask == 0x3ff00000)
__DRI_ATTRIB_RED_MASK, &red);
dri2_dpy->core->getConfigAttrib(dri2_dpy->driver_configs[i],
__DRI_ATTRIB_ALPHA_MASK, &alpha);
if (red == 0x3ff00000 && alpha == 0x00000000)
format = GBM_FORMAT_XRGB2101010;
else if (mask == 0x00ff0000)
else if (red == 0x3ff00000 && alpha == 0xc0000000)
format = GBM_FORMAT_ARGB2101010;
else if (red == 0x00ff0000 && alpha == 0x00000000)
format = GBM_FORMAT_XRGB8888;
else if (mask == 0xf800)
else if (red == 0x00ff0000 && alpha == 0xff000000)
format = GBM_FORMAT_ARGB8888;
else if (red == 0xf800)
format = GBM_FORMAT_RGB565;
else
continue;
@@ -681,6 +695,7 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
i + 1, EGL_WINDOW_BIT, attr_list, NULL);
}
disp->Extensions.KHR_image_pixmap = EGL_TRUE;
if (dri2_dpy->dri2)
disp->Extensions.EXT_buffer_age = EGL_TRUE;

View File

@@ -468,7 +468,7 @@ dri2_wl_get_buffers(__DRIdrawable * driDrawable,
const unsigned int format = 32;
int i;
attachments_with_format = calloc(count * 2, sizeof(unsigned int));
attachments_with_format = calloc(count, 2 * sizeof(unsigned int));
if (!attachments_with_format) {
*out_count = 0;
return NULL;

View File

@@ -49,8 +49,7 @@ dri2_x11_swap_interval(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf,
static void
swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
struct dri2_egl_surface * dri2_surf,
int depth)
struct dri2_egl_surface * dri2_surf)
{
uint32_t mask;
const uint32_t function = GXcopy;
@@ -66,8 +65,7 @@ swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
valgc[0] = function;
valgc[1] = False;
xcb_create_gc(dri2_dpy->conn, dri2_surf->swapgc, dri2_surf->drawable, mask, valgc);
dri2_surf->depth = depth;
switch (depth) {
switch (dri2_surf->depth) {
case 32:
case 24:
dri2_surf->bytes_per_pixel = 4;
@@ -82,7 +80,7 @@ swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
dri2_surf->bytes_per_pixel = 0;
break;
default:
_eglLog(_EGL_WARNING, "unsupported depth %d", depth);
_eglLog(_EGL_WARNING, "unsupported depth %d", dri2_surf->depth);
}
}
@@ -178,6 +176,17 @@ swrastGetImage(__DRIdrawable * read,
}
static xcb_screen_t *
get_xcb_screen(xcb_screen_iterator_t iter, int screen)
{
for (; iter.rem; --screen, xcb_screen_next(&iter))
if (screen == 0)
return iter.data;
return NULL;
}
/**
* Called via eglCreateWindowSurface(), drv->API.CreateWindowSurface().
*/
@@ -194,6 +203,7 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
xcb_screen_iterator_t s;
xcb_generic_error_t *error;
xcb_drawable_t drawable;
xcb_screen_t *screen;
STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface));
drawable = (uintptr_t) native_surface;
@@ -211,10 +221,16 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
dri2_surf->region = XCB_NONE;
if (type == EGL_PBUFFER_BIT) {
dri2_surf->drawable = xcb_generate_id(dri2_dpy->conn);
s = xcb_setup_roots_iterator(xcb_get_setup(dri2_dpy->conn));
screen = get_xcb_screen(s, dri2_dpy->screen);
if (!screen) {
_eglError(EGL_BAD_NATIVE_WINDOW, "dri2_create_surface");
goto cleanup_surf;
}
dri2_surf->drawable = xcb_generate_id(dri2_dpy->conn);
xcb_create_pixmap(dri2_dpy->conn, conf->BufferSize,
dri2_surf->drawable, s.data->root,
dri2_surf->drawable, screen->root,
dri2_surf->base.Width, dri2_surf->base.Height);
} else {
dri2_surf->drawable = drawable;
@@ -239,12 +255,6 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
_eglError(EGL_BAD_ALLOC, "dri2->createNewDrawable");
goto cleanup_pixmap;
}
if (dri2_dpy->dri2) {
xcb_dri2_create_drawable (dri2_dpy->conn, dri2_surf->drawable);
} else {
swrastCreateDrawable(dri2_dpy, dri2_surf, _eglGetConfigKey(conf, EGL_BUFFER_SIZE));
}
if (type != EGL_PBUFFER_BIT) {
cookie = xcb_get_geometry (dri2_dpy->conn, dri2_surf->drawable);
@@ -257,9 +267,19 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
dri2_surf->base.Width = reply->width;
dri2_surf->base.Height = reply->height;
dri2_surf->depth = reply->depth;
free(reply);
}
if (dri2_dpy->dri2) {
xcb_dri2_create_drawable (dri2_dpy->conn, dri2_surf->drawable);
} else {
if (type == EGL_PBUFFER_BIT) {
dri2_surf->depth = _eglGetConfigKey(conf, EGL_BUFFER_SIZE);
}
swrastCreateDrawable(dri2_dpy, dri2_surf);
}
/* we always copy the back buffer to front */
dri2_surf->base.PostSubBufferSupportedNV = EGL_TRUE;
@@ -492,6 +512,7 @@ dri2_x11_connect(struct dri2_egl_display *dri2_dpy)
xcb_dri2_connect_cookie_t connect_cookie;
xcb_generic_error_t *error;
xcb_screen_iterator_t s;
xcb_screen_t *screen;
char *driver_name, *device_name;
const xcb_query_extension_reply_t *extension;
@@ -515,9 +536,13 @@ dri2_x11_connect(struct dri2_egl_display *dri2_dpy)
XCB_DRI2_MINOR_VERSION);
s = xcb_setup_roots_iterator(xcb_get_setup(dri2_dpy->conn));
connect_cookie = xcb_dri2_connect_unchecked (dri2_dpy->conn,
s.data->root,
XCB_DRI2_DRIVER_TYPE_DRI);
screen = get_xcb_screen(s, dri2_dpy->screen);
if (!screen) {
_eglError(EGL_BAD_NATIVE_WINDOW, "dri2_x11_connect");
return EGL_FALSE;
}
connect_cookie = xcb_dri2_connect_unchecked(dri2_dpy->conn, screen->root,
XCB_DRI2_DRIVER_TYPE_DRI);
xfixes_query =
xcb_xfixes_query_version_reply (dri2_dpy->conn,
@@ -577,11 +602,19 @@ dri2_x11_authenticate(_EGLDisplay *disp, uint32_t id)
xcb_dri2_authenticate_reply_t *authenticate;
xcb_dri2_authenticate_cookie_t authenticate_cookie;
xcb_screen_iterator_t s;
xcb_screen_t *screen;
int ret = 0;
s = xcb_setup_roots_iterator(xcb_get_setup(dri2_dpy->conn));
screen = get_xcb_screen(s, dri2_dpy->screen);
if (!screen) {
_eglError(EGL_BAD_NATIVE_WINDOW, "dri2_x11_authenticate");
return -1;
}
authenticate_cookie =
xcb_dri2_authenticate_unchecked(dri2_dpy->conn, s.data->root, id);
xcb_dri2_authenticate_unchecked(dri2_dpy->conn, screen->root, id);
authenticate =
xcb_dri2_authenticate_reply(dri2_dpy->conn, authenticate_cookie, NULL);
@@ -630,7 +663,7 @@ dri2_x11_add_configs_for_visuals(struct dri2_egl_display *dri2_dpy,
};
s = xcb_setup_roots_iterator(xcb_get_setup(dri2_dpy->conn));
d = xcb_screen_allowed_depths_iterator(s.data);
d = xcb_screen_allowed_depths_iterator(get_xcb_screen(s, dri2_dpy->screen));
id = 1;
surface_type =
@@ -1065,10 +1098,13 @@ dri2_initialize_x11_swrast(_EGLDriver *drv, _EGLDisplay *disp)
disp->DriverData = (void *) dri2_dpy;
if (disp->PlatformDisplay == NULL) {
dri2_dpy->conn = xcb_connect(0, 0);
dri2_dpy->conn = xcb_connect(0, &dri2_dpy->screen);
dri2_dpy->own_device = true;
} else {
dri2_dpy->conn = XGetXCBConnection((Display *) disp->PlatformDisplay);
Display *dpy = disp->PlatformDisplay;
dri2_dpy->conn = XGetXCBConnection(dpy);
dri2_dpy->screen = DefaultScreen(dpy);
}
if (xcb_connection_has_error(dri2_dpy->conn)) {
@@ -1185,10 +1221,13 @@ dri2_initialize_x11_dri2(_EGLDriver *drv, _EGLDisplay *disp)
disp->DriverData = (void *) dri2_dpy;
if (disp->PlatformDisplay == NULL) {
dri2_dpy->conn = xcb_connect(0, 0);
dri2_dpy->conn = xcb_connect(0, &dri2_dpy->screen);
dri2_dpy->own_device = true;
} else {
dri2_dpy->conn = XGetXCBConnection((Display *) disp->PlatformDisplay);
Display *dpy = disp->PlatformDisplay;
dri2_dpy->conn = XGetXCBConnection(dpy);
dri2_dpy->screen = DefaultScreen(dpy);
}
if (xcb_connection_has_error(dri2_dpy->conn)) {

View File

@@ -143,6 +143,7 @@ LOCAL_STATIC_LIBRARIES := \
libmesa_st_egl \
$(gallium_DRIVERS) \
libmesa_st_mesa \
libmesa_util \
libmesa_glsl \
libmesa_glsl_utils \
libmesa_gallium \

View File

@@ -28,7 +28,7 @@ AM_CFLAGS = \
$(VISIBILITY_CFLAGS) \
$(EGL_CFLAGS) \
-D_EGL_NATIVE_PLATFORM=$(EGL_NATIVE_PLATFORM) \
-D_EGL_DRIVER_SEARCH_DIR=\"$(EGL_DRIVER_INSTALL_DIR)\" \
-D_EGL_DRIVER_SEARCH_DIR=\"$(libdir)/egl\" \
-D_EGL_OS_UNIX=1
lib_LTLIBRARIES = libEGL.la

View File

@@ -517,19 +517,6 @@ _eglAddUserDriver(void)
}
/**
* Add egl_gallium to the module array.
*/
static void
_eglAddGalliumDriver(void)
{
#ifndef _EGL_BUILT_IN_DRIVER_GALLIUM
void *external = (void *) "egl_gallium";
_eglPreloadForEach(_eglGetSearchPath(), _eglLoaderFile, external);
#endif
}
/**
* Add built-in drivers to the module array.
*/
@@ -562,7 +549,6 @@ _eglAddDrivers(void)
* Add other drivers only when EGL_DRIVER is not set. The order here
* decides the priorities.
*/
_eglAddGalliumDriver();
_eglAddBuiltInDrivers();
}

View File

@@ -16,6 +16,7 @@ GALLIUM_DRIVER_CFLAGS = \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/gallium/drivers \
-I$(top_srcdir)/src/gallium/winsys \
$(DEFINES) \
$(VISIBILITY_CFLAGS)
@@ -26,6 +27,7 @@ GALLIUM_DRIVER_CXXFLAGS = \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/gallium/drivers \
-I$(top_srcdir)/src/gallium/winsys \
$(DEFINES) \
$(VISIBILITY_CXXFLAGS)
@@ -56,7 +58,8 @@ GALLIUM_WINSYS_CFLAGS = \
GALLIUM_PIPE_LOADER_WINSYS_LIBS = \
$(top_builddir)/src/gallium/winsys/sw/null/libws_null.la
$(top_builddir)/src/gallium/winsys/sw/null/libws_null.la \
$(top_builddir)/src/gallium/winsys/sw/wrapper/libwsw.la
if HAVE_DRISW
GALLIUM_PIPE_LOADER_WINSYS_LIBS += \

View File

@@ -68,11 +68,11 @@ SUBDIRS += winsys/radeon/drm
endif
## swrast/softpipe
if NEED_GALLIUM_SOFTPIPE_DRIVER
if HAVE_GALLIUM_SOFTPIPE
SUBDIRS += drivers/softpipe
## swrast/llvmpipe
if NEED_GALLIUM_LLVMPIPE_DRIVER
if HAVE_GALLIUM_LLVMPIPE
SUBDIRS += drivers/llvmpipe
endif
endif
@@ -105,16 +105,23 @@ if HAVE_EGL_PLATFORM_WAYLAND
SUBDIRS += winsys/sw/wayland
endif
if NEED_WINSYS_WRAPPER
SUBDIRS += winsys/sw/wrapper
endif
##
## Don't forget to bundle the remaining (non autotools) winsys'
##
EXTRA_DIST = \
winsys/sw/android \
winsys/sw/gdi \
winsys/sw/hgl
##
## Gallium state trackers and their users (targets)
##
if NEED_GALLIUM_LOADER
if HAVE_LOADER_GALLIUM
SUBDIRS += targets/pipe-loader
endif
@@ -131,14 +138,6 @@ if HAVE_OPENVG
SUBDIRS += state_trackers/vega
endif
if HAVE_GALLIUM_EGL
SUBDIRS += state_trackers/egl targets/egl-static
endif
if HAVE_GALLIUM_GBM
SUBDIRS += state_trackers/gbm targets/gbm
endif
if HAVE_X11_DRIVER
SUBDIRS += state_trackers/glx/xlib targets/libgl-xlib
endif
@@ -151,6 +150,10 @@ if HAVE_GALLIUM_OSMESA
SUBDIRS += state_trackers/osmesa targets/osmesa
endif
if HAVE_ST_VA
SUBDIRS += state_trackers/va targets/va
endif
if HAVE_ST_VDPAU
SUBDIRS += state_trackers/vdpau targets/vdpau
endif
@@ -163,6 +166,22 @@ if HAVE_ST_XVMC
SUBDIRS += state_trackers/xvmc targets/xvmc
endif
if HAVE_ST_NINE
SUBDIRS += state_trackers/nine targets/d3dadapter9
endif
##
## Don't forget to bundle the remaining (non autotools) state-trackers/targets
##
EXTRA_DIST += \
state_trackers/README \
state_trackers/wgl targets/libgl-gdi \
targets/graw-gdi targets/graw-null targets/graw-xlib \
state_trackers/hgl targets/haiku-softpipe \
tools
##
## Gallium tests
##
@@ -172,3 +191,7 @@ SUBDIRS += \
tests/trivial \
tests/unit
endif
EXTRA_DIST += \
tests/graw \
tests/python

View File

@@ -85,6 +85,7 @@ if not env['embedded']:
if env['platform'] == 'haiku':
SConscript([
'state_trackers/hgl/SConscript',
'targets/haiku-softpipe/SConscript',
])

View File

@@ -30,7 +30,9 @@ include $(CLEAR_VARS)
LOCAL_SRC_FILES := $(C_SOURCES)
LOCAL_C_INCLUDES := $(GALLIUM_TOP)/auxiliary/util
LOCAL_C_INCLUDES := \
$(GALLIUM_TOP)/auxiliary/util \
$(MESA_TOP)/src
LOCAL_MODULE := libmesa_gallium

View File

@@ -1,5 +1,9 @@
AUTOMAKE_OPTIONS = subdir-objects
if HAVE_LOADER_GALLIUM
SUBDIRS := pipe-loader
endif
include Makefile.sources
include $(top_srcdir)/src/gallium/Automake.inc

View File

@@ -1,5 +1,3 @@
SUBDIRS := pipe-loader
C_SOURCES := \
cso_cache/cso_cache.c \
cso_cache/cso_context.c \
@@ -77,6 +75,7 @@ C_SOURCES := \
tgsi/tgsi_exec.c \
tgsi/tgsi_info.c \
tgsi/tgsi_iterate.c \
tgsi/tgsi_lowering.c \
tgsi/tgsi_parse.c \
tgsi/tgsi_sanity.c \
tgsi/tgsi_scan.c \

View File

@@ -239,18 +239,8 @@ static void cso_init_vbuf(struct cso_context *cso)
{
struct u_vbuf_caps caps;
u_vbuf_get_caps(cso->pipe->screen, &caps);
/* Install u_vbuf if there is anything unsupported. */
if (!caps.buffer_offset_unaligned ||
!caps.buffer_stride_unaligned ||
!caps.velem_src_offset_unaligned ||
!caps.format_fixed32 ||
!caps.format_float16 ||
!caps.format_float64 ||
!caps.format_norm32 ||
!caps.format_scaled32 ||
!caps.user_vertex_buffers) {
if (u_vbuf_get_caps(cso->pipe->screen, &caps)) {
cso->vbuf = u_vbuf_create(cso->pipe, &caps,
cso->aux_vertex_buffer_index);
}

View File

@@ -81,7 +81,8 @@ draw_get_option_use_llvm(void)
* Create new draw module context with gallivm state for LLVM JIT.
*/
static struct draw_context *
draw_create_context(struct pipe_context *pipe, boolean try_llvm)
draw_create_context(struct pipe_context *pipe, void *context,
boolean try_llvm)
{
struct draw_context *draw = CALLOC_STRUCT( draw_context );
if (draw == NULL)
@@ -92,9 +93,7 @@ draw_create_context(struct pipe_context *pipe, boolean try_llvm)
#if HAVE_LLVM
if (try_llvm && draw_get_option_use_llvm()) {
draw->llvm = draw_llvm_create(draw);
if (!draw->llvm)
goto err_destroy;
draw->llvm = draw_llvm_create(draw, (LLVMContextRef)context);
}
#endif
@@ -122,17 +121,26 @@ err_out:
struct draw_context *
draw_create(struct pipe_context *pipe)
{
return draw_create_context(pipe, TRUE);
return draw_create_context(pipe, NULL, TRUE);
}
#if HAVE_LLVM
struct draw_context *
draw_create_with_llvm_context(struct pipe_context *pipe,
void *context)
{
return draw_create_context(pipe, context, TRUE);
}
#endif
/**
* Create a new draw context, without LLVM JIT.
*/
struct draw_context *
draw_create_no_llvm(struct pipe_context *pipe)
{
return draw_create_context(pipe, FALSE);
return draw_create_context(pipe, NULL, FALSE);
}

View File

@@ -64,6 +64,11 @@ struct draw_so_target {
struct draw_context *draw_create( struct pipe_context *pipe );
#if HAVE_LLVM
struct draw_context *draw_create_with_llvm_context(struct pipe_context *pipe,
void *context);
#endif
struct draw_context *draw_create_no_llvm(struct pipe_context *pipe);
void draw_destroy( struct draw_context *draw );

View File

@@ -64,7 +64,7 @@ draw_gs_get_input_index(int semantic, int index,
* We execute geometry shaders in the SOA mode, so ideally we want to
* flush when the number of currently fetched primitives is equal to
* the number of elements in the SOA vector. This ensures that the
* throughput is optimized for the given vector instrunction set.
* throughput is optimized for the given vector instruction set.
*/
static INLINE boolean
draw_gs_should_flush(struct draw_geometry_shader *shader)
@@ -90,7 +90,7 @@ tgsi_fetch_gs_outputs(struct draw_geometry_shader *shader,
for (prim_idx = 0; prim_idx < num_primitives; ++prim_idx) {
unsigned num_verts_per_prim = machine->Primitives[prim_idx];
shader->primitive_lengths[prim_idx + shader->emitted_primitives] =
shader->primitive_lengths[prim_idx + shader->emitted_primitives] =
machine->Primitives[prim_idx];
shader->emitted_vertices += num_verts_per_prim;
for (j = 0; j < num_verts_per_prim; j++, current_idx++) {
@@ -110,7 +110,6 @@ tgsi_fetch_gs_outputs(struct draw_geometry_shader *shader,
output[slot][2],
output[slot][3]);
#endif
debug_assert(!util_is_inf_or_nan(output[slot][0]));
}
output = (float (*)[4])((char *)output + shader->vertex_size);
}
@@ -751,9 +750,6 @@ draw_create_geometry_shader(struct draw_context *draw,
tgsi_scan_shader(state->tokens, &gs->info);
/* setup the defaults */
gs->input_primitive = PIPE_PRIM_TRIANGLES;
gs->output_primitive = PIPE_PRIM_TRIANGLE_STRIP;
gs->max_output_vertices = 32;
gs->max_out_prims = 0;
#ifdef HAVE_LLVM
@@ -769,17 +765,15 @@ draw_create_geometry_shader(struct draw_context *draw,
gs->vector_length = 1;
}
for (i = 0; i < gs->info.num_properties; ++i) {
if (gs->info.properties[i].name ==
TGSI_PROPERTY_GS_INPUT_PRIM)
gs->input_primitive = gs->info.properties[i].data[0];
else if (gs->info.properties[i].name ==
TGSI_PROPERTY_GS_OUTPUT_PRIM)
gs->output_primitive = gs->info.properties[i].data[0];
else if (gs->info.properties[i].name ==
TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES)
gs->max_output_vertices = gs->info.properties[i].data[0];
}
gs->input_primitive =
gs->info.properties[TGSI_PROPERTY_GS_INPUT_PRIM];
gs->output_primitive =
gs->info.properties[TGSI_PROPERTY_GS_OUTPUT_PRIM];
gs->max_output_vertices =
gs->info.properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES];
if (!gs->max_output_vertices)
gs->max_output_vertices = 32;
/* Primitive boundary is bigger than max_output_vertices by one, because
* the specification says that the geometry shader should exit if the
* number of emitted vertices is bigger or equal to max_output_vertices and

View File

@@ -480,18 +480,27 @@ get_vertex_header_ptr_type(struct draw_llvm_variant *variant)
* Create per-context LLVM info.
*/
struct draw_llvm *
draw_llvm_create(struct draw_context *draw)
draw_llvm_create(struct draw_context *draw, LLVMContextRef context)
{
struct draw_llvm *llvm;
if (!lp_build_init())
return NULL;
llvm = CALLOC_STRUCT( draw_llvm );
if (!llvm)
return NULL;
lp_build_init();
llvm->draw = draw;
llvm->context = context;
if (!llvm->context) {
llvm->context = LLVMContextCreate();
llvm->context_owned = true;
}
if (!llvm->context)
goto fail;
llvm->nr_variants = 0;
make_empty_list(&llvm->vs_variants_list);
@@ -499,6 +508,10 @@ draw_llvm_create(struct draw_context *draw)
make_empty_list(&llvm->gs_variants_list);
return llvm;
fail:
draw_llvm_destroy(llvm);
return NULL;
}
@@ -508,6 +521,10 @@ draw_llvm_create(struct draw_context *draw)
void
draw_llvm_destroy(struct draw_llvm *llvm)
{
if (llvm->context_owned)
LLVMContextDispose(llvm->context);
llvm->context = NULL;
/* XXX free other draw_llvm data? */
FREE(llvm);
}
@@ -539,7 +556,7 @@ draw_llvm_create_variant(struct draw_llvm *llvm,
util_snprintf(module_name, sizeof(module_name), "draw_llvm_vs_variant%u",
variant->shader->variants_cached);
variant->gallivm = gallivm_create(module_name);
variant->gallivm = gallivm_create(module_name, llvm->context);
create_jit_types(variant);
@@ -588,16 +605,12 @@ generate_vs(struct draw_llvm_variant *variant,
draw_jit_context_vs_constants(variant->gallivm, context_ptr);
LLVMValueRef num_consts_ptr =
draw_jit_context_num_vs_constants(variant->gallivm, context_ptr);
struct lp_build_sampler_soa *sampler = 0;
if (gallivm_debug & (GALLIVM_DEBUG_TGSI | GALLIVM_DEBUG_IR)) {
tgsi_dump(tokens, 0);
draw_llvm_dump_variant_key(&variant->key);
}
if (llvm->draw->num_sampler_views && llvm->draw->num_samplers)
sampler = draw_sampler;
lp_build_tgsi_soa(variant->gallivm,
tokens,
vs_type,
@@ -607,7 +620,7 @@ generate_vs(struct draw_llvm_variant *variant,
system_values,
inputs,
outputs,
sampler,
draw_sampler,
&llvm->draw->vs.vertex_shader->info,
NULL);
@@ -645,7 +658,8 @@ generate_fetch(struct gallivm_state *gallivm,
struct pipe_vertex_element *velem,
LLVMValueRef vbuf,
LLVMValueRef index,
LLVMValueRef instance_id)
LLVMValueRef instance_id,
LLVMValueRef start_instance)
{
const struct util_format_description *format_desc =
util_format_description(velem->src_format);
@@ -675,11 +689,11 @@ generate_fetch(struct gallivm_state *gallivm,
* index = start_instance + (instance_id / divisor)
*/
LLVMValueRef current_instance;
index = lp_build_const_int32(gallivm, draw->start_instance);
current_instance = LLVMBuildUDiv(builder, instance_id,
lp_build_const_int32(gallivm, velem->instance_divisor),
"instance_divisor");
index = lp_build_uadd_overflow(gallivm, index, current_instance, &ofbit);
index = lp_build_uadd_overflow(gallivm, start_instance,
current_instance, &ofbit);
}
stride = lp_build_umul_overflow(gallivm, vb_stride, index, &ofbit);
@@ -1473,7 +1487,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant,
struct gallivm_state *gallivm = variant->gallivm;
LLVMContextRef context = gallivm->context;
LLVMTypeRef int32_type = LLVMInt32TypeInContext(context);
LLVMTypeRef arg_types[10];
LLVMTypeRef arg_types[11];
unsigned num_arg_types =
elts ? Elements(arg_types) : Elements(arg_types) - 1;
LLVMTypeRef func_type;
@@ -1484,7 +1498,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant,
struct lp_type vs_type;
LLVMValueRef end, start;
LLVMValueRef count, fetch_elts, fetch_elt_max, fetch_count;
LLVMValueRef vertex_id_offset;
LLVMValueRef vertex_id_offset, start_instance;
LLVMValueRef stride, step, io_itr;
LLVMValueRef io_ptr, vbuffers_ptr, vb_ptr;
LLVMValueRef zero = lp_build_const_int32(gallivm, 0);
@@ -1533,6 +1547,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant,
arg_types[i++] = get_vb_ptr_type(variant); /* pipe_vertex_buffer's */
arg_types[i++] = int32_type; /* instance_id */
arg_types[i++] = int32_type; /* vertex_id_offset */
arg_types[i++] = int32_type; /* start_instance */
func_type = LLVMFunctionType(int32_type, arg_types, num_arg_types, 0);
@@ -1556,6 +1571,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant,
vb_ptr = LLVMGetParam(variant_func, 6 + (elts ? 1 : 0));
system_values.instance_id = LLVMGetParam(variant_func, 7 + (elts ? 1 : 0));
vertex_id_offset = LLVMGetParam(variant_func, 8 + (elts ? 1 : 0));
start_instance = LLVMGetParam(variant_func, 9 + (elts ? 1 : 0));
lp_build_name(context_ptr, "context");
lp_build_name(io_ptr, "io");
@@ -1564,6 +1580,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant,
lp_build_name(vb_ptr, "vb");
lp_build_name(system_values.instance_id, "instance_id");
lp_build_name(vertex_id_offset, "vertex_id_offset");
lp_build_name(start_instance, "start_instance");
if (elts) {
fetch_elts = LLVMGetParam(variant_func, 3);
@@ -1712,7 +1729,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct draw_llvm_variant *variant,
LLVMValueRef vb = LLVMBuildGEP(builder, vb_ptr, &vb_index, 1, "");
generate_fetch(gallivm, draw, vbuffers_ptr,
&aos_attribs[j][i], velem, vb, true_index,
system_values.instance_id);
system_values.instance_id, start_instance);
}
}
convert_to_soa(gallivm, aos_attribs, inputs,
@@ -2194,7 +2211,7 @@ draw_gs_llvm_create_variant(struct draw_llvm *llvm,
util_snprintf(module_name, sizeof(module_name), "draw_llvm_gs_variant%u",
variant->shader->variants_cached);
variant->gallivm = gallivm_create(module_name);
variant->gallivm = gallivm_create(module_name, llvm->context);
create_gs_jit_types(variant);

View File

@@ -274,7 +274,8 @@ typedef int
unsigned stride,
struct pipe_vertex_buffer *vertex_buffers,
unsigned instance_id,
unsigned vertex_id_offset);
unsigned vertex_id_offset,
unsigned start_instance);
typedef int
@@ -287,7 +288,8 @@ typedef int
unsigned stride,
struct pipe_vertex_buffer *vertex_buffers,
unsigned instance_id,
unsigned vertex_id_offset);
unsigned vertex_id_offset,
unsigned start_instance);
typedef int
@@ -459,6 +461,9 @@ struct llvm_geometry_shader {
struct draw_llvm {
struct draw_context *draw;
LLVMContextRef context;
boolean context_owned;
struct draw_jit_context jit_context;
struct draw_gs_jit_context gs_jit_context;
@@ -486,7 +491,7 @@ llvm_geometry_shader(struct draw_geometry_shader *gs)
struct draw_llvm *
draw_llvm_create(struct draw_context *draw);
draw_llvm_create(struct draw_context *draw, LLVMContextRef llvm_context);
void
draw_llvm_destroy(struct draw_llvm *llvm);

View File

@@ -140,7 +140,6 @@ struct aa_transform_context {
int freeSampler; /** an available sampler for the pstipple */
int maxInput, maxGeneric; /**< max input index found */
int colorTemp, texTemp; /**< temp registers */
boolean firstInstruction;
};
@@ -196,149 +195,106 @@ free_bit(uint bitfield)
}
/**
* TGSI transform prolog callback.
*/
static void
aa_transform_prolog(struct tgsi_transform_context *ctx)
{
struct aa_transform_context *aactx = (struct aa_transform_context *) ctx;
uint i;
/* find free sampler */
aactx->freeSampler = free_bit(aactx->samplersUsed);
if (aactx->freeSampler >= PIPE_MAX_SAMPLERS)
aactx->freeSampler = PIPE_MAX_SAMPLERS - 1;
/* find two free temp regs */
for (i = 0; i < 32; i++) {
if ((aactx->tempsUsed & (1 << i)) == 0) {
/* found a free temp */
if (aactx->colorTemp < 0)
aactx->colorTemp = i;
else if (aactx->texTemp < 0)
aactx->texTemp = i;
else
break;
}
}
assert(aactx->colorTemp >= 0);
assert(aactx->texTemp >= 0);
/* declare new generic input/texcoord */
tgsi_transform_input_decl(ctx, aactx->maxInput + 1,
TGSI_SEMANTIC_GENERIC, aactx->maxGeneric + 1,
TGSI_INTERPOLATE_LINEAR);
/* declare new sampler */
tgsi_transform_sampler_decl(ctx, aactx->freeSampler);
/* declare new temp regs */
tgsi_transform_temp_decl(ctx, aactx->texTemp);
tgsi_transform_temp_decl(ctx, aactx->colorTemp);
}
/**
* TGSI transform epilog callback.
*/
static void
aa_transform_epilog(struct tgsi_transform_context *ctx)
{
struct aa_transform_context *aactx = (struct aa_transform_context *) ctx;
if (aactx->colorOutput != -1) {
/* insert texture sampling code for antialiasing. */
/* TEX texTemp, input_coord, sampler */
tgsi_transform_tex_2d_inst(ctx,
TGSI_FILE_TEMPORARY, aactx->texTemp,
TGSI_FILE_INPUT, aactx->maxInput + 1,
aactx->freeSampler);
/* MOV rgb */
tgsi_transform_op1_inst(ctx, TGSI_OPCODE_MOV,
TGSI_FILE_OUTPUT, aactx->colorOutput,
TGSI_WRITEMASK_XYZ,
TGSI_FILE_TEMPORARY, aactx->colorTemp);
/* MUL alpha */
tgsi_transform_op2_inst(ctx, TGSI_OPCODE_MUL,
TGSI_FILE_OUTPUT, aactx->colorOutput,
TGSI_WRITEMASK_W,
TGSI_FILE_TEMPORARY, aactx->colorTemp,
TGSI_FILE_TEMPORARY, aactx->texTemp);
}
}
/**
* TGSI instruction transform callback.
* Replace writes to result.color w/ a temp reg.
* Upon END instruction, insert texture sampling code for antialiasing.
*/
static void
aa_transform_inst(struct tgsi_transform_context *ctx,
struct tgsi_full_instruction *inst)
{
struct aa_transform_context *aactx = (struct aa_transform_context *) ctx;
uint i;
if (aactx->firstInstruction) {
/* emit our new declarations before the first instruction */
struct tgsi_full_declaration decl;
uint i;
/* find free sampler */
aactx->freeSampler = free_bit(aactx->samplersUsed);
if (aactx->freeSampler >= PIPE_MAX_SAMPLERS)
aactx->freeSampler = PIPE_MAX_SAMPLERS - 1;
/* find two free temp regs */
for (i = 0; i < 32; i++) {
if ((aactx->tempsUsed & (1 << i)) == 0) {
/* found a free temp */
if (aactx->colorTemp < 0)
aactx->colorTemp = i;
else if (aactx->texTemp < 0)
aactx->texTemp = i;
else
break;
}
/*
* Look for writes to result.color and replace with colorTemp reg.
*/
for (i = 0; i < inst->Instruction.NumDstRegs; i++) {
struct tgsi_full_dst_register *dst = &inst->Dst[i];
if (dst->Register.File == TGSI_FILE_OUTPUT &&
dst->Register.Index == aactx->colorOutput) {
dst->Register.File = TGSI_FILE_TEMPORARY;
dst->Register.Index = aactx->colorTemp;
}
assert(aactx->colorTemp >= 0);
assert(aactx->texTemp >= 0);
/* declare new generic input/texcoord */
decl = tgsi_default_full_declaration();
decl.Declaration.File = TGSI_FILE_INPUT;
/* XXX this could be linear... */
decl.Declaration.Interpolate = 1;
decl.Declaration.Semantic = 1;
decl.Semantic.Name = TGSI_SEMANTIC_GENERIC;
decl.Semantic.Index = aactx->maxGeneric + 1;
decl.Range.First =
decl.Range.Last = aactx->maxInput + 1;
decl.Interp.Interpolate = TGSI_INTERPOLATE_PERSPECTIVE;
ctx->emit_declaration(ctx, &decl);
/* declare new sampler */
decl = tgsi_default_full_declaration();
decl.Declaration.File = TGSI_FILE_SAMPLER;
decl.Range.First =
decl.Range.Last = aactx->freeSampler;
ctx->emit_declaration(ctx, &decl);
/* declare new temp regs */
decl = tgsi_default_full_declaration();
decl.Declaration.File = TGSI_FILE_TEMPORARY;
decl.Range.First =
decl.Range.Last = aactx->texTemp;
ctx->emit_declaration(ctx, &decl);
decl = tgsi_default_full_declaration();
decl.Declaration.File = TGSI_FILE_TEMPORARY;
decl.Range.First =
decl.Range.Last = aactx->colorTemp;
ctx->emit_declaration(ctx, &decl);
aactx->firstInstruction = FALSE;
}
if (inst->Instruction.Opcode == TGSI_OPCODE_END &&
aactx->colorOutput != -1) {
struct tgsi_full_instruction newInst;
/* TEX */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_TEX;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = aactx->texTemp;
newInst.Instruction.NumSrcRegs = 2;
newInst.Instruction.Texture = TRUE;
newInst.Texture.Texture = TGSI_TEXTURE_2D;
newInst.Src[0].Register.File = TGSI_FILE_INPUT;
newInst.Src[0].Register.Index = aactx->maxInput + 1;
newInst.Src[1].Register.File = TGSI_FILE_SAMPLER;
newInst.Src[1].Register.Index = aactx->freeSampler;
ctx->emit_instruction(ctx, &newInst);
/* MOV rgb */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_MOV;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_OUTPUT;
newInst.Dst[0].Register.Index = aactx->colorOutput;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZ;
newInst.Instruction.NumSrcRegs = 1;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = aactx->colorTemp;
ctx->emit_instruction(ctx, &newInst);
/* MUL alpha */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_MUL;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_OUTPUT;
newInst.Dst[0].Register.Index = aactx->colorOutput;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_W;
newInst.Instruction.NumSrcRegs = 2;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = aactx->colorTemp;
newInst.Src[1].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[1].Register.Index = aactx->texTemp;
ctx->emit_instruction(ctx, &newInst);
/* END */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_END;
newInst.Instruction.NumDstRegs = 0;
newInst.Instruction.NumSrcRegs = 0;
ctx->emit_instruction(ctx, &newInst);
}
else {
/* Not an END instruction.
* Look for writes to result.color and replace with colorTemp reg.
*/
uint i;
for (i = 0; i < inst->Instruction.NumDstRegs; i++) {
struct tgsi_full_dst_register *dst = &inst->Dst[i];
if (dst->Register.File == TGSI_FILE_OUTPUT &&
dst->Register.Index == aactx->colorOutput) {
dst->Register.File = TGSI_FILE_TEMPORARY;
dst->Register.Index = aactx->colorTemp;
}
}
ctx->emit_instruction(ctx, inst);
}
ctx->emit_instruction(ctx, inst);
}
@@ -366,7 +322,8 @@ generate_aaline_fs(struct aaline_stage *aaline)
transform.maxGeneric = -1;
transform.colorTemp = -1;
transform.texTemp = -1;
transform.firstInstruction = TRUE;
transform.base.prolog = aa_transform_prolog;
transform.base.epilog = aa_transform_epilog;
transform.base.transform_instruction = aa_transform_inst;
transform.base.transform_declaration = aa_transform_decl;

View File

@@ -121,7 +121,6 @@ struct aa_transform_context {
int colorOutput; /**< which output is the primary color */
int maxInput, maxGeneric; /**< max input index found */
int tmp0, colorTemp; /**< temp registers */
boolean firstInstruction;
};
@@ -161,325 +160,188 @@ aa_transform_decl(struct tgsi_transform_context *ctx,
/**
* TGSI instruction transform callback.
* TGSI transform callback.
* Insert new declarations and instructions before first instruction.
*/
static void
aa_transform_prolog(struct tgsi_transform_context *ctx)
{
/* emit our new declarations before the first instruction */
struct aa_transform_context *aactx = (struct aa_transform_context *) ctx;
struct tgsi_full_instruction newInst;
const int texInput = aactx->maxInput + 1;
int tmp0;
uint i;
/* find two free temp regs */
for (i = 0; i < 32; i++) {
if ((aactx->tempsUsed & (1 << i)) == 0) {
/* found a free temp */
if (aactx->tmp0 < 0)
aactx->tmp0 = i;
else if (aactx->colorTemp < 0)
aactx->colorTemp = i;
else
break;
}
}
assert(aactx->colorTemp != aactx->tmp0);
tmp0 = aactx->tmp0;
/* declare new generic input/texcoord */
tgsi_transform_input_decl(ctx, texInput,
TGSI_SEMANTIC_GENERIC, aactx->maxGeneric + 1,
TGSI_INTERPOLATE_LINEAR);
/* declare new temp regs */
tgsi_transform_temp_decl(ctx, tmp0);
tgsi_transform_temp_decl(ctx, aactx->colorTemp);
/*
* Emit code to compute fragment coverage, kill if outside point radius
*
* Temp reg0 usage:
* t0.x = distance of fragment from center point
* t0.y = boolean, is t0.x > 1.0, also misc temp usage
* t0.z = temporary for computing 1/(1-k) value
* t0.w = final coverage value
*/
/* MUL t0.xy, tex, tex; # compute x^2, y^2 */
tgsi_transform_op2_inst(ctx, TGSI_OPCODE_MUL,
TGSI_FILE_TEMPORARY, tmp0, TGSI_WRITEMASK_XY,
TGSI_FILE_INPUT, texInput,
TGSI_FILE_INPUT, texInput);
/* ADD t0.x, t0.x, t0.y; # x^2 + y^2 */
tgsi_transform_op2_swz_inst(ctx, TGSI_OPCODE_ADD,
TGSI_FILE_TEMPORARY, tmp0, TGSI_WRITEMASK_X,
TGSI_FILE_TEMPORARY, tmp0, TGSI_SWIZZLE_X,
TGSI_FILE_TEMPORARY, tmp0, TGSI_SWIZZLE_Y);
#if NORMALIZE /* OPTIONAL normalization of length */
/* RSQ t0.x, t0.x; */
tgsi_transform_op1_inst(ctx, TGSI_OPCODE_RSQ,
TGSI_FILE_TEMPORARY, tmp0, TGSI_WRITEMASK_X,
TGSI_FILE_TEMPORARY, tmp0);
/* RCP t0.x, t0.x; */
tgsi_transform_op1_inst(ctx, TGSI_OPCODE_RCP,
TGSI_FILE_TEMPORARY, tmp0, TGSI_WRITEMASK_X,
TGSI_FILE_TEMPORARY, tmp0);
#endif
/* SGT t0.y, t0.xxxx, tex.wwww; # bool b = d > 1 (NOTE tex.w == 1) */
tgsi_transform_op2_swz_inst(ctx, TGSI_OPCODE_SGT,
TGSI_FILE_TEMPORARY, tmp0, TGSI_WRITEMASK_Y,
TGSI_FILE_TEMPORARY, tmp0, TGSI_SWIZZLE_X,
TGSI_FILE_INPUT, texInput, TGSI_SWIZZLE_W);
/* KILL_IF -tmp0.yyyy; # if -tmp0.y < 0, KILL */
tgsi_transform_kill_inst(ctx, TGSI_FILE_TEMPORARY, tmp0, TGSI_SWIZZLE_Y);
/* compute coverage factor = (1-d)/(1-k) */
/* SUB t0.z, tex.w, tex.z; # m = 1 - k */
tgsi_transform_op2_swz_inst(ctx, TGSI_OPCODE_SUB,
TGSI_FILE_TEMPORARY, tmp0, TGSI_WRITEMASK_Z,
TGSI_FILE_INPUT, texInput, TGSI_SWIZZLE_W,
TGSI_FILE_INPUT, texInput, TGSI_SWIZZLE_Z);
/* RCP t0.z, t0.z; # t0.z = 1 / m */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_RCP;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = tmp0;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_Z;
newInst.Instruction.NumSrcRegs = 1;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = tmp0;
newInst.Src[0].Register.SwizzleX = TGSI_SWIZZLE_Z;
ctx->emit_instruction(ctx, &newInst);
/* SUB t0.y, 1, t0.x; # d = 1 - d */
tgsi_transform_op2_swz_inst(ctx, TGSI_OPCODE_SUB,
TGSI_FILE_TEMPORARY, tmp0, TGSI_WRITEMASK_Y,
TGSI_FILE_INPUT, texInput, TGSI_SWIZZLE_W,
TGSI_FILE_TEMPORARY, tmp0, TGSI_SWIZZLE_X);
/* MUL t0.w, t0.y, t0.z; # coverage = d * m */
tgsi_transform_op2_swz_inst(ctx, TGSI_OPCODE_MUL,
TGSI_FILE_TEMPORARY, tmp0, TGSI_WRITEMASK_W,
TGSI_FILE_TEMPORARY, tmp0, TGSI_SWIZZLE_Y,
TGSI_FILE_TEMPORARY, tmp0, TGSI_SWIZZLE_Z);
/* SLE t0.y, t0.x, tex.z; # bool b = distance <= k */
tgsi_transform_op2_swz_inst(ctx, TGSI_OPCODE_SLE,
TGSI_FILE_TEMPORARY, tmp0, TGSI_WRITEMASK_Y,
TGSI_FILE_TEMPORARY, tmp0, TGSI_SWIZZLE_X,
TGSI_FILE_INPUT, texInput, TGSI_SWIZZLE_Z);
/* CMP t0.w, -t0.y, tex.w, t0.w;
* # if -t0.y < 0 then
* t0.w = 1
* else
* t0.w = t0.w
*/
tgsi_transform_op3_swz_inst(ctx, TGSI_OPCODE_CMP,
TGSI_FILE_TEMPORARY, tmp0, TGSI_WRITEMASK_W,
TGSI_FILE_TEMPORARY, tmp0, TGSI_SWIZZLE_Y, 1,
TGSI_FILE_INPUT, texInput, TGSI_SWIZZLE_W,
TGSI_FILE_TEMPORARY, tmp0, TGSI_SWIZZLE_W);
}
/**
* TGSI transform callback.
* Insert new instructions before the END instruction.
*/
static void
aa_transform_epilog(struct tgsi_transform_context *ctx)
{
struct aa_transform_context *aactx = (struct aa_transform_context *) ctx;
/* add alpha modulation code at tail of program */
/* MOV result.color.xyz, colorTemp; */
tgsi_transform_op1_inst(ctx, TGSI_OPCODE_MOV,
TGSI_FILE_OUTPUT, aactx->colorOutput,
TGSI_WRITEMASK_XYZ,
TGSI_FILE_TEMPORARY, aactx->colorTemp);
/* MUL result.color.w, colorTemp, tmp0.w; */
tgsi_transform_op2_inst(ctx, TGSI_OPCODE_MUL,
TGSI_FILE_OUTPUT, aactx->colorOutput,
TGSI_WRITEMASK_W,
TGSI_FILE_TEMPORARY, aactx->colorTemp,
TGSI_FILE_TEMPORARY, aactx->tmp0);
}
/**
* TGSI transform callback.
* Called per instruction.
* Replace writes to result.color w/ a temp reg.
* Upon END instruction, insert texture sampling code for antialiasing.
*/
static void
aa_transform_inst(struct tgsi_transform_context *ctx,
struct tgsi_full_instruction *inst)
{
struct aa_transform_context *aactx = (struct aa_transform_context *) ctx;
struct tgsi_full_instruction newInst;
unsigned i;
if (aactx->firstInstruction) {
/* emit our new declarations before the first instruction */
struct tgsi_full_declaration decl;
const int texInput = aactx->maxInput + 1;
int tmp0;
uint i;
/* find two free temp regs */
for (i = 0; i < 32; i++) {
if ((aactx->tempsUsed & (1 << i)) == 0) {
/* found a free temp */
if (aactx->tmp0 < 0)
aactx->tmp0 = i;
else if (aactx->colorTemp < 0)
aactx->colorTemp = i;
else
break;
}
}
assert(aactx->colorTemp != aactx->tmp0);
tmp0 = aactx->tmp0;
/* declare new generic input/texcoord */
decl = tgsi_default_full_declaration();
decl.Declaration.File = TGSI_FILE_INPUT;
/* XXX this could be linear... */
decl.Declaration.Interpolate = 1;
decl.Declaration.Semantic = 1;
decl.Semantic.Name = TGSI_SEMANTIC_GENERIC;
decl.Semantic.Index = aactx->maxGeneric + 1;
decl.Range.First =
decl.Range.Last = texInput;
decl.Interp.Interpolate = TGSI_INTERPOLATE_PERSPECTIVE;
ctx->emit_declaration(ctx, &decl);
/* declare new temp regs */
decl = tgsi_default_full_declaration();
decl.Declaration.File = TGSI_FILE_TEMPORARY;
decl.Range.First =
decl.Range.Last = tmp0;
ctx->emit_declaration(ctx, &decl);
decl = tgsi_default_full_declaration();
decl.Declaration.File = TGSI_FILE_TEMPORARY;
decl.Range.First =
decl.Range.Last = aactx->colorTemp;
ctx->emit_declaration(ctx, &decl);
aactx->firstInstruction = FALSE;
/*
* Emit code to compute fragment coverage, kill if outside point radius
*
* Temp reg0 usage:
* t0.x = distance of fragment from center point
* t0.y = boolean, is t0.x > 1.0, also misc temp usage
* t0.z = temporary for computing 1/(1-k) value
* t0.w = final coverage value
*/
/* MUL t0.xy, tex, tex; # compute x^2, y^2 */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_MUL;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = tmp0;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XY;
newInst.Instruction.NumSrcRegs = 2;
newInst.Src[0].Register.File = TGSI_FILE_INPUT;
newInst.Src[0].Register.Index = texInput;
newInst.Src[1].Register.File = TGSI_FILE_INPUT;
newInst.Src[1].Register.Index = texInput;
ctx->emit_instruction(ctx, &newInst);
/* ADD t0.x, t0.x, t0.y; # x^2 + y^2 */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_ADD;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = tmp0;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_X;
newInst.Instruction.NumSrcRegs = 2;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = tmp0;
newInst.Src[0].Register.SwizzleX = TGSI_SWIZZLE_X;
newInst.Src[1].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[1].Register.Index = tmp0;
newInst.Src[1].Register.SwizzleX = TGSI_SWIZZLE_Y;
ctx->emit_instruction(ctx, &newInst);
#if NORMALIZE /* OPTIONAL normalization of length */
/* RSQ t0.x, t0.x; */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_RSQ;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = tmp0;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_X;
newInst.Instruction.NumSrcRegs = 1;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = tmp0;
ctx->emit_instruction(ctx, &newInst);
/* RCP t0.x, t0.x; */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_RCP;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = tmp0;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_X;
newInst.Instruction.NumSrcRegs = 1;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = tmp0;
ctx->emit_instruction(ctx, &newInst);
#endif
/* SGT t0.y, t0.xxxx, tex.wwww; # bool b = d > 1 (NOTE tex.w == 1) */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_SGT;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = tmp0;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_Y;
newInst.Instruction.NumSrcRegs = 2;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = tmp0;
newInst.Src[0].Register.SwizzleY = TGSI_SWIZZLE_X;
newInst.Src[1].Register.File = TGSI_FILE_INPUT;
newInst.Src[1].Register.Index = texInput;
newInst.Src[1].Register.SwizzleY = TGSI_SWIZZLE_W;
ctx->emit_instruction(ctx, &newInst);
/* KILL_IF -tmp0.yyyy; # if -tmp0.y < 0, KILL */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_KILL_IF;
newInst.Instruction.NumDstRegs = 0;
newInst.Instruction.NumSrcRegs = 1;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = tmp0;
newInst.Src[0].Register.SwizzleX = TGSI_SWIZZLE_Y;
newInst.Src[0].Register.SwizzleY = TGSI_SWIZZLE_Y;
newInst.Src[0].Register.SwizzleZ = TGSI_SWIZZLE_Y;
newInst.Src[0].Register.SwizzleW = TGSI_SWIZZLE_Y;
newInst.Src[0].Register.Negate = 1;
ctx->emit_instruction(ctx, &newInst);
/* compute coverage factor = (1-d)/(1-k) */
/* SUB t0.z, tex.w, tex.z; # m = 1 - k */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_SUB;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = tmp0;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_Z;
newInst.Instruction.NumSrcRegs = 2;
newInst.Src[0].Register.File = TGSI_FILE_INPUT;
newInst.Src[0].Register.Index = texInput;
newInst.Src[0].Register.SwizzleZ = TGSI_SWIZZLE_W;
newInst.Src[1].Register.File = TGSI_FILE_INPUT;
newInst.Src[1].Register.Index = texInput;
newInst.Src[1].Register.SwizzleZ = TGSI_SWIZZLE_Z;
ctx->emit_instruction(ctx, &newInst);
/* RCP t0.z, t0.z; # t0.z = 1 / m */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_RCP;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = tmp0;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_Z;
newInst.Instruction.NumSrcRegs = 1;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = tmp0;
newInst.Src[0].Register.SwizzleX = TGSI_SWIZZLE_Z;
ctx->emit_instruction(ctx, &newInst);
/* SUB t0.y, 1, t0.x; # d = 1 - d */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_SUB;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = tmp0;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_Y;
newInst.Instruction.NumSrcRegs = 2;
newInst.Src[0].Register.File = TGSI_FILE_INPUT;
newInst.Src[0].Register.Index = texInput;
newInst.Src[0].Register.SwizzleY = TGSI_SWIZZLE_W;
newInst.Src[1].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[1].Register.Index = tmp0;
newInst.Src[1].Register.SwizzleY = TGSI_SWIZZLE_X;
ctx->emit_instruction(ctx, &newInst);
/* MUL t0.w, t0.y, t0.z; # coverage = d * m */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_MUL;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = tmp0;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_W;
newInst.Instruction.NumSrcRegs = 2;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = tmp0;
newInst.Src[0].Register.SwizzleW = TGSI_SWIZZLE_Y;
newInst.Src[1].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[1].Register.Index = tmp0;
newInst.Src[1].Register.SwizzleW = TGSI_SWIZZLE_Z;
ctx->emit_instruction(ctx, &newInst);
/* SLE t0.y, t0.x, tex.z; # bool b = distance <= k */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_SLE;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = tmp0;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_Y;
newInst.Instruction.NumSrcRegs = 2;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = tmp0;
newInst.Src[0].Register.SwizzleY = TGSI_SWIZZLE_X;
newInst.Src[1].Register.File = TGSI_FILE_INPUT;
newInst.Src[1].Register.Index = texInput;
newInst.Src[1].Register.SwizzleY = TGSI_SWIZZLE_Z;
ctx->emit_instruction(ctx, &newInst);
/* CMP t0.w, -t0.y, tex.w, t0.w;
* # if -t0.y < 0 then
* t0.w = 1
* else
* t0.w = t0.w
*/
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_CMP;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = tmp0;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_W;
newInst.Instruction.NumSrcRegs = 3;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = tmp0;
newInst.Src[0].Register.SwizzleX = TGSI_SWIZZLE_Y;
newInst.Src[0].Register.SwizzleY = TGSI_SWIZZLE_Y;
newInst.Src[0].Register.SwizzleZ = TGSI_SWIZZLE_Y;
newInst.Src[0].Register.SwizzleW = TGSI_SWIZZLE_Y;
newInst.Src[0].Register.Negate = 1;
newInst.Src[1].Register.File = TGSI_FILE_INPUT;
newInst.Src[1].Register.Index = texInput;
newInst.Src[1].Register.SwizzleX = TGSI_SWIZZLE_W;
newInst.Src[1].Register.SwizzleY = TGSI_SWIZZLE_W;
newInst.Src[1].Register.SwizzleZ = TGSI_SWIZZLE_W;
newInst.Src[1].Register.SwizzleW = TGSI_SWIZZLE_W;
newInst.Src[2].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[2].Register.Index = tmp0;
newInst.Src[2].Register.SwizzleX = TGSI_SWIZZLE_W;
newInst.Src[2].Register.SwizzleY = TGSI_SWIZZLE_W;
newInst.Src[2].Register.SwizzleZ = TGSI_SWIZZLE_W;
newInst.Src[2].Register.SwizzleW = TGSI_SWIZZLE_W;
ctx->emit_instruction(ctx, &newInst);
}
if (inst->Instruction.Opcode == TGSI_OPCODE_END) {
/* add alpha modulation code at tail of program */
/* MOV result.color.xyz, colorTemp; */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_MOV;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_OUTPUT;
newInst.Dst[0].Register.Index = aactx->colorOutput;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZ;
newInst.Instruction.NumSrcRegs = 1;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = aactx->colorTemp;
ctx->emit_instruction(ctx, &newInst);
/* MUL result.color.w, colorTemp, tmp0.w; */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_MUL;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_OUTPUT;
newInst.Dst[0].Register.Index = aactx->colorOutput;
newInst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_W;
newInst.Instruction.NumSrcRegs = 2;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = aactx->colorTemp;
newInst.Src[1].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[1].Register.Index = aactx->tmp0;
ctx->emit_instruction(ctx, &newInst);
}
else {
/* Not an END instruction.
* Look for writes to result.color and replace with colorTemp reg.
*/
uint i;
for (i = 0; i < inst->Instruction.NumDstRegs; i++) {
struct tgsi_full_dst_register *dst = &inst->Dst[i];
if (dst->Register.File == TGSI_FILE_OUTPUT &&
dst->Register.Index == aactx->colorOutput) {
dst->Register.File = TGSI_FILE_TEMPORARY;
dst->Register.Index = aactx->colorTemp;
}
/* Not an END instruction.
* Look for writes to result.color and replace with colorTemp reg.
*/
for (i = 0; i < inst->Instruction.NumDstRegs; i++) {
struct tgsi_full_dst_register *dst = &inst->Dst[i];
if (dst->Register.File == TGSI_FILE_OUTPUT &&
dst->Register.Index == aactx->colorOutput) {
dst->Register.File = TGSI_FILE_TEMPORARY;
dst->Register.Index = aactx->colorTemp;
}
}
@@ -511,7 +373,8 @@ generate_aapoint_fs(struct aapoint_stage *aapoint)
transform.maxGeneric = -1;
transform.colorTemp = -1;
transform.tmp0 = -1;
transform.firstInstruction = TRUE;
transform.base.prolog = aa_transform_prolog;
transform.base.epilog = aa_transform_epilog;
transform.base.transform_instruction = aa_transform_inst;
transform.base.transform_declaration = aa_transform_decl;

View File

@@ -47,10 +47,6 @@
#define DEBUG_CLIP 0
#ifndef IS_NEGATIVE
#define IS_NEGATIVE(X) ((X) < 0.0)
#endif
#ifndef DIFFERENT_SIGNS
#define DIFFERENT_SIGNS(x, y) ((x) * (y) <= 0.0F && (x) - (y) != 0.0F)
#endif
@@ -437,7 +433,7 @@ do_clip_tri( struct draw_stage *stage,
if (util_is_inf_or_nan(dp))
return; //discard nan
if (!IS_NEGATIVE(dp_prev)) {
if (dp_prev >= 0.0f) {
assert(outcount < MAX_CLIPPED_VERTICES);
if (outcount >= MAX_CLIPPED_VERTICES)
return;
@@ -461,7 +457,7 @@ do_clip_tri( struct draw_stage *stage,
new_edge = &outEdges[outcount];
outlist[outcount++] = new_vert;
if (IS_NEGATIVE(dp)) {
if (dp < 0.0f) {
/* Going out of bounds. Avoid division by zero as we
* know dp != dp_prev from DIFFERENT_SIGNS, above.
*/

View File

@@ -129,7 +129,6 @@ struct pstip_transform_context {
int freeSampler; /** an available sampler for the pstipple */
int texTemp; /**< temp registers */
int numImmed;
boolean firstInstruction;
};
@@ -192,147 +191,85 @@ free_bit(uint bitfield)
/**
* TGSI instruction transform callback.
* Replace writes to result.color w/ a temp reg.
* Upon END instruction, insert texture sampling code for antialiasing.
* TGSI transform prolog callback.
*/
static void
pstip_transform_inst(struct tgsi_transform_context *ctx,
struct tgsi_full_instruction *inst)
pstip_transform_prolog(struct tgsi_transform_context *ctx)
{
struct pstip_transform_context *pctx = (struct pstip_transform_context *) ctx;
uint i;
int wincoordInput;
if (pctx->firstInstruction) {
/* emit our new declarations before the first instruction */
/* find free sampler */
pctx->freeSampler = free_bit(pctx->samplersUsed);
if (pctx->freeSampler >= PIPE_MAX_SAMPLERS)
pctx->freeSampler = PIPE_MAX_SAMPLERS - 1;
struct tgsi_full_declaration decl;
struct tgsi_full_instruction newInst;
uint i;
int wincoordInput;
if (pctx->wincoordInput < 0)
wincoordInput = pctx->maxInput + 1;
else
wincoordInput = pctx->wincoordInput;
/* find free sampler */
pctx->freeSampler = free_bit(pctx->samplersUsed);
if (pctx->freeSampler >= PIPE_MAX_SAMPLERS)
pctx->freeSampler = PIPE_MAX_SAMPLERS - 1;
if (pctx->wincoordInput < 0)
wincoordInput = pctx->maxInput + 1;
/* find one free temp reg */
for (i = 0; i < 32; i++) {
if ((pctx->tempsUsed & (1 << i)) == 0) {
/* found a free temp */
if (pctx->texTemp < 0)
pctx->texTemp = i;
else
wincoordInput = pctx->wincoordInput;
/* find one free temp reg */
for (i = 0; i < 32; i++) {
if ((pctx->tempsUsed & (1 << i)) == 0) {
/* found a free temp */
if (pctx->texTemp < 0)
pctx->texTemp = i;
else
break;
}
break;
}
assert(pctx->texTemp >= 0);
}
assert(pctx->texTemp >= 0);
if (pctx->wincoordInput < 0) {
/* declare new position input reg */
decl = tgsi_default_full_declaration();
decl.Declaration.File = TGSI_FILE_INPUT;
decl.Declaration.Interpolate = 1;
decl.Declaration.Semantic = 1;
decl.Semantic.Name = TGSI_SEMANTIC_POSITION;
decl.Semantic.Index = 0;
decl.Range.First =
decl.Range.Last = wincoordInput;
decl.Interp.Interpolate = TGSI_INTERPOLATE_LINEAR; /* XXX? */
ctx->emit_declaration(ctx, &decl);
}
/* declare new sampler */
decl = tgsi_default_full_declaration();
decl.Declaration.File = TGSI_FILE_SAMPLER;
decl.Range.First =
decl.Range.Last = pctx->freeSampler;
ctx->emit_declaration(ctx, &decl);
/* declare new temp regs */
decl = tgsi_default_full_declaration();
decl.Declaration.File = TGSI_FILE_TEMPORARY;
decl.Range.First =
decl.Range.Last = pctx->texTemp;
ctx->emit_declaration(ctx, &decl);
/* emit immediate = {1/32, 1/32, 1, 1}
* The index/position of this immediate will be pctx->numImmed
*/
{
static const float value[4] = { 1.0/32, 1.0/32, 1.0, 1.0 };
struct tgsi_full_immediate immed;
uint size = 4;
immed = tgsi_default_full_immediate();
immed.Immediate.NrTokens = 1 + size; /* one for the token itself */
immed.u[0].Float = value[0];
immed.u[1].Float = value[1];
immed.u[2].Float = value[2];
immed.u[3].Float = value[3];
ctx->emit_immediate(ctx, &immed);
}
pctx->firstInstruction = FALSE;
/*
* Insert new MUL/TEX/KILL_IF instructions at start of program
* Take gl_FragCoord, divide by 32 (stipple size), sample the
* texture and kill fragment if needed.
*
* We'd like to use non-normalized texcoords to index into a RECT
* texture, but we can only use GL_REPEAT wrap mode with normalized
* texcoords. Darn.
*/
/* MUL texTemp, INPUT[wincoord], 1/32; */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_MUL;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = pctx->texTemp;
newInst.Instruction.NumSrcRegs = 2;
newInst.Src[0].Register.File = TGSI_FILE_INPUT;
newInst.Src[0].Register.Index = wincoordInput;
newInst.Src[1].Register.File = TGSI_FILE_IMMEDIATE;
newInst.Src[1].Register.Index = pctx->numImmed;
ctx->emit_instruction(ctx, &newInst);
/* TEX texTemp, texTemp, sampler; */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_TEX;
newInst.Instruction.NumDstRegs = 1;
newInst.Dst[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Dst[0].Register.Index = pctx->texTemp;
newInst.Instruction.NumSrcRegs = 2;
newInst.Instruction.Texture = TRUE;
newInst.Texture.Texture = TGSI_TEXTURE_2D;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = pctx->texTemp;
newInst.Src[1].Register.File = TGSI_FILE_SAMPLER;
newInst.Src[1].Register.Index = pctx->freeSampler;
ctx->emit_instruction(ctx, &newInst);
/* KILL_IF -texTemp; # if -texTemp < 0, KILL fragment */
newInst = tgsi_default_full_instruction();
newInst.Instruction.Opcode = TGSI_OPCODE_KILL_IF;
newInst.Instruction.NumDstRegs = 0;
newInst.Instruction.NumSrcRegs = 1;
newInst.Src[0].Register.File = TGSI_FILE_TEMPORARY;
newInst.Src[0].Register.Index = pctx->texTemp;
newInst.Src[0].Register.Negate = 1;
ctx->emit_instruction(ctx, &newInst);
if (pctx->wincoordInput < 0) {
/* declare new position input reg */
tgsi_transform_input_decl(ctx, wincoordInput,
TGSI_SEMANTIC_POSITION, 1,
TGSI_INTERPOLATE_LINEAR);
}
/* emit this instruction */
ctx->emit_instruction(ctx, inst);
/* declare new sampler */
tgsi_transform_sampler_decl(ctx, pctx->freeSampler);
/* declare new temp regs */
tgsi_transform_temp_decl(ctx, pctx->texTemp);
/* emit immediate = {1/32, 1/32, 1, 1}
* The index/position of this immediate will be pctx->numImmed
*/
tgsi_transform_immediate_decl(ctx, 1.0/32.0, 1.0/32.0, 1.0, 1.0);
/*
* Insert new MUL/TEX/KILL_IF instructions at start of program
* Take gl_FragCoord, divide by 32 (stipple size), sample the
* texture and kill fragment if needed.
*
* We'd like to use non-normalized texcoords to index into a RECT
* texture, but we can only use GL_REPEAT wrap mode with normalized
* texcoords. Darn.
*/
/* MUL texTemp, INPUT[wincoord], 1/32; */
tgsi_transform_op2_inst(ctx, TGSI_OPCODE_MUL,
TGSI_FILE_TEMPORARY, pctx->texTemp,
TGSI_WRITEMASK_XYZW,
TGSI_FILE_INPUT, wincoordInput,
TGSI_FILE_IMMEDIATE, pctx->numImmed);
/* TEX texTemp, texTemp, sampler; */
tgsi_transform_tex_2d_inst(ctx,
TGSI_FILE_TEMPORARY, pctx->texTemp,
TGSI_FILE_TEMPORARY, pctx->texTemp,
pctx->freeSampler);
/* KILL_IF -texTemp.wwww; # if -texTemp < 0, KILL fragment */
tgsi_transform_kill_inst(ctx,
TGSI_FILE_TEMPORARY, pctx->texTemp, TGSI_SWIZZLE_W);
}
/**
* Generate the frag shader we'll use for doing polygon stipple.
* This will be the user's shader prefixed with a TEX and KIL instruction.
@@ -355,8 +292,7 @@ generate_pstip_fs(struct pstip_stage *pstip)
transform.wincoordInput = -1;
transform.maxInput = -1;
transform.texTemp = -1;
transform.firstInstruction = TRUE;
transform.base.transform_instruction = pstip_transform_inst;
transform.base.prolog = pstip_transform_prolog;
transform.base.transform_declaration = pstip_transform_decl;
transform.base.transform_immediate = pstip_transform_immed;

View File

@@ -192,18 +192,6 @@ llvm_middle_end_prepare( struct draw_pt_middle_end *middle,
*/
fpme->vertex_size = sizeof(struct vertex_header) + nr * 4 * sizeof(float);
/* Get the number of float[4] attributes per vertex.
* Note: this must be done after draw_pt_emit_prepare() since that
* can effect the vertex size.
*/
nr = MAX2(vs->info.num_inputs, draw_total_vs_outputs(draw));
/* Always leave room for the vertex header whether we need it or
* not. It's hard to get rid of it in particular because of the
* viewport code in draw_pt_post_vs.c.
*/
fpme->vertex_size = sizeof(struct vertex_header) + nr * 4 * sizeof(float);
/* return even number */
*max_vertices = *max_vertices & ~1;
@@ -387,7 +375,8 @@ llvm_pipeline_generic(struct draw_pt_middle_end *middle,
fpme->vertex_size,
draw->pt.vertex_buffer,
draw->instance_id,
draw->start_index);
draw->start_index,
draw->start_instance);
else
clipped = fpme->current_variant->jit_func_elts( &fpme->llvm->jit_context,
llvm_vert_info.verts,
@@ -398,7 +387,8 @@ llvm_pipeline_generic(struct draw_pt_middle_end *middle,
fpme->vertex_size,
draw->pt.vertex_buffer,
draw->instance_id,
draw->pt.user.eltBias);
draw->pt.user.eltBias,
draw->start_instance);
/* Finished with fetch and vs:
*/

View File

@@ -53,8 +53,16 @@
#ifndef HAVE_LLVM
#error "HAVE_LLVM should be set with LLVM's version number, e.g. (0x0207 for 2.7)"
#endif
#if HAVE_LLVM < 0x301
#error "LLVM 3.1 or newer required"
#if HAVE_LLVM < 0x303
#error "LLVM 3.3 or newer required"
#endif
#if HAVE_LLVM <= 0x0303
/* We won't actually use LLVMMCJITMemoryManagerRef, just create a dummy
* typedef to simplify things elsewhere.
*/
typedef void *LLVMMCJITMemoryManagerRef;
#endif

View File

@@ -134,7 +134,8 @@ lp_build_min_simple(struct lp_build_context *bld,
}
}
else if (type.floating && util_cpu_caps.has_altivec) {
if (nan_behavior == GALLIVM_NAN_RETURN_NAN) {
if (nan_behavior == GALLIVM_NAN_RETURN_NAN ||
nan_behavior == GALLIVM_NAN_RETURN_NAN_FIRST_NONNAN) {
debug_printf("%s: altivec doesn't support nan return nan behavior\n",
__FUNCTION__);
}
@@ -202,18 +203,19 @@ lp_build_min_simple(struct lp_build_context *bld,
*/
if (util_cpu_caps.has_sse && type.floating &&
nan_behavior != GALLIVM_NAN_BEHAVIOR_UNDEFINED &&
nan_behavior != GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN) {
LLVMValueRef isnan, max;
max = lp_build_intrinsic_binary_anylength(bld->gallivm, intrinsic,
nan_behavior != GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN &&
nan_behavior != GALLIVM_NAN_RETURN_NAN_FIRST_NONNAN) {
LLVMValueRef isnan, min;
min = lp_build_intrinsic_binary_anylength(bld->gallivm, intrinsic,
type,
intr_size, a, b);
if (nan_behavior == GALLIVM_NAN_RETURN_OTHER) {
isnan = lp_build_isnan(bld, b);
return lp_build_select(bld, isnan, a, max);
return lp_build_select(bld, isnan, a, min);
} else {
assert(nan_behavior == GALLIVM_NAN_RETURN_NAN);
isnan = lp_build_isnan(bld, a);
return lp_build_select(bld, isnan, a, max);
return lp_build_select(bld, isnan, a, min);
}
} else {
return lp_build_intrinsic_binary_anylength(bld->gallivm, intrinsic,
@@ -241,6 +243,9 @@ lp_build_min_simple(struct lp_build_context *bld,
case GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN:
cond = lp_build_cmp_ordered(bld, PIPE_FUNC_LESS, a, b);
return lp_build_select(bld, cond, a, b);
case GALLIVM_NAN_RETURN_NAN_FIRST_NONNAN:
cond = lp_build_cmp(bld, PIPE_FUNC_LESS, b, a);
return lp_build_select(bld, cond, b, a);
case GALLIVM_NAN_BEHAVIOR_UNDEFINED:
cond = lp_build_cmp(bld, PIPE_FUNC_LESS, a, b);
return lp_build_select(bld, cond, a, b);
@@ -310,7 +315,8 @@ lp_build_max_simple(struct lp_build_context *bld,
}
}
else if (type.floating && util_cpu_caps.has_altivec) {
if (nan_behavior == GALLIVM_NAN_RETURN_NAN) {
if (nan_behavior == GALLIVM_NAN_RETURN_NAN ||
nan_behavior == GALLIVM_NAN_RETURN_NAN_FIRST_NONNAN) {
debug_printf("%s: altivec doesn't support nan return nan behavior\n",
__FUNCTION__);
}
@@ -373,18 +379,19 @@ lp_build_max_simple(struct lp_build_context *bld,
if(intrinsic) {
if (util_cpu_caps.has_sse && type.floating &&
nan_behavior != GALLIVM_NAN_BEHAVIOR_UNDEFINED &&
nan_behavior != GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN) {
LLVMValueRef isnan, min;
min = lp_build_intrinsic_binary_anylength(bld->gallivm, intrinsic,
nan_behavior != GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN &&
nan_behavior != GALLIVM_NAN_RETURN_NAN_FIRST_NONNAN) {
LLVMValueRef isnan, max;
max = lp_build_intrinsic_binary_anylength(bld->gallivm, intrinsic,
type,
intr_size, a, b);
if (nan_behavior == GALLIVM_NAN_RETURN_OTHER) {
isnan = lp_build_isnan(bld, b);
return lp_build_select(bld, isnan, a, min);
return lp_build_select(bld, isnan, a, max);
} else {
assert(nan_behavior == GALLIVM_NAN_RETURN_NAN);
isnan = lp_build_isnan(bld, a);
return lp_build_select(bld, isnan, a, min);
return lp_build_select(bld, isnan, a, max);
}
} else {
return lp_build_intrinsic_binary_anylength(bld->gallivm, intrinsic,
@@ -412,6 +419,9 @@ lp_build_max_simple(struct lp_build_context *bld,
case GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN:
cond = lp_build_cmp_ordered(bld, PIPE_FUNC_GREATER, a, b);
return lp_build_select(bld, cond, a, b);
case GALLIVM_NAN_RETURN_NAN_FIRST_NONNAN:
cond = lp_build_cmp(bld, PIPE_FUNC_GREATER, b, a);
return lp_build_select(bld, cond, b, a);
case GALLIVM_NAN_BEHAVIOR_UNDEFINED:
cond = lp_build_cmp(bld, PIPE_FUNC_GREATER, a, b);
return lp_build_select(bld, cond, a, b);
@@ -512,9 +522,20 @@ lp_build_add(struct lp_build_context *bld,
return lp_build_intrinsic_binary(builder, intrinsic, lp_build_vec_type(bld->gallivm, bld->type), a, b);
}
/* TODO: handle signed case */
if(type.norm && !type.floating && !type.fixed && !type.sign)
a = lp_build_min_simple(bld, a, lp_build_comp(bld, b), GALLIVM_NAN_BEHAVIOR_UNDEFINED);
if(type.norm && !type.floating && !type.fixed) {
if (type.sign) {
uint64_t sign = (uint64_t)1 << (type.width - 1);
LLVMValueRef max_val = lp_build_const_int_vec(bld->gallivm, type, sign - 1);
LLVMValueRef min_val = lp_build_const_int_vec(bld->gallivm, type, sign);
/* a_clamp_max is the maximum a for positive b,
a_clamp_min is the minimum a for negative b. */
LLVMValueRef a_clamp_max = lp_build_min_simple(bld, a, LLVMBuildSub(builder, max_val, b, ""), GALLIVM_NAN_BEHAVIOR_UNDEFINED);
LLVMValueRef a_clamp_min = lp_build_max_simple(bld, a, LLVMBuildSub(builder, min_val, b, ""), GALLIVM_NAN_BEHAVIOR_UNDEFINED);
a = lp_build_select(bld, lp_build_cmp(bld, PIPE_FUNC_GREATER, b, bld->zero), a_clamp_max, a_clamp_min);
} else {
a = lp_build_min_simple(bld, a, lp_build_comp(bld, b), GALLIVM_NAN_BEHAVIOR_UNDEFINED);
}
}
if(LLVMIsConstant(a) && LLVMIsConstant(b))
if (type.floating)
@@ -793,9 +814,20 @@ lp_build_sub(struct lp_build_context *bld,
return lp_build_intrinsic_binary(builder, intrinsic, lp_build_vec_type(bld->gallivm, bld->type), a, b);
}
/* TODO: handle signed case */
if(type.norm && !type.floating && !type.fixed && !type.sign)
a = lp_build_max_simple(bld, a, b, GALLIVM_NAN_BEHAVIOR_UNDEFINED);
if(type.norm && !type.floating && !type.fixed) {
if (type.sign) {
uint64_t sign = (uint64_t)1 << (type.width - 1);
LLVMValueRef max_val = lp_build_const_int_vec(bld->gallivm, type, sign - 1);
LLVMValueRef min_val = lp_build_const_int_vec(bld->gallivm, type, sign);
/* a_clamp_max is the maximum a for negative b,
a_clamp_min is the minimum a for positive b. */
LLVMValueRef a_clamp_max = lp_build_min_simple(bld, a, LLVMBuildAdd(builder, max_val, b, ""), GALLIVM_NAN_BEHAVIOR_UNDEFINED);
LLVMValueRef a_clamp_min = lp_build_max_simple(bld, a, LLVMBuildAdd(builder, min_val, b, ""), GALLIVM_NAN_BEHAVIOR_UNDEFINED);
a = lp_build_select(bld, lp_build_cmp(bld, PIPE_FUNC_GREATER, b, bld->zero), a_clamp_min, a_clamp_max);
} else {
a = lp_build_max_simple(bld, a, b, GALLIVM_NAN_BEHAVIOR_UNDEFINED);
}
}
if(LLVMIsConstant(a) && LLVMIsConstant(b))
if (type.floating)
@@ -1063,7 +1095,7 @@ lp_build_div(struct lp_build_context *bld,
if(a == bld->zero)
return bld->zero;
if(a == bld->one)
if(a == bld->one && type.floating)
return lp_build_rcp(bld, b);
if(b == bld->zero)
return bld->undef;
@@ -1850,7 +1882,7 @@ lp_build_trunc(struct lp_build_context *bld,
const struct lp_type type = bld->type;
struct lp_type inttype;
struct lp_build_context intbld;
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 2^24);
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 1<<24);
LLVMValueRef trunc, res, anosign, mask;
LLVMTypeRef int_vec_type = bld->int_vec_type;
LLVMTypeRef vec_type = bld->vec_type;
@@ -1905,7 +1937,7 @@ lp_build_round(struct lp_build_context *bld,
const struct lp_type type = bld->type;
struct lp_type inttype;
struct lp_build_context intbld;
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 2^24);
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 1<<24);
LLVMValueRef res, anosign, mask;
LLVMTypeRef int_vec_type = bld->int_vec_type;
LLVMTypeRef vec_type = bld->vec_type;
@@ -1958,7 +1990,7 @@ lp_build_floor(struct lp_build_context *bld,
const struct lp_type type = bld->type;
struct lp_type inttype;
struct lp_build_context intbld;
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 2^24);
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 1<<24);
LLVMValueRef trunc, res, anosign, mask;
LLVMTypeRef int_vec_type = bld->int_vec_type;
LLVMTypeRef vec_type = bld->vec_type;
@@ -2027,7 +2059,7 @@ lp_build_ceil(struct lp_build_context *bld,
const struct lp_type type = bld->type;
struct lp_type inttype;
struct lp_build_context intbld;
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 2^24);
LLVMValueRef cmpval = lp_build_const_vec(bld->gallivm, type, 1<<24);
LLVMValueRef trunc, res, anosign, mask, tmp;
LLVMTypeRef int_vec_type = bld->int_vec_type;
LLVMTypeRef vec_type = bld->vec_type;
@@ -3040,7 +3072,6 @@ lp_build_exp2(struct lp_build_context *bld,
assert(lp_check_value(bld->type, x));
/* TODO: optimize the constant case */
if (gallivm_debug & GALLIVM_DEBUG_PERF &&
LLVMIsConstant(x)) {
@@ -3053,15 +3084,14 @@ lp_build_exp2(struct lp_build_context *bld,
/* We want to preserve NaN and make sure than for exp2 if x > 128,
* the result is INF and if it's smaller than -126.9 the result is 0 */
x = lp_build_min_ext(bld, lp_build_const_vec(bld->gallivm, type, 128.0), x,
GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN);
x = lp_build_max(bld, lp_build_const_vec(bld->gallivm, type, -126.99999), x);
GALLIVM_NAN_RETURN_NAN_FIRST_NONNAN);
x = lp_build_max_ext(bld, lp_build_const_vec(bld->gallivm, type, -126.99999),
x, GALLIVM_NAN_RETURN_NAN_FIRST_NONNAN);
/* ipart = floor(x) */
/* fpart = x - ipart */
lp_build_ifloor_fract(bld, x, &ipart, &fpart);
/* expipart = (float) (1 << ipart) */
expipart = LLVMBuildAdd(builder, ipart,
lp_build_const_int_vec(bld->gallivm, type, 127), "");
@@ -3069,13 +3099,11 @@ lp_build_exp2(struct lp_build_context *bld,
lp_build_const_int_vec(bld->gallivm, type, 23), "");
expipart = LLVMBuildBitCast(builder, expipart, vec_type, "");
expfpart = lp_build_polynomial(bld, fpart, lp_build_exp2_polynomial,
Elements(lp_build_exp2_polynomial));
res = LLVMBuildFMul(builder, expipart, expfpart, "");
return res;
}

View File

@@ -138,7 +138,7 @@ lp_build_lerp_3d(struct lp_build_context *bld,
enum gallivm_nan_behavior {
/* Results are undefined with NaN. Results in fastest code */
GALLIVM_NAN_BEHAVIOR_UNDEFINED,
/* If input is NaN, NaN is returned */
/* If one of the inputs is NaN, NaN is returned */
GALLIVM_NAN_RETURN_NAN,
/* If one of the inputs is NaN, the other operand is returned */
GALLIVM_NAN_RETURN_OTHER,
@@ -146,7 +146,13 @@ enum gallivm_nan_behavior {
* but we guarantee the second operand is not a NaN.
* In min/max it will be as fast as undefined with sse opcodes,
* and archs having native return_other can benefit too. */
GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN
GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN,
/* If one of the inputs is NaN, NaN is returned,
* but we guarantee the first operand is not a NaN.
* In min/max it will be as fast as undefined with sse opcodes,
* and archs having native return_nan can benefit too. */
GALLIVM_NAN_RETURN_NAN_FIRST_NONNAN,
};
LLVMValueRef

View File

@@ -32,10 +32,11 @@
#include <llvm/Target/TargetInstrInfo.h>
#include <llvm/Support/raw_ostream.h>
#include <llvm/Support/Format.h>
#include <llvm/Support/MemoryObject.h>
#if HAVE_LLVM >= 0x0306
#include <llvm/Target/TargetSubtargetInfo.h>
#else
#include <llvm/Support/MemoryObject.h>
#endif
#include <llvm/Support/TargetRegistry.h>
@@ -43,11 +44,7 @@
#include <llvm/Support/Host.h>
#if HAVE_LLVM >= 0x0303
#include <llvm/IR/Module.h>
#else
#include <llvm/Module.h>
#endif
#include <llvm/MC/MCDisassembler.h>
#include <llvm/MC/MCAsmInfo.h>
@@ -57,7 +54,7 @@
#if HAVE_LLVM >= 0x0305
#define OwningPtr std::unique_ptr
#elif HAVE_LLVM >= 0x0303
#else
#include <llvm/ADT/OwningPtr.h>
#endif
@@ -146,6 +143,8 @@ lp_debug_dump_value(LLVMValueRef value)
}
#if HAVE_LLVM < 0x0306
/*
* MemoryObject wrapper around a buffer of memory, to be used by MC
* disassembler.
@@ -181,6 +180,8 @@ public:
}
};
#endif /* HAVE_LLVM < 0x0306 */
/*
* Disassemble a function, using the LLVM MC disassembler.
@@ -284,7 +285,11 @@ disassemble(const void* func, llvm::raw_ostream & Out)
/*
* Wrap the data in a MemoryObject
*/
#if HAVE_LLVM >= 0x0306
ArrayRef<uint8_t> memoryObject((const uint8_t *)bytes, extent);
#else
BufferMemoryObject memoryObject((const uint8_t *)bytes, extent);
#endif
uint64_t pc;
pc = 0;
@@ -414,6 +419,7 @@ disassemble(const void* func, llvm::raw_ostream & Out)
extern "C" void
lp_disassemble(LLVMValueRef func, const void *code) {
raw_debug_ostream Out;
Out << LLVMGetValueName(func) << ":\n";
disassemble(code, Out);
}

View File

@@ -43,43 +43,19 @@
#include <llvm-c/BitWriter.h>
/**
* AVX is supported in:
* - standard JIT from LLVM 3.2 onwards
* - MC-JIT from LLVM 3.1
* - MC-JIT supports limited OSes (MacOSX and Linux)
* - standard JIT in LLVM 3.1, with backports
*/
#if defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) || defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64)
/* Only MCJIT is available as of LLVM SVN r216982 */
#if HAVE_LLVM >= 0x0306
# define USE_MCJIT 1
# define HAVE_AVX 0
#elif HAVE_LLVM >= 0x0302 || (HAVE_LLVM == 0x0301 && defined(HAVE_JIT_AVX_SUPPORT))
# define USE_MCJIT 0
# define HAVE_AVX 1
#elif HAVE_LLVM == 0x0301 && (defined(PIPE_OS_LINUX) || defined(PIPE_OS_APPLE))
#elif defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) || defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64)
# define USE_MCJIT 1
# define HAVE_AVX 1
#else
# define USE_MCJIT 0
# define HAVE_AVX 0
#endif
#if USE_MCJIT
void LLVMLinkInMCJIT();
#endif
/*
* LLVM has several global caches which pointing/derived from objects
* owned by the context, so if we freeing contexts causes
* memory leaks and false cache hits when these objects are destroyed.
*
* TODO: For thread safety on multi-threaded OpenGL we should use one LLVM
* context per thread, and put them in a pool when threads are destroyed.
*/
#define USE_GLOBAL_CONTEXT 1
#ifdef DEBUG
unsigned gallivm_debug = 0;
@@ -130,6 +106,7 @@ enum LLVM_CodeGenOpt_Level {
static boolean
create_pass_manager(struct gallivm_state *gallivm)
{
char *td_str;
assert(!gallivm->passmgr);
assert(gallivm->target);
@@ -137,8 +114,14 @@ create_pass_manager(struct gallivm_state *gallivm)
if (!gallivm->passmgr)
return FALSE;
// Old versions of LLVM get the DataLayout from the pass manager.
LLVMAddTargetData(gallivm->target, gallivm->passmgr);
// New ones from the Module.
td_str = LLVMCopyStringRepOfTargetData(gallivm->target);
LLVMSetDataLayout(gallivm->module, td_str);
free(td_str);
if ((gallivm_debug & GALLIVM_DEBUG_NO_OPT) == 0) {
/* These are the passes currently listed in llvm-c/Transforms/Scalar.h,
* but there are more on SVN.
@@ -193,8 +176,7 @@ gallivm_free_ir(struct gallivm_state *gallivm)
if (gallivm->builder)
LLVMDisposeBuilder(gallivm->builder);
if (!USE_GLOBAL_CONTEXT && gallivm->context)
LLVMContextDispose(gallivm->context);
/* The LLVMContext should be owned by the parent of gallivm. */
gallivm->engine = NULL;
gallivm->target = NULL;
@@ -215,6 +197,8 @@ gallivm_free_code(struct gallivm_state *gallivm)
assert(!gallivm->engine);
lp_free_generated_code(gallivm->code);
gallivm->code = NULL;
lp_free_memory_manager(gallivm->memorymgr);
gallivm->memorymgr = NULL;
}
@@ -236,6 +220,7 @@ init_gallivm_engine(struct gallivm_state *gallivm)
ret = lp_build_create_jit_compiler_for_module(&gallivm->engine,
&gallivm->code,
gallivm->module,
gallivm->memorymgr,
(unsigned) optlevel,
USE_MCJIT,
&error);
@@ -285,18 +270,17 @@ fail:
* \return TRUE for success, FALSE for failure
*/
static boolean
init_gallivm_state(struct gallivm_state *gallivm, const char *name)
init_gallivm_state(struct gallivm_state *gallivm, const char *name,
LLVMContextRef context)
{
assert(!gallivm->context);
assert(!gallivm->module);
lp_build_init();
if (!lp_build_init())
return FALSE;
gallivm->context = context;
if (USE_GLOBAL_CONTEXT) {
gallivm->context = LLVMGetGlobalContext();
} else {
gallivm->context = LLVMContextCreate();
}
if (!gallivm->context)
goto fail;
@@ -309,6 +293,10 @@ init_gallivm_state(struct gallivm_state *gallivm, const char *name)
if (!gallivm->builder)
goto fail;
gallivm->memorymgr = lp_get_default_memory_manager();
if (!gallivm->memorymgr)
goto fail;
/* FIXME: MC-JIT only allows compiling one module at a time, and it must be
* complete when MC-JIT is created. So defer the MC-JIT engine creation for
* now.
@@ -366,11 +354,11 @@ fail:
}
void
boolean
lp_build_init(void)
{
if (gallivm_initialized)
return;
return TRUE;
#ifdef DEBUG
gallivm_debug = debug_get_option_gallivm_debug();
@@ -393,8 +381,7 @@ lp_build_init(void)
* See also:
* - http://www.anandtech.com/show/4955/the-bulldozer-review-amd-fx8150-tested/2
*/
if (HAVE_AVX &&
util_cpu_caps.has_avx &&
if (util_cpu_caps.has_avx &&
util_cpu_caps.has_intel) {
lp_native_vector_width = 256;
} else {
@@ -419,16 +406,6 @@ lp_build_init(void)
util_cpu_caps.has_avx2 = 0;
}
if (!HAVE_AVX) {
/*
* note these instructions are VEX-only, so can only emit if we use
* avx (don't want to base it on has_avx & has_f16c later as that would
* omit it unnecessarily on amd cpus, see above).
*/
util_cpu_caps.has_f16c = 0;
util_cpu_caps.has_xop = 0;
}
#ifdef PIPE_ARCH_PPC_64
/* Set the NJ bit in VSCR to 0 so denormalized values are handled as
* specified by IEEE standard (PowerISA 2.06 - Section 6.3). This guarantees
@@ -461,6 +438,8 @@ lp_build_init(void)
util_cpu_caps.has_avx = 0;
util_cpu_caps.has_f16c = 0;
#endif
return TRUE;
}
@@ -469,13 +448,13 @@ lp_build_init(void)
* Create a new gallivm_state object.
*/
struct gallivm_state *
gallivm_create(const char *name)
gallivm_create(const char *name, LLVMContextRef context)
{
struct gallivm_state *gallivm;
gallivm = CALLOC_STRUCT(gallivm_state);
if (gallivm) {
if (!init_gallivm_state(gallivm, name)) {
if (!init_gallivm_state(gallivm, name, context)) {
FREE(gallivm);
gallivm = NULL;
}

View File

@@ -44,17 +44,18 @@ struct gallivm_state
LLVMPassManagerRef passmgr;
LLVMContextRef context;
LLVMBuilderRef builder;
LLVMMCJITMemoryManagerRef memorymgr;
struct lp_generated_code *code;
unsigned compiled;
};
void
boolean
lp_build_init(void);
struct gallivm_state *
gallivm_create(const char *name);
gallivm_create(const char *name, LLVMContextRef context);
void
gallivm_destroy(struct gallivm_state *gallivm);

View File

@@ -97,6 +97,8 @@ gallivm_get_shader_param(enum pipe_shader_cap param)
return LP_MAX_TGSI_NESTING;
case PIPE_SHADER_CAP_MAX_INPUTS:
return PIPE_MAX_SHADER_INPUTS;
case PIPE_SHADER_CAP_MAX_OUTPUTS:
return 32;
case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
return sizeof(float[4]) * 4096;
case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:

View File

@@ -55,18 +55,20 @@
#include <llvm/Target/TargetOptions.h>
#include <llvm/ExecutionEngine/ExecutionEngine.h>
#include <llvm/ADT/Triple.h>
#if HAVE_LLVM < 0x0306
#include <llvm/ExecutionEngine/JITMemoryManager.h>
#else
#include <llvm/ExecutionEngine/SectionMemoryManager.h>
#endif
#include <llvm/Support/CommandLine.h>
#include <llvm/Support/Host.h>
#include <llvm/Support/PrettyStackTrace.h>
#include <llvm/Support/TargetSelect.h>
#if HAVE_LLVM >= 0x0303
#include <llvm/IR/IRBuilder.h>
#include <llvm/IR/Module.h>
#include <llvm/Support/CBindingWrapping.h>
#endif
#include "pipe/p_config.h"
#include "util/u_debug.h"
@@ -80,15 +82,9 @@ class LLVMEnsureMultithreaded {
public:
LLVMEnsureMultithreaded()
{
#if HAVE_LLVM < 0x0303
if (!llvm::llvm_is_multithreaded()) {
llvm::llvm_start_multithreaded();
}
#else
if (!LLVMIsMultithreaded()) {
LLVMStartMultithreaded();
}
#endif
}
};
@@ -138,23 +134,31 @@ lp_set_load_alignment(LLVMValueRef Inst,
extern "C"
void
lp_set_store_alignment(LLVMValueRef Inst,
unsigned Align)
unsigned Align)
{
llvm::unwrap<llvm::StoreInst>(Inst)->setAlignment(Align);
}
#if HAVE_LLVM < 0x0306
typedef llvm::JITMemoryManager BaseMemoryManager;
#else
typedef llvm::RTDyldMemoryManager BaseMemoryManager;
#endif
/*
* Delegating is tedious but the default manager class is hidden in an
* anonymous namespace in LLVM, so we cannot just derive from it to change
* its behavior.
*/
class DelegatingJITMemoryManager : public llvm::JITMemoryManager {
class DelegatingJITMemoryManager : public BaseMemoryManager {
protected:
virtual llvm::JITMemoryManager *mgr() const = 0;
virtual BaseMemoryManager *mgr() const = 0;
public:
#if HAVE_LLVM < 0x0306
/*
* From JITMemoryManager
*/
@@ -238,6 +242,7 @@ class DelegatingJITMemoryManager : public llvm::JITMemoryManager {
virtual unsigned GetNumStubSlabs() {
return mgr()->GetNumStubSlabs();
}
#endif
/*
* From RTDyldMemoryManager
@@ -257,7 +262,6 @@ class DelegatingJITMemoryManager : public llvm::JITMemoryManager {
return mgr()->allocateCodeSection(Size, Alignment, SectionID);
}
#endif
#if HAVE_LLVM >= 0x0303
virtual uint8_t *allocateDataSection(uintptr_t Size,
unsigned Alignment,
unsigned SectionID,
@@ -282,23 +286,16 @@ class DelegatingJITMemoryManager : public llvm::JITMemoryManager {
virtual void registerEHFrames(llvm::StringRef SectionData) {
mgr()->registerEHFrames(SectionData);
}
#endif
#else
virtual uint8_t *allocateDataSection(uintptr_t Size,
unsigned Alignment,
unsigned SectionID) {
return mgr()->allocateDataSection(Size, Alignment, SectionID);
}
#endif
virtual void *getPointerToNamedFunction(const std::string &Name,
bool AbortOnFailure=true) {
return mgr()->getPointerToNamedFunction(Name, AbortOnFailure);
}
#if HAVE_LLVM == 0x0303
#if HAVE_LLVM <= 0x0303
virtual bool applyPermissions(std::string *ErrMsg = 0) {
return mgr()->applyPermissions(ErrMsg);
}
#elif HAVE_LLVM > 0x0303
#else
virtual bool finalizeMemory(std::string *ErrMsg = 0) {
return mgr()->finalizeMemory(ErrMsg);
}
@@ -319,15 +316,15 @@ class DelegatingJITMemoryManager : public llvm::JITMemoryManager {
*/
class ShaderMemoryManager : public DelegatingJITMemoryManager {
static llvm::JITMemoryManager *TheMM;
static unsigned NumUsers;
BaseMemoryManager *TheMM;
struct GeneratedCode {
typedef std::vector<void *> Vec;
Vec FunctionBody, ExceptionTable;
BaseMemoryManager *TheMM;
GeneratedCode() {
++NumUsers;
GeneratedCode(BaseMemoryManager *MM) {
TheMM = MM;
}
~GeneratedCode() {
@@ -335,36 +332,31 @@ class ShaderMemoryManager : public DelegatingJITMemoryManager {
* Deallocate things as previously requested and
* free shared manager when no longer used.
*/
Vec::iterator i;
#if HAVE_LLVM < 0x0306
Vec::iterator i;
assert(TheMM);
for ( i = FunctionBody.begin(); i != FunctionBody.end(); ++i )
TheMM->deallocateFunctionBody(*i);
assert(TheMM);
for ( i = FunctionBody.begin(); i != FunctionBody.end(); ++i )
TheMM->deallocateFunctionBody(*i);
#if HAVE_LLVM < 0x0304
for ( i = ExceptionTable.begin(); i != ExceptionTable.end(); ++i )
TheMM->deallocateExceptionTable(*i);
#endif
--NumUsers;
if (NumUsers == 0) {
delete TheMM;
TheMM = 0;
}
for ( i = ExceptionTable.begin(); i != ExceptionTable.end(); ++i )
TheMM->deallocateExceptionTable(*i);
#endif /* HAVE_LLVM < 0x0304 */
#endif /* HAVE_LLVM < 0x0306 */
}
};
GeneratedCode *code;
llvm::JITMemoryManager *mgr() const {
if (!TheMM) {
TheMM = CreateDefaultMemManager();
}
BaseMemoryManager *mgr() const {
return TheMM;
}
public:
ShaderMemoryManager() {
code = new GeneratedCode;
ShaderMemoryManager(BaseMemoryManager* MM) {
TheMM = MM;
code = new GeneratedCode(MM);
}
virtual ~ShaderMemoryManager() {
@@ -395,9 +387,6 @@ class ShaderMemoryManager : public DelegatingJITMemoryManager {
}
};
llvm::JITMemoryManager *ShaderMemoryManager::TheMM = 0;
unsigned ShaderMemoryManager::NumUsers = 0;
/**
* Same as LLVMCreateJITCompilerForModule, but:
@@ -414,6 +403,7 @@ LLVMBool
lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
lp_generated_code **OutCode,
LLVMModuleRef M,
LLVMMCJITMemoryManagerRef CMM,
unsigned OptLevel,
int useMCJIT,
char **OutError)
@@ -443,7 +433,9 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
options.JITEmitDebugInfo = true;
#endif
#if defined(DEBUG) || defined(PROFILE)
/* XXX: Workaround http://llvm.org/PR21435 */
#if defined(DEBUG) || defined(PROFILE) || \
(HAVE_LLVM >= 0x0303 && (defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64)))
#if HAVE_LLVM < 0x0304
options.NoFramePointerElimNonLeaf = true;
#endif
@@ -456,7 +448,9 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
.setOptLevel((CodeGenOpt::Level)OptLevel);
if (useMCJIT) {
#if HAVE_LLVM < 0x0306
builder.setUseMCJIT(true);
#endif
#ifdef _WIN32
/*
* MCJIT works on Windows, but currently only through ELF object format.
@@ -499,24 +493,30 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
builder.setMCPU(MCPU);
#endif
ShaderMemoryManager *MM = new ShaderMemoryManager();
*OutCode = MM->getGeneratedCode();
ShaderMemoryManager *MM;
if (useMCJIT) {
#if HAVE_LLVM > 0x0303
BaseMemoryManager* JMM = reinterpret_cast<BaseMemoryManager*>(CMM);
MM = new ShaderMemoryManager(JMM);
*OutCode = MM->getGeneratedCode();
builder.setJITMemoryManager(MM);
builder.setMCJITMemoryManager(MM);
#endif
} else {
#if HAVE_LLVM < 0x0306
BaseMemoryManager* JMM = reinterpret_cast<BaseMemoryManager*>(CMM);
MM = new ShaderMemoryManager(JMM);
*OutCode = MM->getGeneratedCode();
builder.setJITMemoryManager(MM);
#else
assert(0);
#endif
}
ExecutionEngine *JIT;
#if HAVE_LLVM >= 0x0302
JIT = builder.create();
#else
/*
* Workaround http://llvm.org/PR12833
*/
StringRef MArch = "";
StringRef MCPU = "";
Triple TT(unwrap(M)->getTargetTriple());
JIT = builder.create(builder.selectTarget(TT, MArch, MCPU, MAttrs));
#endif
if (JIT) {
*OutJIT = wrap(JIT);
return 0;
@@ -535,3 +535,23 @@ lp_free_generated_code(struct lp_generated_code *code)
{
ShaderMemoryManager::freeGeneratedCode(code);
}
extern "C"
LLVMMCJITMemoryManagerRef
lp_get_default_memory_manager()
{
BaseMemoryManager *mm;
#if HAVE_LLVM < 0x0306
mm = llvm::JITMemoryManager::CreateDefaultMemManager();
#else
mm = new llvm::SectionMemoryManager();
#endif
return reinterpret_cast<LLVMMCJITMemoryManagerRef>(mm);
}
extern "C"
void
lp_free_memory_manager(LLVMMCJITMemoryManagerRef memorymgr)
{
delete reinterpret_cast<BaseMemoryManager*>(memorymgr);
}

View File

@@ -54,6 +54,7 @@ extern int
lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
struct lp_generated_code **OutCode,
LLVMModuleRef M,
LLVMMCJITMemoryManagerRef MM,
unsigned OptLevel,
int useMCJIT,
char **OutError);
@@ -61,6 +62,11 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
extern void
lp_free_generated_code(struct lp_generated_code *code);
extern LLVMMCJITMemoryManagerRef
lp_get_default_memory_manager();
extern void
lp_free_memory_manager(LLVMMCJITMemoryManagerRef memorymgr);
#ifdef __cplusplus
}

View File

@@ -464,6 +464,7 @@ lp_build_pack2(struct gallivm_state *gallivm,
if((util_cpu_caps.has_sse2 || util_cpu_caps.has_altivec) &&
src_type.width * src_type.length >= 128) {
const char *intrinsic = NULL;
boolean swap_intrinsic_operands = FALSE;
switch(src_type.width) {
case 32:
@@ -482,6 +483,9 @@ lp_build_pack2(struct gallivm_state *gallivm,
} else {
intrinsic = "llvm.ppc.altivec.vpkuwus";
}
#ifdef PIPE_ARCH_LITTLE_ENDIAN
swap_intrinsic_operands = TRUE;
#endif
}
break;
case 16:
@@ -490,12 +494,18 @@ lp_build_pack2(struct gallivm_state *gallivm,
intrinsic = "llvm.x86.sse2.packsswb.128";
} else if (util_cpu_caps.has_altivec) {
intrinsic = "llvm.ppc.altivec.vpkshss";
#ifdef PIPE_ARCH_LITTLE_ENDIAN
swap_intrinsic_operands = TRUE;
#endif
}
} else {
if (util_cpu_caps.has_sse2) {
intrinsic = "llvm.x86.sse2.packuswb.128";
} else if (util_cpu_caps.has_altivec) {
intrinsic = "llvm.ppc.altivec.vpkshus";
#ifdef PIPE_ARCH_LITTLE_ENDIAN
swap_intrinsic_operands = TRUE;
#endif
}
}
break;
@@ -504,7 +514,11 @@ lp_build_pack2(struct gallivm_state *gallivm,
if (intrinsic) {
if (src_type.width * src_type.length == 128) {
LLVMTypeRef intr_vec_type = lp_build_vec_type(gallivm, intr_type);
res = lp_build_intrinsic_binary(builder, intrinsic, intr_vec_type, lo, hi);
if (swap_intrinsic_operands) {
res = lp_build_intrinsic_binary(builder, intrinsic, intr_vec_type, hi, lo);
} else {
res = lp_build_intrinsic_binary(builder, intrinsic, intr_vec_type, lo, hi);
}
if (dst_vec_type != intr_vec_type) {
res = LLVMBuildBitCast(builder, res, dst_vec_type, "");
}
@@ -513,6 +527,8 @@ lp_build_pack2(struct gallivm_state *gallivm,
int num_split = src_type.width * src_type.length / 128;
int i;
int nlen = 128 / src_type.width;
int lo_off = swap_intrinsic_operands ? nlen : 0;
int hi_off = swap_intrinsic_operands ? 0 : nlen;
struct lp_type ndst_type = lp_type_unorm(dst_type.width, 128);
struct lp_type nintr_type = lp_type_unorm(intr_type.width, 128);
LLVMValueRef tmpres[LP_MAX_VECTOR_WIDTH / 128];
@@ -524,9 +540,9 @@ lp_build_pack2(struct gallivm_state *gallivm,
for (i = 0; i < num_split / 2; i++) {
tmplo = lp_build_extract_range(gallivm,
lo, i*nlen*2, nlen);
lo, i*nlen*2 + lo_off, nlen);
tmphi = lp_build_extract_range(gallivm,
lo, i*nlen*2 + nlen, nlen);
lo, i*nlen*2 + hi_off, nlen);
tmpres[i] = lp_build_intrinsic_binary(builder, intrinsic,
nintr_vec_type, tmplo, tmphi);
if (ndst_vec_type != nintr_vec_type) {
@@ -535,9 +551,9 @@ lp_build_pack2(struct gallivm_state *gallivm,
}
for (i = 0; i < num_split / 2; i++) {
tmplo = lp_build_extract_range(gallivm,
hi, i*nlen*2, nlen);
hi, i*nlen*2 + lo_off, nlen);
tmphi = lp_build_extract_range(gallivm,
hi, i*nlen*2 + nlen, nlen);
hi, i*nlen*2 + hi_off, nlen);
tmpres[i+num_split/2] = lp_build_intrinsic_binary(builder, intrinsic,
nintr_vec_type,
tmplo, tmphi);

View File

@@ -1313,10 +1313,7 @@ lp_build_mipmap_level_sizes(struct lp_build_sample_context *bld,
bld->row_stride_array,
ilevel);
}
if (dims == 3 ||
bld->static_texture_state->target == PIPE_TEXTURE_CUBE ||
bld->static_texture_state->target == PIPE_TEXTURE_1D_ARRAY ||
bld->static_texture_state->target == PIPE_TEXTURE_2D_ARRAY) {
if (dims == 3 || has_layer_coord(bld->static_texture_state->target)) {
*img_stride_vec = lp_build_get_level_stride_vec(bld,
bld->img_stride_array,
ilevel);

View File

@@ -356,9 +356,7 @@ texture_dims(enum pipe_texture_target tex)
case PIPE_TEXTURE_2D_ARRAY:
case PIPE_TEXTURE_RECT:
case PIPE_TEXTURE_CUBE:
return 2;
case PIPE_TEXTURE_CUBE_ARRAY:
assert(0);
return 2;
case PIPE_TEXTURE_3D:
return 3;
@@ -368,6 +366,21 @@ texture_dims(enum pipe_texture_target tex)
}
}
static INLINE boolean
has_layer_coord(enum pipe_texture_target tex)
{
switch (tex) {
case PIPE_TEXTURE_1D_ARRAY:
case PIPE_TEXTURE_2D_ARRAY:
/* cube is not layered but 3rd coord (after cube mapping) behaves the same */
case PIPE_TEXTURE_CUBE:
case PIPE_TEXTURE_CUBE_ARRAY:
return TRUE;
default:
return FALSE;
}
}
boolean
lp_sampler_wrap_mode_uses_border_color(unsigned mode,

View File

@@ -704,9 +704,7 @@ lp_build_sample_image_nearest(struct lp_build_sample_context *bld,
offset = lp_build_add(&bld->int_coord_bld, offset, z_offset);
}
}
if (bld->static_texture_state->target == PIPE_TEXTURE_CUBE ||
bld->static_texture_state->target == PIPE_TEXTURE_1D_ARRAY ||
bld->static_texture_state->target == PIPE_TEXTURE_2D_ARRAY) {
if (has_layer_coord(bld->static_texture_state->target)) {
LLVMValueRef z_offset;
/* The r coord is the cube face in [0,5] or array layer */
z_offset = lp_build_mul(&bld->int_coord_bld, r, img_stride_vec);
@@ -781,9 +779,7 @@ lp_build_sample_image_nearest_afloat(struct lp_build_sample_context *bld,
&z_icoord);
}
}
if (bld->static_texture_state->target == PIPE_TEXTURE_CUBE ||
bld->static_texture_state->target == PIPE_TEXTURE_1D_ARRAY ||
bld->static_texture_state->target == PIPE_TEXTURE_2D_ARRAY) {
if (has_layer_coord(bld->static_texture_state->target)) {
z_icoord = r;
}
@@ -1130,9 +1126,7 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld,
&x_subcoord[0], &x_subcoord[1]);
/* add potential cube/array/mip offsets now as they are constant per pixel */
if (bld->static_texture_state->target == PIPE_TEXTURE_CUBE ||
bld->static_texture_state->target == PIPE_TEXTURE_1D_ARRAY ||
bld->static_texture_state->target == PIPE_TEXTURE_2D_ARRAY) {
if (has_layer_coord(bld->static_texture_state->target)) {
LLVMValueRef z_offset;
z_offset = lp_build_mul(&bld->int_coord_bld, r, img_stride_vec);
/* The r coord is the cube face in [0,5] or array layer */
@@ -1301,9 +1295,7 @@ lp_build_sample_image_linear_afloat(struct lp_build_sample_context *bld,
&x_offset1, &x_subcoord[1]);
/* add potential cube/array/mip offsets now as they are constant per pixel */
if (bld->static_texture_state->target == PIPE_TEXTURE_CUBE ||
bld->static_texture_state->target == PIPE_TEXTURE_1D_ARRAY ||
bld->static_texture_state->target == PIPE_TEXTURE_2D_ARRAY) {
if (has_layer_coord(bld->static_texture_state->target)) {
LLVMValueRef z_offset;
z_offset = lp_build_mul(&bld->int_coord_bld, r, img_stride_vec);
/* The r coord is the cube face in [0,5] or array layer */

View File

@@ -752,10 +752,14 @@ lp_build_sample_image_nearest(struct lp_build_sample_context *bld,
lp_build_name(z, "tex.z.wrapped");
}
}
if (bld->static_texture_state->target == PIPE_TEXTURE_CUBE ||
bld->static_texture_state->target == PIPE_TEXTURE_1D_ARRAY ||
bld->static_texture_state->target == PIPE_TEXTURE_2D_ARRAY) {
z = coords[2];
if (has_layer_coord(bld->static_texture_state->target)) {
if (bld->static_texture_state->target == PIPE_TEXTURE_CUBE_ARRAY) {
/* add cube layer to face */
z = lp_build_add(&bld->int_coord_bld, coords[2], coords[3]);
}
else {
z = coords[2];
}
lp_build_name(z, "tex.z.layer");
}
@@ -868,7 +872,8 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld,
int chan, texel_index;
boolean seamless_cube_filter, accurate_cube_corners;
seamless_cube_filter = bld->static_texture_state->target == PIPE_TEXTURE_CUBE &&
seamless_cube_filter = (bld->static_texture_state->target == PIPE_TEXTURE_CUBE ||
bld->static_texture_state->target == PIPE_TEXTURE_CUBE_ARRAY) &&
bld->static_sampler_state->seamless_cube_map;
accurate_cube_corners = ACCURATE_CUBE_CORNERS && seamless_cube_filter;
@@ -923,10 +928,15 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld,
lp_build_name(z1, "tex.z1.wrapped");
}
}
if (bld->static_texture_state->target == PIPE_TEXTURE_CUBE ||
bld->static_texture_state->target == PIPE_TEXTURE_1D_ARRAY ||
bld->static_texture_state->target == PIPE_TEXTURE_2D_ARRAY) {
z00 = z01 = z10 = z11 = z1 = coords[2]; /* cube face or layer */
if (has_layer_coord(bld->static_texture_state->target)) {
if (bld->static_texture_state->target == PIPE_TEXTURE_CUBE_ARRAY) {
/* add cube layer to face */
z00 = z01 = z10 = z11 = z1 =
lp_build_add(&bld->int_coord_bld, coords[2], coords[3]);
}
else {
z00 = z01 = z10 = z11 = z1 = coords[2]; /* cube face or layer */
}
lp_build_name(z00, "tex.z0.layer");
lp_build_name(z1, "tex.z1.layer");
}
@@ -1047,6 +1057,14 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld,
z10 = lp_build_select(ivec_bld, fall_off_yp_notxm, new_faces[3], z10);
z11 = lp_build_select(ivec_bld, fall_off_yp_notxp, new_faces[3], z11);
if (bld->static_texture_state->target == PIPE_TEXTURE_CUBE_ARRAY) {
/* now can add cube layer to face (per sample) */
z00 = lp_build_add(ivec_bld, z00, coords[3]);
z01 = lp_build_add(ivec_bld, z01, coords[3]);
z10 = lp_build_add(ivec_bld, z10, coords[3]);
z11 = lp_build_add(ivec_bld, z11, coords[3]);
}
LLVMBuildStore(builder, x00, xs[0]);
LLVMBuildStore(builder, x01, xs[1]);
LLVMBuildStore(builder, x10, xs[2]);
@@ -1070,10 +1088,19 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld,
LLVMBuildStore(builder, y0, ys[1]);
LLVMBuildStore(builder, y1, ys[2]);
LLVMBuildStore(builder, y1, ys[3]);
LLVMBuildStore(builder, face, zs[0]);
LLVMBuildStore(builder, face, zs[1]);
LLVMBuildStore(builder, face, zs[2]);
LLVMBuildStore(builder, face, zs[3]);
if (bld->static_texture_state->target == PIPE_TEXTURE_CUBE_ARRAY) {
LLVMValueRef cube_layer = lp_build_add(ivec_bld, face, coords[3]);
LLVMBuildStore(builder, cube_layer, zs[0]);
LLVMBuildStore(builder, cube_layer, zs[1]);
LLVMBuildStore(builder, cube_layer, zs[2]);
LLVMBuildStore(builder, cube_layer, zs[3]);
}
else {
LLVMBuildStore(builder, face, zs[0]);
LLVMBuildStore(builder, face, zs[1]);
LLVMBuildStore(builder, face, zs[2]);
LLVMBuildStore(builder, face, zs[3]);
}
lp_build_endif(&edge_if);
@@ -1644,6 +1671,7 @@ lp_build_sample_mipmap_both(struct lp_build_sample_context *bld,
static LLVMValueRef
lp_build_layer_coord(struct lp_build_sample_context *bld,
unsigned texture_unit,
boolean is_cube_array,
LLVMValueRef layer,
LLVMValueRef *out_of_bounds)
{
@@ -1655,6 +1683,7 @@ lp_build_layer_coord(struct lp_build_sample_context *bld,
if (out_of_bounds) {
LLVMValueRef out1, out;
assert(!is_cube_array);
num_layers = lp_build_broadcast_scalar(int_coord_bld, num_layers);
out = lp_build_cmp(int_coord_bld, PIPE_FUNC_LESS, layer, int_coord_bld->zero);
out1 = lp_build_cmp(int_coord_bld, PIPE_FUNC_GEQUAL, layer, num_layers);
@@ -1663,7 +1692,9 @@ lp_build_layer_coord(struct lp_build_sample_context *bld,
}
else {
LLVMValueRef maxlayer;
maxlayer = lp_build_sub(&bld->int_bld, num_layers, bld->int_bld.one);
LLVMValueRef s = is_cube_array ? lp_build_const_int32(bld->gallivm, 6) :
bld->int_bld.one;
maxlayer = lp_build_sub(&bld->int_bld, num_layers, s);
maxlayer = lp_build_broadcast_scalar(int_coord_bld, maxlayer);
return lp_build_clamp(int_coord_bld, layer, int_coord_bld->zero, maxlayer);
}
@@ -1703,7 +1734,7 @@ lp_build_sample_common(struct lp_build_sample_context *bld,
* Choose cube face, recompute texcoords for the chosen face and
* compute rho here too (as it requires transform of derivatives).
*/
if (target == PIPE_TEXTURE_CUBE) {
if (target == PIPE_TEXTURE_CUBE || target == PIPE_TEXTURE_CUBE_ARRAY) {
boolean need_derivs;
need_derivs = ((min_filter != mag_filter ||
mip_filter != PIPE_TEX_MIPFILTER_NONE) &&
@@ -1711,11 +1742,19 @@ lp_build_sample_common(struct lp_build_sample_context *bld,
!explicit_lod);
lp_build_cube_lookup(bld, coords, derivs, &cube_rho, &cube_derivs, need_derivs);
derivs = &cube_derivs;
if (target == PIPE_TEXTURE_CUBE_ARRAY) {
/* calculate cube layer coord now */
LLVMValueRef layer = lp_build_iround(&bld->coord_bld, coords[3]);
LLVMValueRef six = lp_build_const_int_vec(bld->gallivm, bld->int_coord_type, 6);
layer = lp_build_mul(&bld->int_coord_bld, layer, six);
coords[3] = lp_build_layer_coord(bld, texture_index, TRUE, layer, NULL);
/* because of seamless filtering can't add it to face (coords[2]) here. */
}
}
else if (target == PIPE_TEXTURE_1D_ARRAY ||
target == PIPE_TEXTURE_2D_ARRAY) {
coords[2] = lp_build_iround(&bld->coord_bld, coords[2]);
coords[2] = lp_build_layer_coord(bld, texture_index, coords[2], NULL);
coords[2] = lp_build_layer_coord(bld, texture_index, FALSE, coords[2], NULL);
}
if (bld->static_sampler_state->compare_mode != PIPE_TEX_COMPARE_NONE) {
@@ -2223,11 +2262,11 @@ lp_build_fetch_texel(struct lp_build_sample_context *bld,
if (target == PIPE_TEXTURE_1D_ARRAY ||
target == PIPE_TEXTURE_2D_ARRAY) {
if (out_of_bound_ret_zero) {
z = lp_build_layer_coord(bld, texture_unit, z, &out1);
z = lp_build_layer_coord(bld, texture_unit, FALSE, z, &out1);
out_of_bounds = lp_build_or(int_coord_bld, out_of_bounds, out1);
}
else {
z = lp_build_layer_coord(bld, texture_unit, z, NULL);
z = lp_build_layer_coord(bld, texture_unit, FALSE, z, NULL);
}
}
@@ -2463,7 +2502,8 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
if ((gallivm_debug & GALLIVM_DEBUG_NO_QUAD_LOD) &&
(gallivm_debug & GALLIVM_DEBUG_NO_RHO_APPROX) &&
(static_texture_state->target == PIPE_TEXTURE_CUBE) &&
(static_texture_state->target == PIPE_TEXTURE_CUBE ||
static_texture_state->target == PIPE_TEXTURE_CUBE_ARRAY) &&
(!is_fetch && mip_filter != PIPE_TEX_MIPFILTER_NONE)) {
/*
* special case for using per-pixel lod even for implicit lod,
@@ -2601,7 +2641,8 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
use_aos &= lp_is_simple_wrap_mode(derived_sampler_state.wrap_r);
}
}
if (static_texture_state->target == PIPE_TEXTURE_CUBE &&
if ((static_texture_state->target == PIPE_TEXTURE_CUBE ||
static_texture_state->target == PIPE_TEXTURE_CUBE_ARRAY) &&
derived_sampler_state.seamless_cube_map &&
(derived_sampler_state.min_img_filter == PIPE_TEX_FILTER_LINEAR ||
derived_sampler_state.mag_img_filter == PIPE_TEX_FILTER_LINEAR)) {
@@ -2631,6 +2672,13 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
&lod_positive, &lod_fpart,
&ilevel0, &ilevel1);
if (use_aos && static_texture_state->target == PIPE_TEXTURE_CUBE_ARRAY) {
/* The aos path doesn't do seamless filtering so simply add cube layer
* to face now.
*/
newcoords[2] = lp_build_add(&bld.int_coord_bld, newcoords[2], newcoords[3]);
}
/*
* we only try 8-wide sampling with soa as it appears to
* be a loss with aos with AVX (but it should work, except
@@ -2695,7 +2743,8 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
bld4.num_mips = bld4.num_lods = 1;
if ((gallivm_debug & GALLIVM_DEBUG_NO_QUAD_LOD) &&
(gallivm_debug & GALLIVM_DEBUG_NO_RHO_APPROX) &&
(static_texture_state->target == PIPE_TEXTURE_CUBE) &&
(static_texture_state->target == PIPE_TEXTURE_CUBE ||
static_texture_state->target == PIPE_TEXTURE_CUBE_ARRAY) &&
(!is_fetch && mip_filter != PIPE_TEX_MIPFILTER_NONE)) {
bld4.num_mips = type4.length;
bld4.num_lods = type4.length;
@@ -2891,6 +2940,7 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
switch (target) {
case PIPE_TEXTURE_1D_ARRAY:
case PIPE_TEXTURE_2D_ARRAY:
case PIPE_TEXTURE_CUBE_ARRAY:
has_array = TRUE;
break;
default:
@@ -2932,10 +2982,20 @@ lp_build_size_query_soa(struct gallivm_state *gallivm,
size = lp_build_minify(&bld_int_vec4, size, lod, TRUE);
if (has_array)
size = LLVMBuildInsertElement(gallivm->builder, size,
dynamic_state->depth(dynamic_state, gallivm, texture_unit),
if (has_array) {
LLVMValueRef layers = dynamic_state->depth(dynamic_state, gallivm, texture_unit);
if (target == PIPE_TEXTURE_CUBE_ARRAY) {
/*
* It looks like GL wants number of cubes, d3d10.1 has it undefined?
* Could avoid this by passing in number of cubes instead of total
* number of layers (might make things easier elsewhere too).
*/
LLVMValueRef six = lp_build_const_int32(gallivm, 6);
layers = LLVMBuildSDiv(gallivm->builder, layers, six, "");
}
size = LLVMBuildInsertElement(gallivm->builder, size, layers,
lp_build_const_int32(gallivm, dims), "");
}
/*
* d3d10 requires zero for x/y/z values (but not w, i.e. mip levels)

View File

@@ -126,6 +126,12 @@ struct lp_tgsi_info
*/
unsigned indirect_textures:1;
/*
* Whether any of the texture (sample) ocpodes use different sampler
* and sampler view unit.
*/
unsigned sampler_texture_units_different:1;
/*
* Whether any immediate values are outside the range of 0 and 1
*/
@@ -538,6 +544,8 @@ struct lp_build_tgsi_aos_context
struct lp_build_sampler_aos *sampler;
struct tgsi_declaration_sampler_view sv[PIPE_MAX_SHADER_SAMPLER_VIEWS];
LLVMValueRef immediates[LP_MAX_INLINED_IMMEDIATES];
LLVMValueRef temps[LP_MAX_INLINED_TEMPS];
LLVMValueRef addr[LP_MAX_TGSI_ADDRS];

View File

@@ -1248,8 +1248,24 @@ idiv_emit_cpu(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
{
emit_data->output[emit_data->chan] = lp_build_div(&bld_base->int_bld,
emit_data->args[0], emit_data->args[1]);
LLVMBuilderRef builder = bld_base->base.gallivm->builder;
LLVMValueRef div_mask = lp_build_cmp(&bld_base->uint_bld,
PIPE_FUNC_EQUAL, emit_data->args[1],
bld_base->uint_bld.zero);
/* We want to make sure that we never divide/mod by zero to not
* generate sigfpe. We don't want to crash just because the
* shader is doing something weird. */
LLVMValueRef divisor = LLVMBuildOr(builder,
div_mask,
emit_data->args[1], "");
LLVMValueRef result = lp_build_div(&bld_base->int_bld,
emit_data->args[0], divisor);
LLVMValueRef not_div_mask = LLVMBuildNot(builder,
div_mask,"");
/* idiv by zero doesn't have a guaranteed return value chose 0 for now. */
emit_data->output[emit_data->chan] = LLVMBuildAnd(builder,
not_div_mask,
result, "");
}
/* TGSI_OPCODE_INEG (CPU Only) */
@@ -1675,15 +1691,15 @@ udiv_emit_cpu(
LLVMValueRef div_mask = lp_build_cmp(&bld_base->uint_bld,
PIPE_FUNC_EQUAL, emit_data->args[1],
bld_base->uint_bld.zero);
/* We want to make sure that we never divide/mod by zero to not
* generate sigfpe. We don't want to crash just because the
/* We want to make sure that we never divide/mod by zero to not
* generate sigfpe. We don't want to crash just because the
* shader is doing something weird. */
LLVMValueRef divisor = LLVMBuildOr(builder,
div_mask,
emit_data->args[1], "");
LLVMValueRef result = lp_build_div(&bld_base->uint_bld,
emit_data->args[0], divisor);
/* udiv by zero is guaranteed to return 0xffffffff */
/* udiv by zero is guaranteed to return 0xffffffff at least with d3d10 */
emit_data->output[emit_data->chan] = LLVMBuildOr(builder,
div_mask,
result, "");

View File

@@ -391,6 +391,37 @@ emit_tex(struct lp_build_tgsi_aos_context *bld,
}
static LLVMValueRef
emit_sample(struct lp_build_tgsi_aos_context *bld,
const struct tgsi_full_instruction *inst,
enum lp_build_tex_modifier modifier)
{
unsigned target;
unsigned unit;
LLVMValueRef coords;
struct lp_derivatives derivs = { {NULL}, {NULL} };
if (!bld->sampler) {
_debug_printf("warning: found texture instruction but no sampler generator supplied\n");
return bld->bld_base.base.undef;
}
coords = lp_build_emit_fetch( &bld->bld_base, inst, 0 , LP_CHAN_ALL);
/* ignore modifiers, can't handle different sampler / sampler view, etc... */
unit = inst->Src[1].Register.Index;
assert(inst->Src[2].Register.Index == unit);
target = bld->sv[unit].Resource;
return bld->sampler->emit_fetch_texel(bld->sampler,
&bld->bld_base.base,
target, unit,
coords, derivs,
modifier);
}
void
lp_emit_declaration_aos(
struct lp_build_tgsi_aos_context *bld,
@@ -430,6 +461,17 @@ lp_emit_declaration_aos(
bld->preds[idx] = lp_build_alloca(gallivm, vec_type, "");
break;
case TGSI_FILE_SAMPLER_VIEW:
/*
* The target stored here MUST match whatever there actually
* is in the set sampler views (what about return type?).
*/
assert(last < PIPE_MAX_SHADER_SAMPLER_VIEWS);
for (idx = first; idx <= last; ++idx) {
bld->sv[idx] = decl->SamplerView;
}
break;
default:
/* don't need to declare other vars */
break;
@@ -782,7 +824,8 @@ lp_emit_instruction_aos(
return FALSE;
case TGSI_OPCODE_RET:
return FALSE;
/* safe to ignore at end */
break;
case TGSI_OPCODE_END:
*pc = -1;
@@ -815,7 +858,6 @@ lp_emit_instruction_aos(
return FALSE;
case TGSI_OPCODE_DIV:
/* deprecated */
assert(0);
return FALSE;
break;
@@ -874,13 +916,11 @@ lp_emit_instruction_aos(
break;
case TGSI_OPCODE_I2F:
/* deprecated? */
assert(0);
return FALSE;
break;
case TGSI_OPCODE_NOT:
/* deprecated? */
assert(0);
return FALSE;
break;
@@ -891,55 +931,46 @@ lp_emit_instruction_aos(
break;
case TGSI_OPCODE_SHL:
/* deprecated? */
assert(0);
return FALSE;
break;
case TGSI_OPCODE_ISHR:
/* deprecated? */
assert(0);
return FALSE;
break;
case TGSI_OPCODE_AND:
/* deprecated? */
assert(0);
return FALSE;
break;
case TGSI_OPCODE_OR:
/* deprecated? */
assert(0);
return FALSE;
break;
case TGSI_OPCODE_MOD:
/* deprecated? */
assert(0);
return FALSE;
break;
case TGSI_OPCODE_XOR:
/* deprecated? */
assert(0);
return FALSE;
break;
case TGSI_OPCODE_SAD:
/* deprecated? */
assert(0);
return FALSE;
break;
case TGSI_OPCODE_TXF:
/* deprecated? */
assert(0);
return FALSE;
break;
case TGSI_OPCODE_TXQ:
/* deprecated? */
assert(0);
return FALSE;
break;
@@ -958,6 +989,10 @@ lp_emit_instruction_aos(
case TGSI_OPCODE_NOP:
break;
case TGSI_OPCODE_SAMPLE:
dst0 = emit_sample(bld, inst, LP_BLD_TEX_MODIFIER_NONE);
break;
default:
return FALSE;
}

View File

@@ -48,6 +48,7 @@ struct analysis_context
unsigned num_imms;
float imm[LP_MAX_TGSI_IMMEDIATES][4];
unsigned sample_target[PIPE_MAX_SHADER_SAMPLER_VIEWS];
struct lp_tgsi_channel_info temp[32][4];
};
@@ -129,29 +130,29 @@ analyse_tex(struct analysis_context *ctx,
case TGSI_TEXTURE_SHADOW2D:
case TGSI_TEXTURE_SHADOWRECT:
case TGSI_TEXTURE_2D_ARRAY:
case TGSI_TEXTURE_2D_MSAA:
case TGSI_TEXTURE_3D:
case TGSI_TEXTURE_CUBE:
readmask = TGSI_WRITEMASK_XYZ;
break;
case TGSI_TEXTURE_SHADOW2D_ARRAY:
case TGSI_TEXTURE_SHADOWCUBE:
readmask = TGSI_WRITEMASK_XYZW;
break;
case TGSI_TEXTURE_2D_ARRAY_MSAA:
case TGSI_TEXTURE_CUBE_ARRAY:
readmask = TGSI_WRITEMASK_XYZW;
/* modifier would be in another not analyzed reg so just say indirect */
if (modifier != LP_BLD_TEX_MODIFIER_NONE) {
indirect = TRUE;
}
break;
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
readmask = TGSI_WRITEMASK_XYZW;
indirect = TRUE;
break;
default:
assert(0);
return;
}
/* XXX
* For cube map arrays, this will not analyze lod or shadow argument.
* For shadow cube, this will not analyze lod bias argument.
* "Indirect" really has no meaning for such textures anyway though.
*/
if (modifier == LP_BLD_TEX_MODIFIER_EXPLICIT_DERIV) {
/* We don't track explicit derivatives, although we could */
@@ -207,20 +208,45 @@ analyse_sample(struct analysis_context *ctx,
if (info->num_texs < Elements(info->tex)) {
struct lp_tgsi_texture_info *tex_info = &info->tex[info->num_texs];
unsigned target = ctx->sample_target[inst->Src[1].Register.Index];
boolean indirect = FALSE;
boolean shadow = FALSE;
unsigned readmask;
/*
* We don't really get much information here, in particular not
* the target info, hence no useful writemask neither. Maybe should just
* forget the whole function.
*/
readmask = TGSI_WRITEMASK_XYZW;
switch (target) {
/* note no shadow targets here */
case TGSI_TEXTURE_BUFFER:
case TGSI_TEXTURE_1D:
readmask = TGSI_WRITEMASK_X;
break;
case TGSI_TEXTURE_1D_ARRAY:
case TGSI_TEXTURE_2D:
case TGSI_TEXTURE_RECT:
readmask = TGSI_WRITEMASK_XY;
break;
case TGSI_TEXTURE_2D_ARRAY:
case TGSI_TEXTURE_2D_MSAA:
case TGSI_TEXTURE_3D:
case TGSI_TEXTURE_CUBE:
readmask = TGSI_WRITEMASK_XYZ;
break;
case TGSI_TEXTURE_CUBE_ARRAY:
case TGSI_TEXTURE_2D_ARRAY_MSAA:
readmask = TGSI_WRITEMASK_XYZW;
break;
default:
assert(0);
return;
}
tex_info->target = target;
tex_info->texture_unit = inst->Src[1].Register.Index;
tex_info->sampler_unit = inst->Src[2].Register.Index;
if (tex_info->texture_unit != tex_info->sampler_unit) {
info->sampler_texture_units_different = TRUE;
}
if (modifier == LP_BLD_TEX_MODIFIER_EXPLICIT_DERIV ||
modifier == LP_BLD_TEX_MODIFIER_EXPLICIT_LOD ||
modifier == LP_BLD_TEX_MODIFIER_LOD_BIAS || shadow) {
@@ -524,7 +550,14 @@ lp_build_tgsi_info(const struct tgsi_token *tokens,
tgsi_parse_token(&parse);
switch (parse.FullToken.Token.Type) {
case TGSI_TOKEN_TYPE_DECLARATION:
case TGSI_TOKEN_TYPE_DECLARATION: {
struct tgsi_full_declaration *decl = &parse.FullToken.FullDeclaration;
if (decl->Declaration.File == TGSI_FILE_SAMPLER_VIEW) {
for (index = decl->Range.First; index <= decl->Range.Last; index++) {
ctx->sample_target[index] = decl->SamplerView.Resource;
}
}
}
break;
case TGSI_TOKEN_TYPE_INSTRUCTION:

View File

@@ -2333,7 +2333,7 @@ emit_fetch_texels( struct lp_build_tgsi_soa_context *bld,
unsigned unit, target;
LLVMValueRef coord_undef = LLVMGetUndef(bld->bld_base.base.int_vec_type);
LLVMValueRef explicit_lod = NULL;
LLVMValueRef coords[3];
LLVMValueRef coords[5];
LLVMValueRef offsets[3] = { NULL };
enum lp_sampler_lod_property lod_property = LP_SAMPLER_LOD_SCALAR;
unsigned dims, i;
@@ -2395,7 +2395,8 @@ emit_fetch_texels( struct lp_build_tgsi_soa_context *bld,
for (i = 0; i < dims; i++) {
coords[i] = lp_build_emit_fetch(&bld->bld_base, inst, 0, i);
}
for (i = dims; i < 3; i++) {
/* never use more than 3 coords here but emit_fetch_texel copies all 5 anyway */
for (i = dims; i < 5; i++) {
coords[i] = coord_undef;
}
if (layer_coord)
@@ -3854,8 +3855,8 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
* were forgetting so we're using MAX_VERTEX_VARYING from
* that spec even though we could debug_assert if it's not
* set, but that's a lot uglier. */
uint max_output_vertices = 32;
uint i = 0;
uint max_output_vertices;
/* inputs are always indirect with gs */
bld.indirect_files |= (1 << TGSI_FILE_INPUT);
bld.gs_iface = gs_iface;
@@ -3863,12 +3864,11 @@ lp_build_tgsi_soa(struct gallivm_state *gallivm,
bld.bld_base.op_actions[TGSI_OPCODE_EMIT].emit = emit_vertex;
bld.bld_base.op_actions[TGSI_OPCODE_ENDPRIM].emit = end_primitive;
for (i = 0; i < info->num_properties; ++i) {
if (info->properties[i].name ==
TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES) {
max_output_vertices = info->properties[i].data[0];
}
}
max_output_vertices =
info->properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES];
if (!max_output_vertices)
max_output_vertices = 32;
bld.max_output_vertices_vec =
lp_build_const_int_vec(gallivm, bld.bld_base.int_bld.type,
max_output_vertices);

View File

@@ -44,6 +44,7 @@
#include "util/u_inlines.h"
#include "util/u_memory.h"
#include "util/u_math.h"
#include "util/u_sampler.h"
#include "util/u_simple_shaders.h"
#include "util/u_string.h"
#include "util/u_upload_mgr.h"
@@ -1050,12 +1051,8 @@ hud_create(struct pipe_context *pipe, struct cso_context *cso)
}
/* sampler view */
memset(&view_templ, 0, sizeof(view_templ));
view_templ.format = hud->font.texture->format;
view_templ.swizzle_r = PIPE_SWIZZLE_RED;
view_templ.swizzle_g = PIPE_SWIZZLE_GREEN;
view_templ.swizzle_b = PIPE_SWIZZLE_BLUE;
view_templ.swizzle_a = PIPE_SWIZZLE_ALPHA;
u_sampler_view_default_template(
&view_templ, hud->font.texture, hud->font.texture->format);
hud->font_sampler_view = pipe->create_sampler_view(pipe, hud->font.texture,
&view_templ);

View File

@@ -193,7 +193,7 @@ def lineloop(intype, outtype, inpv, outpv):
print ' for (i = start, j = 0; j < nr - 2; j+=2, i++) { '
do_line( intype, outtype, 'out+j', 'i', 'i+1', inpv, outpv );
print ' }'
do_line( intype, outtype, 'out+j', 'i', '0', inpv, outpv );
do_line( intype, outtype, 'out+j', 'i', 'start', inpv, outpv );
postamble()
def tris(intype, outtype, inpv, outpv):
@@ -218,7 +218,7 @@ def tristrip(intype, outtype, inpv, outpv):
def trifan(intype, outtype, inpv, outpv):
preamble(intype, outtype, inpv, outpv, prim='trifan')
print ' for (i = start, j = 0; j < nr; j+=3, i++) { '
do_tri( intype, outtype, 'out+j', '0', 'i+1', 'i+2', inpv, outpv );
do_tri( intype, outtype, 'out+j', 'start', 'i+1', 'i+2', inpv, outpv );
print ' }'
postamble()
@@ -228,9 +228,9 @@ def polygon(intype, outtype, inpv, outpv):
preamble(intype, outtype, inpv, outpv, prim='polygon')
print ' for (i = start, j = 0; j < nr; j+=3, i++) { '
if inpv == FIRST:
do_tri( intype, outtype, 'out+j', '0', 'i+1', 'i+2', inpv, outpv );
do_tri( intype, outtype, 'out+j', 'start', 'i+1', 'i+2', inpv, outpv );
else:
do_tri( intype, outtype, 'out+j', 'i+1', 'i+2', '0', inpv, outpv );
do_tri( intype, outtype, 'out+j', 'i+1', 'i+2', 'start', inpv, outpv );
print ' }'
postamble()

View File

@@ -45,6 +45,7 @@
#include "util/u_draw.h"
#include "util/u_inlines.h"
#include "util/u_memory.h"
#include "util/u_upload_mgr.h"
#include "indices/u_indices.h"
#include "indices/u_primconvert.h"
@@ -55,7 +56,7 @@ struct primconvert_context
struct pipe_index_buffer saved_ib;
uint32_t primtypes_mask;
unsigned api_pv;
// TODO we could cache/recycle the indexbuf created to translate prims..
struct u_upload_mgr *upload;
};
@@ -112,10 +113,10 @@ util_primconvert_draw_vbo(struct primconvert_context *pc,
struct pipe_index_buffer *ib = &pc->saved_ib;
struct pipe_index_buffer new_ib;
struct pipe_draw_info new_info;
struct pipe_transfer *src_transfer = NULL, *dst_transfer = NULL;
struct pipe_transfer *src_transfer = NULL;
u_translate_func trans_func;
u_generate_func gen_func;
const void *src;
const void *src = NULL;
void *dst;
memset(&new_ib, 0, sizeof(new_ib));
@@ -123,6 +124,9 @@ util_primconvert_draw_vbo(struct primconvert_context *pc,
new_info.indexed = true;
new_info.min_index = info->min_index;
new_info.max_index = info->max_index;
new_info.index_bias = info->index_bias;
new_info.start_instance = info->start_instance;
new_info.instance_count = info->instance_count;
if (info->indexed) {
u_index_translator(pc->primtypes_mask,
@@ -135,6 +139,7 @@ util_primconvert_draw_vbo(struct primconvert_context *pc,
src = pipe_buffer_map(pc->pipe, ib->buffer,
PIPE_TRANSFER_READ, &src_transfer);
}
src = (const uint8_t *)src + ib->offset;
}
else {
u_index_generator(pc->primtypes_mask,
@@ -144,14 +149,12 @@ util_primconvert_draw_vbo(struct primconvert_context *pc,
&gen_func);
}
if (!pc->upload) {
pc->upload = u_upload_create(pc->pipe, 4096, 4, PIPE_BIND_INDEX_BUFFER);
}
new_ib.buffer = pipe_buffer_create(pc->pipe->screen,
PIPE_BIND_INDEX_BUFFER,
PIPE_USAGE_IMMUTABLE,
new_ib.index_size * new_info.count);
dst =
pipe_buffer_map(pc->pipe, new_ib.buffer, PIPE_TRANSFER_WRITE,
&dst_transfer);
u_upload_alloc(pc->upload, 0, new_ib.index_size * new_info.count,
&new_ib.offset, &new_ib.buffer, &dst);
if (info->indexed) {
trans_func(src, info->start, new_info.count, dst);
@@ -163,8 +166,7 @@ util_primconvert_draw_vbo(struct primconvert_context *pc,
if (src_transfer)
pipe_buffer_unmap(pc->pipe, src_transfer);
if (dst_transfer)
pipe_buffer_unmap(pc->pipe, dst_transfer);
u_upload_unmap(pc->upload);
/* bind new index buffer: */
pc->pipe->set_index_buffer(pc->pipe, &new_ib);

View File

@@ -47,7 +47,7 @@
#endif
#if defined(PIPE_OS_LINUX) || defined(PIPE_OS_CYGWIN)
#if defined(PIPE_OS_LINUX) || defined(PIPE_OS_CYGWIN) || defined(PIPE_OS_SOLARIS)
# include <unistd.h>
#elif defined(PIPE_OS_APPLE) || defined(PIPE_OS_BSD)
# include <sys/sysctl.h>
@@ -111,14 +111,14 @@ os_get_option(const char *name)
bool
os_get_total_physical_memory(uint64_t *size)
{
#if defined(PIPE_OS_LINUX) || defined(PIPE_OS_CYGWIN)
#if defined(PIPE_OS_LINUX) || defined(PIPE_OS_CYGWIN) || defined(PIPE_OS_SOLARIS)
const long phys_pages = sysconf(_SC_PHYS_PAGES);
const long page_size = sysconf(_SC_PAGE_SIZE);
*size = phys_pages * page_size;
return (phys_pages > 0 && page_size > 0);
#elif defined(PIPE_OS_APPLE) || defined(PIPE_OS_BSD)
size_t len = sizeof(size);
size_t len = sizeof(*size);
int mib[2];
mib[0] = CTL_HW;
@@ -128,11 +128,13 @@ os_get_total_physical_memory(uint64_t *size)
mib[1] = HW_PHYSMEM64;
#elif defined(PIPE_OS_FREEBSD)
mib[1] = HW_REALMEM;
#elif defined(PIPE_OS_DRAGONFLY)
mib[1] = HW_PHYSMEM;
#else
#error Unsupported *BSD
#endif
return (sysctl(mib, 2, &size, &len, NULL, 0) == 0);
return (sysctl(mib, 2, size, &len, NULL, 0) == 0);
#elif defined(PIPE_OS_HAIKU)
system_info info;
status_t ret;

View File

@@ -40,9 +40,6 @@
#include "pipe/p_compiler.h"
#if defined(PIPE_OS_UNIX)
# ifndef _FILE_OFFSET_BITS
# error _FILE_OFFSET_BITS must be defined to 64
# endif
# include <sys/mman.h>
#else
# error Unsupported OS
@@ -61,7 +58,8 @@ extern "C" {
extern void *__mmap2(void *, size_t, int, int, int, size_t);
static INLINE void *os_mmap(void *addr, size_t length, int prot, int flags, int fd, loff_t offset)
static INLINE void *os_mmap(void *addr, size_t length, int prot, int flags,
int fd, loff_t offset)
{
/* offset must be aligned to 4096 (not necessarily the page size) */
if (unlikely(offset & 4095)) {
@@ -72,12 +70,26 @@ static INLINE void *os_mmap(void *addr, size_t length, int prot, int flags, int
return __mmap2(addr, length, prot, flags, fd, (size_t) (offset >> 12));
}
# define os_munmap(addr, length) \
munmap(addr, length)
#else
/* assume large file support exists */
# define os_mmap(addr, length, prot, flags, fd, offset) mmap(addr, length, prot, flags, fd, offset)
#endif
# define os_mmap(addr, length, prot, flags, fd, offset) \
mmap(addr, length, prot, flags, fd, offset)
#define os_munmap(addr, length) munmap(addr, length)
static INLINE int os_munmap(void *addr, size_t length)
{
/* Copied from configure code generated by AC_SYS_LARGEFILE */
#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + \
(((off_t) 1 << 31) << 31))
STATIC_ASSERT(LARGE_OFF_T % 2147483629 == 721 &&
LARGE_OFF_T % 2147483647 == 1);
#undef LARGE_OFF_T
return munmap(addr, length);
}
#endif
#ifdef __cplusplus

View File

@@ -8,10 +8,7 @@ AM_CPPFLAGS = $(DEFINES) \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/gallium/winsys
noinst_LTLIBRARIES =
if HAVE_LOADER_GALLIUM
noinst_LTLIBRARIES += libpipe_loader.la
noinst_LTLIBRARIES = libpipe_loader.la
noinst_LTLIBRARIES += libpipe_loader_client.la
COMMON_SOURCES = \
@@ -43,5 +40,3 @@ libpipe_loader_client_la_CFLAGS = \
libpipe_loader_client_la_SOURCES = $(COMMON_SOURCES)
libpipe_loader_client_la_LIBADD = $(COMMON_LIBADD) \
$(GALLIUM_PIPE_LOADER_CLIENT_LIBS)
endif

View File

@@ -166,6 +166,17 @@ pipe_loader_sw_probe_null(struct pipe_loader_device **devs);
int
pipe_loader_sw_probe(struct pipe_loader_device **devs, int ndev);
/**
* Get a software device wrapped atop another device.
*
* This function is platform-specific.
*
* \sa pipe_loader_probe
*/
boolean
pipe_loader_sw_probe_wrapped(struct pipe_loader_device **dev,
struct pipe_screen *screen);
#ifdef HAVE_PIPE_LOADER_DRM
/**

View File

@@ -33,6 +33,7 @@
#include <fcntl.h>
#include <stdio.h>
#include <xf86drm.h>
#include <unistd.h>
#ifdef HAVE_PIPE_LOADER_XCB
@@ -63,6 +64,20 @@ struct pipe_loader_drm_device {
static struct pipe_loader_ops pipe_loader_drm_ops;
#ifdef HAVE_PIPE_LOADER_XCB
static xcb_screen_t *
get_xcb_screen(xcb_screen_iterator_t iter, int screen)
{
for (; iter.rem; --screen, xcb_screen_next(&iter))
if (screen == 0)
return iter.data;
return NULL;
}
#endif
static void
pipe_loader_drm_x_auth(int fd)
{
@@ -77,8 +92,9 @@ pipe_loader_drm_x_auth(int fd)
drm_magic_t magic;
xcb_dri2_authenticate_cookie_t authenticate_cookie;
xcb_dri2_authenticate_reply_t *authenticate;
int screen;
xcb_conn = xcb_connect(NULL, NULL);
xcb_conn = xcb_connect(NULL, &screen);
if(!xcb_conn)
return;
@@ -89,7 +105,8 @@ pipe_loader_drm_x_auth(int fd)
goto disconnect;
s = xcb_setup_roots_iterator(xcb_setup);
connect_cookie = xcb_dri2_connect_unchecked(xcb_conn, s.data->root,
connect_cookie = xcb_dri2_connect_unchecked(xcb_conn,
get_xcb_screen(s, screen)->root,
XCB_DRI2_DRIVER_TYPE_DRI);
connect = xcb_dri2_connect_reply(xcb_conn, connect_cookie, NULL);

View File

@@ -31,6 +31,7 @@
#include "util/u_dl.h"
#include "sw/dri/dri_sw_winsys.h"
#include "sw/null/null_sw_winsys.h"
#include "sw/wrapper/wrapper_sw_winsys.h"
#ifdef HAVE_PIPE_LOADER_XLIB
/* Explicitly wrap the header to ease build without X11 headers */
#include "sw/xlib/xlib_sw_winsys.h"
@@ -140,6 +141,28 @@ pipe_loader_sw_probe(struct pipe_loader_device **devs, int ndev)
return i;
}
boolean
pipe_loader_sw_probe_wrapped(struct pipe_loader_device **dev,
struct pipe_screen *screen)
{
struct pipe_loader_sw_device *sdev = CALLOC_STRUCT(pipe_loader_sw_device);
if (!sdev)
return false;
sdev->base.type = PIPE_LOADER_DEVICE_SOFTWARE;
sdev->base.driver_name = "swrast";
sdev->base.ops = &pipe_loader_sw_ops;
sdev->ws = wrapper_sw_winsys_wrap_pipe_screen(screen);
if (!sdev->ws) {
FREE(sdev);
return false;
}
*dev = &sdev->base;
return true;
}
static void
pipe_loader_sw_release(struct pipe_loader_device **dev)
{

View File

@@ -163,7 +163,8 @@ struct pb_manager *
pb_cache_manager_create(struct pb_manager *provider,
unsigned usecs,
float size_factor,
unsigned bypass_usage);
unsigned bypass_usage,
uint64_t maximum_cache_size);
struct pb_fence_ops;

View File

@@ -84,6 +84,7 @@ struct pb_cache_manager
pb_size numDelayed;
float size_factor;
unsigned bypass_usage;
uint64_t cache_size, max_cache_size;
};
@@ -114,6 +115,7 @@ _pb_cache_buffer_destroy(struct pb_cache_buffer *buf)
LIST_DEL(&buf->head);
assert(mgr->numDelayed);
--mgr->numDelayed;
mgr->cache_size -= buf->base.size;
assert(!pipe_is_referenced(&buf->base.reference));
pb_reference(&buf->buffer, NULL);
FREE(buf);
@@ -158,11 +160,20 @@ pb_cache_buffer_destroy(struct pb_buffer *_buf)
assert(!pipe_is_referenced(&buf->base.reference));
_pb_cache_buffer_list_check_free(mgr);
/* Directly release any buffer that exceeds the limit. */
if (mgr->cache_size + buf->base.size > mgr->max_cache_size) {
pb_reference(&buf->buffer, NULL);
FREE(buf);
pipe_mutex_unlock(mgr->mutex);
return;
}
buf->start = os_time_get();
buf->end = buf->start + mgr->usecs;
LIST_ADDTAIL(&buf->head, &mgr->delayed);
++mgr->numDelayed;
mgr->cache_size += buf->base.size;
pipe_mutex_unlock(mgr->mutex);
}
@@ -314,6 +325,7 @@ pb_cache_manager_create_buffer(struct pb_manager *_mgr,
}
if(buf) {
mgr->cache_size -= buf->base.size;
LIST_DEL(&buf->head);
--mgr->numDelayed;
pipe_mutex_unlock(mgr->mutex);
@@ -400,12 +412,15 @@ pb_cache_manager_destroy(struct pb_manager *mgr)
* the requested size as cache hits.
* @param bypass_usage Bitmask. If (requested usage & bypass_usage) != 0,
* buffer allocation requests are redirected to the provider.
* @param maximum_cache_size Maximum size of all unused buffers the cache can
* hold.
*/
struct pb_manager *
pb_cache_manager_create(struct pb_manager *provider,
unsigned usecs,
float size_factor,
unsigned bypass_usage)
unsigned bypass_usage,
uint64_t maximum_cache_size)
{
struct pb_cache_manager *mgr;
@@ -425,6 +440,7 @@ pb_cache_manager_create(struct pb_manager *provider,
mgr->bypass_usage = bypass_usage;
LIST_INITHEAD(&mgr->delayed);
mgr->numDelayed = 0;
mgr->max_cache_size = maximum_cache_size;
pipe_mutex_init(mgr->mutex);
return &mgr->base;

View File

@@ -19,7 +19,7 @@
#endif
#if GALLIUM_ILO
#include "intel/intel_winsys.h"
#include "intel/drm/intel_drm_public.h"
#include "ilo/ilo_public.h"
#endif
@@ -408,7 +408,7 @@ static const struct drm_conf_ret share_fd_ret = {
{true},
};
static const struct drm_conf_ret *
static inline const struct drm_conf_ret *
configuration_query(enum drm_conf conf)
{
switch (conf) {
@@ -465,7 +465,7 @@ dd_configuration(enum drm_conf conf)
#endif
#if defined(GALLIUM_FREEDRENO)
if ((strcmp(driver_name, "kgsl") == 0) || (strcmp(driver_name, "msm") == 0))
return NULL;
return configuration_query(conf);
else
#endif
return NULL;

View File

@@ -91,6 +91,34 @@ drisw_create_screen(struct drisw_loader_funcs *lf)
return screen;
}
#endif // DRI_TARGET
#if defined(NINE_TARGET)
#include "sw/wrapper/wrapper_sw_winsys.h"
#include "target-helpers/inline_debug_helper.h"
extern struct pipe_screen *ninesw_create_screen(struct pipe_screen *screen);
INLINE struct pipe_screen *
ninesw_create_screen(struct pipe_screen *pscreen)
{
struct sw_winsys *winsys = NULL;
struct pipe_screen *screen = NULL;
winsys = wrapper_sw_winsys_wrap_pipe_screen(pscreen);
if (winsys == NULL)
return NULL;
screen = sw_screen_create(winsys);
if (screen == NULL) {
winsys->destroy(winsys);
return NULL;
}
screen = debug_screen_wrap(screen);
return screen;
}
#endif // NINE_TARGET
#endif // GALLIUM_SOFTPIPE

View File

@@ -297,10 +297,10 @@ tgsi_default_declaration_sampler_view(void)
struct tgsi_declaration_sampler_view dsv;
dsv.Resource = TGSI_TEXTURE_BUFFER;
dsv.ReturnTypeX = PIPE_TYPE_UNORM;
dsv.ReturnTypeY = PIPE_TYPE_UNORM;
dsv.ReturnTypeZ = PIPE_TYPE_UNORM;
dsv.ReturnTypeW = PIPE_TYPE_UNORM;
dsv.ReturnTypeX = TGSI_RETURN_TYPE_UNORM;
dsv.ReturnTypeY = TGSI_RETURN_TYPE_UNORM;
dsv.ReturnTypeZ = TGSI_RETURN_TYPE_UNORM;
dsv.ReturnTypeW = TGSI_RETURN_TYPE_UNORM;
return dsv;
}

Some files were not shown because too many files have changed in this diff Show More