Compare commits

...

187 Commits

Author SHA1 Message Date
Emil Velikov
cb154bb221 docs: Add sha256 sums for the 10.4.7 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 00:50:13 +00:00
Emil Velikov
d26f3c1f86 Add release notes for the 10.4.7 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 00:26:27 +00:00
Emil Velikov
b7b218f3f6 Update version to 10.4.7
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 00:19:39 +00:00
Marek Olšák
832c94a55c radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords
radeon_llvm_emit_prepare_cube_coords uses coords[4] in some cases (TXB2 etc.)

Discovered by Coverity. Reported by Ilia Mirkin.

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit a984abdad3)
2015-03-18 21:49:33 +00:00
Mario Kleiner
70832be2f1 glx: Handle out-of-sequence swap completion events correctly. (v2)
The code for emitting INTEL_swap_events swap completion
events needs to translate from 32-Bit sbc on the wire to
64-Bit sbc for the events and handle wraparound accordingly.

It assumed that events would be sent by the server in the
order their corresponding swap requests were emitted from
the client, iow. sbc count should be always increasing. This
was correct for DRI2.

This is not always the case under the DRI3/Present backend,
where the Present extension can execute presents and send out
completion events in a different order than the submission
order of the present requests, due to client code specifying
targetMSC target vblank counts which are not strictly
monotonically increasing. This confused the wraparound
handling. This patch fixes the problem by handling 32-Bit
wraparound in both directions. As long as successive swap
completion events real 64-Bit sbc's don't differ by more
than 2^30, this should be able to do the right thing.

How this is supposed to work:

awire->sbc contains the low 32-Bits of the true 64-Bit sbc
of the current swap event, transmitted over the wire.

glxDraw->lastEventSbc contains the low 32-Bits of the 64-Bit
sbc of the most recently processed swap event.

glxDraw->eventSbcWrap is a 64-Bit offset which tracks the upper
32-Bits of the current sbc. The final 64-Bit output sbc
aevent->sbc is computed from the sum of awire->sbc and
glxDraw->eventSbcWrap.

Under DRI3/Present, swap completion events can be received
slightly out of order due to non-monotic targetMsc specified
by client code, e.g., present request submission:

Submission sbc:   1   2   3
targetMsc:        10  11  9

Reception of completion events:
Completion sbc:   3   1   2

The completion sequence 3, 1, 2 would confuse the old wraparound
handling made for DRI2 as 1 < 3 --> Assumes a 32-Bit wraparound
has happened when it hasn't.

The client can queue multiple present requests, in the case of
Mesa up to n requests for n-buffered rendering, e.g., n =  2-4 in
the current Mesa GLX DRI3/Present implementation. In the case of
direct Pixmap presents via xcb_present_pixmap() the number n is
limited by the amount of memory available.

We reasonably assume that the number of outstanding requests n is
much less than 2 billion due to memory contraints and common sense.
Therefore while the order of received sbc's can be a bit scrambled,
successive 64-Bit sbc's won't deviate by much, a given sbc may be
a few counts lower or higher than the previous received sbc.

Therefore any large difference between the incoming awire->sbc and
the last recorded glxDraw->lastEventSbc will be due to 32-Bit
wraparound and we need to adapt glxDraw->eventSbcWrap accordingly
to adjust the upper 32-Bits of the sbc.

Two cases, correponding to the two if-statements in the patch:

a) Previous sbc event was below the last 2^32 boundary, in the previous
glxDraw->eventSbcWrap epoch, the new sbc event is in the next 2^32
epoch, therefore the low 32-Bit awire->sbc wrapped around to zero,
or close to zero --> awire->sbc is apparently much lower than the
glxDraw->lastEventSbc recorded for the previous epoch

--> We need to increment glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch to be one higher than the previous one.

--> Case a) also handles the old DRI2 behaviour.

b) Previous sbc event was above closest 2^32 boundary, but now a
late event from the previous 2^32 epoch arrives, with a true sbc
that belongs to the previous 2^32 segment, so the awire->sbc of
this late event has a high count close to 2^32, whereas
glxDraw->lastEventSbc is closer to zero --> awire->sbc is much
greater than glXDraw->lastEventSbc.

--> We need to decrement glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch back to the previous lower epoch of this late
completion event.

We assume such a wraparound to a higher (a) epoch or lower (b)
epoch has happened if awire->sbc and glxDraw->lastEventSbc differ
by more than 2^30 counts, as such a difference can only happen
on wraparound, or if somehow 2^30 present requests would be pending
for a given drawable inside the server, which is rather unlikely.

v2: Explain the reason for this patch and the new wraparound handling
    much more extensive in commit message, no code change wrt. initial
    version.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit cc5ddd584d)
2015-03-18 21:49:25 +00:00
Emil Velikov
ad259df2e0 auxiliary/os: fix the android build - s/drm_munmap/os_munmap/
Squash this silly typo introduced with commit c63eb5dd5ec(auxiliary/os: get
the mmap/munmap wrappers working with android)

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 55f0c0a29f)
2015-03-18 21:49:18 +00:00
Emil Velikov
df2db2a55f loader: include <sys/stat.h> for non-sysfs builds
Required by fstat(), otherwise we'll error out due to implicit function
declaration.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89530
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reported-by: Vadim Rutkovsky <vrutkovs@redhat.com>
Tested-by: Vadim Rutkovsky <vrutkovs@redhat.com>
(cherry picked from commit 771cd266b9)
2015-03-18 21:49:05 +00:00
Rob Clark
0506f69f08 freedreno: update generated headers
Fix a3xx texture layer-size.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e92bc6b38e)
[Emil Velikov: sqush trivial conflicts, drop the a4xx.xml.h changes]

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/freedreno/a2xx/a2xx.xml.h
	src/gallium/drivers/freedreno/a3xx/a3xx.xml.h
	src/gallium/drivers/freedreno/a4xx/a4xx.xml.h
	src/gallium/drivers/freedreno/adreno_common.xml.h
	src/gallium/drivers/freedreno/adreno_pm4.xml.h
2015-03-18 21:48:40 +00:00
Ilia Mirkin
a563045009 freedreno: fix slice pitch calculations
For example if width were 65, the first slice would get 96 while the
second would get 32. However the hardware appears to expect the second
pitch to be 64, based on halving the 96 (and aligning up to 32).

This fixes texelFetch piglit tests on a3xx below a certain size. Going
higher they break again, but most likely due to unrelated reasons.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 620e29b748)
2015-03-18 21:32:21 +00:00
Samuel Iglesias Gonsalvez
b2e243f70c glsl: optimize (0 cmp x + y) into (-x cmp y).
The optimization done by commit 34ec1a24d did not take it into account.

Fixes:

dEQP-GLES3.functional.shaders.random.all_features.fragment.20

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b43bbfa90a)
2015-03-18 21:15:35 +00:00
Iago Toral Quiroga
8c25b0f2d1 i965: Fix out-of-bounds accesses into pull_constant_loc array
The piglit test glsl-fs-uniform-array-loop-unroll.shader_test was designed
to do an out of bounds access into an uniform array to make sure that we
handle that situation gracefully inside the driver, however, as Ken describes
in bug 79202, Valgrind reports that this is leading to an out-of-bounds access
in fs_visitor::demote_pull_constants().

Before accessing the pull_constant_loc array we should make sure that
the uniform we are trying to access is valid.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79202
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 6ac1bc90c4)
Nominated-by: Matt Turner <mattst88@gmail.com>
2015-03-11 17:46:03 +00:00
Rob Clark
a91ee1e187 freedreno/ir3: fix silly typo for binning pass shaders
Was resulting in gl_PointSize write being optimized out, causing
particle system type shaders to hang if hw binning enabled.

Fixes neverball, OGLES2ParticleSystem, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 60096ed906)
2015-03-11 17:44:38 +00:00
Marek Olšák
977626f10a r300g: fix sRGB->sRGB blits
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c939231e72)
2015-03-11 17:42:52 +00:00
Marek Olšák
b451a2ffbf r300g: fix a crash when resolving into an sRGB texture
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9953586af2)
2015-03-11 17:42:38 +00:00
Marek Olšák
a561eee82c r300g: fix RGTC1 and LATC1 SNORM formats
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 74a757f92f)
2015-03-11 17:42:07 +00:00
Stefan Dösinger
80ef80d087 r300g: Fix the ATI1N swizzle (RGTC1 and LATC1)
This fixes the GL_COMPRESSED_RED_RGTC1 part of piglit's rgtc-teximage-01
test as well as the precision part of Wine's 3dc format test (fd.o bug
89156).

The Z component seems to contain a lower precision version of the
result, probably a temporary value from the decompression computation.
The Y and W component contain different data that depends on the input
values as well, but I could not make sense of them (Not that I tried
very hard).

GL_COMPRESSED_SIGNED_RED_RGTC1 still seems to have precision problems in
piglit, and both formats are affected by a compiler bug if they're
sampled by the shader with a swizzle other than .xyzw. Wine uses .xxxx,
which returns random garbage.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89156
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f710b99071)
2015-03-11 17:41:43 +00:00
Ilia Mirkin
fa8bfb3ed1 freedreno/ir3: get the # of miplevels from getinfo
This fixes ARB_texture_query_levels to actually return the desired
value.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cb3eb43ad6)
2015-03-11 17:41:32 +00:00
Ilia Mirkin
025cf8cb3f freedreno/ir3: fix array count returned by TXQ
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8ac957a51c)
2015-03-11 17:41:20 +00:00
Ilia Mirkin
4db4f70546 freedreno: move fb state copy after checking for size change
Fixes: 1f3ca56b ("freedreno: use util_copy_framebuffer_state()")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f3dfe6513c)
2015-03-11 17:40:59 +00:00
Andrey Sudnik
d4a95ffcda i965/vec4: Don't lose the saturate modifier in copy propagation.
Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89224
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0dfec59a27)
2015-03-07 16:41:16 +00:00
Emil Velikov
97b0219ed5 mesa: rename format_info.c to format_info.h
The file is auto-generated, and #included by formats.c. Let's rename it
to reflect the latter. This will also help up fix the dependency
tracking by adding it to the _SOURCES variable, without the side effect
of it being compiled (twice).

v2: Update .gitignore to reflect the rename.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 3f6c28f2a9)

Conflicts:
	src/mesa/Makefile.am
	src/mesa/main/.gitignore
2015-03-07 16:40:27 +00:00
Matt Turner
93273f16af r300g: Check return value of snprintf().
Would have at least prevented the crash the previous patch fixed.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540970
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit ade0b580e7)
2015-03-07 16:37:22 +00:00
Matt Turner
8e8d215cae r300g: Use PATH_MAX instead of limiting ourselves to 100 chars.
When built with Gentoo's package manager, the Mesa source directory
exists seven directories deep. The path to the .test file is too long
and is silently truncated, leading to a crash. Just use PATH_MAX.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540970
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit f5e2aa1324)
2015-03-07 16:37:15 +00:00
Daniel Stone
1a929baa0b egl: Take alpha bits into account when selecting GBM formats
This fixes piglit when using PIGLIT_PLATFORM=gbm

Tom Stellard:
  - Fix ARGB2101010 format

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
(cherry picked from commit 65c8965d03)
2015-03-07 16:37:04 +00:00
Marc-Andre Lureau
3a625d0b3f gallium/auxiliary/indices: fix start param
Since commit 28f3f8d, indices generator take a start parameter. However, some
index values have been left to start at 0.

This fixes the glean/fbo test with the virgl driver, and copytexsubimage
with freedreno.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 073a5d2e84)
2015-03-07 16:36:47 +00:00
Emil Velikov
944ef59b2f cherry-ignore: add not applicable/rejected commits
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-07 16:36:05 +00:00
Emil Velikov
fc9dd495b2 docs: Add sha256 sums for the 10.4.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:44:55 +00:00
Emil Velikov
542a754524 Add release notes for the 10.4.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:23:34 +00:00
Emil Velikov
e559d126f9 Update version to 10.4.6
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:16:58 +00:00
Emil Velikov
fc5881ad73 Revert "gallivm: Update for RTDyldMemoryManager becoming an unique_ptr."
This reverts commit 66a3f104a5.

The commit is likely insufficient for normal work with LLVM 3.6.
The full discussion and reason can be found at
http://lists.freedesktop.org/archives/mesa-dev/2015-March/078795.html
2015-03-06 19:16:28 +00:00
Emil Velikov
9508ca24f1 mesa: cherry-pick the second half of commit 2aa71e9485
Missed out by commit 39ae85732d2(mesa: Fix error validating args for
TexSubImage3D)

Reported-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:16:19 +00:00
Matt Turner
644bbf88ec mesa: Correct backwards NULL check.
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 491d42135a)
2015-03-06 18:45:13 +00:00
Ian Romanick
a369361f9e mesa: Always generate GL_INVALID_OPERATION in _mesa_GetProgramBinary
There are no binary formats supported, so what are you doing?  At least
this gives the application developer some feedback about what's going
on.  The spec gives no guidance about what to do in this scenario.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
(cherry picked from commit f591712efe)
2015-03-06 18:44:52 +00:00
Ian Romanick
f1663a5236 mesa: Ensure that length is set to zero in _mesa_GetProgramBinary
v2: Fix assignment of length.  Noticed by Julien Cristau.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
(cherry picked from commit 4fd8b30123)
2015-03-06 18:44:37 +00:00
Ian Romanick
e1b5bc9330 mesa: Add missing error checks in _mesa_ProgramBinary
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
(cherry picked from commit 201b9c1818)

Conflicts:
	src/mesa/main/shaderapi.c
2015-03-06 18:42:51 +00:00
Emil Velikov
93edf3e7dc Revert "mesa: Correct backwards NULL check."
This reverts commit a598a9bdfe.

The patch was applied without the required dependencies.
2015-03-06 18:40:09 +00:00
José Fonseca
66a3f104a5 gallivm: Update for RTDyldMemoryManager becoming an unique_ptr.
Trivial.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=86958

(cherry picked from commit ef7e0b39a2)
Nominated-by: Sedat Dilek <sedat.dilek@gmail.com>
2015-03-04 01:51:36 +00:00
Abdiel Janulgue
afa7a851da st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported
There is a bug in the current lowering pass implementation where we lower saturate
to clamp only for vertex shaders on drivers supporting SM 3.0. The correct behavior
is to actually lower to clamp only when we don't support saturate which happens
on drivers that don't support SM 3.0

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit 49e0431211)
Nominated-by: Matt Turner <mattst88@gmail.com>
2015-03-04 01:51:36 +00:00
Abdiel Janulgue
d880aa573c glsl: Don't optimize min/max into saturate when EmitNoSat is set
v3: Fix multi-line comment format (Ian)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit 4ea8c8d56c)
2015-03-04 01:51:36 +00:00
Matt Turner
741aeba26f i965/fs: Don't use backend_visitor::instructions after creating the CFG.
This is a fix for a regression introduced in commit a9f8296d ("i965/fs:
Preserve the CFG in a few more places.").

The errata this code works around is described in a comment before the function:

   "[DevBW, DevCL] Errata: A destination register from a send can not be
    used as a destination register until after it has been sourced by an
    instruction with a different destination register.

The framebuffer write's sources must be in message registers, which SEND
instructions cannot have as a destination. There's no way for this
errata to affect anything at the end of the program. Just remove the
code.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84613
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e214000f25)
2015-03-04 01:51:36 +00:00
Matt Turner
a598a9bdfe mesa: Correct backwards NULL check.
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 491d42135a)
[Emil Velikov: the patch hunk has a different offset.]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/mesa/main/shaderapi.c
2015-03-04 01:51:36 +00:00
Chris Forbes
0c46d850d9 i965/gs: Check newly-generated GS-out VUE map against correct stage
Previously, we compared our new GS-out VUE map to the existing *VS*-out
VUE map, which is bogus.

This would mostly manifest as redundant dirty flagging where the GS is
in use but the VS and GS output layouts differ; but there is a scary
case where we would fail to flag a GS-out layout change if it happened
to match the VS-out layout.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.5, 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88885
(cherry picked from commit b51ff50a76)
2015-03-04 01:51:36 +00:00
Jonathan Gray
da46b1b160 auxilary/os: correct sysctl use in os_get_total_physical_memory()
The length argument passed to sysctl was the size of the pointer
not the type.  The result of this is sysctl calls would fail on
32 bit BSD/Mac OS X.

Additionally the wrong pointer was passed as an argument to store
the result of the sysctl call.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 7983a3d2e0)
2015-03-04 01:51:36 +00:00
Matt Turner
7e723c98ce glsl: Rewrite and fix min/max to saturate optimization.
There were some bugs, and the code was really difficult to follow. We
would optimize

   min(max(x, b), 1.0) into max(sat(x), b)

but not pay attention to the order of min/max and also do

   max(min(x, b), 1.0) into max(sat(x), b)

Corrects four shaders from Champions of Regnum that do

   min(max(x, 1), 10)

and corrects rendering of Mass Effect under VMware Workstation.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89180
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit cb25087c7b)
2015-03-04 01:51:36 +00:00
Andreas Boll
0a51529a28 glx: Fix returned values of GLX_RENDERER_PREFERRED_PROFILE_MESA
If the renderer supports the core profile the query returned incorrectly
0x8 as value, because it was using (1U << __DRI_API_OPENGL_CORE) for the
returned value.

The same happened with the compatibility profile. It returned 0x1
(1U << __DRI_API_OPENGL) instead of 0x2.

Internal DRI defines:
   dri_interface.h: #define __DRI_API_OPENGL       0
   dri_interface.h: #define __DRI_API_OPENGL_CORE  3

Those two bits are supposed for internal usage only and should be
translated to GLX_CONTEXT_CORE_PROFILE_BIT_ARB (0x1) for a preferred
core context profile and GLX_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB (0x2)
for a preferred compatibility context profile.

This patch implements the above translation in the glx module.

v2: Fix the incorrect behavior in the glx module

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6d164f65c5)
2015-03-04 01:51:36 +00:00
Leo Liu
2a9e9b5aeb st/omx/dec/h264: fix picture out-of-order with poc type 0 v2
poc counter should be reset with IDR frame,
otherwise there would be a re-order issue with
frames before and after IDR

v2: add commit message

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9c7b343bc0)
2015-03-04 01:51:36 +00:00
Emil Velikov
120792fa04 install-lib-links: remove the .install-lib-links file
With earlier commit (install-lib-links: don't depend on .libs directory)
we moved the location of the file from .libs/ to the current dir.
Although we did not attribute that in the former case autotools was
doing us a favour and removing the file. Explicitly remove the file at
clean-local time, otherwise we'll end up with dangling files.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit fece147be5)
2015-03-04 01:51:35 +00:00
Eduardo Lima Mitev
39ae85732d mesa: Fix error validating args for TexSubImage3D
The zoffset and depth values were not being considered when calling
error_check_subtexture_dimensions().

Fixes 2 dEQP tests:
* dEQP-GLES3.functional.negative_api.texture.texsubimage3d_neg_offset
* dEQP-GLES3.functional.negative_api.texture.texsubimage3d_invalid_offset

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedestkop.org>
(cherry picked from commit 2aa71e9485)
[Emil Velikov: Resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/mesa/main/teximage.c
2015-03-04 01:51:35 +00:00
Marek Olšák
61c1aabb9f radeonsi: fix point sprites
Broken by a27b74819a.

This fix is critical and should be ported to stable ASAP.

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7820a11e3d)

Squashed with commit

radeonsi: fix a warning caused by previous commit

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 050bf75c8b)

[Emil Velikov: The file was renamed si_state_{shaders,draw}.c]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state_shader.c
2015-03-04 01:51:16 +00:00
Marek Olšák
6da4e66d4e vbo: fix an unitialized-variable warning
It looks like a bug to me.

Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 0feb0b7373)
2015-03-04 00:39:01 +00:00
Brian Paul
7e57411b9a st/mesa: fix sampler view reference counting bug in glDraw/CopyPixels
Use pipe_sampler_view_reference() instead of ordinary assignment.
Also add a new sanity check assertion.

Fixes piglit gl-1.0-drawpixels-color-index test crash.  But note
that the test still fails.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 62a8883f32)
2015-03-04 00:38:31 +00:00
Brian Paul
1e6735ead1 swrast: fix multiple color buffer writing
If a fragment program wrote to more than one color buffer, the
first fragment color got replicated to all dest buffers.  This
fixes 5 piglit FBO tests, including fbo-drawbuffers-arbfp.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45348
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 89c96afe3c)
2015-03-04 00:38:23 +00:00
Lucas Stach
deea686c71 install-lib-links: don't depend on .libs directory
This snippet can be included in Makefiles that may, depending on the
project configuration, not actually build any installable libraries.

In that case we don't have anything to depend on and this part of
the makefile may be executed before the .libs directory is created,
so do not depend on it being there.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
(cherry picked from commit 5c1aac17ad)
2015-03-04 00:38:11 +00:00
Emil Velikov
41bdeda102 docs: Add sha256 sums for the 10.4.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-21 12:31:51 +00:00
Emil Velikov
a5c608e951 Add release notes for the 10.4.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-21 12:22:08 +00:00
Emil Velikov
e0276bc297 Update version to 10.4.5
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-21 12:17:35 +00:00
Michel Dänzer
dc16fb1969 Revert "radeon/llvm: enable unsafe math for graphics shaders"
This reverts commit 0e9cdedd2e.

It caused the grass to disappear in The Talos Principle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89069
Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 4db985a5fa)
2015-02-18 12:17:44 +00:00
Kenneth Graunke
aaa823569b glsl: Reduce memory consumption of copy propagation passes.
opt_copy_propagation and opt_copy_propagation_elements create new ACP
and Kill sets each time they enter a new control flow block.  For if
blocks, they also copy the entire existing ACP set contents into the
new set.

When we exit the control flow block, we discard the new sets.  However,
we weren't freeing them - so they lived on until the pass finished.
This can waste a lot of memory (57MB on one pessimal shader).

This patch makes the pass allocate ACP entries using this->acp as the
memory context, and Kill entries out of this->kill.  It also steals
kill entries when moving them from the inner kill list to the parent.

It then frees the lists, including their contents.

v2: Move ralloc_free(this->acp) just before this->acp = orig_acp
    (suggested by Eric Anholt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 76960a55e6)
2015-02-18 12:17:44 +00:00
Laura Ekstrand
f57b41758d main: Fixed _mesa_GetCompressedTexImage_sw to copy slices correctly.
Previously array textures were not working with GetCompressedTextureImage,
leading to failures in the test
arb_direct_state_access/getcompressedtextureimage.c.

Tested-by: Laura Ekstrand <laura@jlekstrand.net>
Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92163482bd)
2015-02-18 12:17:44 +00:00
Marek Olšák
67ac6a3951 radeonsi: fix a crash if a stencil ref state is set before a DSA state
+ minor indentation fixes

Discovered by Axel Davy.

This can't be reproduced with any app, because all state trackers set a DSA
state first.

Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 2ead74888a)
2015-02-18 12:17:44 +00:00
Marek Olšák
5d04b9eeed mesa: fix AtomicBuffer typo in _mesa_DeleteBuffers
Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit e8625a29fe)
2015-02-18 12:17:43 +00:00
Marek Olšák
53041aecef radeonsi: small fix in SPI state
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

(cherry picked from commit a27b74819a)
[Emil Velikov: The file was renamed si_state_{shaders,draw}.c]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
        src/gallium/drivers/radeonsi/si_state_shader.c
2015-02-18 12:14:04 +00:00
Ilia Mirkin
f76bcbb4cd nvc0: allow holes in xfb target lists
Tested with a modified xfb-streams test which outputs to streams 0, 2,
and 3.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 854eb06bee)
2015-02-18 12:09:55 +00:00
Ilia Mirkin
89289934fc st/mesa: treat resource-less xfb buffers as if they weren't there
If a transform feedback buffer's size is 0, st_bufferobj_data doesn't
end up creating a buffer for it. There's no point in trying to write to
such a buffer, so just pretend as if it's not really there.

This fixes arb_gpu_shader5-xfb-streams-without-invocations on nvc0.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 80d373ed5b)
2015-02-18 12:09:54 +00:00
Ilia Mirkin
dbf82d753b nvc0: bail out of 2d blits with non-A8_UNORM alpha formats
This fixes the teximage-colors uploads with GL_ALPHA format and
non-GL_UNSIGNED_BYTE type.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 68e4f3f572)
2015-02-18 12:09:54 +00:00
Emil Velikov
b786e6332b get-pick-list.sh: Require explicit "10.4" for nominating stable patches
A nomination unadorned with a specific version is now interpreted as
being aimed at the 10.5 branch, which was recently opened.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-18 12:09:54 +00:00
Carl Worth
c0ce908a90 Revert use of Mesa IR optimizer for ARB_fragment_programs
Commit f82f2fb3dc added use of the Mesa
IR optimizer for both ARB_fragment_program and ARB_vertex_program, but
only justified the vertex-program portions with measured performance
improvements.

Meanwhile, the optimizer was seen to generate hundreds of unused
immediates without discarding them, causing failures.

Discard the use of the optimizer for now to fix the regression. (In
the future, we anticpate things moving from Mesa IR to NIR for better
optimization anyway.)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82477

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

CC: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 55a57834bf)
2015-02-18 12:09:54 +00:00
Kenneth Graunke
c83c5f4b69 i965: Fix integer border color on Haswell.
+82 Piglits - 100% of border color tests now pass on Haswell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 08a06b6b89)
2015-02-18 12:09:54 +00:00
Kenneth Graunke
f2663112f6 i965: Use a gl_color_union for sampler border color.
This should have no effect, but will make it easier to implement other
bug fixes.

v2: Eliminate "unsigned one" local; just use the value where necessary.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e1e73443c5)
2015-02-18 12:09:54 +00:00
Kenneth Graunke
2ad93851ff i965: Override swizzles for integer luminance formats.
The hardware's integer luminance formats are completely unusable;
currently we fall back to RGBA.  This means we need to override
the texture swizzle to obtain the XXX1 values expected for luminance
formats.

Fixes spec/EXT_texture_integer/texwrap formats bordercolor [swizzled]
on Broadwell - 100% of border color tests now pass on Broadwell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8cb18760cc)
2015-02-18 12:09:54 +00:00
Michel Dänzer
e35e6773c2 st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB
The latter currently implies CPU read access, so only PIPE_USAGE_STAGING
can be expected to be fast.

Mesa demos src/tests/streaming_rect on Kaveri (radeonsi):

Unpatched:  42 frames in  1.023 seconds = 41.056 FPS
Patched:   615 frames in  1.000 seconds = 615.000 FPS

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88658
Cc: "10.3 10.4" <mesa-stable@lists.freedestkop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a338dc0186)
2015-02-18 12:09:54 +00:00
Marek Olšák
51bdd19c97 radeonsi: fix instanced arrays with non-zero start instance
Fixes piglit ARB_base_instance/arb_base_instance-drawarrays.

Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 50908a8918)
2015-02-18 12:09:54 +00:00
Marek Olšák
5c623ff071 r600g,radeonsi: don't append to streamout buffers that haven't been used yet
The FILLED_SIZE counter is uninitialized at the beginning, so we can't use it.
Instead, use offset = 0, which is what we always do when not appending.

This unexpectedly fixes spec/ARB_texture_multisample/sample-position/*.
Yes, the test does use transform feedback.

Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 658f1d4cfe)
2015-02-18 12:09:53 +00:00
Jeremy Huddleston Sequoia
654f197f19 darwin: build fix
xfont.c:237:14: error: implicit declaration of function 'GetGLXDRIDrawable' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
   glxdraw = GetGLXDRIDrawable(CC->currentDpy, CC->currentDrawable);
             ^
Fixes regression from 291be28476

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
(cherry picked from commit e68b67b53f)
2015-02-11 00:24:04 -08:00
Jeremy Huddleston Sequoia
162cee83ba darwin: build fix
../../../src/mesa/main/compiler.h:47:10: fatal error: 'util/macros.h' file not found

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
(cherry picked from commit 1c67a5687a)
2015-02-10 20:35:33 -08:00
Emil Velikov
54da987bae docs: Add sha256 sums for the 10.4.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 00:47:18 +00:00
Emil Velikov
62eb27ac8b Add release notes for the 10.4.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 00:17:09 +00:00
Emil Velikov
a824179af5 Update version to 10.4.4
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 00:12:04 +00:00
Park, Jeongmin
fecedb6c43 st/osmesa: Fix osbuffer->textures indexing
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88930
Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6fd4a61ad6)
2015-02-04 01:37:33 +00:00
Matt Turner
9d1d1f46c7 gallium/util: Don't use __builtin_clrsb in util_last_bit().
Unclear circumstances lead to undefined symbols on x86.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=536916
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 32e98e8ef0)
2015-02-04 01:37:20 +00:00
José Fonseca
b51d369690 egl: Pass the correct X visual depth to xcb_put_image().
The dri2_x11_add_configs_for_visuals() function happily matches a 32
bits EGLconfig with a 24 bits X visual.  However it was passing 32bits
depth to xcb_put_image(), making X server unhappy:

  https://github.com/apitrace/apitrace/issues/313#issuecomment-70571911

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 11a955aef4)
2015-02-02 00:12:04 +00:00
Niels Ole Salscheider
eab8dc28ed configure: Link against all LLVM targets when building clover
Since 8e7df519bd, we initialise all targets in
clover. This fixes bug 85380.

v2: Mention correct bug in commit message

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4b94c3fc31)
2015-02-02 00:12:04 +00:00
Ville Syrjälä
cc580045a8 i965: Fix max_wm_threads for CHV
Change max_wm_threads to match the spec on CHV. The max number of
threads in 3DSTATE_PS is always programmed to 64 and the hardware
internally scales that depending on the GT SKU. So this doesn't
change the max number of threads actually used, but it does affect
the scratch space calculation.

On CHV the old value was too small, so the amount of scratch space
allocated wasn't sufficient to satisfy the actual max number of
threads used.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit 99754446ab)
2015-02-02 00:12:04 +00:00
Mario Kleiner
0d721fa1d6 glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)
Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>

v2: Add Frank Binns signed off by for his original earlier
patch from April 2014, which is identical to this one, and
Chris Wilsons reviewed tag from May 2014 for that patch, ergo
also for this one.

v3: Incorporate comment about triple buffering as suggested
by Axel Davy, and reference to relevant spec provided by
Eric Anholt.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 455d3036fa)
2015-02-02 00:12:04 +00:00
Brian Paul
c96ed76b3d mesa: fix display list 8-byte alignment issue
The _mesa_dlist_alloc() function is only guaranteed to return a pointer
with 4-byte alignment.  On 64-bit systems which don't support unaligned
loads (e.g. SPARC or MIPS) this could lead to a bus error in the VBO code.

The solution is to add a new  _mesa_dlist_alloc_aligned() function which
will return a pointer to an 8-byte aligned address on 64-bit systems.
This is accomplished by inserting a 4-byte NOP instruction in the display
list when needed.

The only place this actually matters is the VBO code where we need to
allocate a 'struct vbo_save_vertex_list' which needs to be 8-byte
aligned (just as if it were malloc'd).

The gears demo and others hit this bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88662
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 53b01938ed)
2015-01-30 08:51:51 -07:00
Emil Velikov
49a5bce780 docs: Add sha256 sums for the 10.4.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-24 12:54:33 +00:00
Emil Velikov
e92bfa3f95 Add release notes for the 10.4.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-24 12:49:17 +00:00
Emil Velikov
f70e4d4afd Update version to 10.4.3
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-24 12:44:46 +00:00
Axel Davy
42806f12a9 st/nine: Allocate vs constbuf buffer for indirect addressing once.
When the shader does indirect addressing on the constants,
we allocate a temporary constant buffer to which we copy
the constants from the app given user constants and
the constants filled in the shader.

This patch makes this buffer be allocated once.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f8a74410f1)
2015-01-23 00:47:26 +00:00
Axel Davy
4c9b64fc44 st/nine: Allocate the correct size for the user constant buffer
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e0f75044c8)
2015-01-23 00:47:26 +00:00
Axel Davy
69c7cf70e7 st/nine: Add variables containing the size of the constant buffers
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b9cbea9dbc)
2015-01-23 00:47:26 +00:00
Axel Davy
4d04fd0871 st/nine: Fix sm3 relative addressing for non-debug build
Relative addressing needs the constant buffer to get all
the correct constants, even those defined by the shader.

The code to copy the shader constants to the constant buffer
was enabled only for debug build. Enable it always.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit a721987077)
2015-01-23 00:47:25 +00:00
Axel Davy
0727ab961c st/nine: Remove unused code for ps
Since constant indirect adressing is not allowed for ps,
we can remove our code to handle that.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4b7a9cfddb)
2015-01-23 00:47:25 +00:00
Axel Davy
7280ddea9d st/nine: Correct rules for relative adressing and constants.
relative adressing for constants is possible only for vs float
constants.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9690bf33d7)
2015-01-23 00:47:25 +00:00
Axel Davy
425bc89720 st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bce94ce831)
2015-01-23 00:47:25 +00:00
Axel Davy
0b3f8c72f7 st/nine: Implement TEXDP3TEX
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9e23b64c15)
2015-01-23 00:47:25 +00:00
Axel Davy
63e668eb18 st/nine: Implement TEXDP3
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 09eb1e901f)
2015-01-23 00:47:24 +00:00
Axel Davy
2b4c577730 st/nine: Implement TEXDEPTH
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f19e699368)
2015-01-23 00:47:24 +00:00
Axel Davy
e3a393b4c3 st/nine: Implement TEXM3x3SPEC
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3676ab02fb)
2015-01-23 00:47:24 +00:00
Axel Davy
7ecd0f9528 st/nine: Implement TEXM3x2TEX
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2b9f079ae3)
2015-01-23 00:47:24 +00:00
Axel Davy
336887bca1 st/nine: implement TEXM3x2DEPTH
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fdff111dc8)
2015-01-23 00:47:24 +00:00
Axel Davy
8e08ba6f96 st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC
The fix is that this line:
"src[s] = tx->regs.vT[s];" is wrong if s doesn't start from 0.
Instead access tx->regs.vT directly when needed.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7865210670)

Conflicts:
	src/gallium/state_trackers/nine/nine_shader.c
2015-01-23 00:47:09 +00:00
Axel Davy
77e1136f44 st/nine: Fill missing dst and src number for some instructions.
Not filling them correctly results in bad padding and later crash.

Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b1259544e3)
2015-01-23 00:44:42 +00:00
Axel Davy
22c75f9f5a st/nine: Implement TEXCOORD special behaviours
texcoord for ps < 1_4 should clamp between 0 and 1 the values.

texcrd (texcoord ps 1_4) does not clamp and can be used with
two modifiers _dw and _dz that means the channels are divided
by w or z.
Implement those in shared code, since the same modifiers can be used
for texld ps 1_4.

v2: replace DIV by RCP + MUL
v3: Remove an useless MOV

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5399119fb1)

Conflicts:
	src/gallium/state_trackers/nine/nine_shader.c
2015-01-23 00:43:57 +00:00
Axel Davy
4b65be8860 st/nine: Fix some fixed function pipeline operation
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6378d74937)
2015-01-22 23:43:28 +00:00
Axel Davy
9ea8e7f0df st/nine: Clamp ps 1.X constants
This is wine (and windows) behaviour.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 018407b5d8)
2015-01-22 23:43:28 +00:00
Axel Davy
d0d09a4eee st/nine: Fix CND implementation
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3ca67f8810)
2015-01-22 23:43:27 +00:00
Axel Davy
75f39e45f0 st/nine: Rewrite LOOP implementation, and a0 aL handling
Previous implementation didn't work well with nested loops.

Instead of using several address registers, put a0 and aL
into normal registers, and copy them to one address register when
we need to use them.

Wine tests loop_index_test() and nested_loop_test() now pass correctly.

Fixes r600g crash while loading Bioshock -
bug https://bugs.freedesktop.org/show_bug.cgi?id=85696

Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6a8e5e48be)
2015-01-22 23:43:27 +00:00
Axel Davy
553089093f st/nine: Correct LOG on negative values
We should take the absolute value of the input.

Also return -FLT_MAX instead of -Inf for an input of 0.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c9aa9a0add)
2015-01-22 23:43:27 +00:00
Axel Davy
add30f01ef st/nine: Handle NRM with input of null norm
When the input's xyz are 0.0, the output
should be 0.0. This is due to the fact that
Inf * 0 = 0 for dx9. To handle this case,
cap the result of RSQ to FLT_MAX. We have
FLT_MAX * 0 = 0.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f5e8e3fb80)
2015-01-22 23:43:27 +00:00
Axel Davy
0dfb9c9e86 st/nine: Handle RSQ special cases
We should use the absolute value of the input as input to ureg_RSQ.

Moreover, an input of 0.0 should return FLT_MAX.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2487f73574)
2015-01-22 23:43:27 +00:00
Axel Davy
7e26cf83ba st/nine: Fix POW implementation
POW doesn't match directly TGSI, since we should
take the absolute value of src0.

Fixes black textures in some games

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c12f8c2088)
2015-01-22 23:43:27 +00:00
Axel Davy
00d22ce0fa st/nine: Fix typo for M4x4
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit e0dd9ca985)
2015-01-22 23:43:26 +00:00
Axel Davy
7f700cc35b st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs
Let's say we have c1 and c2 declared in the shader and c0 given by the app

Then here we would have read c0, c1 and c2 given by the app, instead
of the correct c0, c1, c2.

This correction fixes several issues in some games.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 53dc992f20)
2015-01-22 23:43:26 +00:00
Axel Davy
e6167e749c st/nine: Saturate oFog and oPts vs outputs
According to docs and Wine, these two vs outputs have
to be saturated.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9fb58a74a0)
2015-01-22 23:43:26 +00:00
Axel Davy
bce0058333 st/nine: Remove some shader unused code
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a214838181)
2015-01-22 23:43:26 +00:00
Axel Davy
9a0647ba7f st/nine: Convert integer constants to floats before storing them when cards don't support integers
The shader code is already behaving as if they are floats when the the card doesn't support integers

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d08c7b0b88)
2015-01-22 23:43:26 +00:00
Axel Davy
669c5d6d44 st/nine: Rework of boolean constants
Convert them to shader booleans at earlier stage.
Previous code is fine, but later patch will make
integers being converted at earlier stage, so do
the same for booleans

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d9d18fe39f)
2015-01-22 23:43:26 +00:00
Axel Davy
87ac37074f st/nine: Add ATI1 and ATI2 support
Adds ATI1 and ATI2 support to nine.

They map to PIPE_FORMAT_RGTC1_UNORM and PIPE_FORMAT_RGTC2_UNORM,
but need special handling.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 77f0ecf9ce)
2015-01-22 23:43:25 +00:00
Axel Davy
e1bcca4f13 st/nine: Check if srgb format is supported before trying to use it.
According to msdn, we must act as if user didn't ask srgb if we don't
support it.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b0b5430322)
2015-01-22 23:43:25 +00:00
Stanislaw Halik
50ea1c1f5f st/nine: Hack to generate resource if it doesn't exist when getting view
Buffers in the MANAGED pool are supposed to have the content in a ram buffer,
a copy in VRAM if there is enough memory (driver manages memory and decide when
to delete the buffer in VRAM).

This is not implemented properly in nine, and a VRAM copy is going to be created
when the RAM memory is filled, and the VRAM copy will get synced with the RAM
memory updates.

Due to some issues (in the implementation or in app logic), it can happen
we try to create a sampler view of the resource while we haven't created the
VRAM resource. This hack creates the resource when we hit this case, which prevents
crashing, but doesn't help with the resource content.

This fixes several games crashing at launch.

Acked-by: Axel Davy <axel.davy@ens.fr>
Acked-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Stanislaw Halik <sthalik@misaki.pl>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 82810d3b66)
2015-01-22 23:43:25 +00:00
Axel Davy
3ca8b93476 st/nine: NineBaseTexture9: update sampler view creation
While previous code was having the correct behaviour in general,
this new code is more readable (without checking all gallium formats
manually) and has a more defined behaviour for depth stencil resources.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 47280d777d)
2015-01-22 23:43:25 +00:00
Axel Davy
d06b403377 st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 0abfb80dac)
2015-01-22 23:43:12 +00:00
Axel Davy
481af42f28 st/nine: Fix crash when deleting non-implicit swapchain
The implicit swapchains are destroyed when the device instance is
destroyed. However for non-implicit swapchains, it is not the case,
and the application can have kept an reference on the swapchain
buffers to reuse them.

Fixes problems with battle.net launcher.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 0d2c22e648)
2015-01-22 23:41:09 +00:00
Axel Davy
393fffd07d st/nine: CubeTexture: fix GetLevelDesc
This->surfaces contains the surfaces associated to the levels
and faces. This->surfaces[6*Level] is what we want here,
since it gives us a face descriptor for the level 'Level'.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9232161178)
2015-01-22 23:41:08 +00:00
Axel Davy
c159b4095c st/nine: NineBaseTexture9: fix setting of last_layer
Use same similar settings as u_sampler_view_default_template

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 18c7e70226)
2015-01-22 23:41:08 +00:00
Axel Davy
b80b5b35a3 st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS
The cap means D3DFVF_XYZRHW vertices will see clipping.
This is not the case when
PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION is supported, since
it'll disable clipping.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 05e20e1045)
2015-01-22 23:41:08 +00:00
Xavier Bouchoux
41ca03a7b4 st/nine: Fix D3DRS_POINTSPRITE support
It's done by testing the existence of the point sprite output register *after* parsing the vertex shader.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dc88989189)
2015-01-22 23:41:08 +00:00
Axel Davy
18ac34825b st/nine: Add new texture format strings
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d2f2a550cf)
2015-01-22 23:41:07 +00:00
Xavier Bouchoux
15ef84ccfb st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 072e2ba8e1)
2015-01-22 23:41:07 +00:00
Xavier Bouchoux
44ee59d300 st/nine: Additional defines to d3dtypes.h
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8bb550b958)
2015-01-22 23:41:07 +00:00
Jose Fonseca
1e0ab5b826 nine: Drop use of TGSI_OPCODE_CND.
This was the only state tracker emitting it, and hardware was just having
to lower it anyway (or failing to lower it at all).

v2: Extracted from a larger patch by Jose (which also dropped DP2A), fixed
    to actually not reference TGSI_OPCODE_CND.  Change by anholt.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit 925cb75f89)
2015-01-22 23:40:09 +00:00
Jonathan Gray
a3381286d8 glsl: Link glsl_test with pthreads library.
Otherwise pthread_mutex_lock will be an undefined reference
on OpenBSD.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88219
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c5be9c126d)
2015-01-22 22:27:12 +00:00
Kenneth Graunke
882f702441 i965: Work around mysterious Gen4 GPU hangs with minimal state changes.
Gen4 hardware appears to GPU hang frequently when using Chromium, and
also when running 'glmark2 -b ideas'.  Most of the error states contain
3DPRIMITIVE commands in quick succession, with very few state packets
between them - usually VERTEX_BUFFERS/ELEMENTS and CONSTANT_BUFFER.

I trimmed an apitrace of the glmark2 hang down to two draw calls with a
glUniformMatrix4fv call between the two.  Either draw by itself works
fine, but together, they hang the GPU.  Removing the glUniform call
makes the hangs disappear.  In the hardware state, this translates to
removing the CONSTANT_BUFFER packet between the two 3DPRIMITIVE packets.

Flushing before emitting CONSTANT_BUFFER packets also appears to make
the hangs disappear.  I observed a slowdown in glxgears by doing it all
the time, so I've chosen to only do it when BRW_NEW_BATCH and
BRW_NEW_PSP are unset (i.e. we haven't done a CS_URB_STATE change or
already flushed the whole pipeline).

I'd much rather understand the problem, but at this point, I don't see
how we'd ever be able to track it down further.  We have no real tools,
and the hardware people moved on years ago.  I've analyzed 20+ error
states and read every scrap of documentation I could find.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80568
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85367
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c4fd0c9052)
2015-01-22 16:11:03 +00:00
Jason Ekstrand
a25e26f67f mesa: Fix clamping to -1.0 in snorm_to_float
This patch fixes the return of a wrong value when x is lower than
-MAX_INT(src_bits) as the result would not be between [-1.0 1.0].

v2 by Samuel Iglesias <siglesias@igalia.com>:
    - Modify snorm_to_float() to avoid doing the division when
      x == -MAX_INT(src_bits)

Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 7d1b08ac44)
2015-01-17 14:59:56 +00:00
Kenneth Graunke
021d71b848 i965: Respect the no_8 flag on Gen6, not just Gen7+.
When doing repclears, we only want to use the SIMD16 program, not the
SIMD8 one.  Kristian added this to the Gen7+ code, but apparently we
missed it in the Gen6 code.  This patch copies that code over.

Approximately doubles the performance in a clear microbenchmark from
mesa-demos (clearspd -width 500 -height 500 +color) on Sandybridge.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
References: https://code.google.com/p/chrome-os-partner/issues/detail?id=34681
(cherry picked from commit f95733ddb7)

Conflicts:
	src/mesa/drivers/dri/i965/gen6_wm_state.c
2015-01-17 14:59:08 +00:00
Emil Velikov
14f1659b43 docs: Add sha256 sums for the 10.4.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-12 10:37:09 +00:00
Emil Velikov
02f2e97c3e Add release notes for the 10.4.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-12 10:30:28 +00:00
Emil Velikov
5906dd6c99 Update version to 10.4.2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-12 10:24:59 +00:00
Dave Airlie
2d05942b74 r600g/sb: implement r600 gpr index workaround. (v3.1)
r600, rv610 and rv630 all have a bug in their GPR indexing
and how the hw inserts access to PV.

If the base index for the src is the same as the dst gpr
in a previous group, then it will use PV instead of using
the indexed gpr correctly.

The workaround is to insert a NOP when you detect this.

v2: add second part of fix detecting DST rel writes followed
by same src base index reads.

v3: forget adding stuff to structs, just iterate over the
previous node group again, makes it more obvious.
v3.1: drop local_nop.

Fixes ~200 piglit regressions on rv635 since SB was introduced.

Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3c8ef3a74b)
2015-01-07 17:39:52 +00:00
Dave Airlie
099ed78a04 r600g: fix regression since UCMP change
Since d8da6decea where the
state tracker started using UCMP on cayman a number of tests
regressed.

this seems to be r600g is doing CNDGE_INT for UCMP which is >= 0,
we should be doing CNDE_INT with reverse arguments.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0d4272cd8e)
2015-01-07 17:35:39 +00:00
Vadim Girlin
91c5770ba1 r600g/sb: fix issues with loops created for switch
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit de0fd375f6)
2015-01-07 17:31:12 +00:00
Dave Airlie
3306ed6fd7 Revert "r600g/sb: fix issues cause by GLSL switching to loops for switch"
This reverts commit 7b0067d23a.

Vadim's patch fixes this a lot better.

(cherry picked from commit 34e512d9ea)
2015-01-07 17:29:01 +00:00
Marek Olšák
81f8006f7d radeonsi: fix VertexID for OpenGL
This fixes all failing piglit VertexID tests.

Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit d7c6f397f4)
2015-01-07 17:25:06 +00:00
Marek Olšák
1b498cf5b7 st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX
Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit eaae92a349)
2015-01-07 17:04:21 +00:00
Marek Olšák
8c77be7ef9 vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays
From GL 4.4 Core profile:

  If both PRIMITIVE_RESTART and PRIMITIVE_RESTART_FIXED_INDEX are
  enabled, the index value determined by PRIMITIVE_RESTART_FIXED_INDEX is
  used. If PRIMITIVE_RESTART_FIXED_INDEX is enabled, primitive restart is not
  performed for array elements transferred by any drawing command not taking a
  type parameter, including all of the *Draw* commands other than *DrawEle-
  ments*.

Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8f5d309521)
2015-01-07 16:51:02 +00:00
Leonid Shatz
ef43d21bbc gallium/util: make sure cache line size is not zero
The "normal" detection (querying clflush size) already made sure it is
non-zero, however another method did not. This lead to crashes if this
value happened to be zero (apparently can happen in virtualized environments
at least).
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87913

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5fea39ace3)
2015-01-06 16:21:03 +00:00
Roland Scheidegger
ac3ca98a1b gallium/util: fix crash with daz detection on x86
The code used PIPE_ALIGN_VAR for the variable used by fxsave, however this
does not work if the stack isn't aligned. Hence use PIPE_ALIGN_STACK function
decoration to fix the segfault which can happen if stack alignment is only
4 bytes.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87658.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b59c7ed0ab)
2015-01-06 16:02:10 +00:00
Ilia Mirkin
af1a690075 nv50/ir: fix texture offsets in release builds
assert's get compiled out in release builds, so they can't be relied
upon to perform logic.

Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Roy Spliet <rspliet@eclipso.eu>
Cc: "10.2 10.3 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fb1afd1ea5)
2015-01-06 15:52:12 +00:00
Chad Versace
fffe533f08 i965: Use safer pointer arithmetic in gather_oa_results()
This patch reduces the likelihood of pointer arithmetic overflow bugs in
gather_oa_results(), like the one fixed by b69c7c5dac.

I haven't yet encountered any overflow bugs in the wild along this
patch's codepath. But I get nervous when I see code patterns like this:

   (void*) + (int) * (int)

I smell 32-bit overflow all over this code.

This patch retypes 'snapshot_size' to 'ptrdiff_t', which should fix any
potential overflow.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 414be86c96)
2015-01-04 21:39:10 +00:00
Chad Versace
4d5e0f78b7 i965: Use safer pointer arithmetic in intel_texsubimage_tiled_memcpy()
This patch reduces the likelihood of pointer arithmetic overflow bugs in
intel_texsubimage_tiled_memcpy() , like the one fixed by b69c7c5dac.

I haven't yet encountered any overflow bugs in the wild along this
patch's codepath. But I recently solved, in commit b69c7c5dac, an overflow
bug in a line of code that looks very similar to pointer arithmetic in
this function.

This patch conceptually applies the same fix as in b69c7c5dac. Instead
of retyping the variables, though, this patch adds some casts. (I tried
to retype the variables as ptrdiff_t, but it quickly got very messy. The
casts are cleaner).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 225a09790d)
2015-01-04 21:39:00 +00:00
Marek Olšák
b9e56ea151 glsl_to_tgsi: fix a bug in copy propagation
This fixes the new piglit test: arb_uniform_buffer_object/2-buffers-bug

Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 48094d0e65)
2015-01-04 21:38:26 +00:00
Kenneth Graunke
e05c595acd i965: Fix start/base_vertex_location for >1 prims but !BRW_NEW_VERTICES.
This is a partial revert of c89306983c.
It split the {start,base}_vertex_location handling into several steps:

1. Set brw->draw.start_vertex_location = prim[i].start
   and brw->draw.base_vertex_location = prim[i].basevertex.
   (This happened once per _mesa_prim, in the main drawing loop.)
2. Add brw->vb.start_vertex_bias and brw->ib.start_vertex_offset
   appropriately.  (This happened in brw_prepare_shader_draw_parameters,
   which was called just after brw_prepare_vertices, as part of state
   upload, and only happened when BRW_NEW_VERTICES was flagged.)
3. Use those values when emitting 3DPRIMITIVE (once per _mesa_prim).

If we drew multiple _mesa_prims, but didn't flag BRW_NEW_VERTICES on
the second (or later) primitives, we would do step #1, but not #2.
The first _mesa_prim would get correct values, but subsequent ones
would only get the first half of the summation.

The reason I originally did this was because I needed the value of
gl_BaseVertexARB to exist in a buffer object prior to uploading
3DSTATE_VERTEX_BUFFERS.  I believed I wanted to upload the value
of 3DPRIMITIVE's "Base Vertex Location" field, which was computed
as: (prims[i].indexed ? prims[i].start : prims[i].basevertex) +
brw->vb.start_vertex_bias.  The latter value wasn't available until
after brw_prepare_vertices, and the former weren't available in the
state upload code at all.  Hence the awkward split.

However, I believe that including brw->vb.start_vertex_bias was a
mistake.  It's an extra bias we apply when uploading vertex data into
VBOs, to move [min_index, max_index] to [0, max_index - min_index].

>From the GL_ARB_shader_draw_parameters specification:
"<gl_BaseVertexARB> holds the integer value passed to the <baseVertex>
 parameter to the command that resulted in the current shader
 invocation.  In the case where the command has no <baseVertex>
 parameter, the value of <gl_BaseVertexARB> is zero."

I conclude that gl_BaseVertexARB should only include the baseVertex
parameter from glDraw*Elements*, not any internal biases we add for
optimization purposes.

With that in mind, gl_BaseVertexARB only needs prim[i].start or
prim[i].basevertex.  We can simply store that, and go back to computing
start_vertex_location and base_vertex_location in brw_emit_prim(), like
we used to.  This is much simpler, and should actually fix two bugs.

Fixes missing geometry in Unvanquished.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85529
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit c633528cba)
2015-01-04 21:38:16 +00:00
Ilia Mirkin
c48d0d8dd2 nv50,nvc0: set vertex id base to index_bias
Fixes the piglits which check that gl_VertexID includes the base vertex
offset:
  arb_draw_indirect-vertexid elements
  gl-3.2-basevertex-vertexid

Note that this leaves out the original G80, for which this will continue
to fail. It could be fixed by passing a driver constbuf value in, but
that's beyond the scope of this change.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit be0311c962)
2015-01-04 21:37:51 +00:00
Tiziano Bacocco
aafd13027a nv50,nvc0: implement half_pixel_center
LAST_LINE_PIXEL has actually been renamed to PIXEL_CENTER_INTEGER in
rnndb; use that method to implement the rasterizer setting, used for
st/nine.

Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 609c3e51f5)
2015-01-04 21:37:32 +00:00
Michel Dänzer
1f42230fa7 radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0
E.g. this could happen on older kernels which don't support the
RADEON_INFO_SI_BACKEND_ENABLED_MASK query yet. The code in
si_write_harvested_raster_configs() doesn't deal with this correctly and
would probably mangle the value badly.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit b3057f8097)
2015-01-04 21:34:08 +00:00
Kenneth Graunke
2b85ed72db i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.
This was probably missed when moving from a fixed binding table layout
to a dynamic one that changes based on the shader.

Fixes newly proposed Piglit test fbo-mrt-new-bind.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87619
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Mike Stroyan <mike@LunarG.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4616b2ef85)
2015-01-04 21:33:26 +00:00
Emil Velikov
4cd38a592e docs: Add sha256 sums for the 10.4.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-30 02:38:02 +00:00
Emil Velikov
60e2e04fe8 Add release notes for the 10.4.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-30 02:11:34 +00:00
Emil Velikov
1a3df8cc77 Update version to 10.4.1
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-30 02:07:33 +00:00
Emil Velikov
45416a255f Revert "glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)"
This reverts commit ee241a6889.

May not be the correct fix. Discussion is ongoing.

http://lists.freedesktop.org/archives/mesa-dev/2014-December/072969.html
2014-12-30 01:03:14 +00:00
Cody Northrop
fb3f7c0bc5 i965: Require pixel alignment for GPU copy blit
The blitter will start at a pixel's natural alignment. For PBOs, if the
provided offset if not aligned, bits will get dropped.

This change adds offset alignment check for src and dst, kicking back if
the requirements are not met.

The change is based on following verbiage from BSPEC:
 Color pixel sizes supported are 8, 16, and 32 bits per pixel (bpp).
 All pixels are naturally aligned.

Found in the following locations:
page 35 of intel-gfx-prm-osrc-hsw-blitter.pdf
page 29 of ivb_ihd_os_vol1_part4.pdf
page 29 of snb_ihd_os_vol1_part5.pdf

This behavior was observed with Steam Big Picture rendering incorrect
icon colors.  The fix has been tested on Ubuntu and SteamOS on Haswell.

Signed-off-by: Cody Northrop <cody@lunarg.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83908
Reviewed-by: Neil Roberts <neil@linux.intel.com>
(cherry picked from commit 83e8bb5b1a)
Nominated-by: Matt Turner <mattst88@gmail.com>
2014-12-21 21:19:31 +00:00
Ian Romanick
4f570f2fb3 linker: Assign varying locations geometry shader inputs for SSO
Previously only geometry shader outputs would be assigned locations if
the geometry shader was the only stage in the linked program.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: pavol@klacansky.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82585
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a909b995d9)
Nominted-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-21 21:18:09 +00:00
Ian Romanick
a4c8348597 linker: Wrap access of producer_var with a NULL check
producer_var could be NULL if consumer_var is not NULL and
consumer_is_fs is false.  This will occur when the producer is NULL and
the consumer is the geometry shader for a program that contains only a
geometry shader.  This will occur starting with the next patch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: pavol@klacansky.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82585
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 5eca78a00a)
Nominated-by: Ian Romanick <ian.d.romanick@intel.com>
2014-12-21 21:17:45 +00:00
Maxence Le Doré
893583776e glsl: Add gl_MaxViewports to available builtin constants
It seems to have been forgotten during viewports array implementation time.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 19e05d6898)
2014-12-21 21:17:24 +00:00
Andres Gomez
2d669f6583 i965/brw_reg: struct constructor now needs explicit negate and abs values.
We were assuming, when constructing a new brw_reg struct, that the
negate and abs register modifiers would not be present by default in
the new register.

Now, we force explicitly setting these values when constructing a new
register.

This will avoid problems like forgetting to properly set them when we
are using a previous register to generate this new register, as it was
happening in the dFdx and dFdy generation functions.

Fixes piglit test shaders/glsl-deriv-varyings

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82991
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 8517e665bc)
2014-12-21 21:17:16 +00:00
Mario Kleiner
bccfe7ae0f glx/dri3: Don't fail on glXSwapBuffersMscOML(dpy, window, 0, 0, 0) (v2)
glXSwapBuffersMscOML() with target_msc=divisor=remainder=0 gets
translated into target_msc=divisor=0 but remainder=1 by the mesa
api. This is done for server DRI2 where there needs to be a way
to tell the server-side DRI2ScheduleSwap implementation if a call
to glXSwapBuffers() or glXSwapBuffersMscOML(dpy,window,0,0,0) was
done. remainder = 1 was (ab)used as a flag to tell the server to
select proper semantic. The DRI3/Present backend ignored this
signalling, treated any target_msc=0 as glXSwapBuffers() request,
and called xcb_present_pixmap with invalid divisor=0, remainder=1
combo. The present extension responded kindly to this with a
BadValue error and dropped the request, but mesa's DRI3/Present
backend doesn't check for error codes. From there on stuff went
downhill quickly for the calling OpenGL client...

This patch fixes the problem.

v2: Change comments to be more clear, with reference to
relevant spec, as suggested by Eric Anholt.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 0d7f4c8658)
2014-12-14 15:45:27 +00:00
Mario Kleiner
ee241a6889 glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)
Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>

v2: Add Frank Binns signed off by for his original earlier
patch from April 2014, which is identical to this one, and
Chris Wilsons reviewed tag from May 2014 for that patch, ergo
also for this one.

v3: Incorporate comment about triple buffering as suggested
by Axel Davy, and reference to relevant spec provided by
Eric Anholt.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 455d3036fa)
2014-12-14 15:45:21 +00:00
Mario Kleiner
4b37a18da5 glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc (v2)
Prevent calls to glXGetSyncValuesOML() and glXWaitForMscOML()
from overwriting the (ust,msc) values of the last successfull
swapbuffers call (PresentPixmapCompleteNotify event), as
glXWaitForSbcOML() relies on those values corresponding to
the most recent completed swap, not to whatever was last
returned from the server.

Problematic call sequence without this patch would have been, e.g.,

glXSwapBuffers()
... wait ...
swap completes -> PresentPixmapComplete event -> (ust,msc)
updated to reflect swap completion time and count.
... wait for at least 1 video refresh cycle/vblank increment.

glXGetSyncValuesOML()
-> PresentNotifyMsc event overwrites (ust,msc) of swap
completion with (ust,msc) of most recent vblank

glXWaitForSbcOML()
-> Returns sbc of last completed swap but (ust,msc) of last
completed vblank, not of last completed swap.
-> Client is confused.

Do this by tracking a separate set of (ust, msc) for the
dri3_wait_for_msc() call than for the dri3_wait_for_sbc()
call.

This makes the glXWaitForSbcOML() call robust again and restores
consistent behaviour with the DRI2 implementation.

Fixes applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

v2: Rename vblank_msc/ust to notify_msc/ust as suggested by
Axel Davy for better clarity.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit ad8b0e8bf6)
2014-12-14 15:45:15 +00:00
Mario Kleiner
93f6f55983 glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly. (v2)
targetSBC == 0 is a special case, which asks the function
to block until all pending OpenGL bufferswap requests have
completed.

Currently the function just falls through for targetSBC == 0,
returning bogus results.

This breaks applications originally written and tested against
DRI2 which also rely on this not regressing under DRI3/Present,
e.g., Neuro-Science software like Psychtoolbox-3.

This patch fixes the problem.

v2: Simplify as suggested by Axel Davy. Add comments proposed
by Eric Anholt.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 8cab54de16)
2014-12-14 15:45:10 +00:00
Emil Velikov
af0c82099b docs: Add 10.4 sha256 sums, news item and link release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-14 13:57:54 +00:00
Emil Velikov
5fe79b0b12 docs: Update 10.4.0 release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-14 13:45:54 +00:00
Emil Velikov
45f3aa0bc7 Bump version to 10.4.0 (final)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-14 13:32:44 +00:00
Alexander von Gluck IV
90239276ff mesa/drivers: Add missing mesautil lib to Haiku swrast
* Resolves missing util_format_linear_to_srgb_8unorm_table symbol.

(cherry picked from commit ad2ffd3bc6)
2014-12-11 13:54:54 +00:00
Roland Scheidegger
57868b1ee4 llvmpipe: fix lp_test_arit denorm handling
llvmpipe disables denorms on purpose (on x86/sse only), because denorms are
generally neither required nor desired for graphic apis (and in case of d3d10,
they are forbidden).
However, this caused some arithmetic tests using denorms to fail on some
systems, because the reference did not generate the same results anymore.
(It did not fail on all systems - behavior of these math functions is sort
of undefined when called with non-standard floating point mode, hence the
result differing depending on implementation and in particular the sse
capabilities.)
So, for the reference, simply flush all (input/output) denorms manually
to zero in this case.

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=67672.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 8148a06b8f)
Nominated-by: Matt Turner <mattst88@gmail.com>
2014-12-11 13:54:54 +00:00
Marek Olšák
fe2eac2237 docs/relnotes: document the removal of GALLIUM_MSAA
Cc: 10.2.10.3 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ac319d94d3)
2014-12-11 13:54:54 +00:00
Matt Turner
db784a09f1 i965: Disable unlit-centroid workaround on Gen < 6.
Back to the original commit (8313f444) adding the workaround, we were
enabling it on gens <= 7, even though gens <= 5 can't do multisampling.

I cannot find documentation that says that Sandybridge needs this
workaround but in practice disabling it causes these piglit tests to
fail:

EXT_framebuffer_multisample/interpolation {2,4} centroid-deriv{,-disabled}

On Ironlake:

total instructions in shared programs: 4358478 -> 4349671 (-0.20%)
instructions in affected programs:     117680 -> 108873 (-7.48%)

A bunch of shaders in TF2, Portal 2, and L4D2 are cut by 25~30%.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit 1a2de7dce8)
2014-12-11 13:54:53 +00:00
Dave Airlie
d9f4aaa095 r600g: only init GS_VERT_ITEMSIZE on r600
On evergreen there are 4 regs, on r600/700 there is only one.

Don't initialise regs and trash someone elses state.

Not sure this fixes anything, but hey one less stupid.

Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.3 10.4" mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 7f21cf7198)
2014-12-11 13:54:53 +00:00
Timothy Arceri
e340a28dba mesa: use build flag to ensure stack is realigned on x86
Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but that is an assumption OpenGL drivers (or any dynamic library for that matter) can't afford to make as there are many closed- and open- source application binaries out there that only assume 4-byte stack alignment.

V4: fix comment and indentation

V3: move all sse4.1 build flag config to the same location
 and add comment as to why we need to do the realign

V2: use $target_cpu rather than $host_cpu
  and setup build flags in config rather than makefile

https://bugs.freedesktop.org/show_bug.cgi?id=86788
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Matt Turner <mattst88@gmail.com>
CC: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f1b5f2b157)
2014-12-11 13:54:53 +00:00
Tom Stellard
6b908efd58 radeonsi: Program RASTER_CONFIG for harvested GPUs v5
Harvested GPUs have some of their render backends disabled, so
in order to prevent the hardware from trying to render things
with these disabled backends we need to correctly program
the PA_SC_RASTER_CONFIG register.

v2:
  - Write RASTER_CONFIG for all SEs.

v3:
  - Set GRBM_GFX_INDEX.INSTANCE_BROADCAST_WRITES bit.
  - Set GRBM_GFX_INFEX.SH_BROADCAST_WRITES bit when done setting
    PA_SC_RASTER_CONFIG.
  - Get num_se and num_sh_per_se from kernel.

v4:
  - Get correct value for num_se
  - Remove loop for setting PA_SC_RASTER_CONFIG
  - Only compute raster config when a backend has been disabled.

v5: Michel Dänzer
  - Fix computation for chips with multiple SEs

https://bugs.freedesktop.org/show_bug.cgi?id=60879

CC: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 67dcbcd92c)
2014-12-11 13:54:53 +00:00
Abdiel Janulgue
65f03e6733 ir_to_mesa: Remove sat to clamp lowering pass
Fixes an infinite loop in swrast where the lowering pass unpacks saturate into
clamp but the opt_algebraic pass tries to do the opposite.

v3 (Ian):
This is a revert of commit cfa8c1cb "ir_to_mesa: lower ir_unop_saturate" on
the ir_to_mesa.cpp portion. prog_execute.c can handle saturates in vertex
shaders, so classic swrast shouldn't need this lowering pass.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83463
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit 39f7b72428)
2014-12-11 13:54:53 +00:00
Chris Forbes
ffaf58e7d0 i965/Gen6-7: Fix point sprites with PolygonMode(GL_POINT)
This was an oversight in the original patch. When PolygonMode is
used, then front faces, back faces, or both may be rendered as
points and are affected by point sprite state.

Note that SNB/IVB can't actually be fully conformant here, for
a legacy context -- we don't have separate sets of pointsprite
enables for front and back faces. Haswell ignores pointsprite
state correctly in hardware for non-point rasterization, so can
do this correctly, but it doesn't seem worth it.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86764
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit ed56c16820)
2014-12-11 13:54:53 +00:00
Ben Widawsky
bb9dea8a29 i965/gs: Avoid DW * DW mul
The GS has an interesting use for mul. Because the GS can emit multiple
vertices per input vertex, and it also has a unique count at the top of the URB
payload, the GS unit needs to be able to dynamically specify URB write offsets
(relative to the global offset). The documentation in the function has a very
good explanation from Paul on the mechanics.

This fixes around 2000 piglit tests on BSW.

v2:
Reworded commit message (Ben) no mention of CHV (Matt)
Change SHRT_MAX to USHRT_MAX (Ken, and Matt)
Update comment in code to reflect the use of UW (Ben)
Add Gen7+ assertion for the relevant GS code, since it won't work on Gen6- (Ken)
Drop the bogus hunk in emit_control_data_bits() (Ken)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84777 (with many dupes)
Cc: "10.4 10.3 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit f13870db09)
2014-12-11 13:54:53 +00:00
José Fonseca
be59440b53 util/primconvert: Avoid point arithmetic; apply offset on all cases.
Matches what u_vbuf_get_minmax_index() does.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit f9098f0972)
2014-12-11 13:54:52 +00:00
Ilia Mirkin
ac8d596498 util/primconvert: take ib offset into account
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit c3bed13604)
2014-12-11 13:54:52 +00:00
Ilia Mirkin
112d2fdb17 util/primconvert: support instanced rendering
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit fb434e675f)
2014-12-11 13:54:52 +00:00
Ilia Mirkin
c6353cee0c util/primconvert: pass index bias through
The index_bias (aka base_vertex) applies to the downstream draw just as
much, since the actual index values are never modified.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 1dfa039168)
2014-12-11 13:54:52 +00:00
158 changed files with 2923 additions and 742 deletions

View File

@@ -1 +1 @@
10.4.0-rc4
10.4.7

View File

@@ -1,2 +1,18 @@
# No whitespace commits in stable.
a10bf5c10caf27232d4df8da74d5c35c23eb883d
a10bf5c10caf27232d4df8da74d5c35c23eb883d
# The following patches address code which is missing in 10.4
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078515.html
06084652fefe49c3d6bf1b476ff74ff602fdc22a common: Correct texture init for meta pbo uploads and downloads.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078547.html
ccc5ce6f72c1ec86be4dfcef96c0b51fba0faa6d common: Correct PBO 2D_ARRAY handling.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078549.html
546aba143d13ba3f993ead4cc30b2404abfc0202 common: Fix PBOs for 1D_ARRAY.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078501.html
2b2fa1865248c6e3b7baec81c4f92774759b201f mesa: Indent break statements and add a missing one.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078502.html
87109acbed9c9b52f33d58ca06d9048d0ac7a215 mesa: Free memory allocated for luminance in readpixels.

View File

@@ -14,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*10\.4.*mesa-stable\)' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

View File

@@ -252,8 +252,16 @@ AC_SUBST([VISIBILITY_CXXFLAGS])
dnl
dnl Optional flags, check for compiler support
dnl
SSE41_CFLAGS="-msse4.1"
dnl Code compiled by GCC with -msse* assumes a 16 byte aligned
dnl stack, but on x86-32 such alignment is not guaranteed.
case "$target_cpu" in
i?86)
SSE41_CFLAGS="$SSE41_CFLAGS -mstackrealign"
;;
esac
save_CFLAGS="$CFLAGS"
CFLAGS="-msse4.1 $CFLAGS"
CFLAGS="$SSE41_CFLAGS $CFLAGS"
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
#include <smmintrin.h>
int main () {
@@ -266,6 +274,7 @@ if test "x$SSE41_SUPPORTED" = x1; then
DEFINES="$DEFINES -DUSE_SSE41"
fi
AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
dnl Can't have static and shared libraries, default to static if user
dnl explicitly requested. If both disabled, set to static since shared
@@ -1707,7 +1716,7 @@ if test "x$enable_gallium_llvm" = xyes; then
fi
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
LLVM_COMPONENTS="${LLVM_COMPONENTS} all-targets ipo linker instrumentation"
# LLVM 3.3 >= 177971 requires IRReader
if $LLVM_CONFIG --components | grep -qw 'irreader'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader"

View File

@@ -16,6 +16,13 @@
<h1>News</h1>
<h2>December 14, 2014</h2>
<p>
<a href="relnotes/10.4.html">Mesa 10.4</a> is released. This is a new
development release. See the release notes for more information about
the release.
</p>
<h2>November 8, 2014</h2>
<p>
<a href="relnotes/10.3.3.html">Mesa 10.3.3</a> is released.

View File

@@ -21,6 +21,7 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/10.4.html">10.4 release notes</a>
<li><a href="relnotes/10.3.3.html">10.3.3 release notes</a>
<li><a href="relnotes/10.3.2.html">10.3.2 release notes</a>
<li><a href="relnotes/10.3.1.html">10.3.1 release notes</a>

View File

@@ -88,6 +88,8 @@ following options during configure, if you would like support for svga driver
Note: The files are installed in $(libdir)/gallium-pipe/ and the interface
between them and libxatracker.so is <strong>not</strong> stable.
</p>
<li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>
</ul>
</div>

View File

@@ -327,6 +327,7 @@ DRM drivers that don't have a full-fledged GEM (such as qxl or simpledrm)</li>
<li>Removed support for the GL_ATI_envmap_bumpmap extension</li>
<li>The hacky --enable-32/64-bit is no longer available in configure. To build
32/64 bit mesa refer to the default method recommended by your distribution</li>
</li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>
</ul>
</div>

97
docs/relnotes/10.4.1.html Normal file
View File

@@ -0,0 +1,97 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.1 Release Notes / December 29, 2014</h1>
<p>
Mesa 10.4.1 is a bug fix release which fixes bugs found since the 10.4.0 release.
</p>
<p>
Mesa 10.4.1 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
5311285e791a6bfaa468ad002bd1e1164acb3eaa040b5a1bf958bdb7c27e0a9d MesaLib-10.4.1.tar.gz
91e8b71c8aff4cb92022a09a872b1c5d1ae5bfec8c6c84dbc4221333da5bf1ca MesaLib-10.4.1.tar.bz2
e09c8135f5a86ecb21182c6f8959aafd39ae2f98858fdf7c0e25df65b5abcdb8 MesaLib-10.4.1.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82585">Bug 82585</a> - geometry shader with optional out variable segfaults</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82991">Bug 82991</a> - Inverted bumpmap in webgl applications</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83908">Bug 83908</a> - [i965] Incorrect icon colors in Steam Big Picture</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (1):</p>
<ul>
<li>i965/brw_reg: struct constructor now needs explicit negate and abs values.</li>
</ul>
<p>Cody Northrop (1):</p>
<ul>
<li>i965: Require pixel alignment for GPU copy blit</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: Add 10.4 sha256 sums, news item and link release notes</li>
<li>Revert "glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)"</li>
<li>Update version to 10.4.1</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>linker: Wrap access of producer_var with a NULL check</li>
<li>linker: Assign varying locations geometry shader inputs for SSO</li>
</ul>
<p>Mario Kleiner (4):</p>
<ul>
<li>glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly. (v2)</li>
<li>glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc (v2)</li>
<li>glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)</li>
<li>glx/dri3: Don't fail on glXSwapBuffersMscOML(dpy, window, 0, 0, 0) (v2)</li>
</ul>
<p>Maxence Le Doré (1):</p>
<ul>
<li>glsl: Add gl_MaxViewports to available builtin constants</li>
</ul>
</div>
</body>
</html>

127
docs/relnotes/10.4.2.html Normal file
View File

@@ -0,0 +1,127 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.2 Release Notes / January 12, 2015</h1>
<p>
Mesa 10.4.2 is a bug fix release which fixes bugs found since the 10.4.1 release.
</p>
<p>
Mesa 10.4.2 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e303e77dd774df0d051b2870b165f98c97084a55980f884731df89c1b56a6146 MesaLib-10.4.2.tar.gz
08a119937d9f2aa2f66dd5de97baffc2a6e675f549e40e699a31f5485d15327f MesaLib-10.4.2.tar.bz2
c2c2921a80a3395824f02bee4572a6a17d6a12a928a3e497618eeea04fb06490 MesaLib-10.4.2.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85529">Bug 85529</a> - Surfaces not drawn in Unvanquished</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87619">Bug 87619</a> - Changes to state such as render targets change fragment shader without marking it dirty.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87658">Bug 87658</a> - [llvmpipe] SEGV in sse2_has_daz on ancient Pentium4-M</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87913">Bug 87913</a> - CPU cacheline size of 0 can be returned by CPUID leaf 0x80000006 in some virtual machines</li>
</ul>
<h2>Changes</h2>
<p>Chad Versace (2):</p>
<ul>
<li>i965: Use safer pointer arithmetic in intel_texsubimage_tiled_memcpy()</li>
<li>i965: Use safer pointer arithmetic in gather_oa_results()</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>Revert "r600g/sb: fix issues cause by GLSL switching to loops for switch"</li>
<li>r600g: fix regression since UCMP change</li>
<li>r600g/sb: implement r600 gpr index workaround. (v3.1)</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.1 release</li>
<li>Update version to 10.4.2</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>nv50,nvc0: set vertex id base to index_bias</li>
<li>nv50/ir: fix texture offsets in release builds</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.</li>
<li>i965: Fix start/base_vertex_location for &gt;1 prims but !BRW_NEW_VERTICES.</li>
</ul>
<p>Leonid Shatz (1):</p>
<ul>
<li>gallium/util: make sure cache line size is not zero</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>glsl_to_tgsi: fix a bug in copy propagation</li>
<li>vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays</li>
<li>st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX</li>
<li>radeonsi: fix VertexID for OpenGL</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>gallium/util: fix crash with daz detection on x86</li>
</ul>
<p>Tiziano Bacocco (1):</p>
<ul>
<li>nv50,nvc0: implement half_pixel_center</li>
</ul>
<p>Vadim Girlin (1):</p>
<ul>
<li>r600g/sb: fix issues with loops created for switch</li>
</ul>
</div>
</body>
</html>

145
docs/relnotes/10.4.3.html Normal file
View File

@@ -0,0 +1,145 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.3 Release Notes / January 24, 2015</h1>
<p>
Mesa 10.4.3 is a bug fix release which fixes bugs found since the 10.4.2 release.
</p>
<p>
Mesa 10.4.3 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c53eaafc83d9c6315f63e0904d9954d929b841b0b2be7a328eeb6e14f1376129 MesaLib-10.4.3.tar.gz
ef6ecc9c2f36c9f78d1662382a69ae961f38f03af3a0c3268e53f351aa1978ad MesaLib-10.4.3.tar.bz2
179325fc8ec66529d3b0d0c43ef61a33a44d91daa126c3bbdd1efdfd25a7db1d MesaLib-10.4.3.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80568">Bug 80568</a> - [gen4] GPU Crash During Google Chrome Operation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85367">Bug 85367</a> - [gen4] GPU hang in glmark-es2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85696">Bug 85696</a> - r600g+nine: Bioshock shader failure after 7b1c0cbc90d456384b0950ad21faa3c61a6b43ff</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88219">Bug 88219</a> - include/c11/threads_posix.h:197: undefined reference to `pthread_mutex_lock'</li>
</ul>
<h2>Changes</h2>
<p>Axel Davy (39):</p>
<ul>
<li>st/nine: Add new texture format strings</li>
<li>st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS</li>
<li>st/nine: NineBaseTexture9: fix setting of last_layer</li>
<li>st/nine: CubeTexture: fix GetLevelDesc</li>
<li>st/nine: Fix crash when deleting non-implicit swapchain</li>
<li>st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format</li>
<li>st/nine: NineBaseTexture9: update sampler view creation</li>
<li>st/nine: Check if srgb format is supported before trying to use it.</li>
<li>st/nine: Add ATI1 and ATI2 support</li>
<li>st/nine: Rework of boolean constants</li>
<li>st/nine: Convert integer constants to floats before storing them when cards don't support integers</li>
<li>st/nine: Remove some shader unused code</li>
<li>st/nine: Saturate oFog and oPts vs outputs</li>
<li>st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs</li>
<li>st/nine: Fix typo for M4x4</li>
<li>st/nine: Fix POW implementation</li>
<li>st/nine: Handle RSQ special cases</li>
<li>st/nine: Handle NRM with input of null norm</li>
<li>st/nine: Correct LOG on negative values</li>
<li>st/nine: Rewrite LOOP implementation, and a0 aL handling</li>
<li>st/nine: Fix CND implementation</li>
<li>st/nine: Clamp ps 1.X constants</li>
<li>st/nine: Fix some fixed function pipeline operation</li>
<li>st/nine: Implement TEXCOORD special behaviours</li>
<li>st/nine: Fill missing dst and src number for some instructions.</li>
<li>st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC</li>
<li>st/nine: implement TEXM3x2DEPTH</li>
<li>st/nine: Implement TEXM3x2TEX</li>
<li>st/nine: Implement TEXM3x3SPEC</li>
<li>st/nine: Implement TEXDEPTH</li>
<li>st/nine: Implement TEXDP3</li>
<li>st/nine: Implement TEXDP3TEX</li>
<li>st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB</li>
<li>st/nine: Correct rules for relative adressing and constants.</li>
<li>st/nine: Remove unused code for ps</li>
<li>st/nine: Fix sm3 relative addressing for non-debug build</li>
<li>st/nine: Add variables containing the size of the constant buffers</li>
<li>st/nine: Allocate the correct size for the user constant buffer</li>
<li>st/nine: Allocate vs constbuf buffer for indirect addressing once.</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.2 release</li>
<li>Update version to 10.4.3</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>mesa: Fix clamping to -1.0 in snorm_to_float</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>glsl: Link glsl_test with pthreads library.</li>
</ul>
<p>Jose Fonseca (1):</p>
<ul>
<li>nine: Drop use of TGSI_OPCODE_CND.</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Respect the no_8 flag on Gen6, not just Gen7+.</li>
<li>i965: Work around mysterious Gen4 GPU hangs with minimal state changes.</li>
</ul>
<p>Stanislaw Halik (1):</p>
<ul>
<li>st/nine: Hack to generate resource if it doesn't exist when getting view</li>
</ul>
<p>Xavier Bouchoux (3):</p>
<ul>
<li>st/nine: Additional defines to d3dtypes.h</li>
<li>st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9</li>
<li>st/nine: Fix D3DRS_POINTSPRITE support</li>
</ul>
</div>
</body>
</html>

100
docs/relnotes/10.4.4.html Normal file
View File

@@ -0,0 +1,100 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.4 Release Notes / February 06, 2015</h1>
<p>
Mesa 10.4.4 is a bug fix release which fixes bugs found since the 10.4.3 release.
</p>
<p>
Mesa 10.4.4 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
5cb427eaf980cb8555953e9928f5797979ed783e277745d5f8cbae8bc5364086 MesaLib-10.4.4.tar.gz
f18a967e9c4d80e054b2fdff8c130ce6e6d1f8eecfc42c9f354f8628d8b4df1c MesaLib-10.4.4.tar.bz2
86baad73b77920c80fe58402a905e7dd17e3ea10ead6ea7d3afdc0a56c860bd7 MesaLib-10.4.4.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88662">Bug 88662</a> - unaligned access to gl_dlist_node</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88930">Bug 88930</a> - [osmesa] osbuffer-&gt;textures should be indexed by attachment type</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: fix display list 8-byte alignment issue</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.3 release</li>
<li>Update version to 10.4.4</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>egl: Pass the correct X visual depth to xcb_put_image().</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>gallium/util: Don't use __builtin_clrsb in util_last_bit().</li>
</ul>
<p>Niels Ole Salscheider (1):</p>
<ul>
<li>configure: Link against all LLVM targets when building clover</li>
</ul>
<p>Park, Jeongmin (1):</p>
<ul>
<li>st/osmesa: Fix osbuffer-&gt;textures indexing</li>
</ul>
<p>Ville Syrjälä (1):</p>
<ul>
<li>i965: Fix max_wm_threads for CHV</li>
</ul>
</div>
</body>
</html>

114
docs/relnotes/10.4.5.html Normal file
View File

@@ -0,0 +1,114 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.5 Release Notes / February 21, 2015</h1>
<p>
Mesa 10.4.5 is a bug fix release which fixes bugs found since the 10.4.4 release.
</p>
<p>
Mesa 10.4.5 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e12bbdaee9a758617e8ebd0bb0e987f72addd11db2e4da25ba695e386cd63843 MesaLib-10.4.5.tar.gz
bf60000700a9d58e3aca2bfeee7e781053b0d839e61a95b1883e05a2dee247a0 MesaLib-10.4.5.tar.bz2
3b926de8eee500bb67cf85332c51292f826cc539b8636382aadbb8e70c76527a MesaLib-10.4.5.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82477">Bug 82477</a> - [softpipe] piglit fp-long-alu regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88658">Bug 88658</a> - (bisected) Slow video playback on Kabini</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89069">Bug 89069</a> - Lack of grass in The Talos Principle on radeonsi (native\wine\nine)</li>
</ul>
<h2>Changes</h2>
<p>Carl Worth (1):</p>
<ul>
<li>Revert use of Mesa IR optimizer for ARB_fragment_programs</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.4 release</li>
<li>get-pick-list.sh: Require explicit "10.4" for nominating stable patches</li>
<li>Update version to 10.4.5</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>nvc0: bail out of 2d blits with non-A8_UNORM alpha formats</li>
<li>st/mesa: treat resource-less xfb buffers as if they weren't there</li>
<li>nvc0: allow holes in xfb target lists</li>
</ul>
<p>Jeremy Huddleston Sequoia (2):</p>
<ul>
<li>darwin: build fix</li>
<li>darwin: build fix</li>
</ul>
<p>Kenneth Graunke (4):</p>
<ul>
<li>i965: Override swizzles for integer luminance formats.</li>
<li>i965: Use a gl_color_union for sampler border color.</li>
<li>i965: Fix integer border color on Haswell.</li>
<li>glsl: Reduce memory consumption of copy propagation passes.</li>
</ul>
<p>Laura Ekstrand (1):</p>
<ul>
<li>main: Fixed _mesa_GetCompressedTexImage_sw to copy slices correctly.</li>
</ul>
<p>Marek Olšák (5):</p>
<ul>
<li>r600g,radeonsi: don't append to streamout buffers that haven't been used yet</li>
<li>radeonsi: fix instanced arrays with non-zero start instance</li>
<li>radeonsi: small fix in SPI state</li>
<li>mesa: fix AtomicBuffer typo in _mesa_DeleteBuffers</li>
<li>radeonsi: fix a crash if a stencil ref state is set before a DSA state</li>
</ul>
<p>Michel Dänzer (2):</p>
<ul>
<li>st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB</li>
<li>Revert "radeon/llvm: enable unsafe math for graphics shaders"</li>
</ul>
</div>
</body>
</html>

143
docs/relnotes/10.4.6.html Normal file
View File

@@ -0,0 +1,143 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.6 Release Notes / March 06, 2015</h1>
<p>
Mesa 10.4.6 is a bug fix release which fixes bugs found since the 10.4.5 release.
</p>
<p>
Mesa 10.4.6 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
46c9082142e811c01e49a2c332a9ac0a1eb98f2908985fb9df216539d7eaeaf4 MesaLib-10.4.6.tar.gz
d8baedd20e79ccd98a5a7b05e23d59a30892e68de1fcc057ca6873dafca02735 MesaLib-10.4.6.tar.bz2
6aded6eac7f0d4d55117b8b581d8424710bbb4c768fc90f7b881f29311a751aa MesaLib-10.4.6.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=45348">Bug 45348</a> - [swrast] piglit fbo-drawbuffers-arbfp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84613">Bug 84613</a> - [G965, bisected] piglit regressions : glslparsertest.glsl2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87516">Bug 87516</a> - glProgramBinary violates spec</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88885">Bug 88885</a> - Transform feedback uses incorrect interleaving if a previous draw did not write gl_Position</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89180">Bug 89180</a> - [IVB regression] Rendering issues in Mass Effect through VMware Workstation</li>
</ul>
<h2>Changes</h2>
<p>Abdiel Janulgue (2):</p>
<ul>
<li>glsl: Don't optimize min/max into saturate when EmitNoSat is set</li>
<li>st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported</li>
</ul>
<p>Andreas Boll (1):</p>
<ul>
<li>glx: Fix returned values of GLX_RENDERER_PREFERRED_PROFILE_MESA</li>
</ul>
<p>Brian Paul (2):</p>
<ul>
<li>swrast: fix multiple color buffer writing</li>
<li>st/mesa: fix sampler view reference counting bug in glDraw/CopyPixels</li>
</ul>
<p>Chris Forbes (1):</p>
<ul>
<li>i965/gs: Check newly-generated GS-out VUE map against correct stage</li>
</ul>
<p>Eduardo Lima Mitev (1):</p>
<ul>
<li>mesa: Fix error validating args for TexSubImage3D</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.5 release</li>
<li>install-lib-links: remove the .install-lib-links file</li>
<li>Revert "mesa: Correct backwards NULL check."</li>
<li>mesa: cherry-pick the second half of commit 2aa71e9485a</li>
<li>Revert "gallivm: Update for RTDyldMemoryManager becoming an unique_ptr."</li>
<li>Update version to 10.4.6</li>
</ul>
<p>Ian Romanick (3):</p>
<ul>
<li>mesa: Add missing error checks in _mesa_ProgramBinary</li>
<li>mesa: Ensure that length is set to zero in _mesa_GetProgramBinary</li>
<li>mesa: Always generate GL_INVALID_OPERATION in _mesa_GetProgramBinary</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>auxilary/os: correct sysctl use in os_get_total_physical_memory()</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>gallivm: Update for RTDyldMemoryManager becoming an unique_ptr.</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>st/omx/dec/h264: fix picture out-of-order with poc type 0 v2</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>install-lib-links: don't depend on .libs directory</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>vbo: fix an unitialized-variable warning</li>
<li>radeonsi: fix point sprites</li>
</ul>
<p>Matt Turner (4):</p>
<ul>
<li>glsl: Rewrite and fix min/max to saturate optimization.</li>
<li>mesa: Correct backwards NULL check.</li>
<li>i965/fs: Don't use backend_visitor::instructions after creating the CFG.</li>
<li>mesa: Correct backwards NULL check.</li>
</ul>
</div>
</body>
</html>

134
docs/relnotes/10.4.7.html Normal file
View File

@@ -0,0 +1,134 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.7 Release Notes / March 20, 2015</h1>
<p>
Mesa 10.4.7 is a bug fix release which fixes bugs found since the 10.4.6 release.
</p>
<p>
Mesa 10.4.7 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
9e7b59267199658808f8b33e0410b86fbafbdcd52378658b9df65fac9d24947f MesaLib-10.4.7.tar.gz
2c351c98671f9a7ab3fd9c601bb7a255801b1580f5dd0992639f99152801b0d2 MesaLib-10.4.7.tar.bz2
d14ac578b5ce16560757b53fbd1cb4d6b34652f8e110e4b10a019adc82e67ffd MesaLib-10.4.7.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79202">Bug 79202</a> - valgrind errors in glsl-fs-uniform-array-loop-unroll.shader_test; random code generation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89156">Bug 89156</a> - r300g: GL_COMPRESSED_RED_RGTC1 / ATI1N support broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89224">Bug 89224</a> - Incorrect rendering of Unigine Valley running in VM on VMware Workstation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89530">Bug 89530</a> - FTBFS in loader: missing fstat</li>
</ul>
<h2>Changes</h2>
<p>Andrey Sudnik (1):</p>
<ul>
<li>i965/vec4: Don't lose the saturate modifier in copy propagation.</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>egl: Take alpha bits into account when selecting GBM formats</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.6 release</li>
<li>cherry-ignore: add not applicable/rejected commits</li>
<li>mesa: rename format_info.c to format_info.h</li>
<li>loader: include &lt;sys/stat.h&gt; for non-sysfs builds</li>
<li>auxiliary/os: fix the android build - s/drm_munmap/os_munmap/</li>
<li>Update version to 10.4.7</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>i965: Fix out-of-bounds accesses into pull_constant_loc array</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>freedreno: move fb state copy after checking for size change</li>
<li>freedreno/ir3: fix array count returned by TXQ</li>
<li>freedreno/ir3: get the # of miplevels from getinfo</li>
<li>freedreno: fix slice pitch calculations</li>
</ul>
<p>Marc-Andre Lureau (1):</p>
<ul>
<li>gallium/auxiliary/indices: fix start param</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>r300g: fix RGTC1 and LATC1 SNORM formats</li>
<li>r300g: fix a crash when resolving into an sRGB texture</li>
<li>r300g: fix sRGB-&gt;sRGB blits</li>
<li>radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>glx: Handle out-of-sequence swap completion events correctly. (v2)</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>r300g: Use PATH_MAX instead of limiting ourselves to 100 chars.</li>
<li>r300g: Check return value of snprintf().</li>
</ul>
<p>Rob Clark (2):</p>
<ul>
<li>freedreno/ir3: fix silly typo for binning pass shaders</li>
<li>freedreno: update generated headers</li>
</ul>
<p>Samuel Iglesias Gonsalvez (1):</p>
<ul>
<li>glsl: optimize (0 cmp x + y) into (-x cmp y).</li>
</ul>
<p>Stefan Dösinger (1):</p>
<ul>
<li>r300g: Fix the ATI1N swizzle (RGTC1 and LATC1)</li>
</ul>
</div>
</body>
</html>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4 Release Notes / TBD</h1>
<h1>Mesa 10.4 Release Notes / December 14, 2014</h1>
<p>
Mesa 10.4 is a new development release.
@@ -31,9 +31,11 @@ because compatibility contexts are not supported.
</p>
<h2>MD5 checksums</h2>
<h2>SHA256 checksums</h2>
<pre>
TBD.
abfbfd2d91ce81491c5bb6923ae649212ad5f82d0bee277de8704cc948dc221e MesaLib-10.4.0.tar.gz
98a7dff3a1a6708c79789de8b9a05d8042e867067f70e8f30387c15026233219 MesaLib-10.4.0.tar.bz2
443a6d46d0691b5ac811d8d30091b1716c365689b16d49c57cf273c2b76086fe MesaLib-10.4.0.zip
</pre>
@@ -54,11 +56,202 @@ Note: some of the new features are only available with certain drivers.
<h2>Bug fixes</h2>
TBD.
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79963">Bug 79963</a> - [ILK Bisected]some piglit and ogles2conform cases fail </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=29661">Bug 29661</a> - MSVC built u_format_test fails on Windows</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=38873">Bug 38873</a> - [855gm] gnome-shell misrendered</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60879">Bug 60879</a> - [radeonsi] X11 can't start with acceleration enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61415">Bug 61415</a> - Clover ignores --with-opencl-libdir path</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64471">Bug 64471</a> - Radeon HD6570 lockup in Brütal Legend with HyperZ</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66184">Bug 66184</a> - src/mesa/state_tracker/st_glsl_to_tgsi.cpp:3216:simplify_cmp: Assertion `inst-&gt;dst.index &lt; 4096' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67672">Bug 67672</a> - [llvmpipe] lp_test_arit fails on old CPUs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69200">Bug 69200</a> - [Bisected]Piglit glx/glx-multithread-shader-compile aborted</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70410">Bug 70410</a> - egl-static/Makefile: linking fails with llvm &gt;= 3.4</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72685">Bug 72685</a> - [radeonsi hyperz] Artifacts in Unigine Sanctuary</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72819">Bug 72819</a> - [855GM] Incorrect drop shadow color on windows and strange white rectangle when showing/hiding GLX-dock...</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74563">Bug 74563</a> - Surfaceless contexts are not properly released by DRI drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74863">Bug 74863</a> - [r600g] HyperZ broken on RV770 and CYPRESS (Left 4 Dead 2 trees corruption) bisected!</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75011">Bug 75011</a> - [hyperz] Performance drop since git-01e6371 (disable hyperz by default) with radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75112">Bug 75112</a> - Meta Bug for HyperZ issues on r600g and radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76252">Bug 76252</a> - Dynamic loading/unloading of opengl32.dll results in a deadlock</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76861">Bug 76861</a> - mid3 generates slow code for constant arguments</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77957">Bug 77957</a> - Variably-indexed constant arrays result in terrible shader code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78468">Bug 78468</a> - Compiling of shader gets stuck in infinite loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78770">Bug 78770</a> - [SNB bisected]Webglc conformance/textures/texture-size-limit.html fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79155">Bug 79155</a> - [Tesseract Game] Global Illumination: Medium Causes Color Distortion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79462">Bug 79462</a> - [NVC0/Codegen] Shader compilation falis in spill logic</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80011">Bug 80011</a> - [softpipe] tgsi/tgsi_exec.c:2023:exec_txf: Assertion `0' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80012">Bug 80012</a> - [softpipe] draw/draw_gs.c:113:tgsi_fetch_gs_outputs: Assertion `!util_is_inf_or_nan(output[slot][0])' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80050">Bug 80050</a> - [855GM] Incorrect drop shadow color under windows in Cinnamon persists with MESA 10.1.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80247">Bug 80247</a> - Khronos conformance test ES3-CTS.gtf.GL3Tests.transform_feedback.transform_feedback_vertex_id fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80561">Bug 80561</a> - Incorrect implementation of some VDPAU APIs.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80615">Bug 80615</a> - Files in bellagio directory [omx tracker] don't respect installation folder</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80848">Bug 80848</a> - [dri3] Building mesa fails with dri3 enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81680">Bug 81680</a> - [r600g] Firefox crashes with hardware acceleration turned on</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82255">Bug 82255</a> - [VP2] Chroma planes are vertically stretched during VDPAU playback</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82472">Bug 82472</a> - piglit 16385-consecutive-chars regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82537">Bug 82537</a> - Stunt Rally GLSL compiler assertion failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82538">Bug 82538</a> - Super Maryo Chronicles fails with st/mesa assertion failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82539">Bug 82539</a> - vmw_screen_dri.lo In file included from vmw_screen_dri.c:41: vmwgfx_drm.h:32:17: error: drm.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82796">Bug 82796</a> - [IVB/BYT-M/HSW/BDW Bisected]Synmark2_v6.0_OglTerrainFlyInst/OglTerrainPanInst cannot run as image validation failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82804">Bug 82804</a> - unreal engine 4 rendering errors</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82828">Bug 82828</a> - Regression: Crash in 3Dmark2001</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82846">Bug 82846</a> - [BDW Bisected] Gpu hang when running Lightsmark v2008/Warsow v1.0/Xonotic v0.7/unigine-demos</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82881">Bug 82881</a> - test_vec4_register_coalesce regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82882">Bug 82882</a> - [swrast] piglit glsl-fs-uniform-bool-1 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82921">Bug 82921</a> - layout(location=0) emits error &gt;= MAX_UNIFORM_LOCATIONS due to integer underflow</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82929">Bug 82929</a> - [BDW Bisected]glxgears causes X hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82932">Bug 82932</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.shaders.indexing.vector_subscript.vec3_static_loop_subscript_write_direct_read_vertex fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83079">Bug 83079</a> - [NVC0] Dota 2 (Linux native and Wine) crash with Nouveau Drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83080">Bug 83080</a> - [SNB+ Bisected]ES3-CTS.shaders.loops.do_while_constant_iterations.mixed_break_continue_fragment fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83081">Bug 83081</a> - [BDW Bisected]Piglit spec_ARB_sample_shading_builtin-gl-sample-mask_2 is core dumped</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83127">Bug 83127</a> - [ILK Bisected]Piglit glean_texCombine fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83148">Bug 83148</a> - Unity invisible under Ubuntu 14.04 and 14.10</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83355">Bug 83355</a> - FTBFS: src/mesa/program/program_lexer.l:122:64: error: unknown type name 'YYSTYPE'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83380">Bug 83380</a> - Linking fails when not writing gl_Position.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83418">Bug 83418</a> - EU IV is incorrectly rendered after git1409011930.d571f2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83432">Bug 83432</a> - r600_query.c:269:r600_emit_query_end: Assertion `ctx-&gt;num_pipelinestat_queries &gt; 0' failed [Gallium HUD]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83463">Bug 83463</a> - [swrast] piglit glsl-vs-clamp-1 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83468">Bug 83468</a> - [UBO] Using bool from UBO as if-statement condition asserts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83500">Bug 83500</a> - si_dma_copy_tile causes GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83506">Bug 83506</a> - [UBO] row_major layout ignored inside structures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83533">Bug 83533</a> - [UBO] nested structures don't get appropriate padding</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83573">Bug 83573</a> - [swrast] piglit fs-op-not-bool-using-if regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83574">Bug 83574</a> - [llvmpipe] [softpipe] piglit arb_explicit_uniform_location-use-of-unused-loc regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83741">Bug 83741</a> - [UBO] row_major layout partially ignored for arrays of structures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83777">Bug 83777</a> - [regression] ilo fails to build</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83934">Bug 83934</a> - Structures must have same name to be considered same type.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84140">Bug 84140</a> - mplayer crashes playing some files using vdpau output</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84145">Bug 84145</a> - UE4: Realistic Rendering Demo render blue</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84178">Bug 84178</a> - Big glamor regression in Xorg server 1.6.99.1 GIT: x11perf 1.5 Test: PutImage XY 500x500 Square</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84355">Bug 84355</a> - texture2DProjLod and textureCubeLod are not supported when using GLES.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84529">Bug 84529</a> - [IVB bisected] glean fragProg1 CMP test failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84538">Bug 84538</a> - lp_test_format.c:226:4: error: too few arguments to function gallivm_create</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84539">Bug 84539</a> - brw_fs_register_coalesce.cpp:183: bool fs_visitor::register_coalesce(): Assertion `src_size &lt;= 11' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84557">Bug 84557</a> - [HSW] &quot;Emit ELSE/ENDIF JIP with type D on Gen 7&quot; causes Atomic Afterlife and GPU hangs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84651">Bug 84651</a> - Distorted graphics or black window when running Battle.net app on Intel hardware via wine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84662">Bug 84662</a> - Long pauses with Unreal demo Elemental on R9270X since : Always flush the HDP cache before submitting a CS to the GPU</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84777">Bug 84777</a> - [BSW]Piglit spec_glsl-1.50_execution_geometry-basic fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84807">Bug 84807</a> - Build issue starting between bf4aecfb2acc8d0dc815105d2f36eccbc97c284b and a3e9582f09249ad27716ba82c7dfcee685b65d51</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85189">Bug 85189</a> - llvm/invocation.cpp: In function 'void {anonymous}::optimize(llvm::Module*, unsigned int, const std::vector&lt;llvm::Function*&gt;&amp;)': llvm/invocation.cpp:324:18: error: expected type-specifier</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85267">Bug 85267</a> - vlc crashes with vdpau (Radeon 3850HD) [r600]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85377">Bug 85377</a> - lp_test_format failure with llvm-3.6</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85425">Bug 85425</a> - [bisected] Compiler error in clip control operations in meta</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85429">Bug 85429</a> - indirect.c:296: multiple definition of `__indirect_glNewList'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85454">Bug 85454</a> - Unigine Sanctuary with Wine crashes on Mesa Git</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85647">Bug 85647</a> - Random radeonsi crashes with mesa 10.3.x</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85683">Bug 85683</a> - [i965 Bisected]Piglit shaders_glsl-vs-raytrace-bug26691 segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85691">Bug 85691</a> - 'glsl: Drop constant 0.0 components from dot products.' broke piglit shaders/glsl-gnome-shell-dim-window and a few others with Gallium</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86025">Bug 86025</a> - src\glsl\list.h(535) : error C2143: syntax error : missing ';' before 'type'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86089">Bug 86089</a> - [r600g][mesa 10.4.0-dev] shader failure - r600_sb::bc_finalizer::cf_peephole() when starting Second Life</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86145">Bug 86145</a> - Pipeline statistic counter values for VF always 0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86618">Bug 86618</a> - [NV96] neg modifiers not working in MIN and MAX operations</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86760">Bug 86760</a> - mesa doesn't build: recipe for target 'r600_llvm.lo' failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86764">Bug 86764</a> - [SNB+ Bisected]Piglit glean/pointSprite fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86788">Bug 86788</a> - (bisected) 32bit UrbanTerror 4.1 timedemo sse4.1 segfault...</li>
</ul>
<h2>Changes</h2>
<ul>
<li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>
</ul>
</div>

View File

@@ -399,6 +399,16 @@ struct IDirect3DVolume9 : public IUnknown
virtual HRESULT WINAPI UnlockBox() = 0;
};
struct IDirect3DVolumeTexture9 : public IDirect3DBaseTexture9
{
virtual HRESULT WINAPI GetLevelDesc(UINT Level, D3DVOLUME_DESC *pDesc) = 0;
virtual HRESULT WINAPI GetVolumeLevel(UINT Level, IDirect3DVolume9 **ppVolumeLevel) = 0;
virtual HRESULT WINAPI LockBox(UINT Level, D3DLOCKED_BOX *pLockedVolume, const D3DBOX *pBox, DWORD Flags) = 0;
virtual HRESULT WINAPI UnlockBox(UINT Level) = 0;
virtual HRESULT WINAPI AddDirtyBox(const D3DBOX *pDirtyBox) = 0;
};
#else /* __cplusplus */
extern const GUID IID_IDirect3D9;

View File

@@ -224,6 +224,8 @@ typedef struct _RGNDATA {
#define D3DERR_INVALIDDEVICE MAKE_D3DHRESULT(2155)
#define D3DERR_INVALIDCALL MAKE_D3DHRESULT(2156)
#define D3DERR_DRIVERINVALIDCALL MAKE_D3DHRESULT(2157)
#define D3DERR_DEVICEREMOVED MAKE_D3DHRESULT(2160)
#define D3DERR_DEVICEHUNG MAKE_D3DHRESULT(2164)
/********************************************************
* Bitmasks *
@@ -331,6 +333,7 @@ typedef struct _RGNDATA {
#define D3DPRESENT_DONOTWAIT 0x00000001
#define D3DPRESENT_LINEAR_CONTENT 0x00000002
#define D3DPRESENT_RATE_DEFAULT 0
#define D3DCREATE_FPU_PRESERVE 0x00000002
#define D3DCREATE_MULTITHREADED 0x00000004
@@ -344,6 +347,13 @@ typedef struct _RGNDATA {
#define D3DSTREAMSOURCE_INDEXEDDATA (1 << 30)
#define D3DSTREAMSOURCE_INSTANCEDATA (2 << 30)
/* D3DRS_COLORWRITEENABLE */
#define D3DCOLORWRITEENABLE_RED (1L << 0)
#define D3DCOLORWRITEENABLE_GREEN (1L << 1)
#define D3DCOLORWRITEENABLE_BLUE (1L << 2)
#define D3DCOLORWRITEENABLE_ALPHA (1L << 3)
/********************************************************
* Function macros *
*******************************************************/
@@ -639,10 +649,13 @@ typedef enum _D3DFORMAT {
D3DFMT_A1 = 118,
D3DFMT_A2B10G10R10_XR_BIAS = 119,
D3DFMT_BINARYBUFFER = 199,
D3DFMT_ATI1 = MAKEFOURCC('A', 'T', 'I', '1'),
D3DFMT_ATI2 = MAKEFOURCC('A', 'T', 'I', '2'),
D3DFMT_DF16 = MAKEFOURCC('D', 'F', '1', '6'),
D3DFMT_DF24 = MAKEFOURCC('D', 'F', '2', '4'),
D3DFMT_INTZ = MAKEFOURCC('I', 'N', 'T', 'Z'),
D3DFMT_NULL = MAKEFOURCC('N', 'U', 'L', 'L'),
D3DFMT_NVDB = MAKEFOURCC('N', 'V', 'D', 'B'),
D3DFMT_NV11 = MAKEFOURCC('N', 'V', '1', '1'),
D3DFMT_NV12 = MAKEFOURCC('N', 'V', '1', '2'),
D3DFMT_Y210 = MAKEFOURCC('Y', '2', '1', '0'),

View File

@@ -3,9 +3,9 @@
if BUILD_SHARED
if HAVE_COMPAT_SYMLINKS
all-local : .libs/install-mesa-links
all-local : .install-mesa-links
.libs/install-mesa-links : $(lib_LTLIBRARIES)
.install-mesa-links : $(lib_LTLIBRARIES)
$(AM_V_GEN)$(MKDIR_P) $(top_builddir)/$(LIB_DIR); \
for f in $(join $(addsuffix .libs/,$(dir $(lib_LTLIBRARIES))),$(notdir $(lib_LTLIBRARIES:%.la=%.$(LIB_EXT)*))); do \
if test -h .libs/$$f; then \
@@ -14,5 +14,9 @@ all-local : .libs/install-mesa-links
ln -f $$f $(top_builddir)/$(LIB_DIR); \
fi; \
done && touch $@
clean-local:
$(RM) .install-mesa-links
endif
endif

View File

@@ -668,15 +668,21 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
for (i = 0; dri2_dpy->driver_configs[i]; i++) {
EGLint format, attr_list[3];
unsigned int mask;
unsigned int red, alpha;
dri2_dpy->core->getConfigAttrib(dri2_dpy->driver_configs[i],
__DRI_ATTRIB_RED_MASK, &mask);
if (mask == 0x3ff00000)
__DRI_ATTRIB_RED_MASK, &red);
dri2_dpy->core->getConfigAttrib(dri2_dpy->driver_configs[i],
__DRI_ATTRIB_ALPHA_MASK, &alpha);
if (red == 0x3ff00000 && alpha == 0x00000000)
format = GBM_FORMAT_XRGB2101010;
else if (mask == 0x00ff0000)
else if (red == 0x3ff00000 && alpha == 0xc0000000)
format = GBM_FORMAT_ARGB2101010;
else if (red == 0x00ff0000 && alpha == 0x00000000)
format = GBM_FORMAT_XRGB8888;
else if (mask == 0xf800)
else if (red == 0x00ff0000 && alpha == 0xff000000)
format = GBM_FORMAT_ARGB8888;
else if (red == 0xf800)
format = GBM_FORMAT_RGB565;
else
continue;

View File

@@ -49,8 +49,7 @@ dri2_x11_swap_interval(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf,
static void
swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
struct dri2_egl_surface * dri2_surf,
int depth)
struct dri2_egl_surface * dri2_surf)
{
uint32_t mask;
const uint32_t function = GXcopy;
@@ -66,8 +65,7 @@ swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
valgc[0] = function;
valgc[1] = False;
xcb_create_gc(dri2_dpy->conn, dri2_surf->swapgc, dri2_surf->drawable, mask, valgc);
dri2_surf->depth = depth;
switch (depth) {
switch (dri2_surf->depth) {
case 32:
case 24:
dri2_surf->bytes_per_pixel = 4;
@@ -82,7 +80,7 @@ swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
dri2_surf->bytes_per_pixel = 0;
break;
default:
_eglLog(_EGL_WARNING, "unsupported depth %d", depth);
_eglLog(_EGL_WARNING, "unsupported depth %d", dri2_surf->depth);
}
}
@@ -257,12 +255,6 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
_eglError(EGL_BAD_ALLOC, "dri2->createNewDrawable");
goto cleanup_pixmap;
}
if (dri2_dpy->dri2) {
xcb_dri2_create_drawable (dri2_dpy->conn, dri2_surf->drawable);
} else {
swrastCreateDrawable(dri2_dpy, dri2_surf, _eglGetConfigKey(conf, EGL_BUFFER_SIZE));
}
if (type != EGL_PBUFFER_BIT) {
cookie = xcb_get_geometry (dri2_dpy->conn, dri2_surf->drawable);
@@ -275,9 +267,19 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
dri2_surf->base.Width = reply->width;
dri2_surf->base.Height = reply->height;
dri2_surf->depth = reply->depth;
free(reply);
}
if (dri2_dpy->dri2) {
xcb_dri2_create_drawable (dri2_dpy->conn, dri2_surf->drawable);
} else {
if (type == EGL_PBUFFER_BIT) {
dri2_surf->depth = _eglGetConfigKey(conf, EGL_BUFFER_SIZE);
}
swrastCreateDrawable(dri2_dpy, dri2_surf);
}
/* we always copy the back buffer to front */
dri2_surf->base.PostSubBufferSupportedNV = EGL_TRUE;

View File

@@ -193,7 +193,7 @@ def lineloop(intype, outtype, inpv, outpv):
print ' for (i = start, j = 0; j < nr - 2; j+=2, i++) { '
do_line( intype, outtype, 'out+j', 'i', 'i+1', inpv, outpv );
print ' }'
do_line( intype, outtype, 'out+j', 'i', '0', inpv, outpv );
do_line( intype, outtype, 'out+j', 'i', 'start', inpv, outpv );
postamble()
def tris(intype, outtype, inpv, outpv):
@@ -218,7 +218,7 @@ def tristrip(intype, outtype, inpv, outpv):
def trifan(intype, outtype, inpv, outpv):
preamble(intype, outtype, inpv, outpv, prim='trifan')
print ' for (i = start, j = 0; j < nr; j+=3, i++) { '
do_tri( intype, outtype, 'out+j', '0', 'i+1', 'i+2', inpv, outpv );
do_tri( intype, outtype, 'out+j', 'start', 'i+1', 'i+2', inpv, outpv );
print ' }'
postamble()
@@ -228,9 +228,9 @@ def polygon(intype, outtype, inpv, outpv):
preamble(intype, outtype, inpv, outpv, prim='polygon')
print ' for (i = start, j = 0; j < nr; j+=3, i++) { '
if inpv == FIRST:
do_tri( intype, outtype, 'out+j', '0', 'i+1', 'i+2', inpv, outpv );
do_tri( intype, outtype, 'out+j', 'start', 'i+1', 'i+2', inpv, outpv );
else:
do_tri( intype, outtype, 'out+j', 'i+1', 'i+2', '0', inpv, outpv );
do_tri( intype, outtype, 'out+j', 'i+1', 'i+2', 'start', inpv, outpv );
print ' }'
postamble()

View File

@@ -124,6 +124,9 @@ util_primconvert_draw_vbo(struct primconvert_context *pc,
new_info.indexed = true;
new_info.min_index = info->min_index;
new_info.max_index = info->max_index;
new_info.index_bias = info->index_bias;
new_info.start_instance = info->start_instance;
new_info.instance_count = info->instance_count;
if (info->indexed) {
u_index_translator(pc->primtypes_mask,
@@ -136,6 +139,7 @@ util_primconvert_draw_vbo(struct primconvert_context *pc,
src = pipe_buffer_map(pc->pipe, ib->buffer,
PIPE_TRANSFER_READ, &src_transfer);
}
src = (const uint8_t *)src + ib->offset;
}
else {
u_index_generator(pc->primtypes_mask,

View File

@@ -118,7 +118,7 @@ os_get_total_physical_memory(uint64_t *size)
*size = phys_pages * page_size;
return (phys_pages > 0 && page_size > 0);
#elif defined(PIPE_OS_APPLE) || defined(PIPE_OS_BSD)
size_t len = sizeof(size);
size_t len = sizeof(*size);
int mib[2];
mib[0] = CTL_HW;
@@ -134,7 +134,7 @@ os_get_total_physical_memory(uint64_t *size)
#error Unsupported *BSD
#endif
return (sysctl(mib, 2, &size, &len, NULL, 0) == 0);
return (sysctl(mib, 2, size, &len, NULL, 0) == 0);
#elif defined(PIPE_OS_HAIKU)
system_info info;
status_t ret;

View File

@@ -70,8 +70,8 @@ static INLINE void *os_mmap(void *addr, size_t length, int prot, int flags,
return __mmap2(addr, length, prot, flags, fd, (size_t) (offset >> 12));
}
# define drm_munmap(addr, length) \
munmap(addr, length)
# define os_munmap(addr, length) \
munmap(addr, length)
#else
/* assume large file support exists */

View File

@@ -272,7 +272,7 @@ static INLINE uint64_t xgetbv(void)
#if defined(PIPE_ARCH_X86)
static INLINE boolean sse2_has_daz(void)
PIPE_ALIGN_STACK static INLINE boolean sse2_has_daz(void)
{
struct {
uint32_t pad1[7];
@@ -409,8 +409,12 @@ util_cpu_detect(void)
}
if (regs[0] >= 0x80000006) {
/* should we really do this if the clflush size above worked? */
unsigned int cacheline;
cpuid(0x80000006, regs2);
util_cpu_caps.cacheline = regs2[2] & 0xFF;
cacheline = regs2[2] & 0xFF;
if (cacheline > 0)
util_cpu_caps.cacheline = cacheline;
}
if (!util_cpu_caps.has_sse) {

View File

@@ -561,14 +561,10 @@ util_last_bit(unsigned u)
static INLINE unsigned
util_last_bit_signed(int i)
{
#if defined(__GNUC__) && ((__GNUC__ * 100 + __GNUC_MINOR__) >= 407) && !defined(__INTEL_COMPILER)
return 31 - __builtin_clrsb(i);
#else
if (i >= 0)
return util_last_bit(i);
else
return util_last_bit(~(unsigned)i);
#endif
}
/* Destructively loop over all of the bits in a mask as in:

View File

@@ -13,7 +13,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10347 bytes, from 2014-10-01 18:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14960 bytes, from 2014-07-27 17:22:13)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 60533 bytes, from 2014-10-15 18:32:43)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 64771 bytes, from 2015-03-15 21:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 41068 bytes, from 2014-08-01 12:22:48)
Copyright (C) 2013-2014 by the following authors:

View File

@@ -13,7 +13,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10347 bytes, from 2014-10-01 18:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14960 bytes, from 2014-07-27 17:22:13)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 60533 bytes, from 2014-10-15 18:32:43)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 64771 bytes, from 2015-03-15 21:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 41068 bytes, from 2014-08-01 12:22:48)
Copyright (C) 2013-2014 by the following authors:
@@ -2572,7 +2572,7 @@ static inline uint32_t A3XX_TEX_CONST_2_SWAP(enum a3xx_color_swap val)
}
#define REG_A3XX_TEX_CONST_3 0x00000003
#define A3XX_TEX_CONST_3_LAYERSZ1__MASK 0x0000000f
#define A3XX_TEX_CONST_3_LAYERSZ1__MASK 0x00001fff
#define A3XX_TEX_CONST_3_LAYERSZ1__SHIFT 0
static inline uint32_t A3XX_TEX_CONST_3_LAYERSZ1(uint32_t val)
{

View File

@@ -13,7 +13,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10347 bytes, from 2014-10-01 18:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14960 bytes, from 2014-07-27 17:22:13)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 60533 bytes, from 2014-10-15 18:32:43)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 64771 bytes, from 2015-03-15 21:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 41068 bytes, from 2014-08-01 12:22:48)
Copyright (C) 2013-2014 by the following authors:

View File

@@ -13,7 +13,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10347 bytes, from 2014-10-01 18:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14960 bytes, from 2014-07-27 17:22:13)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 60533 bytes, from 2014-10-15 18:32:43)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 64771 bytes, from 2015-03-15 21:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 41068 bytes, from 2014-08-01 12:22:48)
Copyright (C) 2013-2014 by the following authors:

View File

@@ -199,7 +199,7 @@ setup_slices(struct fd_resource *rsc)
for (level = 0; level <= prsc->last_level; level++) {
struct fd_resource_slice *slice = fd_resource_slice(rsc, level);
slice->pitch = align(width, 32);
slice->pitch = width = align(width, 32);
slice->offset = size;
slice->size0 = slice->pitch * height * rsc->cpp;

View File

@@ -123,12 +123,12 @@ fd_set_framebuffer_state(struct pipe_context *pctx,
fd_context_render(pctx);
util_copy_framebuffer_state(cso, framebuffer);
if ((cso->width != framebuffer->width) ||
(cso->height != framebuffer->height))
ctx->needs_rb_fbd = true;
util_copy_framebuffer_state(cso, framebuffer);
ctx->dirty |= FD_DIRTY_FRAMEBUFFER;
ctx->disabled_scissor.minx = 0;

View File

@@ -1421,6 +1421,7 @@ trans_txq(const struct instr_translater *t,
struct tgsi_dst_register *dst = &inst->Dst[0].Register;
struct tgsi_src_register *level = &inst->Src[0].Register;
struct tgsi_src_register *samp = &inst->Src[1].Register;
const struct target_info *tgt = &tex_targets[inst->Texture.Texture];
struct tex_info tinf;
memset(&tinf, 0, sizeof(tinf));
@@ -1434,8 +1435,67 @@ trans_txq(const struct instr_translater *t,
instr->cat5.tex = samp->Index;
instr->flags |= tinf.flags;
add_dst_reg_wrmask(ctx, instr, dst, 0, dst->WriteMask);
add_src_reg_wrmask(ctx, instr, level, level->SwizzleX, 0x1);
if (tgt->array && (dst->WriteMask & (1 << tgt->dims))) {
/* Array size actually ends up in .w rather than .z. This doesn't
* matter for miplevel 0, but for higher mips the value in z is
* minified whereas w stays. Also, the value in TEX_CONST_3_DEPTH is
* returned, which means that we have to add 1 to it for arrays.
*/
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register *tmp_src;
type_t type_mov = get_utype(ctx);
tmp_src = get_internal_temp(ctx, &tmp_dst);
add_dst_reg_wrmask(ctx, instr, &tmp_dst, 0,
dst->WriteMask | TGSI_WRITEMASK_W);
add_src_reg_wrmask(ctx, instr, level, level->SwizzleX, 0x1);
if (dst->WriteMask & TGSI_WRITEMASK_X) {
instr = instr_create(ctx, 1, 0);
instr->cat1.src_type = type_mov;
instr->cat1.dst_type = type_mov;
add_dst_reg(ctx, instr, dst, 0);
add_src_reg(ctx, instr, tmp_src, src_swiz(tmp_src, 0));
}
if (tgt->dims == 2) {
if (dst->WriteMask & TGSI_WRITEMASK_Y) {
instr = instr_create(ctx, 1, 0);
instr->cat1.src_type = type_mov;
instr->cat1.dst_type = type_mov;
add_dst_reg(ctx, instr, dst, 1);
add_src_reg(ctx, instr, tmp_src, src_swiz(tmp_src, 1));
}
}
instr = instr_create(ctx, 2, OPC_ADD_U);
add_dst_reg(ctx, instr, dst, tgt->dims);
add_src_reg(ctx, instr, tmp_src, src_swiz(tmp_src, 3));
ir3_reg_create(instr, 0, IR3_REG_IMMED)->iim_val = 1;
} else {
add_dst_reg_wrmask(ctx, instr, dst, 0, dst->WriteMask);
add_src_reg_wrmask(ctx, instr, level, level->SwizzleX, 0x1);
}
if (dst->WriteMask & TGSI_WRITEMASK_W) {
/* The # of levels comes from getinfo.z. We need to add 1 to it, since
* the value in TEX_CONST_0 is zero-based.
*/
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register *tmp_src;
tmp_src = get_internal_temp(ctx, &tmp_dst);
instr = instr_create(ctx, 5, OPC_GETINFO);
instr->cat5.type = get_utype(ctx);
instr->cat5.samp = samp->Index;
instr->cat5.tex = samp->Index;
add_dst_reg_wrmask(ctx, instr, &tmp_dst, 0, TGSI_WRITEMASK_Z);
instr = instr_create(ctx, 2, OPC_ADD_U);
add_dst_reg(ctx, instr, dst, 3);
add_src_reg(ctx, instr, tmp_src, src_swiz(tmp_src, 2));
ir3_reg_create(instr, 0, IR3_REG_IMMED)->iim_val = 1;
}
}
/* DDX/DDY */
@@ -3094,7 +3154,7 @@ ir3_compile_shader(struct ir3_shader_variant *so,
if (key.binning_pass) {
for (i = 0, j = 0; i < so->outputs_count; i++) {
unsigned name = sem2name(so->outputs[i].semantic);
unsigned idx = sem2name(so->outputs[i].semantic);
unsigned idx = sem2idx(so->outputs[i].semantic);
/* throw away everything but first position/psize */
if ((idx == 0) && ((name == TGSI_SEMANTIC_POSITION) ||

View File

@@ -33,6 +33,7 @@
#include "util/u_pointer.h"
#include "util/u_memory.h"
#include "util/u_math.h"
#include "util/u_cpu_detect.h"
#include "gallivm/lp_bld.h"
#include "gallivm/lp_bld_debug.h"
@@ -332,6 +333,38 @@ build_unary_test_func(struct gallivm_state *gallivm,
}
/*
* Flush denorms to zero.
*/
static float
flush_denorm_to_zero(float val)
{
/*
* If we have a denorm manually set it to (+-)0.
* This is because the reference may or may not do the right thing
* otherwise because we want the result according to treating all
* denormals as zero (FTZ/DAZ). Not using fpclassify because
* a) some compilers are stuck at c89 (msvc)
* b) not sure it reliably works with non-standard ftz/daz mode
* And, right now we only disable denorms with jited code on x86/sse
* (albeit this should be classified as a bug) so to get results which
* match we must only flush them to zero here in that case too.
*/
union fi fi_val;
fi_val.f = val;
#if defined(PIPE_ARCH_SSE)
if (util_cpu_caps.has_sse) {
if ((fi_val.ui & 0x7f800000) == 0) {
fi_val.ui &= 0xff800000;
}
}
#endif
return fi_val.f;
}
/*
* Test one LLVM unary arithmetic builder function.
*/
@@ -374,10 +407,13 @@ test_unary(unsigned verbose, FILE *fp, const struct unary_test_t *test)
test_func_jit(out, in);
for (i = 0; i < num_vals; ++i) {
float ref = test->ref(in[i]);
float testval, ref;
double error, precision;
bool pass;
testval = flush_denorm_to_zero(in[i]);
ref = flush_denorm_to_zero(test->ref(testval));
if (util_inf_sign(ref) && util_inf_sign(out[i]) == util_inf_sign(ref)) {
error = 0;
} else {

View File

@@ -772,7 +772,8 @@ NV50LoweringPreSSA::handleTEX(TexInstruction *i)
if (i->tex.useOffsets) {
for (int c = 0; c < 3; ++c) {
ImmediateValue val;
assert(i->offset[0][c].getImmediate(val));
if (!i->offset[0][c].getImmediate(val))
assert(!"non-immediate offset");
i->tex.offset[c] = val.reg.data.u32;
i->offset[0][c].set(NULL);
}

View File

@@ -754,7 +754,8 @@ NVC0LoweringPass::handleTEX(TexInstruction *i)
assert(i->tex.useOffsets == 1);
for (c = 0; c < 3; ++c) {
ImmediateValue val;
assert(i->offset[0][c].getImmediate(val));
if (!i->offset[0][c].getImmediate(val))
assert(!"non-immediate offset passed to non-TXG");
imm |= (val.reg.data.u32 & 0xf) << (c * 4);
}
if (i->op == OP_TXD && chipset >= NVISA_GK104_CHIPSET) {

View File

@@ -1708,7 +1708,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#define NV50_3D_CULL_FACE_BACK 0x00000405
#define NV50_3D_CULL_FACE_FRONT_AND_BACK 0x00000408
#define NV50_3D_LINE_LAST_PIXEL 0x00001924
#define NV50_3D_PIXEL_CENTER_INTEGER 0x00001924
#define NVA3_3D_FP_MULTISAMPLE 0x00001928
#define NVA3_3D_FP_MULTISAMPLE_EXPORT_SAMPLE_MASK 0x00000001

View File

@@ -461,8 +461,6 @@ nv50_screen_init_hwctx(struct nv50_screen *screen)
PUSH_DATA (push, 0);
BEGIN_NV04(push, NV50_3D(PRIM_RESTART_WITH_DRAW_ARRAYS), 1);
PUSH_DATA (push, 1);
BEGIN_NV04(push, NV50_3D(LINE_LAST_PIXEL), 1);
PUSH_DATA (push, 0);
BEGIN_NV04(push, NV50_3D(BLEND_SEPARATE_ALPHA), 1);
PUSH_DATA (push, 1);
@@ -609,6 +607,13 @@ nv50_screen_init_hwctx(struct nv50_screen *screen)
BEGIN_NV04(push, NV50_3D(EDGEFLAG), 1);
PUSH_DATA (push, 1);
BEGIN_NV04(push, NV50_3D(VB_ELEMENT_BASE), 1);
PUSH_DATA (push, 0);
if (screen->base.class_3d >= NV84_3D_CLASS) {
BEGIN_NV04(push, SUBC_3D(NV84_3D_VERTEX_ID_BASE), 1);
PUSH_DATA (push, 0);
}
PUSH_KICK (push);
}

View File

@@ -57,10 +57,6 @@
* ! pipe_rasterizer_state.flatshade_first also applies to QUADS
* (There's a GL query for that, forcing an exception is just ridiculous.)
*
* ! pipe_rasterizer_state.half_pixel_center is ignored - pixel centers
* are always at half integer coordinates and the top-left rule applies
* (There does not seem to be a hardware switch for this.)
*
* ! pipe_rasterizer_state.sprite_coord_enable is masked with 0xff on NVC0
* (The hardware only has 8 slots meant for TexCoord and we have to assign
* in advance to maintain elegant separate shader objects.)
@@ -221,7 +217,7 @@ nv50_blend_state_delete(struct pipe_context *pipe, void *hwcso)
FREE(hwcso);
}
/* NOTE: ignoring line_last_pixel, using FALSE (set on screen init) */
/* NOTE: ignoring line_last_pixel */
static void *
nv50_rasterizer_state_create(struct pipe_context *pipe,
const struct pipe_rasterizer_state *cso)
@@ -336,6 +332,9 @@ nv50_rasterizer_state_create(struct pipe_context *pipe,
SB_BEGIN_3D(so, DEPTH_CLIP_NEGATIVE_Z, 1);
SB_DATA (so, cso->clip_halfz);
SB_BEGIN_3D(so, PIXEL_CENTER_INTEGER, 1);
SB_DATA (so, !cso->half_pixel_center);
assert(so->size <= (sizeof(so->state) / sizeof(so->state[0])));
return (void *)so;
}

View File

@@ -25,7 +25,7 @@ struct nv50_blend_stateobj {
struct nv50_rasterizer_stateobj {
struct pipe_rasterizer_state pipe;
int size;
uint32_t state[48];
uint32_t state[49];
};
struct nv50_zsa_stateobj {

View File

@@ -472,6 +472,10 @@ nv50_draw_arrays(struct nv50_context *nv50,
if (nv50->state.index_bias) {
BEGIN_NV04(push, NV50_3D(VB_ELEMENT_BASE), 1);
PUSH_DATA (push, 0);
if (nv50->screen->base.class_3d >= NV84_3D_CLASS) {
BEGIN_NV04(push, SUBC_3D(NV84_3D_VERTEX_ID_BASE), 1);
PUSH_DATA (push, 0);
}
nv50->state.index_bias = 0;
}
@@ -594,6 +598,10 @@ nv50_draw_elements(struct nv50_context *nv50, boolean shorten,
if (index_bias != nv50->state.index_bias) {
BEGIN_NV04(push, NV50_3D(VB_ELEMENT_BASE), 1);
PUSH_DATA (push, index_bias);
if (nv50->screen->base.class_3d >= NV84_3D_CLASS) {
BEGIN_NV04(push, SUBC_3D(NV84_3D_VERTEX_ID_BASE), 1);
PUSH_DATA (push, index_bias);
}
nv50->state.index_bias = index_bias;
}

View File

@@ -227,6 +227,7 @@ locn_0f_ts:
/* NVC0_3D_MACRO_DRAW_ELEMENTS_INDIRECT
*
* NOTE: Saves and restores VB_ELEMENT,INSTANCE_BASE.
* Forcefully sets VERTEX_ID_BASE to the value of VB_ELEMENT_BASE.
*
* arg = mode
* parm[0] = count
@@ -247,6 +248,8 @@ locn_0f_ts:
maddr 0x150d /* VB_ELEMENT,INSTANCE_BASE */
send $r4
send $r5
maddr 0x446
send $r4
mov $r4 0x1
dei_again:
maddr 0x586 /* VERTEX_BEGIN_GL */
@@ -258,8 +261,10 @@ dei_again:
branz $r2 #dei_again
mov $r1 (extrinsrt $r1 $r4 0 1 26) /* set INSTANCE_NEXT */
maddr 0x150d /* VB_ELEMENT,INSTANCE_BASE */
exit send $r6
send $r6
send $r7
exit maddr 0x446
send $r6
dei_end:
exit
nop

View File

@@ -128,16 +128,18 @@ uint32_t mme9097_draw_elts_indirect[] = {
0x00000301,
0x00000201,
0x017dc451,
/* 0x000c: dei_again */
/* 0x000e: dei_again */
0x00002431,
0x0004d007,
/* 0x0017: dei_end */
0x0005d007,
0x00000501,
/* 0x001b: dei_end */
0x01434615,
0x01438715,
0x05434021,
0x00002041,
0x00002841,
0x01118021,
0x00002041,
0x00004411,
0x01618021,
0x00000841,
@@ -148,8 +150,10 @@ uint32_t mme9097_draw_elts_indirect[] = {
0xfffe9017,
0xd0410912,
0x05434021,
0x000030c1,
0x00003041,
0x00003841,
0x011180a1,
0x00003041,
0x00000091,
0x00000011,
};

View File

@@ -1041,7 +1041,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#define NVC0_3D_CULL_FACE_BACK 0x00000405
#define NVC0_3D_CULL_FACE_FRONT_AND_BACK 0x00000408
#define NVC0_3D_LINE_LAST_PIXEL 0x00001924
#define NVC0_3D_PIXEL_CENTER_INTEGER 0x00001924
#define NVC0_3D_VIEWPORT_TRANSFORM_EN 0x0000192c

View File

@@ -786,8 +786,6 @@ nvc0_screen_create(struct nouveau_device *dev)
PUSH_DATA (push, 0);
BEGIN_NVC0(push, NVC0_3D(LINE_WIDTH_SEPARATE), 1);
PUSH_DATA (push, 1);
BEGIN_NVC0(push, NVC0_3D(LINE_LAST_PIXEL), 1);
PUSH_DATA (push, 0);
BEGIN_NVC0(push, NVC0_3D(PRIM_RESTART_WITH_DRAW_ARRAYS), 1);
PUSH_DATA (push, 1);
BEGIN_NVC0(push, NVC0_3D(BLEND_SEPARATE_ALPHA), 1);

View File

@@ -252,7 +252,12 @@ nvc0_tfb_validate(struct nvc0_context *nvc0)
for (b = 0; b < nvc0->num_tfbbufs; ++b) {
struct nvc0_so_target *targ = nvc0_so_target(nvc0->tfbbuf[b]);
struct nv04_resource *buf = nv04_resource(targ->pipe.buffer);
struct nv04_resource *buf;
if (!targ) {
IMMED_NVC0(push, NVC0_3D(TFB_BUFFER_ENABLE(b)), 0);
continue;
}
if (tfb)
targ->stride = tfb->stride[b];
@@ -260,6 +265,8 @@ nvc0_tfb_validate(struct nvc0_context *nvc0)
if (!(nvc0->tfbbuf_dirty & (1 << b)))
continue;
buf = nv04_resource(targ->pipe.buffer);
if (!targ->clean)
nvc0_query_fifo_wait(push, targ->pq);
BEGIN_NVC0(push, NVC0_3D(TFB_BUFFER_ENABLE(b)), 5);

View File

@@ -204,7 +204,7 @@ nvc0_blend_state_delete(struct pipe_context *pipe, void *hwcso)
FREE(hwcso);
}
/* NOTE: ignoring line_last_pixel, using FALSE (set on screen init) */
/* NOTE: ignoring line_last_pixel */
static void *
nvc0_rasterizer_state_create(struct pipe_context *pipe,
const struct pipe_rasterizer_state *cso)
@@ -315,6 +315,8 @@ nvc0_rasterizer_state_create(struct pipe_context *pipe,
SB_IMMED_3D(so, DEPTH_CLIP_NEGATIVE_Z, cso->clip_halfz);
SB_IMMED_3D(so, PIXEL_CENTER_INTEGER, !cso->half_pixel_center);
assert(so->size <= (sizeof(so->state) / sizeof(so->state[0])));
return (void *)so;
}
@@ -1087,9 +1089,11 @@ nvc0_set_transform_feedback_targets(struct pipe_context *pipe,
pipe_so_target_reference(&nvc0->tfbbuf[i], targets[i]);
}
for (; i < nvc0->num_tfbbufs; ++i) {
nvc0->tfbbuf_dirty |= 1 << i;
nvc0_so_target_save_offset(pipe, nvc0->tfbbuf[i], i, &serialize);
pipe_so_target_reference(&nvc0->tfbbuf[i], NULL);
if (nvc0->tfbbuf[i]) {
nvc0->tfbbuf_dirty |= 1 << i;
nvc0_so_target_save_offset(pipe, nvc0->tfbbuf[i], i, &serialize);
pipe_so_target_reference(&nvc0->tfbbuf[i], NULL);
}
}
nvc0->num_tfbbufs = num_targets;

View File

@@ -23,7 +23,7 @@ struct nvc0_blend_stateobj {
struct nvc0_rasterizer_stateobj {
struct pipe_rasterizer_state pipe;
int size;
uint32_t state[43];
uint32_t state[44];
};
struct nvc0_zsa_stateobj {

View File

@@ -1401,11 +1401,14 @@ nvc0_blit(struct pipe_context *pipe, const struct pipe_blit_info *info)
} else
if (!nv50_2d_src_format_faithful(info->src.format)) {
if (!util_format_is_luminance(info->src.format)) {
if (!nv50_2d_dst_format_ops_supported(info->dst.format))
eng3d = TRUE;
else
if (util_format_is_intensity(info->src.format))
eng3d = info->src.format != PIPE_FORMAT_I8_UNORM;
else
if (!nv50_2d_dst_format_ops_supported(info->dst.format))
eng3d = TRUE;
if (util_format_is_alpha(info->src.format))
eng3d = info->src.format != PIPE_FORMAT_A8_UNORM;
else
eng3d = !nv50_2d_format_supported(info->src.format);
}

View File

@@ -575,8 +575,9 @@ nvc0_draw_arrays(struct nvc0_context *nvc0,
if (nvc0->state.index_bias) {
/* index_bias is implied 0 if !info->indexed (really ?) */
/* TODO: can we deactivate it for the VERTEX_BUFFER_FIRST command ? */
PUSH_SPACE(push, 1);
PUSH_SPACE(push, 2);
IMMED_NVC0(push, NVC0_3D(VB_ELEMENT_BASE), 0);
IMMED_NVC0(push, NVC0_3D(VERTEX_ID), 0);
nvc0->state.index_bias = 0;
}
@@ -705,9 +706,11 @@ nvc0_draw_elements(struct nvc0_context *nvc0, boolean shorten,
prim = nvc0_prim_gl(mode);
if (index_bias != nvc0->state.index_bias) {
PUSH_SPACE(push, 2);
PUSH_SPACE(push, 4);
BEGIN_NVC0(push, NVC0_3D(VB_ELEMENT_BASE), 1);
PUSH_DATA (push, index_bias);
BEGIN_NVC0(push, NVC0_3D(VERTEX_ID), 1);
PUSH_DATA (push, index_bias);
nvc0->state.index_bias = index_bias;
}
@@ -818,6 +821,7 @@ nvc0_draw_indirect(struct nvc0_context *nvc0, const struct pipe_draw_info *info)
if (nvc0->state.index_bias) {
/* index_bias is implied 0 if !info->indexed (really ?) */
IMMED_NVC0(push, NVC0_3D(VB_ELEMENT_BASE), 0);
IMMED_NVC0(push, NVC0_3D(VERTEX_ID), 0);
nvc0->state.index_bias = 0;
}
size = 4 * 4;

View File

@@ -28,6 +28,7 @@
*/
#include <errno.h>
#include <limits.h>
#include <regex.h>
#include <stdlib.h>
#include <stdio.h>
@@ -528,7 +529,6 @@ void init_compiler(
}
#define MAX_LINE_LENGTH 100
#define MAX_PATH_LENGTH 100
unsigned load_program(
struct radeon_compiler *c,
@@ -536,14 +536,19 @@ unsigned load_program(
const char *filename)
{
char line[MAX_LINE_LENGTH];
char path[MAX_PATH_LENGTH];
char path[PATH_MAX];
FILE *file;
unsigned *count;
char **string_store;
unsigned i = 0;
int n;
memset(line, 0, sizeof(line));
snprintf(path, MAX_PATH_LENGTH, TEST_PATH "/%s", filename);
n = snprintf(path, PATH_MAX, TEST_PATH "/%s", filename);
if (n < 0 || n >= PATH_MAX) {
return 0;
}
file = fopen(path, "r");
if (!file) {
return 0;

View File

@@ -803,6 +803,15 @@ static void r300_blit(struct pipe_context *pipe,
(struct pipe_framebuffer_state*)r300->fb_state.state;
struct pipe_blit_info info = *blit;
/* The driver supports sRGB textures but not framebuffers. Blitting
* from sRGB to sRGB should be the same as blitting from linear
* to linear, so use that, This avoids incorrect linearization.
*/
if (util_format_is_srgb(info.src.format)) {
info.src.format = util_format_linear(info.src.format);
info.dst.format = util_format_linear(info.dst.format);
}
/* MSAA resolve. */
if (info.src.resource->nr_samples > 1 &&
!util_format_is_depth_or_stencil(info.src.resource->format)) {

View File

@@ -170,24 +170,10 @@ static void get_external_state(
}
state->unit[i].non_normalized_coords = !s->state.normalized_coords;
state->unit[i].convert_unorm_to_snorm =
v->base.format == PIPE_FORMAT_RGTC1_SNORM ||
v->base.format == PIPE_FORMAT_LATC1_SNORM;
state->unit[i].convert_unorm_to_snorm = 0;
/* Pass texture swizzling to the compiler, some lowering passes need it. */
if (v->base.format == PIPE_FORMAT_RGTC1_SNORM ||
v->base.format == PIPE_FORMAT_LATC1_SNORM) {
unsigned char swizzle[4];
util_format_compose_swizzles(
util_format_description(v->base.format)->swizzle,
v->swizzle,
swizzle);
state->unit[i].texture_swizzle =
RC_MAKE_SWIZZLE(swizzle[0], swizzle[1],
swizzle[2], swizzle[3]);
} else if (state->unit[i].compare_mode_enabled) {
if (state->unit[i].compare_mode_enabled) {
state->unit[i].texture_swizzle =
RC_MAKE_SWIZZLE(v->swizzle[0], v->swizzle[1],
v->swizzle[2], v->swizzle[3]);

View File

@@ -169,20 +169,21 @@ uint32_t r300_translate_texformat(enum pipe_format format,
/* Add swizzling. */
/* The RGTC1_SNORM and LATC1_SNORM swizzle is done in the shader. */
if (format != PIPE_FORMAT_RGTC1_SNORM &&
if (util_format_is_compressed(format) &&
dxtc_swizzle &&
format != PIPE_FORMAT_RGTC2_UNORM &&
format != PIPE_FORMAT_RGTC2_SNORM &&
format != PIPE_FORMAT_LATC2_UNORM &&
format != PIPE_FORMAT_LATC2_SNORM &&
format != PIPE_FORMAT_RGTC1_UNORM &&
format != PIPE_FORMAT_RGTC1_SNORM &&
format != PIPE_FORMAT_LATC1_UNORM &&
format != PIPE_FORMAT_LATC1_SNORM) {
if (util_format_is_compressed(format) &&
dxtc_swizzle &&
format != PIPE_FORMAT_RGTC2_UNORM &&
format != PIPE_FORMAT_RGTC2_SNORM &&
format != PIPE_FORMAT_LATC2_UNORM &&
format != PIPE_FORMAT_LATC2_SNORM) {
result |= r300_get_swizzle_combined(desc->swizzle, swizzle_view,
TRUE);
} else {
result |= r300_get_swizzle_combined(desc->swizzle, swizzle_view,
FALSE);
}
result |= r300_get_swizzle_combined(desc->swizzle, swizzle_view,
TRUE);
} else {
result |= r300_get_swizzle_combined(desc->swizzle, swizzle_view,
FALSE);
}
/* S3TC formats. */
@@ -213,6 +214,7 @@ uint32_t r300_translate_texformat(enum pipe_format format,
switch (format) {
case PIPE_FORMAT_RGTC1_SNORM:
case PIPE_FORMAT_LATC1_SNORM:
result |= sign_bit[0];
case PIPE_FORMAT_LATC1_UNORM:
case PIPE_FORMAT_RGTC1_UNORM:
return R500_TX_FORMAT_ATI1N | result;
@@ -936,14 +938,16 @@ static void r300_texture_setup_fb_state(struct r300_surface *surf)
surf->pitch_zmask = tex->tex.zmask_stride_in_pixels[level];
surf->pitch_hiz = tex->tex.hiz_stride_in_pixels[level];
} else {
enum pipe_format format = util_format_linear(surf->base.format);
surf->pitch =
stride |
r300_translate_colorformat(surf->base.format) |
r300_translate_colorformat(format) |
R300_COLOR_TILE(tex->tex.macrotile[level]) |
R300_COLOR_MICROTILE(tex->tex.microtile);
surf->format = r300_translate_out_fmt(surf->base.format);
surf->format = r300_translate_out_fmt(format);
surf->colormask_swizzle =
r300_translate_colormask_swizzle(surf->base.format);
r300_translate_colormask_swizzle(format);
surf->pitch_cmask = tex->tex.cmask_stride_in_pixels;
}
}

View File

@@ -6071,7 +6071,7 @@ static int tgsi_ucmp(struct r600_shader_ctx *ctx)
continue;
memset(&alu, 0, sizeof(struct r600_bytecode_alu));
alu.op = ALU_OP3_CNDGE_INT;
alu.op = ALU_OP3_CNDE_INT;
r600_bytecode_src(&alu.src[0], &ctx->src[0], i);
r600_bytecode_src(&alu.src[1], &ctx->src[2], i);
r600_bytecode_src(&alu.src[2], &ctx->src[1], i);

View File

@@ -2659,11 +2659,8 @@ void r600_update_gs_state(struct pipe_context *ctx, struct r600_pipe_shader *sha
r600_store_context_reg(cb, R_028A6C_VGT_GS_OUT_PRIM_TYPE,
r600_conv_prim_to_gs_out(rshader->gs_output_prim));
r600_store_context_reg_seq(cb, R_0288C8_SQ_GS_VERT_ITEMSIZE, 4);
r600_store_value(cb, cp_shader->ring_item_size >> 2);
r600_store_value(cb, 0);
r600_store_value(cb, 0);
r600_store_value(cb, 0);
r600_store_context_reg(cb, R_0288C8_SQ_GS_VERT_ITEMSIZE,
cp_shader->ring_item_size >> 2);
r600_store_context_reg(cb, R_0288A8_SQ_ESGS_RING_ITEMSIZE,
(rshader->ring_item_size) >> 2);

View File

@@ -616,6 +616,8 @@ public:
unsigned num_slots;
bool uses_mova_gpr;
bool r6xx_gpr_index_workaround;
bool stack_workaround_8xx;
bool stack_workaround_9xx;

View File

@@ -38,6 +38,18 @@
namespace r600_sb {
void bc_finalizer::insert_rv6xx_load_ar_workaround(alu_group_node *b4) {
alu_group_node *g = sh.create_alu_group();
alu_node *a = sh.create_alu();
a->bc.set_op(ALU_OP0_NOP);
a->bc.last = 1;
g->push_back(a);
b4->insert_before(g);
}
int bc_finalizer::run() {
run_on(sh.root);
@@ -46,22 +58,15 @@ int bc_finalizer::run() {
for (regions_vec::reverse_iterator I = rv.rbegin(), E = rv.rend(); I != E;
++I) {
region_node *r = *I;
bool is_if = false;
assert(r);
assert(r->first);
if (r->first->is_container()) {
container_node *repdep1 = static_cast<container_node*>(r->first);
assert(repdep1->is_depart() || repdep1->is_repeat());
if_node *n_if = static_cast<if_node*>(repdep1->first);
if (n_if && n_if->is_if())
is_if = true;
}
bool loop = r->is_loop();
if (is_if)
finalize_if(r);
else
if (loop)
finalize_loop(r);
else
finalize_if(r);
r->expand();
}
@@ -117,35 +122,20 @@ int bc_finalizer::run() {
void bc_finalizer::finalize_loop(region_node* r) {
update_nstack(r);
cf_node *loop_start = sh.create_cf(CF_OP_LOOP_START_DX10);
cf_node *loop_end = sh.create_cf(CF_OP_LOOP_END);
bool has_instr = false;
if (!r->is_loop()) {
for (depart_vec::iterator I = r->departs.begin(), E = r->departs.end();
I != E; ++I) {
depart_node *dep = *I;
if (!dep->empty()) {
has_instr = true;
break;
}
}
} else
has_instr = true;
if (has_instr) {
loop_start->jump_after(loop_end);
loop_end->jump_after(loop_start);
}
loop_start->jump_after(loop_end);
loop_end->jump_after(loop_start);
for (depart_vec::iterator I = r->departs.begin(), E = r->departs.end();
I != E; ++I) {
depart_node *dep = *I;
if (has_instr) {
cf_node *loop_break = sh.create_cf(CF_OP_LOOP_BREAK);
loop_break->jump(loop_end);
dep->push_back(loop_break);
}
cf_node *loop_break = sh.create_cf(CF_OP_LOOP_BREAK);
loop_break->jump(loop_end);
dep->push_back(loop_break);
dep->expand();
}
@@ -161,10 +151,8 @@ void bc_finalizer::finalize_loop(region_node* r) {
rep->expand();
}
if (has_instr) {
r->push_front(loop_start);
r->push_back(loop_end);
}
r->push_front(loop_start);
r->push_back(loop_end);
}
void bc_finalizer::finalize_if(region_node* r) {
@@ -235,12 +223,12 @@ void bc_finalizer::finalize_if(region_node* r) {
}
void bc_finalizer::run_on(container_node* c) {
node *prev_node = NULL;
for (node_iterator I = c->begin(), E = c->end(); I != E; ++I) {
node *n = *I;
if (n->is_alu_group()) {
finalize_alu_group(static_cast<alu_group_node*>(n));
finalize_alu_group(static_cast<alu_group_node*>(n), prev_node);
} else {
if (n->is_alu_clause()) {
cf_node *c = static_cast<cf_node*>(n);
@@ -275,17 +263,22 @@ void bc_finalizer::run_on(container_node* c) {
if (n->is_container())
run_on(static_cast<container_node*>(n));
}
prev_node = n;
}
}
void bc_finalizer::finalize_alu_group(alu_group_node* g) {
void bc_finalizer::finalize_alu_group(alu_group_node* g, node *prev_node) {
alu_node *last = NULL;
alu_group_node *prev_g = NULL;
bool add_nop = false;
if (prev_node && prev_node->is_alu_group()) {
prev_g = static_cast<alu_group_node*>(prev_node);
}
for (node_iterator I = g->begin(), E = g->end(); I != E; ++I) {
alu_node *n = static_cast<alu_node*>(*I);
unsigned slot = n->bc.slot;
value *d = n->dst.empty() ? NULL : n->dst[0];
if (d && d->is_special_reg()) {
@@ -323,17 +316,22 @@ void bc_finalizer::finalize_alu_group(alu_group_node* g) {
update_ngpr(n->bc.dst_gpr);
finalize_alu_src(g, n);
add_nop |= finalize_alu_src(g, n, prev_g);
last = n;
}
if (add_nop) {
if (sh.get_ctx().r6xx_gpr_index_workaround) {
insert_rv6xx_load_ar_workaround(g);
}
}
last->bc.last = 1;
}
void bc_finalizer::finalize_alu_src(alu_group_node* g, alu_node* a) {
bool bc_finalizer::finalize_alu_src(alu_group_node* g, alu_node* a, alu_group_node *prev) {
vvec &sv = a->src;
bool add_nop = false;
FBC_DUMP(
sblog << "finalize_alu_src: ";
dump::dump_op(a);
@@ -360,6 +358,15 @@ void bc_finalizer::finalize_alu_src(alu_group_node* g, alu_node* a) {
if (!v->rel->is_const()) {
src.rel = 1;
update_ngpr(v->array->gpr.sel() + v->array->array_size -1);
if (prev && !add_nop) {
for (node_iterator pI = prev->begin(), pE = prev->end(); pI != pE; ++pI) {
alu_node *pn = static_cast<alu_node*>(*pI);
if (pn->bc.dst_gpr == src.sel) {
add_nop = true;
break;
}
}
}
} else
src.rel = 0;
@@ -417,11 +424,23 @@ void bc_finalizer::finalize_alu_src(alu_group_node* g, alu_node* a) {
assert(!"unknown value kind");
break;
}
if (prev && !add_nop) {
for (node_iterator pI = prev->begin(), pE = prev->end(); pI != pE; ++pI) {
alu_node *pn = static_cast<alu_node*>(*pI);
if (pn->bc.dst_rel) {
if (pn->bc.dst_gpr == src.sel) {
add_nop = true;
break;
}
}
}
}
}
while (si < 3) {
a->bc.src[si++].sel = 0;
}
return add_nop;
}
void bc_finalizer::copy_fetch_src(fetch_node &dst, fetch_node &src, unsigned arg_start)

View File

@@ -758,6 +758,8 @@ int bc_parser::prepare_loop(cf_node* c) {
c->insert_before(reg);
rep->move(c, end->next);
reg->src_loop = true;
loop_stack.push(reg);
return 0;
}

View File

@@ -61,6 +61,8 @@ int sb_context::init(r600_isa *isa, sb_hw_chip chip, sb_hw_class cclass) {
uses_mova_gpr = is_r600() && chip != HW_CHIP_RV670;
r6xx_gpr_index_workaround = is_r600() && chip != HW_CHIP_RV670 && chip != HW_CHIP_RS780 && chip != HW_CHIP_RS880;
switch (chip) {
case HW_CHIP_RV610:
case HW_CHIP_RS780:

View File

@@ -115,13 +115,13 @@ void if_conversion::convert_kill_instructions(region_node *r,
bool if_conversion::check_and_convert(region_node *r) {
depart_node *nd1 = static_cast<depart_node*>(r->first);
if (!nd1->is_depart())
if (!nd1->is_depart() || nd1->target != r)
return false;
if_node *nif = static_cast<if_node*>(nd1->first);
if (!nif->is_if())
return false;
depart_node *nd2 = static_cast<depart_node*>(nif->first);
if (!nd2->is_depart())
if (!nd2->is_depart() || nd2->target != r)
return false;
value* &em = nif->cond;

View File

@@ -1089,7 +1089,8 @@ typedef std::vector<repeat_node*> repeat_vec;
class region_node : public container_node {
protected:
region_node(unsigned id) : container_node(NT_REGION, NST_LIST), region_id(id),
loop_phi(), phi(), vars_defined(), departs(), repeats() {}
loop_phi(), phi(), vars_defined(), departs(), repeats(), src_loop()
{}
public:
unsigned region_id;
@@ -1101,12 +1102,16 @@ public:
depart_vec departs;
repeat_vec repeats;
// true if region was created for loop in the parser, sometimes repeat_node
// may be optimized away so we need to remember this information
bool src_loop;
virtual bool accept(vpass &p, bool enter);
unsigned dep_count() { return departs.size(); }
unsigned rep_count() { return repeats.size() + 1; }
bool is_loop() { return !repeats.empty(); }
bool is_loop() { return src_loop || !repeats.empty(); }
container_node* get_entry_code_location() {
node *p = first;

View File

@@ -695,8 +695,9 @@ public:
void run_on(container_node *c);
void finalize_alu_group(alu_group_node *g);
void finalize_alu_src(alu_group_node *g, alu_node *a);
void insert_rv6xx_load_ar_workaround(alu_group_node *b4);
void finalize_alu_group(alu_group_node *g, node *prev_node);
bool finalize_alu_src(alu_group_node *g, alu_node *a, alu_group_node *prev_node);
void emit_set_grad(fetch_node* f);
void finalize_fetch(fetch_node *f);

View File

@@ -1527,6 +1527,9 @@ bool post_scheduler::check_copy(node *n) {
if (!s->is_prealloc()) {
recolor_local(s);
if (!s->chunk || s->chunk != d->chunk)
return false;
}
if (s->gpr == d->gpr) {

View File

@@ -294,6 +294,7 @@ struct r600_so_target {
/* The buffer where BUFFER_FILLED_SIZE is stored. */
struct r600_resource *buf_filled_size;
unsigned buf_filled_size_offset;
bool buf_filled_size_valid;
unsigned stride_in_dw;
};

View File

@@ -237,7 +237,7 @@ static void r600_emit_streamout_begin(struct r600_common_context *rctx, struct r
}
}
if (rctx->streamout.append_bitmask & (1 << i)) {
if (rctx->streamout.append_bitmask & (1 << i) && t[i]->buf_filled_size_valid) {
uint64_t va = t[i]->buf_filled_size->gpu_address +
t[i]->buf_filled_size_offset;
@@ -302,6 +302,8 @@ void r600_emit_streamout_end(struct r600_common_context *rctx)
* buffer bound. This ensures that the primitives-emitted query
* won't increment. */
r600_write_context_reg(cs, R_028AD0_VGT_STRMOUT_BUFFER_SIZE_0 + 16*i, 0);
t[i]->buf_filled_size_valid = true;
}
rctx->streamout.begin_emitted = false;

View File

@@ -80,10 +80,6 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned type)
sprintf(Str, "%1d", llvm_type);
LLVMAddTargetDependentFunctionAttr(F, "ShaderType", Str);
if (type != TGSI_PROCESSOR_COMPUTE) {
LLVMAddTargetDependentFunctionAttr(F, "unsafe-fp-math", "true");
}
}
static void init_r600_target()

View File

@@ -748,7 +748,7 @@ static void txp_fetch_args(
const struct tgsi_full_instruction * inst = emit_data->inst;
LLVMValueRef src_w;
unsigned chan;
LLVMValueRef coords[4];
LLVMValueRef coords[5];
emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
src_w = lp_build_emit_fetch(bld_base, emit_data->inst, 0, TGSI_CHAN_W);

View File

@@ -228,14 +228,14 @@ static LLVMValueRef get_instance_index_for_fetch(
LLVMValueRef result = LLVMGetParam(radeon_bld->main_fn,
si_shader_ctx->param_instance_id);
result = LLVMBuildAdd(gallivm->builder, result, LLVMGetParam(
radeon_bld->main_fn, SI_PARAM_START_INSTANCE), "");
/* The division must be done before START_INSTANCE is added. */
if (divisor > 1)
result = LLVMBuildUDiv(gallivm->builder, result,
lp_build_const_int32(gallivm, divisor), "");
return result;
return LLVMBuildAdd(gallivm->builder, result, LLVMGetParam(
radeon_bld->main_fn, SI_PARAM_START_INSTANCE), "");
}
static void declare_input_vs(
@@ -590,8 +590,11 @@ static void declare_system_value(
break;
case TGSI_SEMANTIC_VERTEXID:
value = LLVMGetParam(radeon_bld->main_fn,
si_shader_ctx->param_vertex_id);
value = LLVMBuildAdd(gallivm->builder,
LLVMGetParam(radeon_bld->main_fn,
si_shader_ctx->param_vertex_id),
LLVMGetParam(radeon_bld->main_fn,
SI_PARAM_BASE_VERTEX), "");
break;
case TGSI_SEMANTIC_SAMPLEID:
@@ -1502,7 +1505,7 @@ static void tex_fetch_args(
const struct tgsi_full_instruction * inst = emit_data->inst;
unsigned opcode = inst->Instruction.Opcode;
unsigned target = inst->Texture.Texture;
LLVMValueRef coords[4];
LLVMValueRef coords[5];
LLVMValueRef address[16];
int ref_pos;
unsigned num_coords = tgsi_util_get_texture_coord_dim(target, &ref_pos);

View File

@@ -697,12 +697,16 @@ static void si_delete_rs_state(struct pipe_context *ctx, void *state)
*/
static void si_update_dsa_stencil_ref(struct si_context *sctx)
{
struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
struct si_pm4_state *pm4;
struct pipe_stencil_ref *ref = &sctx->stencil_ref;
struct si_state_dsa *dsa = sctx->queued.named.dsa;
struct si_state_dsa *dsa = sctx->queued.named.dsa;
if (pm4 == NULL)
return;
if (!dsa)
return;
pm4 = CALLOC_STRUCT(si_pm4_state);
if (pm4 == NULL)
return;
si_pm4_set_reg(pm4, R_028430_DB_STENCILREFMASK,
S_028430_STENCILTESTVAL(ref->ref_value[0]) |
@@ -3081,6 +3085,110 @@ void si_init_state_functions(struct si_context *sctx)
sctx->b.b.draw_vbo = si_draw_vbo;
}
static void
si_write_harvested_raster_configs(struct si_context *sctx,
struct si_pm4_state *pm4,
unsigned raster_config)
{
unsigned sh_per_se = MAX2(sctx->screen->b.info.max_sh_per_se, 1);
unsigned num_se = MAX2(sctx->screen->b.info.max_se, 1);
unsigned rb_mask = sctx->screen->b.info.si_backend_enabled_mask;
unsigned num_rb = sctx->screen->b.info.r600_num_backends;
unsigned rb_per_pkr = num_rb / num_se / sh_per_se;
unsigned rb_per_se = num_rb / num_se;
unsigned se0_mask = (1 << rb_per_se) - 1;
unsigned se1_mask = se0_mask << rb_per_se;
unsigned se;
assert(num_se == 1 || num_se == 2);
assert(sh_per_se == 1 || sh_per_se == 2);
assert(rb_per_pkr == 1 || rb_per_pkr == 2);
/* XXX: I can't figure out what the *_XSEL and *_YSEL
* fields are for, so I'm leaving them as their default
* values. */
se0_mask &= rb_mask;
se1_mask &= rb_mask;
if (num_se == 2 && (!se0_mask || !se1_mask)) {
raster_config &= C_028350_SE_MAP;
if (!se0_mask) {
raster_config |=
S_028350_SE_MAP(V_028350_RASTER_CONFIG_SE_MAP_3);
} else {
raster_config |=
S_028350_SE_MAP(V_028350_RASTER_CONFIG_SE_MAP_0);
}
}
for (se = 0; se < num_se; se++) {
unsigned raster_config_se = raster_config;
unsigned pkr0_mask = ((1 << rb_per_pkr) - 1) << (se * rb_per_se);
unsigned pkr1_mask = pkr0_mask << rb_per_pkr;
pkr0_mask &= rb_mask;
pkr1_mask &= rb_mask;
if (sh_per_se == 2 && (!pkr0_mask || !pkr1_mask)) {
raster_config_se &= C_028350_PKR_MAP;
if (!pkr0_mask) {
raster_config_se |=
S_028350_PKR_MAP(V_028350_RASTER_CONFIG_PKR_MAP_3);
} else {
raster_config_se |=
S_028350_PKR_MAP(V_028350_RASTER_CONFIG_PKR_MAP_0);
}
}
if (rb_per_pkr == 2) {
unsigned rb0_mask = 1 << (se * rb_per_se);
unsigned rb1_mask = rb0_mask << 1;
rb0_mask &= rb_mask;
rb1_mask &= rb_mask;
if (!rb0_mask || !rb1_mask) {
raster_config_se &= C_028350_RB_MAP_PKR0;
if (!rb0_mask) {
raster_config_se |=
S_028350_RB_MAP_PKR0(V_028350_RASTER_CONFIG_RB_MAP_3);
} else {
raster_config_se |=
S_028350_RB_MAP_PKR0(V_028350_RASTER_CONFIG_RB_MAP_0);
}
}
if (sh_per_se == 2) {
rb0_mask = 1 << (se * rb_per_se + rb_per_pkr);
rb1_mask = rb0_mask << 1;
rb0_mask &= rb_mask;
rb1_mask &= rb_mask;
if (!rb0_mask || !rb1_mask) {
raster_config_se &= C_028350_RB_MAP_PKR1;
if (!rb0_mask) {
raster_config_se |=
S_028350_RB_MAP_PKR1(V_028350_RASTER_CONFIG_RB_MAP_3);
} else {
raster_config_se |=
S_028350_RB_MAP_PKR1(V_028350_RASTER_CONFIG_RB_MAP_0);
}
}
}
}
si_pm4_set_reg(pm4, GRBM_GFX_INDEX,
SE_INDEX(se) | SH_BROADCAST_WRITES |
INSTANCE_BROADCAST_WRITES);
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, raster_config_se);
}
si_pm4_set_reg(pm4, GRBM_GFX_INDEX,
SE_BROADCAST_WRITES | SH_BROADCAST_WRITES |
INSTANCE_BROADCAST_WRITES);
}
void si_init_config(struct si_context *sctx)
{
struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
@@ -3152,24 +3260,40 @@ void si_init_config(struct si_context *sctx)
break;
}
} else {
unsigned rb_mask = sctx->screen->b.info.si_backend_enabled_mask;
unsigned num_rb = sctx->screen->b.info.r600_num_backends;
unsigned raster_config;
switch (sctx->screen->b.family) {
case CHIP_TAHITI:
case CHIP_PITCAIRN:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x2a00126a);
raster_config = 0x2a00126a;
break;
case CHIP_VERDE:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x0000124a);
raster_config = 0x0000124a;
break;
case CHIP_OLAND:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x00000082);
raster_config = 0x00000082;
break;
case CHIP_HAINAN:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x00000000);
raster_config = 0x00000000;
break;
default:
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG, 0x00000000);
fprintf(stderr,
"radeonsi: Unknown GPU, using 0 for raster_config\n");
raster_config = 0x00000000;
break;
}
/* Always use the default config when all backends are enabled
* (or when we failed to determine the enabled backends).
*/
if (!rb_mask || util_bitcount(rb_mask) >= num_rb) {
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG,
raster_config);
} else {
si_write_harvested_raster_configs(sctx, pm4, raster_config);
}
}
si_pm4_set_reg(pm4, R_028204_PA_SC_WINDOW_SCISSOR_TL, S_028204_WINDOW_OFFSET_DISABLE(1));

View File

@@ -544,9 +544,11 @@ bcolor:
}
}
if (j == vsinfo->num_outputs) {
/* No corresponding output found, load defaults into input */
tmp |= S_028644_OFFSET(0x20);
if (j == vsinfo->num_outputs && !G_028644_PT_SPRITE_TEX(tmp)) {
/* No corresponding output found, load defaults into input.
* Don't set any other bits.
* (FLAT_SHADE=1 completely changes behavior) */
tmp = S_028644_OFFSET(0x20);
}
si_pm4_set_reg(pm4,

View File

@@ -204,7 +204,13 @@
* 6. COMMAND [29:22] | BYTE_COUNT [20:0]
*/
#define GRBM_GFX_INDEX 0x802C
#define INSTANCE_INDEX(x) ((x) << 0)
#define SH_INDEX(x) ((x) << 8)
#define SE_INDEX(x) ((x) << 16)
#define SH_BROADCAST_WRITES (1 << 29)
#define INSTANCE_BROADCAST_WRITES (1 << 30)
#define SE_BROADCAST_WRITES (1 << 31)
#define R_0084FC_CP_STRMOUT_CNTL 0x0084FC
#define S_0084FC_OFFSET_UPDATE_DONE(x) (((x) & 0x1) << 0)
#define R_0085F0_CP_COHER_CNTL 0x0085F0

View File

@@ -302,6 +302,9 @@ NineAdapter9_CheckDeviceFormat( struct NineAdapter9 *This,
return D3DERR_NOTAVAILABLE;
}
/* we support ATI1 and ATI2 hack only for 2D textures */
if (RType != D3DRTYPE_TEXTURE && (CheckFormat == D3DFMT_ATI1 || CheckFormat == D3DFMT_ATI2))
return D3DERR_NOTAVAILABLE;
/* if (Usage & D3DUSAGE_NONSECURE) { don't know the implications of this } */
/* if (Usage & D3DUSAGE_SOFTWAREPROCESSING) { we can always support this } */
@@ -549,7 +552,7 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
D3DPMISCCAPS_CULLCCW |
D3DPMISCCAPS_COLORWRITEENABLE |
D3DPMISCCAPS_CLIPPLANESCALEDPOINTS |
D3DPMISCCAPS_CLIPTLVERTS |
/*D3DPMISCCAPS_CLIPTLVERTS |*/
D3DPMISCCAPS_TSSARGTEMP |
D3DPMISCCAPS_BLENDOP |
D3DPIPECAP(INDEP_BLEND_ENABLE, D3DPMISCCAPS_INDEPENDENTWRITEMASKS) |
@@ -560,6 +563,8 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
D3DPIPECAP(MIXED_COLORBUFFER_FORMATS, D3DPMISCCAPS_MRTINDEPENDENTBITDEPTHS) |
D3DPMISCCAPS_MRTPOSTPIXELSHADERBLENDING |
/*D3DPMISCCAPS_FOGVERTEXCLAMPED*/0;
if (!screen->get_param(screen, PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION))
pCaps->PrimitiveMiscCaps |= D3DPMISCCAPS_CLIPTLVERTS;
pCaps->RasterCaps =
D3DPIPECAP(ANISOTROPIC_FILTER, D3DPRASTERCAPS_ANISOTROPY) |

View File

@@ -436,14 +436,21 @@ NineBaseTexture9_CreatePipeResource( struct NineBaseTexture9 *This,
return D3D_OK;
}
#define SWIZZLE_TO_REPLACE(s) (s == UTIL_FORMAT_SWIZZLE_0 || \
s == UTIL_FORMAT_SWIZZLE_1 || \
s == UTIL_FORMAT_SWIZZLE_NONE)
HRESULT
NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 *This,
const int sRGB )
{
const struct util_format_description *desc;
struct pipe_context *pipe = This->pipe;
struct pipe_screen *screen = pipe->screen;
struct pipe_resource *resource = This->base.resource;
struct pipe_sampler_view templ;
enum pipe_format srgb_format;
unsigned i;
uint8_t swizzle[4];
DBG("This=%p sRGB=%d\n", This, sRGB);
@@ -452,6 +459,9 @@ NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 *This,
if (unlikely(This->format == D3DFMT_NULL))
return D3D_OK;
NineBaseTexture9_Dump(This);
/* hack due to incorrect POOL_MANAGED handling */
NineBaseTexture9_GenerateMipSubLevels(This);
resource = This->base.resource;
}
assert(resource);
@@ -463,25 +473,49 @@ NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 *This,
swizzle[3] = PIPE_SWIZZLE_ALPHA;
desc = util_format_description(resource->format);
if (desc->colorspace == UTIL_FORMAT_COLORSPACE_ZS) {
/* ZZZ1 -> 0Z01 (see end of docs/source/tgsi.rst)
* XXX: but it's wrong
swizzle[0] = PIPE_SWIZZLE_ZERO;
swizzle[2] = PIPE_SWIZZLE_ZERO; */
} else
if (desc->swizzle[0] == UTIL_FORMAT_SWIZZLE_X &&
desc->swizzle[3] == UTIL_FORMAT_SWIZZLE_1) {
/* R001/RG01 -> R111/RG11 */
if (desc->swizzle[1] == UTIL_FORMAT_SWIZZLE_0)
swizzle[1] = PIPE_SWIZZLE_ONE;
if (desc->swizzle[2] == UTIL_FORMAT_SWIZZLE_0)
swizzle[2] = PIPE_SWIZZLE_ONE;
/* msdn doc is incomplete here and wrong.
* The only formats that can be read directly here
* are DF16, DF24 and INTZ.
* Tested on win the swizzle is
* R = depth, G = B = 0, A = 1 for DF16 and DF24
* R = G = B = A = depth for INTZ
* For the other ZS formats that can't be read directly
* but can be used as shadow map, the result is duplicated on
* all channel */
if (This->format == D3DFMT_DF16 ||
This->format == D3DFMT_DF24) {
swizzle[1] = PIPE_SWIZZLE_ZERO;
swizzle[2] = PIPE_SWIZZLE_ZERO;
swizzle[3] = PIPE_SWIZZLE_ONE;
} else {
swizzle[1] = PIPE_SWIZZLE_RED;
swizzle[2] = PIPE_SWIZZLE_RED;
swizzle[3] = PIPE_SWIZZLE_RED;
}
} else if (resource->format != PIPE_FORMAT_A8_UNORM &&
resource->format != PIPE_FORMAT_RGTC1_UNORM) {
/* exceptions:
* A8 should have 0.0 as default values for RGB.
* ATI1/RGTC1 should be r 0 0 1 (tested on windows).
* It is already what gallium does. All the other ones
* should have 1.0 for non-defined values */
for (i = 0; i < 4; i++) {
if (SWIZZLE_TO_REPLACE(desc->swizzle[i]))
swizzle[i] = PIPE_SWIZZLE_ONE;
}
}
/* but 000A remains unchanged */
templ.format = sRGB ? util_format_srgb(resource->format) : resource->format;
/* if requested and supported, convert to the sRGB format */
srgb_format = util_format_srgb(resource->format);
if (sRGB && srgb_format != PIPE_FORMAT_NONE &&
screen->is_format_supported(screen, srgb_format,
resource->target, 0, resource->bind))
templ.format = srgb_format;
else
templ.format = resource->format;
templ.u.tex.first_layer = 0;
templ.u.tex.last_layer = (resource->target == PIPE_TEXTURE_CUBE) ?
5 : (This->base.info.depth0 - 1);
templ.u.tex.last_layer = resource->target == PIPE_TEXTURE_3D ?
resource->depth0 - 1 : resource->array_size - 1;
templ.u.tex.first_level = 0;
templ.u.tex.last_level = resource->last_level;
templ.swizzle_r = swizzle[0];

View File

@@ -38,6 +38,8 @@ NineCubeTexture9_ctor( struct NineCubeTexture9 *This,
HANDLE *pSharedHandle )
{
struct pipe_resource *info = &This->base.base.info;
struct pipe_screen *screen = pParams->device->screen;
enum pipe_format pf;
unsigned i;
D3DSURFACE_DESC sfdesc;
HRESULT hr;
@@ -55,9 +57,19 @@ NineCubeTexture9_ctor( struct NineCubeTexture9 *This,
if (Usage & D3DUSAGE_AUTOGENMIPMAP)
Levels = 0;
pf = d3d9_to_pipe_format(Format);
if (pf == PIPE_FORMAT_NONE ||
!screen->is_format_supported(screen, pf, PIPE_TEXTURE_CUBE, 0, PIPE_BIND_SAMPLER_VIEW)) {
return D3DERR_INVALIDCALL;
}
/* We support ATI1 and ATI2 hacks only for 2D textures */
if (Format == D3DFMT_ATI1 || Format == D3DFMT_ATI2)
return D3DERR_INVALIDCALL;
info->screen = pParams->device->screen;
info->target = PIPE_TEXTURE_CUBE;
info->format = d3d9_to_pipe_format(Format);
info->format = pf;
info->width0 = EdgeLength;
info->height0 = EdgeLength;
info->depth0 = 1;
@@ -146,7 +158,7 @@ NineCubeTexture9_GetLevelDesc( struct NineCubeTexture9 *This,
user_assert(Level == 0 || !(This->base.base.usage & D3DUSAGE_AUTOGENMIPMAP),
D3DERR_INVALIDCALL);
*pDesc = This->surfaces[Level]->desc;
*pDesc = This->surfaces[Level * 6]->desc;
return D3D_OK;
}

View File

@@ -62,7 +62,7 @@ NineDevice9_SetDefaultState( struct NineDevice9 *This, boolean is_reset )
assert(!This->is_recording);
nine_state_set_defaults(&This->state, &This->caps, is_reset);
nine_state_set_defaults(This, &This->caps, is_reset);
This->state.viewport.X = 0;
This->state.viewport.Y = 0;
@@ -109,7 +109,7 @@ NineDevice9_RestoreNonCSOState( struct NineDevice9 *This, unsigned mask )
cb.buffer = This->constbuf_vs;
cb.user_buffer = NULL;
}
cb.buffer_size = This->constbuf_vs->width0;
cb.buffer_size = This->vs_const_size;
pipe->set_constant_buffer(pipe, PIPE_SHADER_VERTEX, 0, &cb);
if (This->prefer_user_constbuf) {
@@ -117,7 +117,7 @@ NineDevice9_RestoreNonCSOState( struct NineDevice9 *This, unsigned mask )
} else {
cb.buffer = This->constbuf_ps;
}
cb.buffer_size = This->constbuf_ps->width0;
cb.buffer_size = This->ps_const_size;
pipe->set_constant_buffer(pipe, PIPE_SHADER_FRAGMENT, 0, &cb);
}
@@ -262,10 +262,14 @@ NineDevice9_ctor( struct NineDevice9 *This,
This->max_ps_const_f = max_const_ps -
(NINE_MAX_CONST_I + NINE_MAX_CONST_B / 4);
This->vs_const_size = max_const_vs * sizeof(float[4]);
This->ps_const_size = max_const_ps * sizeof(float[4]);
/* Include space for I,B constants for user constbuf. */
This->state.vs_const_f = CALLOC(NINE_MAX_CONST_ALL, sizeof(float[4]));
This->state.ps_const_f = CALLOC(NINE_MAX_CONST_ALL, sizeof(float[4]));
if (!This->state.vs_const_f || !This->state.ps_const_f)
This->state.vs_const_f = CALLOC(This->vs_const_size, 1);
This->state.ps_const_f = CALLOC(This->ps_const_size, 1);
This->state.vs_lconstf_temp = CALLOC(This->vs_const_size,1);
if (!This->state.vs_const_f || !This->state.ps_const_f ||
!This->state.vs_lconstf_temp)
return E_OUTOFMEMORY;
if (strstr(pScreen->get_name(pScreen), "AMD") ||
@@ -283,23 +287,16 @@ NineDevice9_ctor( struct NineDevice9 *This,
tmpl.bind = PIPE_BIND_CONSTANT_BUFFER;
tmpl.flags = 0;
tmpl.width0 = max_const_vs * sizeof(float[4]);
tmpl.width0 = This->vs_const_size;
This->constbuf_vs = pScreen->resource_create(pScreen, &tmpl);
tmpl.width0 = max_const_ps * sizeof(float[4]);
tmpl.width0 = This->ps_const_size;
This->constbuf_ps = pScreen->resource_create(pScreen, &tmpl);
if (!This->constbuf_vs || !This->constbuf_ps)
return E_OUTOFMEMORY;
}
This->vs_bool_true = pScreen->get_shader_param(pScreen,
PIPE_SHADER_VERTEX,
PIPE_SHADER_CAP_INTEGERS) ? 0xFFFFFFFF : fui(1.0f);
This->ps_bool_true = pScreen->get_shader_param(pScreen,
PIPE_SHADER_FRAGMENT,
PIPE_SHADER_CAP_INTEGERS) ? 0xFFFFFFFF : fui(1.0f);
/* Allocate upload helper for drivers that suck (from st pov ;). */
{
unsigned bind = 0;
@@ -314,6 +311,8 @@ NineDevice9_ctor( struct NineDevice9 *This,
}
This->driver_caps.window_space_position_support = GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION);
This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS);
This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS);
nine_ff_init(This); /* initialize fixed function code */
@@ -350,6 +349,7 @@ NineDevice9_dtor( struct NineDevice9 *This )
pipe_resource_reference(&This->constbuf_ps, NULL);
FREE(This->state.vs_const_f);
FREE(This->state.ps_const_f);
FREE(This->state.vs_lconstf_temp);
if (This->swapchains) {
for (i = 0; i < This->nswapchains; ++i)
@@ -2938,6 +2938,7 @@ NineDevice9_SetVertexShaderConstantI( struct NineDevice9 *This,
UINT Vector4iCount )
{
struct nine_state *state = This->update;
int i;
DBG("This=%p StartRegister=%u pConstantData=%p Vector4iCount=%u\n",
This, StartRegister, pConstantData, Vector4iCount);
@@ -2946,9 +2947,18 @@ NineDevice9_SetVertexShaderConstantI( struct NineDevice9 *This,
user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(&state->vs_const_i[StartRegister][0],
pConstantData,
Vector4iCount * sizeof(state->vs_const_i[0]));
if (This->driver_caps.vs_integer) {
memcpy(&state->vs_const_i[StartRegister][0],
pConstantData,
Vector4iCount * sizeof(state->vs_const_i[0]));
} else {
for (i = 0; i < Vector4iCount; i++) {
state->vs_const_i[StartRegister+i][0] = fui((float)(pConstantData[4*i]));
state->vs_const_i[StartRegister+i][1] = fui((float)(pConstantData[4*i+1]));
state->vs_const_i[StartRegister+i][2] = fui((float)(pConstantData[4*i+2]));
state->vs_const_i[StartRegister+i][3] = fui((float)(pConstantData[4*i+3]));
}
}
state->changed.vs_const_i |= ((1 << Vector4iCount) - 1) << StartRegister;
state->changed.group |= NINE_STATE_VS_CONST;
@@ -2963,14 +2973,24 @@ NineDevice9_GetVertexShaderConstantI( struct NineDevice9 *This,
UINT Vector4iCount )
{
const struct nine_state *state = &This->state;
int i;
user_assert(StartRegister < NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(pConstantData,
&state->vs_const_i[StartRegister][0],
Vector4iCount * sizeof(state->vs_const_i[0]));
if (This->driver_caps.vs_integer) {
memcpy(pConstantData,
&state->vs_const_i[StartRegister][0],
Vector4iCount * sizeof(state->vs_const_i[0]));
} else {
for (i = 0; i < Vector4iCount; i++) {
pConstantData[4*i] = (int32_t) uif(state->vs_const_i[StartRegister+i][0]);
pConstantData[4*i+1] = (int32_t) uif(state->vs_const_i[StartRegister+i][1]);
pConstantData[4*i+2] = (int32_t) uif(state->vs_const_i[StartRegister+i][2]);
pConstantData[4*i+3] = (int32_t) uif(state->vs_const_i[StartRegister+i][3]);
}
}
return D3D_OK;
}
@@ -2982,6 +3002,8 @@ NineDevice9_SetVertexShaderConstantB( struct NineDevice9 *This,
UINT BoolCount )
{
struct nine_state *state = This->update;
int i;
uint32_t bool_true = This->driver_caps.vs_integer ? 0xFFFFFFFF : fui(1.0f);
DBG("This=%p StartRegister=%u pConstantData=%p BoolCount=%u\n",
This, StartRegister, pConstantData, BoolCount);
@@ -2990,9 +3012,8 @@ NineDevice9_SetVertexShaderConstantB( struct NineDevice9 *This,
user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(&state->vs_const_b[StartRegister],
pConstantData,
BoolCount * sizeof(state->vs_const_b[0]));
for (i = 0; i < BoolCount; i++)
state->vs_const_b[StartRegister + i] = pConstantData[i] ? bool_true : 0;
state->changed.vs_const_b |= ((1 << BoolCount) - 1) << StartRegister;
state->changed.group |= NINE_STATE_VS_CONST;
@@ -3007,14 +3028,14 @@ NineDevice9_GetVertexShaderConstantB( struct NineDevice9 *This,
UINT BoolCount )
{
const struct nine_state *state = &This->state;
int i;
user_assert(StartRegister < NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(pConstantData,
&state->vs_const_b[StartRegister],
BoolCount * sizeof(state->vs_const_b[0]));
for (i = 0; i < BoolCount; i++)
pConstantData[i] = state->vs_const_b[StartRegister + i] != 0 ? TRUE : FALSE;
return D3D_OK;
}
@@ -3243,6 +3264,7 @@ NineDevice9_SetPixelShaderConstantI( struct NineDevice9 *This,
UINT Vector4iCount )
{
struct nine_state *state = This->update;
int i;
DBG("This=%p StartRegister=%u pConstantData=%p Vector4iCount=%u\n",
This, StartRegister, pConstantData, Vector4iCount);
@@ -3251,10 +3273,18 @@ NineDevice9_SetPixelShaderConstantI( struct NineDevice9 *This,
user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(&state->ps_const_i[StartRegister][0],
pConstantData,
Vector4iCount * sizeof(state->ps_const_i[0]));
if (This->driver_caps.ps_integer) {
memcpy(&state->ps_const_i[StartRegister][0],
pConstantData,
Vector4iCount * sizeof(state->ps_const_i[0]));
} else {
for (i = 0; i < Vector4iCount; i++) {
state->ps_const_i[StartRegister+i][0] = fui((float)(pConstantData[4*i]));
state->ps_const_i[StartRegister+i][1] = fui((float)(pConstantData[4*i+1]));
state->ps_const_i[StartRegister+i][2] = fui((float)(pConstantData[4*i+2]));
state->ps_const_i[StartRegister+i][3] = fui((float)(pConstantData[4*i+3]));
}
}
state->changed.ps_const_i |= ((1 << Vector4iCount) - 1) << StartRegister;
state->changed.group |= NINE_STATE_PS_CONST;
@@ -3268,14 +3298,24 @@ NineDevice9_GetPixelShaderConstantI( struct NineDevice9 *This,
UINT Vector4iCount )
{
const struct nine_state *state = &This->state;
int i;
user_assert(StartRegister < NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(pConstantData,
&state->ps_const_i[StartRegister][0],
Vector4iCount * sizeof(state->ps_const_i[0]));
if (This->driver_caps.ps_integer) {
memcpy(pConstantData,
&state->ps_const_i[StartRegister][0],
Vector4iCount * sizeof(state->ps_const_i[0]));
} else {
for (i = 0; i < Vector4iCount; i++) {
pConstantData[4*i] = (int32_t) uif(state->ps_const_i[StartRegister+i][0]);
pConstantData[4*i+1] = (int32_t) uif(state->ps_const_i[StartRegister+i][1]);
pConstantData[4*i+2] = (int32_t) uif(state->ps_const_i[StartRegister+i][2]);
pConstantData[4*i+3] = (int32_t) uif(state->ps_const_i[StartRegister+i][3]);
}
}
return D3D_OK;
}
@@ -3287,6 +3327,8 @@ NineDevice9_SetPixelShaderConstantB( struct NineDevice9 *This,
UINT BoolCount )
{
struct nine_state *state = This->update;
int i;
uint32_t bool_true = This->driver_caps.ps_integer ? 0xFFFFFFFF : fui(1.0f);
DBG("This=%p StartRegister=%u pConstantData=%p BoolCount=%u\n",
This, StartRegister, pConstantData, BoolCount);
@@ -3295,9 +3337,8 @@ NineDevice9_SetPixelShaderConstantB( struct NineDevice9 *This,
user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(&state->ps_const_b[StartRegister],
pConstantData,
BoolCount * sizeof(state->ps_const_b[0]));
for (i = 0; i < BoolCount; i++)
state->ps_const_b[StartRegister + i] = pConstantData[i] ? bool_true : 0;
state->changed.ps_const_b |= ((1 << BoolCount) - 1) << StartRegister;
state->changed.group |= NINE_STATE_PS_CONST;
@@ -3312,14 +3353,14 @@ NineDevice9_GetPixelShaderConstantB( struct NineDevice9 *This,
UINT BoolCount )
{
const struct nine_state *state = &This->state;
int i;
user_assert(StartRegister < NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(pConstantData,
&state->ps_const_b[StartRegister],
BoolCount * sizeof(state->ps_const_b[0]));
for (i = 0; i < BoolCount; i++)
pConstantData[i] = state->ps_const_b[StartRegister + i] ? TRUE : FALSE;
return D3D_OK;
}

View File

@@ -77,10 +77,10 @@ struct NineDevice9
struct pipe_resource *constbuf_vs;
struct pipe_resource *constbuf_ps;
uint16_t vs_const_size;
uint16_t ps_const_size;
uint16_t max_vs_const_f;
uint16_t max_ps_const_f;
uint32_t vs_bool_true;
uint32_t ps_bool_true;
struct gen_mipmap_state *gen_mipmap;
@@ -111,6 +111,8 @@ struct NineDevice9
boolean user_vbufs;
boolean user_ibufs;
boolean window_space_position_support;
boolean vs_integer;
boolean ps_integer;
} driver_caps;
struct u_upload_mgr *upload;

View File

@@ -1151,10 +1151,10 @@ ps_do_ts_op(struct ps_build_ctx *ps, unsigned top, struct ureg_dst dst, struct u
ureg_MUL(ureg, ureg_saturate(dst), ureg_src(tmp), ureg_imm4f(ureg,4.0,4.0,4.0,4.0));
break;
case D3DTOP_MULTIPLYADD:
ureg_MAD(ureg, dst, arg[2], arg[0], arg[1]);
ureg_MAD(ureg, dst, arg[1], arg[2], arg[0]);
break;
case D3DTOP_LERP:
ureg_LRP(ureg, dst, arg[1], arg[2], arg[0]);
ureg_LRP(ureg, dst, arg[0], arg[1], arg[2]);
break;
case D3DTOP_DISABLE:
/* no-op ? */
@@ -1278,6 +1278,8 @@ nine_ff_build_ps(struct NineDevice9 *device, struct nine_ff_ps_key *key)
(key->ts[0].resultarg != 0 /* not current */ ||
key->ts[0].colorop == D3DTOP_DISABLE ||
key->ts[0].alphaop == D3DTOP_DISABLE ||
key->ts[0].colorop == D3DTOP_BLENDCURRENTALPHA ||
key->ts[0].alphaop == D3DTOP_BLENDCURRENTALPHA ||
key->ts[0].colorarg0 == D3DTA_CURRENT ||
key->ts[0].colorarg1 == D3DTA_CURRENT ||
key->ts[0].colorarg2 == D3DTA_CURRENT ||

View File

@@ -185,6 +185,8 @@ d3d9_to_pipe_format(D3DFORMAT format)
case D3DFMT_DXT3: return PIPE_FORMAT_DXT3_RGBA;
case D3DFMT_DXT4: return PIPE_FORMAT_DXT5_RGBA; /* XXX */
case D3DFMT_DXT5: return PIPE_FORMAT_DXT5_RGBA;
case D3DFMT_ATI1: return PIPE_FORMAT_RGTC1_UNORM;
case D3DFMT_ATI2: return PIPE_FORMAT_RGTC2_UNORM;
case D3DFMT_UYVY: return PIPE_FORMAT_UYVY;
case D3DFMT_YUY2: return PIPE_FORMAT_YUYV; /* XXX check */
case D3DFMT_NV12: return PIPE_FORMAT_NV12;
@@ -249,6 +251,8 @@ d3dformat_to_string(D3DFORMAT fmt)
case D3DFMT_DXT3: return "D3DFMT_DXT3";
case D3DFMT_DXT4: return "D3DFMT_DXT4";
case D3DFMT_DXT5: return "D3DFMT_DXT5";
case D3DFMT_ATI1: return "D3DFMT_ATI1";
case D3DFMT_ATI2: return "D3DFMT_ATI2";
case D3DFMT_D16_LOCKABLE: return "D3DFMT_D16_LOCKABLE";
case D3DFMT_D32: return "D3DFMT_D32";
case D3DFMT_D15S1: return "D3DFMT_D15S1";
@@ -279,6 +283,7 @@ d3dformat_to_string(D3DFORMAT fmt)
case D3DFMT_DF16: return "D3DFMT_DF16";
case D3DFMT_DF24: return "D3DFMT_DF24";
case D3DFMT_INTZ: return "D3DFMT_INTZ";
case D3DFMT_NVDB: return "D3DFMT_NVDB";
case D3DFMT_NULL: return "D3DFMT_NULL";
default:
break;

View File

@@ -35,11 +35,6 @@
#define DBG_CHANNEL DBG_SHADER
#if 1
#define NINE_TGSI_LAZY_DEVS /* don't use TGSI_OPCODE_BREAKC */
#endif
#define NINE_TGSI_LAZY_R600 /* don't use TGSI_OPCODE_DP2A */
#define DUMP(args...) _nine_debug_printf(DBG_CHANNEL, NULL, args)
@@ -471,14 +466,14 @@ struct shader_translator
struct ureg_src vFace;
struct ureg_src s;
struct ureg_dst p;
struct ureg_dst a;
struct ureg_dst address;
struct ureg_dst a0;
struct ureg_dst tS[8]; /* texture stage registers */
struct ureg_dst tdst; /* scratch dst if we need extra modifiers */
struct ureg_dst t[5]; /* scratch TEMPs */
struct ureg_src vC[2]; /* PS color in */
struct ureg_src vT[8]; /* PS texcoord in */
struct ureg_dst rL[NINE_MAX_LOOP_DEPTH]; /* loop ctr */
struct ureg_dst aL[NINE_MAX_LOOP_DEPTH]; /* loop ctr ADDR register */
} regs;
unsigned num_temp; /* Elements(regs.r) */
unsigned num_scratch;
@@ -487,6 +482,7 @@ struct shader_translator
unsigned cond_depth;
unsigned loop_labels[NINE_MAX_LOOP_DEPTH];
unsigned cond_labels[NINE_MAX_COND_DEPTH];
boolean loop_or_rep[NINE_MAX_LOOP_DEPTH]; /* true: loop, false: rep */
unsigned *inst_labels; /* LABEL op */
unsigned num_inst_labels;
@@ -664,8 +660,10 @@ static INLINE void
tx_addr_alloc(struct shader_translator *tx, INT idx)
{
assert(idx == 0);
if (ureg_dst_is_undef(tx->regs.a))
tx->regs.a = ureg_DECL_address(tx->ureg);
if (ureg_dst_is_undef(tx->regs.address))
tx->regs.address = ureg_DECL_address(tx->ureg);
if (ureg_dst_is_undef(tx->regs.a0))
tx->regs.a0 = ureg_DECL_temporary(tx->ureg);
}
static INLINE void
@@ -707,7 +705,7 @@ tx_endloop(struct shader_translator *tx)
}
static struct ureg_dst
tx_get_loopctr(struct shader_translator *tx)
tx_get_loopctr(struct shader_translator *tx, boolean loop_or_rep)
{
const unsigned l = tx->loop_depth - 1;
@@ -717,26 +715,32 @@ tx_get_loopctr(struct shader_translator *tx)
return ureg_dst_undef();
}
if (ureg_dst_is_undef(tx->regs.aL[l]))
{
struct ureg_dst rreg = ureg_DECL_local_temporary(tx->ureg);
struct ureg_dst areg = ureg_DECL_address(tx->ureg);
unsigned c;
assert(l % 4 == 0);
for (c = l; c < (l + 4) && c < Elements(tx->regs.aL); ++c) {
tx->regs.rL[c] = ureg_writemask(rreg, 1 << (c & 3));
tx->regs.aL[c] = ureg_writemask(areg, 1 << (c & 3));
}
if (ureg_dst_is_undef(tx->regs.rL[l])) {
/* loop or rep ctr creation */
tx->regs.rL[l] = ureg_DECL_local_temporary(tx->ureg);
tx->loop_or_rep[l] = loop_or_rep;
}
/* loop - rep - endloop - endrep not allowed */
assert(tx->loop_or_rep[l] == loop_or_rep);
return tx->regs.rL[l];
}
static struct ureg_dst
tx_get_aL(struct shader_translator *tx)
static struct ureg_src
tx_get_loopal(struct shader_translator *tx)
{
if (!ureg_dst_is_undef(tx_get_loopctr(tx)))
return tx->regs.aL[tx->loop_depth - 1];
return ureg_dst_undef();
int loop_level = tx->loop_depth - 1;
while (loop_level >= 0) {
/* handle loop - rep - endrep - endloop case */
if (tx->loop_or_rep[loop_level])
/* the value is in the loop counter y component (nine implementation) */
return ureg_scalar(ureg_src(tx->regs.rL[loop_level]), TGSI_SWIZZLE_Y);
loop_level--;
}
DBG("aL counter requested outside of loop\n");
return ureg_src_undef();
}
static INLINE unsigned *
@@ -787,8 +791,12 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
case D3DSPR_ADDR:
assert(!param->rel);
if (IS_VS) {
tx_addr_alloc(tx, param->idx);
src = ureg_src(tx->regs.a);
assert(param->idx == 0);
/* the address register (vs only) must be
* assigned before use */
assert(!ureg_dst_is_undef(tx->regs.a0));
ureg_ARR(ureg, tx->regs.address, ureg_src(tx->regs.a0));
src = ureg_src(tx->regs.address);
} else {
if (tx->version.major < 2 && tx->version.minor < 4) {
/* no subroutines, so should be defined */
@@ -827,6 +835,7 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
src = ureg_src_register(TGSI_FILE_SAMPLER, param->idx);
break;
case D3DSPR_CONST:
assert(!param->rel || IS_VS);
if (param->rel)
tx->indirect_const_access = TRUE;
if (param->rel || !tx_lconstf(tx, &src, param->idx)) {
@@ -834,6 +843,13 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
nine_info_mark_const_f_used(tx->info, param->idx);
src = ureg_src_register(TGSI_FILE_CONSTANT, param->idx);
}
if (!IS_VS && tx->version.major < 2) {
/* ps 1.X clamps constants */
tmp = tx_scratch(tx);
ureg_MIN(ureg, tmp, src, ureg_imm1f(ureg, 1.0f));
ureg_MAX(ureg, tmp, ureg_src(tmp), ureg_imm1f(ureg, -1.0f));
src = ureg_src(tmp);
}
break;
case D3DSPR_CONST2:
case D3DSPR_CONST3:
@@ -843,26 +859,33 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
src = ureg_imm1f(ureg, 0.0f);
break;
case D3DSPR_CONSTINT:
if (param->rel || !tx_lconsti(tx, &src, param->idx)) {
if (!param->rel)
nine_info_mark_const_i_used(tx->info, param->idx);
/* relative adressing only possible for float constants in vs */
assert(!param->rel);
if (!tx_lconsti(tx, &src, param->idx)) {
nine_info_mark_const_i_used(tx->info, param->idx);
src = ureg_src_register(TGSI_FILE_CONSTANT,
tx->info->const_i_base + param->idx);
}
break;
case D3DSPR_CONSTBOOL:
if (param->rel || !tx_lconstb(tx, &src, param->idx)) {
assert(!param->rel);
if (!tx_lconstb(tx, &src, param->idx)) {
char r = param->idx / 4;
char s = param->idx & 3;
if (!param->rel)
nine_info_mark_const_b_used(tx->info, param->idx);
nine_info_mark_const_b_used(tx->info, param->idx);
src = ureg_src_register(TGSI_FILE_CONSTANT,
tx->info->const_b_base + r);
src = ureg_swizzle(src, s, s, s, s);
}
break;
case D3DSPR_LOOP:
src = tx_src_scalar(tx_get_aL(tx));
if (ureg_dst_is_undef(tx->regs.address))
tx->regs.address = ureg_DECL_address(ureg);
if (!tx->native_integers)
ureg_ARR(ureg, tx->regs.address, tx_get_loopal(tx));
else
ureg_UARL(ureg, tx->regs.address, tx_get_loopal(tx));
src = ureg_src(tx->regs.address);
break;
case D3DSPR_MISCTYPE:
switch (param->idx) {
@@ -904,6 +927,25 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
if (param->rel)
src = ureg_src_indirect(src, tx_src_param(tx, param->rel));
switch (param->mod) {
case NINED3DSPSM_DW:
tmp = tx_scratch(tx);
/* NOTE: app is not allowed to read w with this modifier */
ureg_RCP(ureg, ureg_writemask(tmp, NINED3DSP_WRITEMASK_3), src);
ureg_MUL(ureg, tmp, src, ureg_swizzle(ureg_src(tmp), NINE_SWIZZLE4(W,W,W,W)));
src = ureg_src(tmp);
break;
case NINED3DSPSM_DZ:
tmp = tx_scratch(tx);
/* NOTE: app is not allowed to read z with this modifier */
ureg_RCP(ureg, ureg_writemask(tmp, NINED3DSP_WRITEMASK_2), src);
ureg_MUL(ureg, tmp, src, ureg_swizzle(ureg_src(tmp), NINE_SWIZZLE4(Z,Z,Z,Z)));
src = ureg_src(tmp);
break;
default:
break;
}
if (param->swizzle != NINED3DSP_NOSWIZZLE)
src = ureg_swizzle(src,
(param->swizzle >> 0) & 0x3,
@@ -946,7 +988,7 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
break;
case NINED3DSPSM_DZ:
case NINED3DSPSM_DW:
/* handled in instruction */
/* Already handled*/
break;
case NINED3DSPSM_SIGN:
tmp = tx_scratch(tx);
@@ -1001,7 +1043,7 @@ _tx_dst_param(struct shader_translator *tx, const struct sm1_dst_param *param)
dst = ureg_dst(tx->regs.vT[param->idx]);
} else {
tx_addr_alloc(tx, param->idx);
dst = tx->regs.a;
dst = tx->regs.a0;
}
break;
case D3DSPR_RASTOUT:
@@ -1016,13 +1058,13 @@ _tx_dst_param(struct shader_translator *tx, const struct sm1_dst_param *param)
case 1:
if (ureg_dst_is_undef(tx->regs.oFog))
tx->regs.oFog =
ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_FOG, 0);
ureg_saturate(ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_FOG, 0));
dst = tx->regs.oFog;
break;
case 2:
if (ureg_dst_is_undef(tx->regs.oPts))
tx->regs.oPts =
ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_PSIZE, 0);
ureg_saturate(ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_PSIZE, 0));
dst = tx->regs.oPts;
break;
default:
@@ -1163,16 +1205,19 @@ NineTranslateInstruction_Mkxn(struct shader_translator *tx, const unsigned k, co
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst;
struct ureg_src src[2];
struct sm1_src_param *src_mat = &tx->insn.src[1];
unsigned i;
dst = tx_dst_param(tx, &tx->insn.dst[0]);
src[0] = tx_src_param(tx, &tx->insn.src[0]);
src[1] = tx_src_param(tx, &tx->insn.src[1]);
for (i = 0; i < n; i++, src[1].Index++)
for (i = 0; i < n; i++)
{
const unsigned m = (1 << i);
src[1] = tx_src_param(tx, src_mat);
src_mat->idx++;
if (!(dst.WriteMask & m))
continue;
@@ -1329,7 +1374,7 @@ NineTranslateInstruction_Generic(struct shader_translator *);
DECL_SPECIAL(M4x4)
{
return NineTranslateInstruction_Mkxn(tx, 4, 3);
return NineTranslateInstruction_Mkxn(tx, 4, 4);
}
DECL_SPECIAL(M4x3)
@@ -1367,33 +1412,29 @@ DECL_SPECIAL(CND)
struct ureg_dst cgt;
struct ureg_src cnd;
if (tx->insn.coissue && tx->version.major == 1 && tx->version.minor < 4) {
/* the coissue flag was a tip for compilers to advise to
* execute two operations at the same time, in cases
* the two executions had same dst with different channels.
* It has no effect on current hw. However it seems CND
* is affected. The handling of this very specific case
* handled below mimick wine behaviour */
if (tx->insn.coissue && tx->version.major == 1 && tx->version.minor < 4 && tx->insn.dst[0].mask != NINED3DSP_WRITEMASK_3) {
ureg_MOV(tx->ureg,
dst, tx_src_param(tx, &tx->insn.src[1]));
return D3D_OK;
}
cnd = tx_src_param(tx, &tx->insn.src[0]);
#ifdef NINE_TGSI_LAZY_R600
cgt = tx_scratch(tx);
if (tx->version.major == 1 && tx->version.minor < 4) {
cgt.WriteMask = TGSI_WRITEMASK_W;
ureg_SGT(tx->ureg, cgt, cnd, ureg_imm1f(tx->ureg, 0.5f));
cnd = ureg_scalar(cnd, TGSI_SWIZZLE_W);
} else {
ureg_SGT(tx->ureg, cgt, cnd, ureg_imm1f(tx->ureg, 0.5f));
}
ureg_CMP(tx->ureg, dst,
tx_src_param(tx, &tx->insn.src[1]),
tx_src_param(tx, &tx->insn.src[2]), ureg_negate(cnd));
#else
if (tx->version.major == 1 && tx->version.minor < 4)
cnd = ureg_scalar(cnd, TGSI_SWIZZLE_W);
ureg_CND(tx->ureg, dst,
ureg_SGT(tx->ureg, cgt, cnd, ureg_imm1f(tx->ureg, 0.5f));
ureg_CMP(tx->ureg, dst, ureg_negate(ureg_src(cgt)),
tx_src_param(tx, &tx->insn.src[1]),
tx_src_param(tx, &tx->insn.src[2]), cnd);
#endif
tx_src_param(tx, &tx->insn.src[2]));
return D3D_OK;
}
@@ -1427,9 +1468,17 @@ DECL_SPECIAL(CALLNZ)
DECL_SPECIAL(MOV_vs1x)
{
if (tx->insn.dst[0].file == D3DSPR_ADDR) {
ureg_ARL(tx->ureg,
/* Implementation note: We don't write directly
* to the addr register, but to an intermediate
* float register.
* Contrary to the doc, when writing to ADDR here,
* the rounding is not to nearest, but to lowest
* (wine test).
* Since we use ARR next, substract 0.5. */
ureg_SUB(tx->ureg,
tx_dst_param(tx, &tx->insn.dst[0]),
tx_src_param(tx, &tx->insn.src[0]));
tx_src_param(tx, &tx->insn.src[0]),
ureg_imm1f(tx->ureg, 0.5f));
return D3D_OK;
}
return NineTranslateInstruction_Generic(tx);
@@ -1440,46 +1489,36 @@ DECL_SPECIAL(LOOP)
struct ureg_program *ureg = tx->ureg;
unsigned *label;
struct ureg_src src = tx_src_param(tx, &tx->insn.src[1]);
struct ureg_src iter = ureg_scalar(src, TGSI_SWIZZLE_X);
struct ureg_src init = ureg_scalar(src, TGSI_SWIZZLE_Y);
struct ureg_src step = ureg_scalar(src, TGSI_SWIZZLE_Z);
struct ureg_dst ctr;
struct ureg_dst tmp = tx_scratch_scalar(tx);
struct ureg_dst tmp;
struct ureg_src ctrx;
label = tx_bgnloop(tx);
ctr = tx_get_loopctr(tx);
ctr = tx_get_loopctr(tx, TRUE);
ctrx = ureg_scalar(ureg_src(ctr), TGSI_SWIZZLE_X);
ureg_MOV(tx->ureg, ctr, init);
/* src: num_iterations - start_value of al - step for al - 0 */
ureg_MOV(ureg, ctr, src);
ureg_BGNLOOP(tx->ureg, label);
if (tx->native_integers) {
/* we'll let the backend pull up that MAD ... */
ureg_UMAD(ureg, tmp, iter, step, init);
ureg_USEQ(ureg, tmp, ureg_src(ctr), tx_src_scalar(tmp));
#ifdef NINE_TGSI_LAZY_DEVS
ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
#endif
} else {
/* can't simply use SGE for precision because step might be negative */
ureg_MAD(ureg, tmp, iter, step, init);
ureg_SEQ(ureg, tmp, ureg_src(ctr), tx_src_scalar(tmp));
#ifdef NINE_TGSI_LAZY_DEVS
tmp = tx_scratch_scalar(tx);
/* Initially ctr.x contains the number of iterations.
* ctr.y will contain the updated value of al.
* We decrease ctr.x at the end of every iteration,
* and stop when it reaches 0. */
if (!tx->native_integers) {
/* case src and ctr contain floats */
/* to avoid precision issue, we stop when ctr <= 0.5 */
ureg_SGE(ureg, tmp, ureg_imm1f(ureg, 0.5f), ctrx);
ureg_IF(ureg, tx_src_scalar(tmp), tx_cond(tx));
#endif
} else {
/* case src and ctr contain integers */
ureg_ISGE(ureg, tmp, ureg_imm1i(ureg, 0), ctrx);
ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
}
#ifdef NINE_TGSI_LAZY_DEVS
ureg_BRK(ureg);
tx_endcond(tx);
ureg_ENDIF(ureg);
#else
ureg_BREAKC(ureg, tx_src_scalar(tmp));
#endif
if (tx->native_integers) {
ureg_UARL(ureg, tx_get_aL(tx), tx_src_scalar(ctr));
ureg_UADD(ureg, ctr, tx_src_scalar(ctr), step);
} else {
ureg_ARL(ureg, tx_get_aL(tx), tx_src_scalar(ctr));
ureg_ADD(ureg, ctr, tx_src_scalar(ctr), step);
}
return D3D_OK;
}
@@ -1491,6 +1530,25 @@ DECL_SPECIAL(RET)
DECL_SPECIAL(ENDLOOP)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst ctr = tx_get_loopctr(tx, TRUE);
struct ureg_dst dst_ctrx, dst_al;
struct ureg_src src_ctr, al_counter;
dst_ctrx = ureg_writemask(ctr, NINED3DSP_WRITEMASK_0);
dst_al = ureg_writemask(ctr, NINED3DSP_WRITEMASK_1);
src_ctr = ureg_src(ctr);
al_counter = ureg_scalar(src_ctr, TGSI_SWIZZLE_Z);
/* ctr.x -= 1
* ctr.y (aL) += step */
if (!tx->native_integers) {
ureg_ADD(ureg, dst_ctrx, src_ctr, ureg_imm1f(ureg, -1.0f));
ureg_ADD(ureg, dst_al, src_ctr, al_counter);
} else {
ureg_UADD(ureg, dst_ctrx, src_ctr, ureg_imm1i(ureg, -1));
ureg_UADD(ureg, dst_al, src_ctr, al_counter);
}
ureg_ENDLOOP(tx->ureg, tx_endloop(tx));
return D3D_OK;
}
@@ -1540,7 +1598,7 @@ DECL_SPECIAL(REP)
tx->native_integers ? ureg_imm1u(ureg, 0) : ureg_imm1f(ureg, 0.0f);
label = tx_bgnloop(tx);
ctr = tx_get_loopctr(tx);
ctr = tx_get_loopctr(tx, FALSE);
/* NOTE: rep must be constant, so we don't have to save the count */
assert(rep.File == TGSI_FILE_CONSTANT || rep.File == TGSI_FILE_IMMEDIATE);
@@ -1550,24 +1608,16 @@ DECL_SPECIAL(REP)
if (tx->native_integers)
{
ureg_USGE(ureg, tmp, tx_src_scalar(ctr), rep);
#ifdef NINE_TGSI_LAZY_DEVS
ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
#endif
}
else
{
ureg_SGE(ureg, tmp, tx_src_scalar(ctr), rep);
#ifdef NINE_TGSI_LAZY_DEVS
ureg_IF(ureg, tx_src_scalar(tmp), tx_cond(tx));
#endif
}
#ifdef NINE_TGSI_LAZY_DEVS
ureg_BRK(ureg);
tx_endcond(tx);
ureg_ENDIF(ureg);
#else
ureg_BREAKC(ureg, tx_src_scalar(tmp));
#endif
if (tx->native_integers) {
ureg_UADD(ureg, ctr, tx_src_scalar(ctr), ureg_imm1u(ureg, 1));
@@ -1645,14 +1695,10 @@ DECL_SPECIAL(BREAKC)
src[0] = tx_src_param(tx, &tx->insn.src[0]);
src[1] = tx_src_param(tx, &tx->insn.src[1]);
ureg_insn(tx->ureg, cmp_op, &tmp, 1, src, 2);
#ifdef NINE_TGSI_LAZY_DEVS
ureg_IF(tx->ureg, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), tx_cond(tx));
ureg_BRK(tx->ureg);
tx_endcond(tx);
ureg_ENDIF(tx->ureg);
#else
ureg_BREAKC(tx->ureg, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
#endif
return D3D_OK;
}
@@ -1958,21 +2004,55 @@ DECL_SPECIAL(DEFI)
return D3D_OK;
}
DECL_SPECIAL(POW)
{
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src[2] = {
tx_src_param(tx, &tx->insn.src[0]),
tx_src_param(tx, &tx->insn.src[1])
};
ureg_POW(tx->ureg, dst, ureg_abs(src[0]), src[1]);
return D3D_OK;
}
DECL_SPECIAL(RSQ)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
struct ureg_dst tmp = tx_scratch(tx);
ureg_RSQ(ureg, tmp, ureg_abs(src));
ureg_MIN(ureg, dst, ureg_imm1f(ureg, FLT_MAX), ureg_src(tmp));
return D3D_OK;
}
DECL_SPECIAL(LOG)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst tmp = tx_scratch_scalar(tx);
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
ureg_LG2(ureg, tmp, ureg_abs(src));
ureg_MAX(ureg, dst, ureg_imm1f(ureg, -FLT_MAX), tx_src_scalar(tmp));
return D3D_OK;
}
DECL_SPECIAL(NRM)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst tmp = tx_scratch_scalar(tx);
struct ureg_src nrm = tx_src_scalar(tmp);
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
ureg_DP3(ureg, tmp, src, src);
ureg_RSQ(ureg, tmp, nrm);
ureg_MUL(ureg, tx_dst_param(tx, &tx->insn.dst[0]), src, nrm);
ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), nrm);
ureg_MUL(ureg, dst, src, nrm);
return D3D_OK;
}
DECL_SPECIAL(DP2ADD)
{
#ifdef NINE_TGSI_LAZY_R600
struct ureg_dst tmp = tx_scratch_scalar(tx);
struct ureg_src dp2 = tx_src_scalar(tmp);
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
@@ -1986,9 +2066,6 @@ DECL_SPECIAL(DP2ADD)
ureg_ADD(tx->ureg, dst, src[2], dp2);
return D3D_OK;
#else
return NineTranslateInstruction_Generic(tx);
#endif
}
DECL_SPECIAL(TEXCOORD)
@@ -1997,9 +2074,9 @@ DECL_SPECIAL(TEXCOORD)
const unsigned s = tx->insn.dst[0].idx;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
if (ureg_src_is_undef(tx->regs.vT[s]))
tx->regs.vT[s] = ureg_DECL_fs_input(ureg, tx->texcoord_sn, s, TGSI_INTERPOLATE_PERSPECTIVE);
ureg_MOV(ureg, dst, tx->regs.vT[s]); /* XXX is this sufficient ? */
tx_texcoord_alloc(tx, s);
ureg_MOV(ureg, ureg_writemask(ureg_saturate(dst), TGSI_WRITEMASK_XYZ), tx->regs.vT[s]);
ureg_MOV(ureg, ureg_writemask(dst, TGSI_WRITEMASK_W), ureg_imm1f(tx->ureg, 1.0f));
return D3D_OK;
}
@@ -2007,12 +2084,12 @@ DECL_SPECIAL(TEXCOORD)
DECL_SPECIAL(TEXCOORD_ps14)
{
struct ureg_program *ureg = tx->ureg;
const unsigned s = tx->insn.src[0].idx;
struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
if (ureg_src_is_undef(tx->regs.vT[s]))
tx->regs.vT[s] = ureg_DECL_fs_input(ureg, tx->texcoord_sn, s, TGSI_INTERPOLATE_PERSPECTIVE);
ureg_MOV(ureg, dst, tx->regs.vT[s]); /* XXX is this sufficient ? */
assert(tx->insn.src[0].file == D3DSPR_TEXTURE);
ureg_MOV(ureg, dst, src);
return D3D_OK;
}
@@ -2046,22 +2123,62 @@ DECL_SPECIAL(TEXBEML)
DECL_SPECIAL(TEXREG2AR)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src sample;
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
sample = ureg_DECL_sampler(ureg, m);
tx->info->sampler_mask |= 1 << m;
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), ureg_swizzle(ureg_src(tx->regs.tS[n]), NINE_SWIZZLE4(W,X,X,X)), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXREG2GB)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src sample;
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
sample = ureg_DECL_sampler(ureg, m);
tx->info->sampler_mask |= 1 << m;
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), ureg_swizzle(ureg_src(tx->regs.tS[n]), NINE_SWIZZLE4(Y,Z,Z,Z)), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXM3x2PAD)
{
STUB(D3DERR_INVALIDCALL);
return D3D_OK; /* this is just padding */
}
DECL_SPECIAL(TEXM3x2TEX)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src sample;
const int m = tx->insn.dst[0].idx - 1;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
tx_texcoord_alloc(tx, m);
tx_texcoord_alloc(tx, m+1);
/* performs the matrix multiplication */
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], ureg_src(tx->regs.tS[n]));
sample = ureg_DECL_sampler(ureg, m + 1);
tx->info->sampler_mask |= 1 << (m + 1);
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 1), ureg_src(dst), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXM3x3PAD)
@@ -2071,61 +2188,180 @@ DECL_SPECIAL(TEXM3x3PAD)
DECL_SPECIAL(TEXM3x3SPEC)
{
STUB(D3DERR_INVALIDCALL);
}
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src E = tx_src_param(tx, &tx->insn.src[1]);
struct ureg_src sample;
struct ureg_dst tmp;
const int m = tx->insn.dst[0].idx - 2;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
DECL_SPECIAL(TEXM3x3VSPEC)
{
STUB(D3DERR_INVALIDCALL);
tx_texcoord_alloc(tx, m);
tx_texcoord_alloc(tx, m+1);
tx_texcoord_alloc(tx, m+2);
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Z), tx->regs.vT[m+2], ureg_src(tx->regs.tS[n]));
sample = ureg_DECL_sampler(ureg, m + 2);
tx->info->sampler_mask |= 1 << (m + 2);
tmp = ureg_writemask(tx_scratch(tx), TGSI_WRITEMASK_XYZ);
/* At this step, dst = N = (u', w', z').
* We want dst to be the texture sampled at (u'', w'', z''), with
* (u'', w'', z'') = 2 * (N.E / N.N) * N - E */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_src(dst), ureg_src(dst));
ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
/* at this step tmp.x = 1/N.N */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), ureg_src(dst), E);
/* at this step tmp.y = N.E */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y));
/* at this step tmp.x = N.E/N.N */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_imm1f(ureg, 2.0f));
ureg_MUL(ureg, tmp, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_src(dst));
/* at this step tmp.xyz = 2 * (N.E / N.N) * N */
ureg_SUB(ureg, tmp, ureg_src(tmp), E);
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(tmp), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXREG2RGB)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src sample;
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
sample = ureg_DECL_sampler(ureg, m);
tx->info->sampler_mask |= 1 << m;
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), ureg_src(tx->regs.tS[n]), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXDP3TEX)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_dst tmp;
struct ureg_src sample;
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
tx_texcoord_alloc(tx, m);
tmp = tx_scratch(tx);
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_MOV(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_YZ), ureg_imm1f(ureg, 0.0f));
sample = ureg_DECL_sampler(ureg, m);
tx->info->sampler_mask |= 1 << m;
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), ureg_src(tmp), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXM3x2DEPTH)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst tmp;
const int m = tx->insn.dst[0].idx - 1;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
tx_texcoord_alloc(tx, m);
tx_texcoord_alloc(tx, m+1);
tmp = tx_scratch(tx);
/* performs the matrix multiplication */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], ureg_src(tx->regs.tS[n]));
ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Z), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y));
/* tmp.x = 'z', tmp.y = 'w', tmp.z = 1/'w'. */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Z));
/* res = 'w' == 0 ? 1.0 : z/w */
ureg_CMP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_negate(ureg_abs(ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y))),
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_imm1f(ureg, 1.0f));
/* replace the depth for depth testing with the result */
tx->regs.oDepth = ureg_DECL_output_masked(ureg, TGSI_SEMANTIC_POSITION, 0, TGSI_WRITEMASK_Z);
ureg_MOV(ureg, tx->regs.oDepth, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
/* note that we write nothing to the destination, since it's disallowed to use it afterward */
return D3D_OK;
}
DECL_SPECIAL(TEXDP3)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
tx_texcoord_alloc(tx, m);
ureg_DP3(ureg, dst, tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
return D3D_OK;
}
DECL_SPECIAL(TEXM3x3)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src[4];
int s;
struct ureg_src sample;
struct ureg_dst E, tmp;
const int m = tx->insn.dst[0].idx - 2;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
for (s = m; s <= (m + 2); ++s) {
if (ureg_src_is_undef(tx->regs.vT[s]))
tx->regs.vT[s] = ureg_DECL_fs_input(ureg, tx->texcoord_sn, s, TGSI_INTERPOLATE_PERSPECTIVE);
src[s] = tx->regs.vT[s];
}
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), src[0], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), src[1], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Z), src[2], ureg_src(tx->regs.tS[n]));
tx_texcoord_alloc(tx, m);
tx_texcoord_alloc(tx, m+1);
tx_texcoord_alloc(tx, m+2);
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Z), tx->regs.vT[m+2], ureg_src(tx->regs.tS[n]));
switch (tx->insn.opcode) {
case D3DSIO_TEXM3x3:
ureg_MOV(ureg, ureg_writemask(dst, TGSI_WRITEMASK_W), ureg_imm1f(ureg, 1.0f));
break;
case D3DSIO_TEXM3x3TEX:
src[3] = ureg_DECL_sampler(ureg, m + 2);
sample = ureg_DECL_sampler(ureg, m + 2);
tx->info->sampler_mask |= 1 << (m + 2);
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(dst), src[3]);
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(dst), sample);
break;
case D3DSIO_TEXM3x3VSPEC:
sample = ureg_DECL_sampler(ureg, m + 2);
tx->info->sampler_mask |= 1 << (m + 2);
E = tx_scratch(tx);
tmp = ureg_writemask(tx_scratch(tx), TGSI_WRITEMASK_XYZ);
ureg_MOV(ureg, ureg_writemask(E, TGSI_WRITEMASK_X), ureg_scalar(tx->regs.vT[m], TGSI_SWIZZLE_W));
ureg_MOV(ureg, ureg_writemask(E, TGSI_WRITEMASK_Y), ureg_scalar(tx->regs.vT[m+1], TGSI_SWIZZLE_W));
ureg_MOV(ureg, ureg_writemask(E, TGSI_WRITEMASK_Z), ureg_scalar(tx->regs.vT[m+2], TGSI_SWIZZLE_W));
/* At this step, dst = N = (u', w', z').
* We want dst to be the texture sampled at (u'', w'', z''), with
* (u'', w'', z'') = 2 * (N.E / N.N) * N - E */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_src(dst), ureg_src(dst));
ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
/* at this step tmp.x = 1/N.N */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), ureg_src(dst), ureg_src(E));
/* at this step tmp.y = N.E */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y));
/* at this step tmp.x = N.E/N.N */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_imm1f(ureg, 2.0f));
ureg_MUL(ureg, tmp, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_src(dst));
/* at this step tmp.xyz = 2 * (N.E / N.N) * N */
ureg_SUB(ureg, tmp, ureg_src(tmp), ureg_src(E));
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(tmp), sample);
break;
default:
return D3DERR_INVALIDCALL;
@@ -2135,7 +2371,28 @@ DECL_SPECIAL(TEXM3x3)
DECL_SPECIAL(TEXDEPTH)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst r5;
struct ureg_src r5r, r5g;
assert(tx->insn.dst[0].idx == 5); /* instruction must get r5 here */
/* we must replace the depth by r5.g == 0 ? 1.0f : r5.r/r5.g.
* r5 won't be used afterward, thus we can use r5.ba */
r5 = tx->regs.r[5];
r5r = ureg_scalar(ureg_src(r5), TGSI_SWIZZLE_X);
r5g = ureg_scalar(ureg_src(r5), TGSI_SWIZZLE_Y);
ureg_RCP(ureg, ureg_writemask(r5, TGSI_WRITEMASK_Z), r5g);
ureg_MUL(ureg, ureg_writemask(r5, TGSI_WRITEMASK_X), r5r, ureg_scalar(ureg_src(r5), TGSI_SWIZZLE_Z));
/* r5.r = r/g */
ureg_CMP(ureg, ureg_writemask(r5, TGSI_WRITEMASK_X), ureg_negate(ureg_abs(r5g)),
r5r, ureg_imm1f(ureg, 1.0f));
/* replace the depth for depth testing with the result */
tx->regs.oDepth = ureg_DECL_output_masked(ureg, TGSI_SEMANTIC_POSITION, 0, TGSI_WRITEMASK_Z);
ureg_MOV(ureg, tx->regs.oDepth, r5r);
return D3D_OK;
}
DECL_SPECIAL(BEM)
@@ -2275,7 +2532,7 @@ struct sm1_op_info inst_table[] =
_OPI(MAD, MAD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 3, NULL), /* 4 */
_OPI(MUL, MUL, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 5 */
_OPI(RCP, RCP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 6 */
_OPI(RSQ, RSQ, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 7 */
_OPI(RSQ, RSQ, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(RSQ)), /* 7 */
_OPI(DP3, DP3, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 8 */
_OPI(DP4, DP4, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 9 */
_OPI(MIN, MIN, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 10 */
@@ -2283,7 +2540,7 @@ struct sm1_op_info inst_table[] =
_OPI(SLT, SLT, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 12 */
_OPI(SGE, SGE, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 13 */
_OPI(EXP, EX2, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 14 */
_OPI(LOG, LG2, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 15 */
_OPI(LOG, LG2, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(LOG)), /* 15 */
_OPI(LIT, LIT, V(0,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL), /* 16 */
_OPI(DST, DST, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 17 */
_OPI(LRP, LRP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 3, NULL), /* 18 */
@@ -2295,16 +2552,16 @@ struct sm1_op_info inst_table[] =
_OPI(M3x3, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(M3x3)),
_OPI(M3x2, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(M3x2)),
_OPI(CALL, CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(CALL)),
_OPI(CALLNZ, CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(CALLNZ)),
_OPI(CALL, CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 1, SPECIAL(CALL)),
_OPI(CALLNZ, CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 2, SPECIAL(CALLNZ)),
_OPI(LOOP, BGNLOOP, V(2,0), V(3,0), V(3,0), V(3,0), 0, 2, SPECIAL(LOOP)),
_OPI(RET, RET, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(RET)),
_OPI(ENDLOOP, ENDLOOP, V(2,0), V(3,0), V(3,0), V(3,0), 0, 0, SPECIAL(ENDLOOP)),
_OPI(LABEL, NOP, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(LABEL)),
_OPI(LABEL, NOP, V(2,0), V(3,0), V(2,1), V(3,0), 0, 1, SPECIAL(LABEL)),
_OPI(DCL, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 0, 0, SPECIAL(DCL)),
_OPI(POW, POW, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL),
_OPI(POW, POW, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(POW)),
_OPI(CRS, XPD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* XXX: .w */
_OPI(SGN, SSG, V(2,0), V(3,0), V(0,0), V(0,0), 1, 3, SPECIAL(SGN)), /* ignore src1,2 */
_OPI(ABS, ABS, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL),
@@ -2322,8 +2579,9 @@ struct sm1_op_info inst_table[] =
_OPI(ENDIF, ENDIF, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(ENDIF)),
_OPI(BREAK, BRK, V(2,1), V(3,0), V(2,1), V(3,0), 0, 0, NULL),
_OPI(BREAKC, BREAKC, V(2,1), V(3,0), V(2,1), V(3,0), 0, 2, SPECIAL(BREAKC)),
_OPI(MOVA, ARR, V(2,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
/* we don't write to the address register, but a normal register (copied
* when needed to the address register), thus we don't use ARR */
_OPI(MOVA, MOV, V(2,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
_OPI(DEFB, NOP, V(0,0), V(3,0) , V(0,0), V(3,0) , 1, 0, SPECIAL(DEFB)),
_OPI(DEFI, NOP, V(0,0), V(3,0) , V(0,0), V(3,0) , 1, 0, SPECIAL(DEFI)),
@@ -2334,42 +2592,42 @@ struct sm1_op_info inst_table[] =
_OPI(TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 0, SPECIAL(TEX)),
_OPI(TEX, TEX, V(0,0), V(0,0), V(1,4), V(1,4), 1, 1, SPECIAL(TEXLD_14)),
_OPI(TEX, TEX, V(0,0), V(0,0), V(2,0), V(3,0), 1, 2, SPECIAL(TEXLD)),
_OPI(TEXBEM, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXBEM)),
_OPI(TEXBEML, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXBEML)),
_OPI(TEXREG2AR, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXREG2AR)),
_OPI(TEXREG2GB, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXREG2GB)),
_OPI(TEXM3x2PAD, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x2PAD)),
_OPI(TEXM3x2TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x2TEX)),
_OPI(TEXM3x3PAD, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x3PAD)),
_OPI(TEXM3x3TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x3)),
_OPI(TEXM3x3SPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x3SPEC)),
_OPI(TEXM3x3VSPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x3VSPEC)),
_OPI(TEXBEM, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXBEM)),
_OPI(TEXBEML, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXBEML)),
_OPI(TEXREG2AR, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXREG2AR)),
_OPI(TEXREG2GB, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXREG2GB)),
_OPI(TEXM3x2PAD, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x2PAD)),
_OPI(TEXM3x2TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x2TEX)),
_OPI(TEXM3x3PAD, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x3PAD)),
_OPI(TEXM3x3TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x3)),
_OPI(TEXM3x3SPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 2, SPECIAL(TEXM3x3SPEC)),
_OPI(TEXM3x3VSPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x3)),
_OPI(EXPP, EXP, V(0,0), V(1,1), V(0,0), V(0,0), 1, 1, NULL),
_OPI(EXPP, EX2, V(2,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
_OPI(LOGP, LG2, V(0,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
_OPI(CND, CND, V(0,0), V(0,0), V(0,0), V(1,4), 1, 3, SPECIAL(CND)),
_OPI(LOGP, LG2, V(0,0), V(3,0), V(0,0), V(0,0), 1, 1, SPECIAL(LOG)),
_OPI(CND, NOP, V(0,0), V(0,0), V(0,0), V(1,4), 1, 3, SPECIAL(CND)),
_OPI(DEF, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 0, SPECIAL(DEF)),
/* More tex stuff */
_OPI(TEXREG2RGB, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, SPECIAL(TEXREG2RGB)),
_OPI(TEXDP3TEX, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, SPECIAL(TEXDP3TEX)),
_OPI(TEXM3x2DEPTH, TEX, V(0,0), V(0,0), V(1,3), V(1,3), 0, 0, SPECIAL(TEXM3x2DEPTH)),
_OPI(TEXDP3, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, SPECIAL(TEXDP3)),
_OPI(TEXM3x3, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, SPECIAL(TEXM3x3)),
_OPI(TEXDEPTH, TEX, V(0,0), V(0,0), V(1,4), V(1,4), 0, 0, SPECIAL(TEXDEPTH)),
_OPI(TEXREG2RGB, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, SPECIAL(TEXREG2RGB)),
_OPI(TEXDP3TEX, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, SPECIAL(TEXDP3TEX)),
_OPI(TEXM3x2DEPTH, TEX, V(0,0), V(0,0), V(1,3), V(1,3), 1, 1, SPECIAL(TEXM3x2DEPTH)),
_OPI(TEXDP3, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, SPECIAL(TEXDP3)),
_OPI(TEXM3x3, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, SPECIAL(TEXM3x3)),
_OPI(TEXDEPTH, TEX, V(0,0), V(0,0), V(1,4), V(1,4), 1, 0, SPECIAL(TEXDEPTH)),
/* Misc */
_OPI(CMP, CMP, V(0,0), V(0,0), V(1,2), V(3,0), 1, 3, SPECIAL(CMP)), /* reversed */
_OPI(BEM, NOP, V(0,0), V(0,0), V(1,4), V(1,4), 0, 0, SPECIAL(BEM)),
_OPI(DP2ADD, DP2A, V(0,0), V(0,0), V(2,0), V(3,0), 1, 3, SPECIAL(DP2ADD)), /* for radeons */
_OPI(BEM, NOP, V(0,0), V(0,0), V(1,4), V(1,4), 1, 2, SPECIAL(BEM)),
_OPI(DP2ADD, NOP, V(0,0), V(0,0), V(2,0), V(3,0), 1, 3, SPECIAL(DP2ADD)),
_OPI(DSX, DDX, V(0,0), V(0,0), V(2,1), V(3,0), 1, 1, NULL),
_OPI(DSY, DDY, V(0,0), V(0,0), V(2,1), V(3,0), 1, 1, NULL),
_OPI(TEXLDD, TXD, V(0,0), V(0,0), V(2,1), V(3,0), 1, 4, SPECIAL(TEXLDD)),
_OPI(SETP, NOP, V(0,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(SETP)),
_OPI(SETP, NOP, V(0,0), V(3,0), V(2,1), V(3,0), 1, 2, SPECIAL(SETP)),
_OPI(TEXLDL, TXL, V(3,0), V(3,0), V(3,0), V(3,0), 1, 2, SPECIAL(TEXLDL)),
_OPI(BREAKP, BRK, V(0,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(BREAKP))
_OPI(BREAKP, BRK, V(0,0), V(3,0), V(2,1), V(3,0), 0, 1, SPECIAL(BREAKP))
};
struct sm1_op_info inst_phase =
@@ -2740,11 +2998,11 @@ tx_ctor(struct shader_translator *tx, struct nine_shader_info *info)
info->lconstf.data = NULL;
info->lconstf.ranges = NULL;
for (i = 0; i < Elements(tx->regs.aL); ++i) {
tx->regs.aL[i] = ureg_dst_undef();
for (i = 0; i < Elements(tx->regs.rL); ++i) {
tx->regs.rL[i] = ureg_dst_undef();
}
tx->regs.a = ureg_dst_undef();
tx->regs.address = ureg_dst_undef();
tx->regs.a0 = ureg_dst_undef();
tx->regs.p = ureg_dst_undef();
tx->regs.oDepth = ureg_dst_undef();
tx->regs.vPos = ureg_src_undef();
@@ -2852,9 +3110,6 @@ nine_translate_shader(struct NineDevice9 *device, struct nine_shader_info *info)
ureg_property_fs_coord_pixel_center(tx->ureg, TGSI_FS_COORD_PIXEL_CENTER_INTEGER);
}
if (!ureg_dst_is_undef(tx->regs.oPts))
info->point_size = TRUE;
while (!sm1_parse_eof(tx))
sm1_parse_instruction(tx);
tx->parse++; /* for byte_size */
@@ -2870,6 +3125,9 @@ nine_translate_shader(struct NineDevice9 *device, struct nine_shader_info *info)
ureg_END(tx->ureg);
if (IS_VS && !ureg_dst_is_undef(tx->regs.oPts))
info->point_size = TRUE;
if (debug_get_bool_option("NINE_TGSI_DUMP", FALSE)) {
unsigned count;
const struct tgsi_token *toks = ureg_get_tokens(tx->ureg, &count);

View File

@@ -347,14 +347,13 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
const int *const_i;
const BOOL *const_b;
uint32_t data_b[NINE_MAX_CONST_B];
uint32_t b_true;
uint16_t dirty_i;
uint16_t dirty_b;
const unsigned usage = PIPE_TRANSFER_WRITE | PIPE_TRANSFER_DISCARD_RANGE;
unsigned x = 0; /* silence warning */
unsigned i, c, n;
const struct nine_lconstf *lconstf;
struct nine_range *r, *p;
unsigned i, c;
struct nine_range *r, *p, *lconstf_ranges;
float *lconstf_data;
box.y = 0;
box.z = 0;
@@ -381,9 +380,10 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
dirty_b = device->state.changed.vs_const_b;
device->state.changed.vs_const_b = 0;
const_b = device->state.vs_const_b;
b_true = device->vs_bool_true;
lconstf = &device->state.vs->lconstf;
lconstf_ranges = device->state.vs->lconstf.ranges;
lconstf_data = device->state.vs->lconstf.data;
device->state.ff.clobber.vs_const = TRUE;
device->state.changed.group &= ~NINE_STATE_VS_CONST;
} else {
@@ -406,9 +406,10 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
dirty_b = device->state.changed.ps_const_b;
device->state.changed.ps_const_b = 0;
const_b = device->state.ps_const_b;
b_true = device->ps_bool_true;
lconstf = &device->state.ps->lconstf;
lconstf_ranges = NULL;
lconstf_data = NULL;
device->state.ff.clobber.ps_const = TRUE;
device->state.changed.group &= ~NINE_STATE_PS_CONST;
}
@@ -420,11 +421,10 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
i = ffs(dirty_b) - 1;
x = buf->width0 - (NINE_MAX_CONST_B - i) * 4;
c -= i;
for (n = 0; n < c; ++n, ++i)
data_b[n] = const_b[i] ? b_true : 0;
memcpy(data_b, &(const_b[i]), c * sizeof(uint32_t));
box.x = x;
box.width = n * 4;
DBG("upload ConstantB [%u .. %u]\n", x, x + n - 1);
box.width = c * 4;
DBG("upload ConstantB [%u .. %u]\n", x, x + c - 1);
pipe->transfer_inline_write(pipe, buf, 0, usage, &box, data_b, 0, 0);
}
@@ -455,14 +455,14 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
}
/* TODO: only upload these when shader itself changes */
if (lconstf->ranges) {
if (lconstf_ranges) {
unsigned n = 0;
struct nine_range *r = lconstf->ranges;
struct nine_range *r = lconstf_ranges;
while (r) {
box.x = r->bgn * 4 * sizeof(float);
n += r->end - r->bgn;
box.width = (r->end - r->bgn) * 4 * sizeof(float);
data = &lconstf->data[4 * n];
data = &lconstf_data[4 * n];
pipe->transfer_inline_write(pipe, buf, 0, usage, &box, data, 0, 0);
r = r->next;
}
@@ -491,19 +491,16 @@ update_vs_constants_userbuf(struct NineDevice9 *device)
if (state->changed.vs_const_b) {
int *idst = (int *)&state->vs_const_f[4 * device->max_vs_const_f];
uint32_t *bdst = (uint32_t *)&idst[4 * NINE_MAX_CONST_I];
int i;
for (i = 0; i < NINE_MAX_CONST_B; ++i)
bdst[i] = state->vs_const_b[i] ? device->vs_bool_true : 0;
memcpy(bdst, state->vs_const_b, sizeof(state->vs_const_b));
state->changed.vs_const_b = 0;
}
#ifdef DEBUG
if (device->state.vs->lconstf.ranges) {
/* TODO: Can we make it so that we don't have to copy everything ? */
const struct nine_lconstf *lconstf = &device->state.vs->lconstf;
const struct nine_range *r = lconstf->ranges;
unsigned n = 0;
float *dst = (float *)MALLOC(cb.buffer_size);
float *dst = device->state.vs_lconstf_temp;
float *src = (float *)cb.user_buffer;
memcpy(dst, src, cb.buffer_size);
while (r) {
@@ -515,15 +512,9 @@ update_vs_constants_userbuf(struct NineDevice9 *device)
}
cb.user_buffer = dst;
}
#endif
pipe->set_constant_buffer(pipe, PIPE_SHADER_VERTEX, 0, &cb);
#ifdef DEBUG
if (device->state.vs->lconstf.ranges)
FREE((void *)cb.user_buffer);
#endif
if (device->state.changed.vs_const_f) {
struct nine_range *r = device->state.changed.vs_const_f;
struct nine_range *p = r;
@@ -557,39 +548,12 @@ update_ps_constants_userbuf(struct NineDevice9 *device)
if (state->changed.ps_const_b) {
int *idst = (int *)&state->ps_const_f[4 * device->max_ps_const_f];
uint32_t *bdst = (uint32_t *)&idst[4 * NINE_MAX_CONST_I];
int i;
for (i = 0; i < NINE_MAX_CONST_B; ++i)
bdst[i] = state->ps_const_b[i] ? device->ps_bool_true : 0;
memcpy(bdst, state->ps_const_b, sizeof(state->ps_const_b));
state->changed.ps_const_b = 0;
}
#ifdef DEBUG
if (device->state.ps->lconstf.ranges) {
/* TODO: Can we make it so that we don't have to copy everything ? */
const struct nine_lconstf *lconstf = &device->state.ps->lconstf;
const struct nine_range *r = lconstf->ranges;
unsigned n = 0;
float *dst = (float *)MALLOC(cb.buffer_size);
float *src = (float *)cb.user_buffer;
memcpy(dst, src, cb.buffer_size);
while (r) {
unsigned p = r->bgn;
unsigned c = r->end - r->bgn;
memcpy(&dst[p * 4], &lconstf->data[n * 4], c * 4 * sizeof(float));
n += c;
r = r->next;
}
cb.user_buffer = dst;
}
#endif
pipe->set_constant_buffer(pipe, PIPE_SHADER_FRAGMENT, 0, &cb);
#ifdef DEBUG
if (device->state.ps->lconstf.ranges)
FREE((void *)cb.user_buffer);
#endif
if (device->state.changed.ps_const_f) {
struct nine_range *r = device->state.changed.ps_const_f;
struct nine_range *p = r;
@@ -1030,9 +994,10 @@ static const DWORD nine_samp_state_defaults[NINED3DSAMP_LAST + 1] =
[NINED3DSAMP_SHADOW] = 0
};
void
nine_state_set_defaults(struct nine_state *state, const D3DCAPS9 *caps,
nine_state_set_defaults(struct NineDevice9 *device, const D3DCAPS9 *caps,
boolean is_reset)
{
struct nine_state *state = &device->state;
unsigned s;
/* Initialize defaults.
@@ -1053,9 +1018,9 @@ nine_state_set_defaults(struct nine_state *state, const D3DCAPS9 *caps,
}
if (state->vs_const_f)
memset(state->vs_const_f, 0, NINE_MAX_CONST_F * 4 * sizeof(float));
memset(state->vs_const_f, 0, device->vs_const_size);
if (state->ps_const_f)
memset(state->ps_const_f, 0, NINE_MAX_CONST_F * 4 * sizeof(float));
memset(state->ps_const_f, 0, device->ps_const_size);
/* Cap dependent initial state:
*/

View File

@@ -144,6 +144,7 @@ struct nine_state
float *vs_const_f;
int vs_const_i[NINE_MAX_CONST_I][4];
BOOL vs_const_b[NINE_MAX_CONST_B];
float *vs_lconstf_temp;
uint32_t vs_key;
struct NinePixelShader9 *ps;
@@ -218,7 +219,7 @@ struct NineDevice9;
boolean nine_update_state(struct NineDevice9 *, uint32_t group_mask);
void nine_state_set_defaults(struct nine_state *, const D3DCAPS9 *,
void nine_state_set_defaults(struct NineDevice9 *, const D3DCAPS9 *,
boolean is_reset);
void nine_state_clear(struct nine_state *, const boolean device);

View File

@@ -72,9 +72,10 @@ NinePixelShader9_ctor( struct NinePixelShader9 *This,
This->sampler_mask = info.sampler_mask;
This->rt_mask = info.rt_mask;
This->const_used_size = info.const_used_size;
if (info.const_used_size == ~0)
This->const_used_size = NINE_CONSTBUF_SIZE(device->max_ps_const_f);
This->lconstf = info.lconstf;
/* no constant relative addressing for ps */
assert(info.const_used_size != ~0);
assert(info.lconstf.data == NULL);
assert(info.lconstf.ranges == NULL);
return D3D_OK;
}
@@ -101,9 +102,6 @@ NinePixelShader9_dtor( struct NinePixelShader9 *This )
if (This->byte_code.tokens)
FREE((void *)This->byte_code.tokens); /* const_cast */
FREE(This->lconstf.data);
FREE(This->lconstf.ranges);
NineUnknown_dtor(&This->base);
}

View File

@@ -41,8 +41,6 @@ struct NinePixelShader9
unsigned const_used_size; /* in bytes */
struct nine_lconstf lconstf;
uint16_t sampler_mask;
uint16_t sampler_mask_shadow;
uint8_t rt_mask;

View File

@@ -43,8 +43,8 @@ NineStateBlock9_ctor( struct NineStateBlock9 *This,
This->type = type;
This->state.vs_const_f = MALLOC(pParams->device->constbuf_vs->width0);
This->state.ps_const_f = MALLOC(pParams->device->constbuf_ps->width0);
This->state.vs_const_f = MALLOC(This->base.device->vs_const_size);
This->state.ps_const_f = MALLOC(This->base.device->ps_const_size);
if (!This->state.vs_const_f || !This->state.ps_const_f)
return E_OUTOFMEMORY;

View File

@@ -38,6 +38,8 @@
#define DBG_CHANNEL DBG_SURFACE
#define is_ATI1_ATI2(format) (format == PIPE_FORMAT_RGTC1_UNORM || format == PIPE_FORMAT_RGTC2_UNORM)
HRESULT
NineSurface9_ctor( struct NineSurface9 *This,
struct NineUnknownParams *pParams,
@@ -150,14 +152,22 @@ struct pipe_surface *
NineSurface9_CreatePipeSurface( struct NineSurface9 *This, const int sRGB )
{
struct pipe_context *pipe = This->pipe;
struct pipe_screen *screen = pipe->screen;
struct pipe_resource *resource = This->base.resource;
struct pipe_surface templ;
enum pipe_format srgb_format;
assert(This->desc.Pool == D3DPOOL_DEFAULT ||
This->desc.Pool == D3DPOOL_MANAGED);
assert(resource);
templ.format = sRGB ? util_format_srgb(resource->format) : resource->format;
srgb_format = util_format_srgb(resource->format);
if (sRGB && srgb_format != PIPE_FORMAT_NONE &&
screen->is_format_supported(screen, srgb_format,
resource->target, 0, resource->bind))
templ.format = srgb_format;
else
templ.format = resource->format;
templ.u.tex.level = This->level;
templ.u.tex.first_layer = This->layer;
templ.u.tex.last_layer = This->layer;
@@ -374,10 +384,19 @@ NineSurface9_LockRect( struct NineSurface9 *This,
if (This->data) {
DBG("returning system memory\n");
pLockedRect->Pitch = This->stride;
pLockedRect->pBits = NineSurface9_GetSystemMemPointer(This,
box.x, box.y);
/* ATI1 and ATI2 need special handling, because of d3d9 bug.
* We must advertise to the application as if it is uncompressed
* and bpp 8, and the app has a workaround to work with the fact
* that it is actually compressed. */
if (is_ATI1_ATI2(This->base.info.format)) {
pLockedRect->Pitch = This->desc.Height;
pLockedRect->pBits = This->data + box.y * This->desc.Height + box.x;
} else {
pLockedRect->Pitch = This->stride;
pLockedRect->pBits = NineSurface9_GetSystemMemPointer(This,
box.x,
box.y);
}
} else {
DBG("mapping pipe_resource %p (level=%u usage=%x)\n",
resource, This->level, usage);

View File

@@ -467,7 +467,7 @@ NineSwapChain9_dtor( struct NineSwapChain9 *This )
if (This->buffers) {
for (i = 0; i < This->params.BackBufferCount; i++) {
NineUnknown_Destroy(NineUnknown(This->buffers[i]));
NineUnknown_Release(NineUnknown(This->buffers[i]));
ID3DPresent_DestroyD3DWindowBuffer(This->present, This->present_handles[i]);
if (This->present_buffers)
pipe_resource_reference(&(This->present_buffers[i]), NULL);

View File

@@ -47,6 +47,7 @@ NineTexture9_ctor( struct NineTexture9 *This,
struct pipe_screen *screen = pParams->device->screen;
struct pipe_resource *info = &This->base.base.info;
struct pipe_resource *resource;
enum pipe_format pf;
unsigned l;
D3DSURFACE_DESC sfdesc;
HRESULT hr;
@@ -92,9 +93,15 @@ NineTexture9_ctor( struct NineTexture9 *This,
if (Usage & D3DUSAGE_AUTOGENMIPMAP)
Levels = 0;
pf = d3d9_to_pipe_format(Format);
if (Format != D3DFMT_NULL && (pf == PIPE_FORMAT_NONE ||
!screen->is_format_supported(screen, pf, PIPE_TEXTURE_2D, 0, PIPE_BIND_SAMPLER_VIEW))) {
return D3DERR_INVALIDCALL;
}
info->screen = screen;
info->target = PIPE_TEXTURE_2D;
info->format = d3d9_to_pipe_format(Format);
info->format = pf;
info->width0 = Width;
info->height0 = Height;
info->depth0 = 1;

View File

@@ -37,6 +37,8 @@ NineVolumeTexture9_ctor( struct NineVolumeTexture9 *This,
HANDLE *pSharedHandle )
{
struct pipe_resource *info = &This->base.base.info;
struct pipe_screen *screen = pParams->device->screen;
enum pipe_format pf;
unsigned l;
D3DVOLUME_DESC voldesc;
HRESULT hr;
@@ -57,9 +59,19 @@ NineVolumeTexture9_ctor( struct NineVolumeTexture9 *This,
if (Usage & D3DUSAGE_AUTOGENMIPMAP)
Levels = 0;
pf = d3d9_to_pipe_format(Format);
if (pf == PIPE_FORMAT_NONE ||
!screen->is_format_supported(screen, pf, PIPE_TEXTURE_3D, 0, PIPE_BIND_SAMPLER_VIEW)) {
return D3DERR_INVALIDCALL;
}
/* We support ATI1 and ATI2 hacks only for 2D textures */
if (Format == D3DFMT_ATI1 || Format == D3DFMT_ATI2)
return D3DERR_INVALIDCALL;
info->screen = pParams->device->screen;
info->target = PIPE_TEXTURE_3D;
info->format = d3d9_to_pipe_format(Format);
info->format = pf;
info->width0 = Width;
info->height0 = Height;
info->depth0 = Depth;

View File

@@ -706,6 +706,11 @@ static void slice_header(vid_dec_PrivateType *priv, struct vl_rbsp *rbsp,
if (pic_order_cnt_lsb != priv->codec_data.h264.pic_order_cnt_lsb)
vid_dec_h264_EndFrame(priv);
if (IdrPicFlag) {
priv->codec_data.h264.pic_order_cnt_msb = 0;
priv->codec_data.h264.pic_order_cnt_lsb = 0;
}
if ((pic_order_cnt_lsb < priv->codec_data.h264.pic_order_cnt_lsb) &&
(priv->codec_data.h264.pic_order_cnt_lsb - pic_order_cnt_lsb) >= (max_pic_order_cnt_lsb / 2))
pic_order_cnt_msb = priv->codec_data.h264.pic_order_cnt_msb + max_pic_order_cnt_lsb;

View File

@@ -431,7 +431,7 @@ osmesa_st_framebuffer_validate(struct st_context_iface *stctx,
templat.format = format;
templat.bind = bind;
out[i] = osbuffer->textures[i] =
out[i] = osbuffer->textures[statts[i]] =
screen->resource_create(screen, &templat);
}

View File

@@ -325,6 +325,9 @@ static boolean do_winsys_init(struct radeon_drm_winsys *ws)
&ws->info.max_sclk);
ws->info.max_sclk /= 1000;
radeon_get_drm_value(ws->fd, RADEON_INFO_SI_BACKEND_ENABLED_MASK, NULL,
&ws->info.si_backend_enabled_mask);
ws->num_cpus = sysconf(_SC_NPROCESSORS_ONLN);
/* Generation-specific queries. */

View File

@@ -231,6 +231,7 @@ struct radeon_info {
boolean si_tile_mode_array_valid;
uint32_t si_tile_mode_array[32];
uint32_t si_backend_enabled_mask;
boolean cik_macrotile_mode_array_valid;
uint32_t cik_macrotile_mode_array[16];

View File

@@ -137,7 +137,9 @@ glsl_test_SOURCES = \
test.cpp \
test_optpass.cpp
glsl_test_LDADD = libglsl.la
glsl_test_LDADD = \
libglsl.la \
$(PTHREAD_LIBS)
# We write our own rules for yacc and lex below. We'd rather use automake,
# but automake makes it especially difficult for a number of reasons:

View File

@@ -724,6 +724,10 @@ builtin_variable_generator::generate_constants()
add_const("gl_MaxCombinedImageUniforms",
state->Const.MaxCombinedImageUniforms);
}
if (state->is_version(410, 0) ||
state->ARB_viewport_array_enable)
add_const("gl_MaxViewports", state->Const.MaxViewports);
}

View File

@@ -134,6 +134,9 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct gl_context *_ctx,
this->Const.MaxFragmentImageUniforms = ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms;
this->Const.MaxCombinedImageUniforms = ctx->Const.MaxCombinedImageUniforms;
/* ARB_viewport_array */
this->Const.MaxViewports = ctx->Const.MaxViewports;
this->current_function = NULL;
this->toplevel_ir = NULL;
this->found_return = false;

View File

@@ -343,6 +343,9 @@ struct _mesa_glsl_parse_state {
unsigned MaxGeometryImageUniforms;
unsigned MaxFragmentImageUniforms;
unsigned MaxCombinedImageUniforms;
/* ARB_viewport_array */
unsigned MaxViewports;
} Const;
/**

View File

@@ -835,9 +835,11 @@ varying_matches::record(ir_variable *producer_var, ir_variable *consumer_var)
* regardless of where they appear. We can trivially satisfy that
* requirement by changing the interpolation type to flat here.
*/
producer_var->data.centroid = false;
producer_var->data.sample = false;
producer_var->data.interpolation = INTERP_QUALIFIER_FLAT;
if (producer_var) {
producer_var->data.centroid = false;
producer_var->data.sample = false;
producer_var->data.interpolation = INTERP_QUALIFIER_FLAT;
}
if (consumer_var) {
consumer_var->data.centroid = false;

View File

@@ -2746,6 +2746,21 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog)
if (last >= 0 && last < MESA_SHADER_FRAGMENT) {
gl_shader *const sh = prog->_LinkedShaders[last];
if (first == MESA_SHADER_GEOMETRY) {
/* There was no vertex shader, but we still have to assign varying
* locations for use by geometry shader inputs in SSO.
*
* If the shader is not separable (i.e., prog->SeparateShader is
* false), linking will have already failed when first is
* MESA_SHADER_GEOMETRY.
*/
if (!assign_varying_locations(ctx, mem_ctx, prog,
NULL, sh,
num_tfeedback_decls, tfeedback_decls,
prog->Geom.VerticesIn))
goto done;
}
if (num_tfeedback_decls != 0 || prog->SeparateShader) {
/* There was no fragment shader, but we still have to assign varying
* locations for use by transform feedback.

Some files were not shown because too many files have changed in this diff Show More