Compare commits

...

158 Commits

Author SHA1 Message Date
Emil Velikov
cb154bb221 docs: Add sha256 sums for the 10.4.7 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 00:50:13 +00:00
Emil Velikov
d26f3c1f86 Add release notes for the 10.4.7 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 00:26:27 +00:00
Emil Velikov
b7b218f3f6 Update version to 10.4.7
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 00:19:39 +00:00
Marek Olšák
832c94a55c radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords
radeon_llvm_emit_prepare_cube_coords uses coords[4] in some cases (TXB2 etc.)

Discovered by Coverity. Reported by Ilia Mirkin.

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit a984abdad3)
2015-03-18 21:49:33 +00:00
Mario Kleiner
70832be2f1 glx: Handle out-of-sequence swap completion events correctly. (v2)
The code for emitting INTEL_swap_events swap completion
events needs to translate from 32-Bit sbc on the wire to
64-Bit sbc for the events and handle wraparound accordingly.

It assumed that events would be sent by the server in the
order their corresponding swap requests were emitted from
the client, iow. sbc count should be always increasing. This
was correct for DRI2.

This is not always the case under the DRI3/Present backend,
where the Present extension can execute presents and send out
completion events in a different order than the submission
order of the present requests, due to client code specifying
targetMSC target vblank counts which are not strictly
monotonically increasing. This confused the wraparound
handling. This patch fixes the problem by handling 32-Bit
wraparound in both directions. As long as successive swap
completion events real 64-Bit sbc's don't differ by more
than 2^30, this should be able to do the right thing.

How this is supposed to work:

awire->sbc contains the low 32-Bits of the true 64-Bit sbc
of the current swap event, transmitted over the wire.

glxDraw->lastEventSbc contains the low 32-Bits of the 64-Bit
sbc of the most recently processed swap event.

glxDraw->eventSbcWrap is a 64-Bit offset which tracks the upper
32-Bits of the current sbc. The final 64-Bit output sbc
aevent->sbc is computed from the sum of awire->sbc and
glxDraw->eventSbcWrap.

Under DRI3/Present, swap completion events can be received
slightly out of order due to non-monotic targetMsc specified
by client code, e.g., present request submission:

Submission sbc:   1   2   3
targetMsc:        10  11  9

Reception of completion events:
Completion sbc:   3   1   2

The completion sequence 3, 1, 2 would confuse the old wraparound
handling made for DRI2 as 1 < 3 --> Assumes a 32-Bit wraparound
has happened when it hasn't.

The client can queue multiple present requests, in the case of
Mesa up to n requests for n-buffered rendering, e.g., n =  2-4 in
the current Mesa GLX DRI3/Present implementation. In the case of
direct Pixmap presents via xcb_present_pixmap() the number n is
limited by the amount of memory available.

We reasonably assume that the number of outstanding requests n is
much less than 2 billion due to memory contraints and common sense.
Therefore while the order of received sbc's can be a bit scrambled,
successive 64-Bit sbc's won't deviate by much, a given sbc may be
a few counts lower or higher than the previous received sbc.

Therefore any large difference between the incoming awire->sbc and
the last recorded glxDraw->lastEventSbc will be due to 32-Bit
wraparound and we need to adapt glxDraw->eventSbcWrap accordingly
to adjust the upper 32-Bits of the sbc.

Two cases, correponding to the two if-statements in the patch:

a) Previous sbc event was below the last 2^32 boundary, in the previous
glxDraw->eventSbcWrap epoch, the new sbc event is in the next 2^32
epoch, therefore the low 32-Bit awire->sbc wrapped around to zero,
or close to zero --> awire->sbc is apparently much lower than the
glxDraw->lastEventSbc recorded for the previous epoch

--> We need to increment glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch to be one higher than the previous one.

--> Case a) also handles the old DRI2 behaviour.

b) Previous sbc event was above closest 2^32 boundary, but now a
late event from the previous 2^32 epoch arrives, with a true sbc
that belongs to the previous 2^32 segment, so the awire->sbc of
this late event has a high count close to 2^32, whereas
glxDraw->lastEventSbc is closer to zero --> awire->sbc is much
greater than glXDraw->lastEventSbc.

--> We need to decrement glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch back to the previous lower epoch of this late
completion event.

We assume such a wraparound to a higher (a) epoch or lower (b)
epoch has happened if awire->sbc and glxDraw->lastEventSbc differ
by more than 2^30 counts, as such a difference can only happen
on wraparound, or if somehow 2^30 present requests would be pending
for a given drawable inside the server, which is rather unlikely.

v2: Explain the reason for this patch and the new wraparound handling
    much more extensive in commit message, no code change wrt. initial
    version.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit cc5ddd584d)
2015-03-18 21:49:25 +00:00
Emil Velikov
ad259df2e0 auxiliary/os: fix the android build - s/drm_munmap/os_munmap/
Squash this silly typo introduced with commit c63eb5dd5ec(auxiliary/os: get
the mmap/munmap wrappers working with android)

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 55f0c0a29f)
2015-03-18 21:49:18 +00:00
Emil Velikov
df2db2a55f loader: include <sys/stat.h> for non-sysfs builds
Required by fstat(), otherwise we'll error out due to implicit function
declaration.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89530
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reported-by: Vadim Rutkovsky <vrutkovs@redhat.com>
Tested-by: Vadim Rutkovsky <vrutkovs@redhat.com>
(cherry picked from commit 771cd266b9)
2015-03-18 21:49:05 +00:00
Rob Clark
0506f69f08 freedreno: update generated headers
Fix a3xx texture layer-size.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e92bc6b38e)
[Emil Velikov: sqush trivial conflicts, drop the a4xx.xml.h changes]

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/freedreno/a2xx/a2xx.xml.h
	src/gallium/drivers/freedreno/a3xx/a3xx.xml.h
	src/gallium/drivers/freedreno/a4xx/a4xx.xml.h
	src/gallium/drivers/freedreno/adreno_common.xml.h
	src/gallium/drivers/freedreno/adreno_pm4.xml.h
2015-03-18 21:48:40 +00:00
Ilia Mirkin
a563045009 freedreno: fix slice pitch calculations
For example if width were 65, the first slice would get 96 while the
second would get 32. However the hardware appears to expect the second
pitch to be 64, based on halving the 96 (and aligning up to 32).

This fixes texelFetch piglit tests on a3xx below a certain size. Going
higher they break again, but most likely due to unrelated reasons.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 620e29b748)
2015-03-18 21:32:21 +00:00
Samuel Iglesias Gonsalvez
b2e243f70c glsl: optimize (0 cmp x + y) into (-x cmp y).
The optimization done by commit 34ec1a24d did not take it into account.

Fixes:

dEQP-GLES3.functional.shaders.random.all_features.fragment.20

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b43bbfa90a)
2015-03-18 21:15:35 +00:00
Iago Toral Quiroga
8c25b0f2d1 i965: Fix out-of-bounds accesses into pull_constant_loc array
The piglit test glsl-fs-uniform-array-loop-unroll.shader_test was designed
to do an out of bounds access into an uniform array to make sure that we
handle that situation gracefully inside the driver, however, as Ken describes
in bug 79202, Valgrind reports that this is leading to an out-of-bounds access
in fs_visitor::demote_pull_constants().

Before accessing the pull_constant_loc array we should make sure that
the uniform we are trying to access is valid.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79202
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 6ac1bc90c4)
Nominated-by: Matt Turner <mattst88@gmail.com>
2015-03-11 17:46:03 +00:00
Rob Clark
a91ee1e187 freedreno/ir3: fix silly typo for binning pass shaders
Was resulting in gl_PointSize write being optimized out, causing
particle system type shaders to hang if hw binning enabled.

Fixes neverball, OGLES2ParticleSystem, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 60096ed906)
2015-03-11 17:44:38 +00:00
Marek Olšák
977626f10a r300g: fix sRGB->sRGB blits
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c939231e72)
2015-03-11 17:42:52 +00:00
Marek Olšák
b451a2ffbf r300g: fix a crash when resolving into an sRGB texture
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9953586af2)
2015-03-11 17:42:38 +00:00
Marek Olšák
a561eee82c r300g: fix RGTC1 and LATC1 SNORM formats
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 74a757f92f)
2015-03-11 17:42:07 +00:00
Stefan Dösinger
80ef80d087 r300g: Fix the ATI1N swizzle (RGTC1 and LATC1)
This fixes the GL_COMPRESSED_RED_RGTC1 part of piglit's rgtc-teximage-01
test as well as the precision part of Wine's 3dc format test (fd.o bug
89156).

The Z component seems to contain a lower precision version of the
result, probably a temporary value from the decompression computation.
The Y and W component contain different data that depends on the input
values as well, but I could not make sense of them (Not that I tried
very hard).

GL_COMPRESSED_SIGNED_RED_RGTC1 still seems to have precision problems in
piglit, and both formats are affected by a compiler bug if they're
sampled by the shader with a swizzle other than .xyzw. Wine uses .xxxx,
which returns random garbage.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89156
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f710b99071)
2015-03-11 17:41:43 +00:00
Ilia Mirkin
fa8bfb3ed1 freedreno/ir3: get the # of miplevels from getinfo
This fixes ARB_texture_query_levels to actually return the desired
value.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cb3eb43ad6)
2015-03-11 17:41:32 +00:00
Ilia Mirkin
025cf8cb3f freedreno/ir3: fix array count returned by TXQ
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8ac957a51c)
2015-03-11 17:41:20 +00:00
Ilia Mirkin
4db4f70546 freedreno: move fb state copy after checking for size change
Fixes: 1f3ca56b ("freedreno: use util_copy_framebuffer_state()")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f3dfe6513c)
2015-03-11 17:40:59 +00:00
Andrey Sudnik
d4a95ffcda i965/vec4: Don't lose the saturate modifier in copy propagation.
Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89224
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0dfec59a27)
2015-03-07 16:41:16 +00:00
Emil Velikov
97b0219ed5 mesa: rename format_info.c to format_info.h
The file is auto-generated, and #included by formats.c. Let's rename it
to reflect the latter. This will also help up fix the dependency
tracking by adding it to the _SOURCES variable, without the side effect
of it being compiled (twice).

v2: Update .gitignore to reflect the rename.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 3f6c28f2a9)

Conflicts:
	src/mesa/Makefile.am
	src/mesa/main/.gitignore
2015-03-07 16:40:27 +00:00
Matt Turner
93273f16af r300g: Check return value of snprintf().
Would have at least prevented the crash the previous patch fixed.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540970
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit ade0b580e7)
2015-03-07 16:37:22 +00:00
Matt Turner
8e8d215cae r300g: Use PATH_MAX instead of limiting ourselves to 100 chars.
When built with Gentoo's package manager, the Mesa source directory
exists seven directories deep. The path to the .test file is too long
and is silently truncated, leading to a crash. Just use PATH_MAX.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540970
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit f5e2aa1324)
2015-03-07 16:37:15 +00:00
Daniel Stone
1a929baa0b egl: Take alpha bits into account when selecting GBM formats
This fixes piglit when using PIGLIT_PLATFORM=gbm

Tom Stellard:
  - Fix ARGB2101010 format

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
(cherry picked from commit 65c8965d03)
2015-03-07 16:37:04 +00:00
Marc-Andre Lureau
3a625d0b3f gallium/auxiliary/indices: fix start param
Since commit 28f3f8d, indices generator take a start parameter. However, some
index values have been left to start at 0.

This fixes the glean/fbo test with the virgl driver, and copytexsubimage
with freedreno.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 073a5d2e84)
2015-03-07 16:36:47 +00:00
Emil Velikov
944ef59b2f cherry-ignore: add not applicable/rejected commits
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-07 16:36:05 +00:00
Emil Velikov
fc9dd495b2 docs: Add sha256 sums for the 10.4.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:44:55 +00:00
Emil Velikov
542a754524 Add release notes for the 10.4.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:23:34 +00:00
Emil Velikov
e559d126f9 Update version to 10.4.6
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:16:58 +00:00
Emil Velikov
fc5881ad73 Revert "gallivm: Update for RTDyldMemoryManager becoming an unique_ptr."
This reverts commit 66a3f104a5.

The commit is likely insufficient for normal work with LLVM 3.6.
The full discussion and reason can be found at
http://lists.freedesktop.org/archives/mesa-dev/2015-March/078795.html
2015-03-06 19:16:28 +00:00
Emil Velikov
9508ca24f1 mesa: cherry-pick the second half of commit 2aa71e9485
Missed out by commit 39ae85732d2(mesa: Fix error validating args for
TexSubImage3D)

Reported-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:16:19 +00:00
Matt Turner
644bbf88ec mesa: Correct backwards NULL check.
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 491d42135a)
2015-03-06 18:45:13 +00:00
Ian Romanick
a369361f9e mesa: Always generate GL_INVALID_OPERATION in _mesa_GetProgramBinary
There are no binary formats supported, so what are you doing?  At least
this gives the application developer some feedback about what's going
on.  The spec gives no guidance about what to do in this scenario.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
(cherry picked from commit f591712efe)
2015-03-06 18:44:52 +00:00
Ian Romanick
f1663a5236 mesa: Ensure that length is set to zero in _mesa_GetProgramBinary
v2: Fix assignment of length.  Noticed by Julien Cristau.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
(cherry picked from commit 4fd8b30123)
2015-03-06 18:44:37 +00:00
Ian Romanick
e1b5bc9330 mesa: Add missing error checks in _mesa_ProgramBinary
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
(cherry picked from commit 201b9c1818)

Conflicts:
	src/mesa/main/shaderapi.c
2015-03-06 18:42:51 +00:00
Emil Velikov
93edf3e7dc Revert "mesa: Correct backwards NULL check."
This reverts commit a598a9bdfe.

The patch was applied without the required dependencies.
2015-03-06 18:40:09 +00:00
José Fonseca
66a3f104a5 gallivm: Update for RTDyldMemoryManager becoming an unique_ptr.
Trivial.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=86958

(cherry picked from commit ef7e0b39a2)
Nominated-by: Sedat Dilek <sedat.dilek@gmail.com>
2015-03-04 01:51:36 +00:00
Abdiel Janulgue
afa7a851da st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported
There is a bug in the current lowering pass implementation where we lower saturate
to clamp only for vertex shaders on drivers supporting SM 3.0. The correct behavior
is to actually lower to clamp only when we don't support saturate which happens
on drivers that don't support SM 3.0

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit 49e0431211)
Nominated-by: Matt Turner <mattst88@gmail.com>
2015-03-04 01:51:36 +00:00
Abdiel Janulgue
d880aa573c glsl: Don't optimize min/max into saturate when EmitNoSat is set
v3: Fix multi-line comment format (Ian)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit 4ea8c8d56c)
2015-03-04 01:51:36 +00:00
Matt Turner
741aeba26f i965/fs: Don't use backend_visitor::instructions after creating the CFG.
This is a fix for a regression introduced in commit a9f8296d ("i965/fs:
Preserve the CFG in a few more places.").

The errata this code works around is described in a comment before the function:

   "[DevBW, DevCL] Errata: A destination register from a send can not be
    used as a destination register until after it has been sourced by an
    instruction with a different destination register.

The framebuffer write's sources must be in message registers, which SEND
instructions cannot have as a destination. There's no way for this
errata to affect anything at the end of the program. Just remove the
code.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84613
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e214000f25)
2015-03-04 01:51:36 +00:00
Matt Turner
a598a9bdfe mesa: Correct backwards NULL check.
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 491d42135a)
[Emil Velikov: the patch hunk has a different offset.]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/mesa/main/shaderapi.c
2015-03-04 01:51:36 +00:00
Chris Forbes
0c46d850d9 i965/gs: Check newly-generated GS-out VUE map against correct stage
Previously, we compared our new GS-out VUE map to the existing *VS*-out
VUE map, which is bogus.

This would mostly manifest as redundant dirty flagging where the GS is
in use but the VS and GS output layouts differ; but there is a scary
case where we would fail to flag a GS-out layout change if it happened
to match the VS-out layout.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.5, 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88885
(cherry picked from commit b51ff50a76)
2015-03-04 01:51:36 +00:00
Jonathan Gray
da46b1b160 auxilary/os: correct sysctl use in os_get_total_physical_memory()
The length argument passed to sysctl was the size of the pointer
not the type.  The result of this is sysctl calls would fail on
32 bit BSD/Mac OS X.

Additionally the wrong pointer was passed as an argument to store
the result of the sysctl call.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 7983a3d2e0)
2015-03-04 01:51:36 +00:00
Matt Turner
7e723c98ce glsl: Rewrite and fix min/max to saturate optimization.
There were some bugs, and the code was really difficult to follow. We
would optimize

   min(max(x, b), 1.0) into max(sat(x), b)

but not pay attention to the order of min/max and also do

   max(min(x, b), 1.0) into max(sat(x), b)

Corrects four shaders from Champions of Regnum that do

   min(max(x, 1), 10)

and corrects rendering of Mass Effect under VMware Workstation.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89180
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit cb25087c7b)
2015-03-04 01:51:36 +00:00
Andreas Boll
0a51529a28 glx: Fix returned values of GLX_RENDERER_PREFERRED_PROFILE_MESA
If the renderer supports the core profile the query returned incorrectly
0x8 as value, because it was using (1U << __DRI_API_OPENGL_CORE) for the
returned value.

The same happened with the compatibility profile. It returned 0x1
(1U << __DRI_API_OPENGL) instead of 0x2.

Internal DRI defines:
   dri_interface.h: #define __DRI_API_OPENGL       0
   dri_interface.h: #define __DRI_API_OPENGL_CORE  3

Those two bits are supposed for internal usage only and should be
translated to GLX_CONTEXT_CORE_PROFILE_BIT_ARB (0x1) for a preferred
core context profile and GLX_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB (0x2)
for a preferred compatibility context profile.

This patch implements the above translation in the glx module.

v2: Fix the incorrect behavior in the glx module

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6d164f65c5)
2015-03-04 01:51:36 +00:00
Leo Liu
2a9e9b5aeb st/omx/dec/h264: fix picture out-of-order with poc type 0 v2
poc counter should be reset with IDR frame,
otherwise there would be a re-order issue with
frames before and after IDR

v2: add commit message

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9c7b343bc0)
2015-03-04 01:51:36 +00:00
Emil Velikov
120792fa04 install-lib-links: remove the .install-lib-links file
With earlier commit (install-lib-links: don't depend on .libs directory)
we moved the location of the file from .libs/ to the current dir.
Although we did not attribute that in the former case autotools was
doing us a favour and removing the file. Explicitly remove the file at
clean-local time, otherwise we'll end up with dangling files.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit fece147be5)
2015-03-04 01:51:35 +00:00
Eduardo Lima Mitev
39ae85732d mesa: Fix error validating args for TexSubImage3D
The zoffset and depth values were not being considered when calling
error_check_subtexture_dimensions().

Fixes 2 dEQP tests:
* dEQP-GLES3.functional.negative_api.texture.texsubimage3d_neg_offset
* dEQP-GLES3.functional.negative_api.texture.texsubimage3d_invalid_offset

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedestkop.org>
(cherry picked from commit 2aa71e9485)
[Emil Velikov: Resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/mesa/main/teximage.c
2015-03-04 01:51:35 +00:00
Marek Olšák
61c1aabb9f radeonsi: fix point sprites
Broken by a27b74819a.

This fix is critical and should be ported to stable ASAP.

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7820a11e3d)

Squashed with commit

radeonsi: fix a warning caused by previous commit

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 050bf75c8b)

[Emil Velikov: The file was renamed si_state_{shaders,draw}.c]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state_shader.c
2015-03-04 01:51:16 +00:00
Marek Olšák
6da4e66d4e vbo: fix an unitialized-variable warning
It looks like a bug to me.

Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 0feb0b7373)
2015-03-04 00:39:01 +00:00
Brian Paul
7e57411b9a st/mesa: fix sampler view reference counting bug in glDraw/CopyPixels
Use pipe_sampler_view_reference() instead of ordinary assignment.
Also add a new sanity check assertion.

Fixes piglit gl-1.0-drawpixels-color-index test crash.  But note
that the test still fails.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 62a8883f32)
2015-03-04 00:38:31 +00:00
Brian Paul
1e6735ead1 swrast: fix multiple color buffer writing
If a fragment program wrote to more than one color buffer, the
first fragment color got replicated to all dest buffers.  This
fixes 5 piglit FBO tests, including fbo-drawbuffers-arbfp.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45348
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 89c96afe3c)
2015-03-04 00:38:23 +00:00
Lucas Stach
deea686c71 install-lib-links: don't depend on .libs directory
This snippet can be included in Makefiles that may, depending on the
project configuration, not actually build any installable libraries.

In that case we don't have anything to depend on and this part of
the makefile may be executed before the .libs directory is created,
so do not depend on it being there.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
(cherry picked from commit 5c1aac17ad)
2015-03-04 00:38:11 +00:00
Emil Velikov
41bdeda102 docs: Add sha256 sums for the 10.4.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-21 12:31:51 +00:00
Emil Velikov
a5c608e951 Add release notes for the 10.4.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-21 12:22:08 +00:00
Emil Velikov
e0276bc297 Update version to 10.4.5
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-21 12:17:35 +00:00
Michel Dänzer
dc16fb1969 Revert "radeon/llvm: enable unsafe math for graphics shaders"
This reverts commit 0e9cdedd2e.

It caused the grass to disappear in The Talos Principle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89069
Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 4db985a5fa)
2015-02-18 12:17:44 +00:00
Kenneth Graunke
aaa823569b glsl: Reduce memory consumption of copy propagation passes.
opt_copy_propagation and opt_copy_propagation_elements create new ACP
and Kill sets each time they enter a new control flow block.  For if
blocks, they also copy the entire existing ACP set contents into the
new set.

When we exit the control flow block, we discard the new sets.  However,
we weren't freeing them - so they lived on until the pass finished.
This can waste a lot of memory (57MB on one pessimal shader).

This patch makes the pass allocate ACP entries using this->acp as the
memory context, and Kill entries out of this->kill.  It also steals
kill entries when moving them from the inner kill list to the parent.

It then frees the lists, including their contents.

v2: Move ralloc_free(this->acp) just before this->acp = orig_acp
    (suggested by Eric Anholt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 76960a55e6)
2015-02-18 12:17:44 +00:00
Laura Ekstrand
f57b41758d main: Fixed _mesa_GetCompressedTexImage_sw to copy slices correctly.
Previously array textures were not working with GetCompressedTextureImage,
leading to failures in the test
arb_direct_state_access/getcompressedtextureimage.c.

Tested-by: Laura Ekstrand <laura@jlekstrand.net>
Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92163482bd)
2015-02-18 12:17:44 +00:00
Marek Olšák
67ac6a3951 radeonsi: fix a crash if a stencil ref state is set before a DSA state
+ minor indentation fixes

Discovered by Axel Davy.

This can't be reproduced with any app, because all state trackers set a DSA
state first.

Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 2ead74888a)
2015-02-18 12:17:44 +00:00
Marek Olšák
5d04b9eeed mesa: fix AtomicBuffer typo in _mesa_DeleteBuffers
Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit e8625a29fe)
2015-02-18 12:17:43 +00:00
Marek Olšák
53041aecef radeonsi: small fix in SPI state
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

(cherry picked from commit a27b74819a)
[Emil Velikov: The file was renamed si_state_{shaders,draw}.c]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
        src/gallium/drivers/radeonsi/si_state_shader.c
2015-02-18 12:14:04 +00:00
Ilia Mirkin
f76bcbb4cd nvc0: allow holes in xfb target lists
Tested with a modified xfb-streams test which outputs to streams 0, 2,
and 3.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 854eb06bee)
2015-02-18 12:09:55 +00:00
Ilia Mirkin
89289934fc st/mesa: treat resource-less xfb buffers as if they weren't there
If a transform feedback buffer's size is 0, st_bufferobj_data doesn't
end up creating a buffer for it. There's no point in trying to write to
such a buffer, so just pretend as if it's not really there.

This fixes arb_gpu_shader5-xfb-streams-without-invocations on nvc0.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 80d373ed5b)
2015-02-18 12:09:54 +00:00
Ilia Mirkin
dbf82d753b nvc0: bail out of 2d blits with non-A8_UNORM alpha formats
This fixes the teximage-colors uploads with GL_ALPHA format and
non-GL_UNSIGNED_BYTE type.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 68e4f3f572)
2015-02-18 12:09:54 +00:00
Emil Velikov
b786e6332b get-pick-list.sh: Require explicit "10.4" for nominating stable patches
A nomination unadorned with a specific version is now interpreted as
being aimed at the 10.5 branch, which was recently opened.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-18 12:09:54 +00:00
Carl Worth
c0ce908a90 Revert use of Mesa IR optimizer for ARB_fragment_programs
Commit f82f2fb3dc added use of the Mesa
IR optimizer for both ARB_fragment_program and ARB_vertex_program, but
only justified the vertex-program portions with measured performance
improvements.

Meanwhile, the optimizer was seen to generate hundreds of unused
immediates without discarding them, causing failures.

Discard the use of the optimizer for now to fix the regression. (In
the future, we anticpate things moving from Mesa IR to NIR for better
optimization anyway.)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82477

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

CC: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 55a57834bf)
2015-02-18 12:09:54 +00:00
Kenneth Graunke
c83c5f4b69 i965: Fix integer border color on Haswell.
+82 Piglits - 100% of border color tests now pass on Haswell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 08a06b6b89)
2015-02-18 12:09:54 +00:00
Kenneth Graunke
f2663112f6 i965: Use a gl_color_union for sampler border color.
This should have no effect, but will make it easier to implement other
bug fixes.

v2: Eliminate "unsigned one" local; just use the value where necessary.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e1e73443c5)
2015-02-18 12:09:54 +00:00
Kenneth Graunke
2ad93851ff i965: Override swizzles for integer luminance formats.
The hardware's integer luminance formats are completely unusable;
currently we fall back to RGBA.  This means we need to override
the texture swizzle to obtain the XXX1 values expected for luminance
formats.

Fixes spec/EXT_texture_integer/texwrap formats bordercolor [swizzled]
on Broadwell - 100% of border color tests now pass on Broadwell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8cb18760cc)
2015-02-18 12:09:54 +00:00
Michel Dänzer
e35e6773c2 st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB
The latter currently implies CPU read access, so only PIPE_USAGE_STAGING
can be expected to be fast.

Mesa demos src/tests/streaming_rect on Kaveri (radeonsi):

Unpatched:  42 frames in  1.023 seconds = 41.056 FPS
Patched:   615 frames in  1.000 seconds = 615.000 FPS

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88658
Cc: "10.3 10.4" <mesa-stable@lists.freedestkop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a338dc0186)
2015-02-18 12:09:54 +00:00
Marek Olšák
51bdd19c97 radeonsi: fix instanced arrays with non-zero start instance
Fixes piglit ARB_base_instance/arb_base_instance-drawarrays.

Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 50908a8918)
2015-02-18 12:09:54 +00:00
Marek Olšák
5c623ff071 r600g,radeonsi: don't append to streamout buffers that haven't been used yet
The FILLED_SIZE counter is uninitialized at the beginning, so we can't use it.
Instead, use offset = 0, which is what we always do when not appending.

This unexpectedly fixes spec/ARB_texture_multisample/sample-position/*.
Yes, the test does use transform feedback.

Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 658f1d4cfe)
2015-02-18 12:09:53 +00:00
Jeremy Huddleston Sequoia
654f197f19 darwin: build fix
xfont.c:237:14: error: implicit declaration of function 'GetGLXDRIDrawable' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
   glxdraw = GetGLXDRIDrawable(CC->currentDpy, CC->currentDrawable);
             ^
Fixes regression from 291be28476

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
(cherry picked from commit e68b67b53f)
2015-02-11 00:24:04 -08:00
Jeremy Huddleston Sequoia
162cee83ba darwin: build fix
../../../src/mesa/main/compiler.h:47:10: fatal error: 'util/macros.h' file not found

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
(cherry picked from commit 1c67a5687a)
2015-02-10 20:35:33 -08:00
Emil Velikov
54da987bae docs: Add sha256 sums for the 10.4.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 00:47:18 +00:00
Emil Velikov
62eb27ac8b Add release notes for the 10.4.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 00:17:09 +00:00
Emil Velikov
a824179af5 Update version to 10.4.4
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 00:12:04 +00:00
Park, Jeongmin
fecedb6c43 st/osmesa: Fix osbuffer->textures indexing
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88930
Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6fd4a61ad6)
2015-02-04 01:37:33 +00:00
Matt Turner
9d1d1f46c7 gallium/util: Don't use __builtin_clrsb in util_last_bit().
Unclear circumstances lead to undefined symbols on x86.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=536916
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 32e98e8ef0)
2015-02-04 01:37:20 +00:00
José Fonseca
b51d369690 egl: Pass the correct X visual depth to xcb_put_image().
The dri2_x11_add_configs_for_visuals() function happily matches a 32
bits EGLconfig with a 24 bits X visual.  However it was passing 32bits
depth to xcb_put_image(), making X server unhappy:

  https://github.com/apitrace/apitrace/issues/313#issuecomment-70571911

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 11a955aef4)
2015-02-02 00:12:04 +00:00
Niels Ole Salscheider
eab8dc28ed configure: Link against all LLVM targets when building clover
Since 8e7df519bd, we initialise all targets in
clover. This fixes bug 85380.

v2: Mention correct bug in commit message

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4b94c3fc31)
2015-02-02 00:12:04 +00:00
Ville Syrjälä
cc580045a8 i965: Fix max_wm_threads for CHV
Change max_wm_threads to match the spec on CHV. The max number of
threads in 3DSTATE_PS is always programmed to 64 and the hardware
internally scales that depending on the GT SKU. So this doesn't
change the max number of threads actually used, but it does affect
the scratch space calculation.

On CHV the old value was too small, so the amount of scratch space
allocated wasn't sufficient to satisfy the actual max number of
threads used.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit 99754446ab)
2015-02-02 00:12:04 +00:00
Mario Kleiner
0d721fa1d6 glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)
Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>

v2: Add Frank Binns signed off by for his original earlier
patch from April 2014, which is identical to this one, and
Chris Wilsons reviewed tag from May 2014 for that patch, ergo
also for this one.

v3: Incorporate comment about triple buffering as suggested
by Axel Davy, and reference to relevant spec provided by
Eric Anholt.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 455d3036fa)
2015-02-02 00:12:04 +00:00
Brian Paul
c96ed76b3d mesa: fix display list 8-byte alignment issue
The _mesa_dlist_alloc() function is only guaranteed to return a pointer
with 4-byte alignment.  On 64-bit systems which don't support unaligned
loads (e.g. SPARC or MIPS) this could lead to a bus error in the VBO code.

The solution is to add a new  _mesa_dlist_alloc_aligned() function which
will return a pointer to an 8-byte aligned address on 64-bit systems.
This is accomplished by inserting a 4-byte NOP instruction in the display
list when needed.

The only place this actually matters is the VBO code where we need to
allocate a 'struct vbo_save_vertex_list' which needs to be 8-byte
aligned (just as if it were malloc'd).

The gears demo and others hit this bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88662
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 53b01938ed)
2015-01-30 08:51:51 -07:00
Emil Velikov
49a5bce780 docs: Add sha256 sums for the 10.4.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-24 12:54:33 +00:00
Emil Velikov
e92bfa3f95 Add release notes for the 10.4.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-24 12:49:17 +00:00
Emil Velikov
f70e4d4afd Update version to 10.4.3
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-24 12:44:46 +00:00
Axel Davy
42806f12a9 st/nine: Allocate vs constbuf buffer for indirect addressing once.
When the shader does indirect addressing on the constants,
we allocate a temporary constant buffer to which we copy
the constants from the app given user constants and
the constants filled in the shader.

This patch makes this buffer be allocated once.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f8a74410f1)
2015-01-23 00:47:26 +00:00
Axel Davy
4c9b64fc44 st/nine: Allocate the correct size for the user constant buffer
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e0f75044c8)
2015-01-23 00:47:26 +00:00
Axel Davy
69c7cf70e7 st/nine: Add variables containing the size of the constant buffers
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b9cbea9dbc)
2015-01-23 00:47:26 +00:00
Axel Davy
4d04fd0871 st/nine: Fix sm3 relative addressing for non-debug build
Relative addressing needs the constant buffer to get all
the correct constants, even those defined by the shader.

The code to copy the shader constants to the constant buffer
was enabled only for debug build. Enable it always.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit a721987077)
2015-01-23 00:47:25 +00:00
Axel Davy
0727ab961c st/nine: Remove unused code for ps
Since constant indirect adressing is not allowed for ps,
we can remove our code to handle that.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4b7a9cfddb)
2015-01-23 00:47:25 +00:00
Axel Davy
7280ddea9d st/nine: Correct rules for relative adressing and constants.
relative adressing for constants is possible only for vs float
constants.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9690bf33d7)
2015-01-23 00:47:25 +00:00
Axel Davy
425bc89720 st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bce94ce831)
2015-01-23 00:47:25 +00:00
Axel Davy
0b3f8c72f7 st/nine: Implement TEXDP3TEX
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9e23b64c15)
2015-01-23 00:47:25 +00:00
Axel Davy
63e668eb18 st/nine: Implement TEXDP3
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 09eb1e901f)
2015-01-23 00:47:24 +00:00
Axel Davy
2b4c577730 st/nine: Implement TEXDEPTH
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f19e699368)
2015-01-23 00:47:24 +00:00
Axel Davy
e3a393b4c3 st/nine: Implement TEXM3x3SPEC
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3676ab02fb)
2015-01-23 00:47:24 +00:00
Axel Davy
7ecd0f9528 st/nine: Implement TEXM3x2TEX
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2b9f079ae3)
2015-01-23 00:47:24 +00:00
Axel Davy
336887bca1 st/nine: implement TEXM3x2DEPTH
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fdff111dc8)
2015-01-23 00:47:24 +00:00
Axel Davy
8e08ba6f96 st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC
The fix is that this line:
"src[s] = tx->regs.vT[s];" is wrong if s doesn't start from 0.
Instead access tx->regs.vT directly when needed.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7865210670)

Conflicts:
	src/gallium/state_trackers/nine/nine_shader.c
2015-01-23 00:47:09 +00:00
Axel Davy
77e1136f44 st/nine: Fill missing dst and src number for some instructions.
Not filling them correctly results in bad padding and later crash.

Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b1259544e3)
2015-01-23 00:44:42 +00:00
Axel Davy
22c75f9f5a st/nine: Implement TEXCOORD special behaviours
texcoord for ps < 1_4 should clamp between 0 and 1 the values.

texcrd (texcoord ps 1_4) does not clamp and can be used with
two modifiers _dw and _dz that means the channels are divided
by w or z.
Implement those in shared code, since the same modifiers can be used
for texld ps 1_4.

v2: replace DIV by RCP + MUL
v3: Remove an useless MOV

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5399119fb1)

Conflicts:
	src/gallium/state_trackers/nine/nine_shader.c
2015-01-23 00:43:57 +00:00
Axel Davy
4b65be8860 st/nine: Fix some fixed function pipeline operation
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6378d74937)
2015-01-22 23:43:28 +00:00
Axel Davy
9ea8e7f0df st/nine: Clamp ps 1.X constants
This is wine (and windows) behaviour.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 018407b5d8)
2015-01-22 23:43:28 +00:00
Axel Davy
d0d09a4eee st/nine: Fix CND implementation
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3ca67f8810)
2015-01-22 23:43:27 +00:00
Axel Davy
75f39e45f0 st/nine: Rewrite LOOP implementation, and a0 aL handling
Previous implementation didn't work well with nested loops.

Instead of using several address registers, put a0 and aL
into normal registers, and copy them to one address register when
we need to use them.

Wine tests loop_index_test() and nested_loop_test() now pass correctly.

Fixes r600g crash while loading Bioshock -
bug https://bugs.freedesktop.org/show_bug.cgi?id=85696

Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6a8e5e48be)
2015-01-22 23:43:27 +00:00
Axel Davy
553089093f st/nine: Correct LOG on negative values
We should take the absolute value of the input.

Also return -FLT_MAX instead of -Inf for an input of 0.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c9aa9a0add)
2015-01-22 23:43:27 +00:00
Axel Davy
add30f01ef st/nine: Handle NRM with input of null norm
When the input's xyz are 0.0, the output
should be 0.0. This is due to the fact that
Inf * 0 = 0 for dx9. To handle this case,
cap the result of RSQ to FLT_MAX. We have
FLT_MAX * 0 = 0.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f5e8e3fb80)
2015-01-22 23:43:27 +00:00
Axel Davy
0dfb9c9e86 st/nine: Handle RSQ special cases
We should use the absolute value of the input as input to ureg_RSQ.

Moreover, an input of 0.0 should return FLT_MAX.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2487f73574)
2015-01-22 23:43:27 +00:00
Axel Davy
7e26cf83ba st/nine: Fix POW implementation
POW doesn't match directly TGSI, since we should
take the absolute value of src0.

Fixes black textures in some games

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c12f8c2088)
2015-01-22 23:43:27 +00:00
Axel Davy
00d22ce0fa st/nine: Fix typo for M4x4
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit e0dd9ca985)
2015-01-22 23:43:26 +00:00
Axel Davy
7f700cc35b st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs
Let's say we have c1 and c2 declared in the shader and c0 given by the app

Then here we would have read c0, c1 and c2 given by the app, instead
of the correct c0, c1, c2.

This correction fixes several issues in some games.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 53dc992f20)
2015-01-22 23:43:26 +00:00
Axel Davy
e6167e749c st/nine: Saturate oFog and oPts vs outputs
According to docs and Wine, these two vs outputs have
to be saturated.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9fb58a74a0)
2015-01-22 23:43:26 +00:00
Axel Davy
bce0058333 st/nine: Remove some shader unused code
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a214838181)
2015-01-22 23:43:26 +00:00
Axel Davy
9a0647ba7f st/nine: Convert integer constants to floats before storing them when cards don't support integers
The shader code is already behaving as if they are floats when the the card doesn't support integers

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d08c7b0b88)
2015-01-22 23:43:26 +00:00
Axel Davy
669c5d6d44 st/nine: Rework of boolean constants
Convert them to shader booleans at earlier stage.
Previous code is fine, but later patch will make
integers being converted at earlier stage, so do
the same for booleans

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d9d18fe39f)
2015-01-22 23:43:26 +00:00
Axel Davy
87ac37074f st/nine: Add ATI1 and ATI2 support
Adds ATI1 and ATI2 support to nine.

They map to PIPE_FORMAT_RGTC1_UNORM and PIPE_FORMAT_RGTC2_UNORM,
but need special handling.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 77f0ecf9ce)
2015-01-22 23:43:25 +00:00
Axel Davy
e1bcca4f13 st/nine: Check if srgb format is supported before trying to use it.
According to msdn, we must act as if user didn't ask srgb if we don't
support it.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b0b5430322)
2015-01-22 23:43:25 +00:00
Stanislaw Halik
50ea1c1f5f st/nine: Hack to generate resource if it doesn't exist when getting view
Buffers in the MANAGED pool are supposed to have the content in a ram buffer,
a copy in VRAM if there is enough memory (driver manages memory and decide when
to delete the buffer in VRAM).

This is not implemented properly in nine, and a VRAM copy is going to be created
when the RAM memory is filled, and the VRAM copy will get synced with the RAM
memory updates.

Due to some issues (in the implementation or in app logic), it can happen
we try to create a sampler view of the resource while we haven't created the
VRAM resource. This hack creates the resource when we hit this case, which prevents
crashing, but doesn't help with the resource content.

This fixes several games crashing at launch.

Acked-by: Axel Davy <axel.davy@ens.fr>
Acked-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Stanislaw Halik <sthalik@misaki.pl>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 82810d3b66)
2015-01-22 23:43:25 +00:00
Axel Davy
3ca8b93476 st/nine: NineBaseTexture9: update sampler view creation
While previous code was having the correct behaviour in general,
this new code is more readable (without checking all gallium formats
manually) and has a more defined behaviour for depth stencil resources.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 47280d777d)
2015-01-22 23:43:25 +00:00
Axel Davy
d06b403377 st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 0abfb80dac)
2015-01-22 23:43:12 +00:00
Axel Davy
481af42f28 st/nine: Fix crash when deleting non-implicit swapchain
The implicit swapchains are destroyed when the device instance is
destroyed. However for non-implicit swapchains, it is not the case,
and the application can have kept an reference on the swapchain
buffers to reuse them.

Fixes problems with battle.net launcher.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 0d2c22e648)
2015-01-22 23:41:09 +00:00
Axel Davy
393fffd07d st/nine: CubeTexture: fix GetLevelDesc
This->surfaces contains the surfaces associated to the levels
and faces. This->surfaces[6*Level] is what we want here,
since it gives us a face descriptor for the level 'Level'.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9232161178)
2015-01-22 23:41:08 +00:00
Axel Davy
c159b4095c st/nine: NineBaseTexture9: fix setting of last_layer
Use same similar settings as u_sampler_view_default_template

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 18c7e70226)
2015-01-22 23:41:08 +00:00
Axel Davy
b80b5b35a3 st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS
The cap means D3DFVF_XYZRHW vertices will see clipping.
This is not the case when
PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION is supported, since
it'll disable clipping.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 05e20e1045)
2015-01-22 23:41:08 +00:00
Xavier Bouchoux
41ca03a7b4 st/nine: Fix D3DRS_POINTSPRITE support
It's done by testing the existence of the point sprite output register *after* parsing the vertex shader.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dc88989189)
2015-01-22 23:41:08 +00:00
Axel Davy
18ac34825b st/nine: Add new texture format strings
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d2f2a550cf)
2015-01-22 23:41:07 +00:00
Xavier Bouchoux
15ef84ccfb st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 072e2ba8e1)
2015-01-22 23:41:07 +00:00
Xavier Bouchoux
44ee59d300 st/nine: Additional defines to d3dtypes.h
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8bb550b958)
2015-01-22 23:41:07 +00:00
Jose Fonseca
1e0ab5b826 nine: Drop use of TGSI_OPCODE_CND.
This was the only state tracker emitting it, and hardware was just having
to lower it anyway (or failing to lower it at all).

v2: Extracted from a larger patch by Jose (which also dropped DP2A), fixed
    to actually not reference TGSI_OPCODE_CND.  Change by anholt.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit 925cb75f89)
2015-01-22 23:40:09 +00:00
Jonathan Gray
a3381286d8 glsl: Link glsl_test with pthreads library.
Otherwise pthread_mutex_lock will be an undefined reference
on OpenBSD.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88219
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c5be9c126d)
2015-01-22 22:27:12 +00:00
Kenneth Graunke
882f702441 i965: Work around mysterious Gen4 GPU hangs with minimal state changes.
Gen4 hardware appears to GPU hang frequently when using Chromium, and
also when running 'glmark2 -b ideas'.  Most of the error states contain
3DPRIMITIVE commands in quick succession, with very few state packets
between them - usually VERTEX_BUFFERS/ELEMENTS and CONSTANT_BUFFER.

I trimmed an apitrace of the glmark2 hang down to two draw calls with a
glUniformMatrix4fv call between the two.  Either draw by itself works
fine, but together, they hang the GPU.  Removing the glUniform call
makes the hangs disappear.  In the hardware state, this translates to
removing the CONSTANT_BUFFER packet between the two 3DPRIMITIVE packets.

Flushing before emitting CONSTANT_BUFFER packets also appears to make
the hangs disappear.  I observed a slowdown in glxgears by doing it all
the time, so I've chosen to only do it when BRW_NEW_BATCH and
BRW_NEW_PSP are unset (i.e. we haven't done a CS_URB_STATE change or
already flushed the whole pipeline).

I'd much rather understand the problem, but at this point, I don't see
how we'd ever be able to track it down further.  We have no real tools,
and the hardware people moved on years ago.  I've analyzed 20+ error
states and read every scrap of documentation I could find.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80568
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85367
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c4fd0c9052)
2015-01-22 16:11:03 +00:00
Jason Ekstrand
a25e26f67f mesa: Fix clamping to -1.0 in snorm_to_float
This patch fixes the return of a wrong value when x is lower than
-MAX_INT(src_bits) as the result would not be between [-1.0 1.0].

v2 by Samuel Iglesias <siglesias@igalia.com>:
    - Modify snorm_to_float() to avoid doing the division when
      x == -MAX_INT(src_bits)

Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 7d1b08ac44)
2015-01-17 14:59:56 +00:00
Kenneth Graunke
021d71b848 i965: Respect the no_8 flag on Gen6, not just Gen7+.
When doing repclears, we only want to use the SIMD16 program, not the
SIMD8 one.  Kristian added this to the Gen7+ code, but apparently we
missed it in the Gen6 code.  This patch copies that code over.

Approximately doubles the performance in a clear microbenchmark from
mesa-demos (clearspd -width 500 -height 500 +color) on Sandybridge.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
References: https://code.google.com/p/chrome-os-partner/issues/detail?id=34681
(cherry picked from commit f95733ddb7)

Conflicts:
	src/mesa/drivers/dri/i965/gen6_wm_state.c
2015-01-17 14:59:08 +00:00
Emil Velikov
14f1659b43 docs: Add sha256 sums for the 10.4.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-12 10:37:09 +00:00
Emil Velikov
02f2e97c3e Add release notes for the 10.4.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-12 10:30:28 +00:00
Emil Velikov
5906dd6c99 Update version to 10.4.2
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-12 10:24:59 +00:00
Dave Airlie
2d05942b74 r600g/sb: implement r600 gpr index workaround. (v3.1)
r600, rv610 and rv630 all have a bug in their GPR indexing
and how the hw inserts access to PV.

If the base index for the src is the same as the dst gpr
in a previous group, then it will use PV instead of using
the indexed gpr correctly.

The workaround is to insert a NOP when you detect this.

v2: add second part of fix detecting DST rel writes followed
by same src base index reads.

v3: forget adding stuff to structs, just iterate over the
previous node group again, makes it more obvious.
v3.1: drop local_nop.

Fixes ~200 piglit regressions on rv635 since SB was introduced.

Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3c8ef3a74b)
2015-01-07 17:39:52 +00:00
Dave Airlie
099ed78a04 r600g: fix regression since UCMP change
Since d8da6decea where the
state tracker started using UCMP on cayman a number of tests
regressed.

this seems to be r600g is doing CNDGE_INT for UCMP which is >= 0,
we should be doing CNDE_INT with reverse arguments.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0d4272cd8e)
2015-01-07 17:35:39 +00:00
Vadim Girlin
91c5770ba1 r600g/sb: fix issues with loops created for switch
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit de0fd375f6)
2015-01-07 17:31:12 +00:00
Dave Airlie
3306ed6fd7 Revert "r600g/sb: fix issues cause by GLSL switching to loops for switch"
This reverts commit 7b0067d23a.

Vadim's patch fixes this a lot better.

(cherry picked from commit 34e512d9ea)
2015-01-07 17:29:01 +00:00
Marek Olšák
81f8006f7d radeonsi: fix VertexID for OpenGL
This fixes all failing piglit VertexID tests.

Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit d7c6f397f4)
2015-01-07 17:25:06 +00:00
Marek Olšák
1b498cf5b7 st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX
Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit eaae92a349)
2015-01-07 17:04:21 +00:00
Marek Olšák
8c77be7ef9 vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays
From GL 4.4 Core profile:

  If both PRIMITIVE_RESTART and PRIMITIVE_RESTART_FIXED_INDEX are
  enabled, the index value determined by PRIMITIVE_RESTART_FIXED_INDEX is
  used. If PRIMITIVE_RESTART_FIXED_INDEX is enabled, primitive restart is not
  performed for array elements transferred by any drawing command not taking a
  type parameter, including all of the *Draw* commands other than *DrawEle-
  ments*.

Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8f5d309521)
2015-01-07 16:51:02 +00:00
Leonid Shatz
ef43d21bbc gallium/util: make sure cache line size is not zero
The "normal" detection (querying clflush size) already made sure it is
non-zero, however another method did not. This lead to crashes if this
value happened to be zero (apparently can happen in virtualized environments
at least).
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87913

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5fea39ace3)
2015-01-06 16:21:03 +00:00
Roland Scheidegger
ac3ca98a1b gallium/util: fix crash with daz detection on x86
The code used PIPE_ALIGN_VAR for the variable used by fxsave, however this
does not work if the stack isn't aligned. Hence use PIPE_ALIGN_STACK function
decoration to fix the segfault which can happen if stack alignment is only
4 bytes.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87658.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b59c7ed0ab)
2015-01-06 16:02:10 +00:00
Ilia Mirkin
af1a690075 nv50/ir: fix texture offsets in release builds
assert's get compiled out in release builds, so they can't be relied
upon to perform logic.

Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Roy Spliet <rspliet@eclipso.eu>
Cc: "10.2 10.3 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fb1afd1ea5)
2015-01-06 15:52:12 +00:00
Chad Versace
fffe533f08 i965: Use safer pointer arithmetic in gather_oa_results()
This patch reduces the likelihood of pointer arithmetic overflow bugs in
gather_oa_results(), like the one fixed by b69c7c5dac.

I haven't yet encountered any overflow bugs in the wild along this
patch's codepath. But I get nervous when I see code patterns like this:

   (void*) + (int) * (int)

I smell 32-bit overflow all over this code.

This patch retypes 'snapshot_size' to 'ptrdiff_t', which should fix any
potential overflow.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 414be86c96)
2015-01-04 21:39:10 +00:00
Chad Versace
4d5e0f78b7 i965: Use safer pointer arithmetic in intel_texsubimage_tiled_memcpy()
This patch reduces the likelihood of pointer arithmetic overflow bugs in
intel_texsubimage_tiled_memcpy() , like the one fixed by b69c7c5dac.

I haven't yet encountered any overflow bugs in the wild along this
patch's codepath. But I recently solved, in commit b69c7c5dac, an overflow
bug in a line of code that looks very similar to pointer arithmetic in
this function.

This patch conceptually applies the same fix as in b69c7c5dac. Instead
of retyping the variables, though, this patch adds some casts. (I tried
to retype the variables as ptrdiff_t, but it quickly got very messy. The
casts are cleaner).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 225a09790d)
2015-01-04 21:39:00 +00:00
Marek Olšák
b9e56ea151 glsl_to_tgsi: fix a bug in copy propagation
This fixes the new piglit test: arb_uniform_buffer_object/2-buffers-bug

Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 48094d0e65)
2015-01-04 21:38:26 +00:00
Kenneth Graunke
e05c595acd i965: Fix start/base_vertex_location for >1 prims but !BRW_NEW_VERTICES.
This is a partial revert of c89306983c.
It split the {start,base}_vertex_location handling into several steps:

1. Set brw->draw.start_vertex_location = prim[i].start
   and brw->draw.base_vertex_location = prim[i].basevertex.
   (This happened once per _mesa_prim, in the main drawing loop.)
2. Add brw->vb.start_vertex_bias and brw->ib.start_vertex_offset
   appropriately.  (This happened in brw_prepare_shader_draw_parameters,
   which was called just after brw_prepare_vertices, as part of state
   upload, and only happened when BRW_NEW_VERTICES was flagged.)
3. Use those values when emitting 3DPRIMITIVE (once per _mesa_prim).

If we drew multiple _mesa_prims, but didn't flag BRW_NEW_VERTICES on
the second (or later) primitives, we would do step #1, but not #2.
The first _mesa_prim would get correct values, but subsequent ones
would only get the first half of the summation.

The reason I originally did this was because I needed the value of
gl_BaseVertexARB to exist in a buffer object prior to uploading
3DSTATE_VERTEX_BUFFERS.  I believed I wanted to upload the value
of 3DPRIMITIVE's "Base Vertex Location" field, which was computed
as: (prims[i].indexed ? prims[i].start : prims[i].basevertex) +
brw->vb.start_vertex_bias.  The latter value wasn't available until
after brw_prepare_vertices, and the former weren't available in the
state upload code at all.  Hence the awkward split.

However, I believe that including brw->vb.start_vertex_bias was a
mistake.  It's an extra bias we apply when uploading vertex data into
VBOs, to move [min_index, max_index] to [0, max_index - min_index].

>From the GL_ARB_shader_draw_parameters specification:
"<gl_BaseVertexARB> holds the integer value passed to the <baseVertex>
 parameter to the command that resulted in the current shader
 invocation.  In the case where the command has no <baseVertex>
 parameter, the value of <gl_BaseVertexARB> is zero."

I conclude that gl_BaseVertexARB should only include the baseVertex
parameter from glDraw*Elements*, not any internal biases we add for
optimization purposes.

With that in mind, gl_BaseVertexARB only needs prim[i].start or
prim[i].basevertex.  We can simply store that, and go back to computing
start_vertex_location and base_vertex_location in brw_emit_prim(), like
we used to.  This is much simpler, and should actually fix two bugs.

Fixes missing geometry in Unvanquished.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85529
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit c633528cba)
2015-01-04 21:38:16 +00:00
Ilia Mirkin
c48d0d8dd2 nv50,nvc0: set vertex id base to index_bias
Fixes the piglits which check that gl_VertexID includes the base vertex
offset:
  arb_draw_indirect-vertexid elements
  gl-3.2-basevertex-vertexid

Note that this leaves out the original G80, for which this will continue
to fail. It could be fixed by passing a driver constbuf value in, but
that's beyond the scope of this change.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit be0311c962)
2015-01-04 21:37:51 +00:00
Tiziano Bacocco
aafd13027a nv50,nvc0: implement half_pixel_center
LAST_LINE_PIXEL has actually been renamed to PIXEL_CENTER_INTEGER in
rnndb; use that method to implement the rasterizer setting, used for
st/nine.

Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 609c3e51f5)
2015-01-04 21:37:32 +00:00
Michel Dänzer
1f42230fa7 radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0
E.g. this could happen on older kernels which don't support the
RADEON_INFO_SI_BACKEND_ENABLED_MASK query yet. The code in
si_write_harvested_raster_configs() doesn't deal with this correctly and
would probably mangle the value badly.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit b3057f8097)
2015-01-04 21:34:08 +00:00
Kenneth Graunke
2b85ed72db i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.
This was probably missed when moving from a fixed binding table layout
to a dynamic one that changes based on the shader.

Fixes newly proposed Piglit test fbo-mrt-new-bind.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87619
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Mike Stroyan <mike@LunarG.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4616b2ef85)
2015-01-04 21:33:26 +00:00
Emil Velikov
4cd38a592e docs: Add sha256 sums for the 10.4.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-12-30 02:38:02 +00:00
132 changed files with 2322 additions and 703 deletions

View File

@@ -1 +1 @@
10.4.1
10.4.7

View File

@@ -1,2 +1,18 @@
# No whitespace commits in stable.
a10bf5c10caf27232d4df8da74d5c35c23eb883d
a10bf5c10caf27232d4df8da74d5c35c23eb883d
# The following patches address code which is missing in 10.4
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078515.html
06084652fefe49c3d6bf1b476ff74ff602fdc22a common: Correct texture init for meta pbo uploads and downloads.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078547.html
ccc5ce6f72c1ec86be4dfcef96c0b51fba0faa6d common: Correct PBO 2D_ARRAY handling.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078549.html
546aba143d13ba3f993ead4cc30b2404abfc0202 common: Fix PBOs for 1D_ARRAY.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078501.html
2b2fa1865248c6e3b7baec81c4f92774759b201f mesa: Indent break statements and add a missing one.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078502.html
87109acbed9c9b52f33d58ca06d9048d0ac7a215 mesa: Free memory allocated for luminance in readpixels.

View File

@@ -14,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*10\.4.*mesa-stable\)' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

View File

@@ -1716,7 +1716,7 @@ if test "x$enable_gallium_llvm" = xyes; then
fi
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
LLVM_COMPONENTS="${LLVM_COMPONENTS} all-targets ipo linker instrumentation"
# LLVM 3.3 >= 177971 requires IRReader
if $LLVM_CONFIG --components | grep -qw 'irreader'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader"

View File

@@ -30,7 +30,9 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD
5311285e791a6bfaa468ad002bd1e1164acb3eaa040b5a1bf958bdb7c27e0a9d MesaLib-10.4.1.tar.gz
91e8b71c8aff4cb92022a09a872b1c5d1ae5bfec8c6c84dbc4221333da5bf1ca MesaLib-10.4.1.tar.bz2
e09c8135f5a86ecb21182c6f8959aafd39ae2f98858fdf7c0e25df65b5abcdb8 MesaLib-10.4.1.zip
</pre>
<h2>New features</h2>

127
docs/relnotes/10.4.2.html Normal file
View File

@@ -0,0 +1,127 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.2 Release Notes / January 12, 2015</h1>
<p>
Mesa 10.4.2 is a bug fix release which fixes bugs found since the 10.4.1 release.
</p>
<p>
Mesa 10.4.2 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e303e77dd774df0d051b2870b165f98c97084a55980f884731df89c1b56a6146 MesaLib-10.4.2.tar.gz
08a119937d9f2aa2f66dd5de97baffc2a6e675f549e40e699a31f5485d15327f MesaLib-10.4.2.tar.bz2
c2c2921a80a3395824f02bee4572a6a17d6a12a928a3e497618eeea04fb06490 MesaLib-10.4.2.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85529">Bug 85529</a> - Surfaces not drawn in Unvanquished</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87619">Bug 87619</a> - Changes to state such as render targets change fragment shader without marking it dirty.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87658">Bug 87658</a> - [llvmpipe] SEGV in sse2_has_daz on ancient Pentium4-M</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87913">Bug 87913</a> - CPU cacheline size of 0 can be returned by CPUID leaf 0x80000006 in some virtual machines</li>
</ul>
<h2>Changes</h2>
<p>Chad Versace (2):</p>
<ul>
<li>i965: Use safer pointer arithmetic in intel_texsubimage_tiled_memcpy()</li>
<li>i965: Use safer pointer arithmetic in gather_oa_results()</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>Revert "r600g/sb: fix issues cause by GLSL switching to loops for switch"</li>
<li>r600g: fix regression since UCMP change</li>
<li>r600g/sb: implement r600 gpr index workaround. (v3.1)</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.1 release</li>
<li>Update version to 10.4.2</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>nv50,nvc0: set vertex id base to index_bias</li>
<li>nv50/ir: fix texture offsets in release builds</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.</li>
<li>i965: Fix start/base_vertex_location for &gt;1 prims but !BRW_NEW_VERTICES.</li>
</ul>
<p>Leonid Shatz (1):</p>
<ul>
<li>gallium/util: make sure cache line size is not zero</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>glsl_to_tgsi: fix a bug in copy propagation</li>
<li>vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays</li>
<li>st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX</li>
<li>radeonsi: fix VertexID for OpenGL</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>gallium/util: fix crash with daz detection on x86</li>
</ul>
<p>Tiziano Bacocco (1):</p>
<ul>
<li>nv50,nvc0: implement half_pixel_center</li>
</ul>
<p>Vadim Girlin (1):</p>
<ul>
<li>r600g/sb: fix issues with loops created for switch</li>
</ul>
</div>
</body>
</html>

145
docs/relnotes/10.4.3.html Normal file
View File

@@ -0,0 +1,145 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.3 Release Notes / January 24, 2015</h1>
<p>
Mesa 10.4.3 is a bug fix release which fixes bugs found since the 10.4.2 release.
</p>
<p>
Mesa 10.4.3 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c53eaafc83d9c6315f63e0904d9954d929b841b0b2be7a328eeb6e14f1376129 MesaLib-10.4.3.tar.gz
ef6ecc9c2f36c9f78d1662382a69ae961f38f03af3a0c3268e53f351aa1978ad MesaLib-10.4.3.tar.bz2
179325fc8ec66529d3b0d0c43ef61a33a44d91daa126c3bbdd1efdfd25a7db1d MesaLib-10.4.3.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80568">Bug 80568</a> - [gen4] GPU Crash During Google Chrome Operation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85367">Bug 85367</a> - [gen4] GPU hang in glmark-es2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85696">Bug 85696</a> - r600g+nine: Bioshock shader failure after 7b1c0cbc90d456384b0950ad21faa3c61a6b43ff</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88219">Bug 88219</a> - include/c11/threads_posix.h:197: undefined reference to `pthread_mutex_lock'</li>
</ul>
<h2>Changes</h2>
<p>Axel Davy (39):</p>
<ul>
<li>st/nine: Add new texture format strings</li>
<li>st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS</li>
<li>st/nine: NineBaseTexture9: fix setting of last_layer</li>
<li>st/nine: CubeTexture: fix GetLevelDesc</li>
<li>st/nine: Fix crash when deleting non-implicit swapchain</li>
<li>st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format</li>
<li>st/nine: NineBaseTexture9: update sampler view creation</li>
<li>st/nine: Check if srgb format is supported before trying to use it.</li>
<li>st/nine: Add ATI1 and ATI2 support</li>
<li>st/nine: Rework of boolean constants</li>
<li>st/nine: Convert integer constants to floats before storing them when cards don't support integers</li>
<li>st/nine: Remove some shader unused code</li>
<li>st/nine: Saturate oFog and oPts vs outputs</li>
<li>st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs</li>
<li>st/nine: Fix typo for M4x4</li>
<li>st/nine: Fix POW implementation</li>
<li>st/nine: Handle RSQ special cases</li>
<li>st/nine: Handle NRM with input of null norm</li>
<li>st/nine: Correct LOG on negative values</li>
<li>st/nine: Rewrite LOOP implementation, and a0 aL handling</li>
<li>st/nine: Fix CND implementation</li>
<li>st/nine: Clamp ps 1.X constants</li>
<li>st/nine: Fix some fixed function pipeline operation</li>
<li>st/nine: Implement TEXCOORD special behaviours</li>
<li>st/nine: Fill missing dst and src number for some instructions.</li>
<li>st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC</li>
<li>st/nine: implement TEXM3x2DEPTH</li>
<li>st/nine: Implement TEXM3x2TEX</li>
<li>st/nine: Implement TEXM3x3SPEC</li>
<li>st/nine: Implement TEXDEPTH</li>
<li>st/nine: Implement TEXDP3</li>
<li>st/nine: Implement TEXDP3TEX</li>
<li>st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB</li>
<li>st/nine: Correct rules for relative adressing and constants.</li>
<li>st/nine: Remove unused code for ps</li>
<li>st/nine: Fix sm3 relative addressing for non-debug build</li>
<li>st/nine: Add variables containing the size of the constant buffers</li>
<li>st/nine: Allocate the correct size for the user constant buffer</li>
<li>st/nine: Allocate vs constbuf buffer for indirect addressing once.</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.2 release</li>
<li>Update version to 10.4.3</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>mesa: Fix clamping to -1.0 in snorm_to_float</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>glsl: Link glsl_test with pthreads library.</li>
</ul>
<p>Jose Fonseca (1):</p>
<ul>
<li>nine: Drop use of TGSI_OPCODE_CND.</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Respect the no_8 flag on Gen6, not just Gen7+.</li>
<li>i965: Work around mysterious Gen4 GPU hangs with minimal state changes.</li>
</ul>
<p>Stanislaw Halik (1):</p>
<ul>
<li>st/nine: Hack to generate resource if it doesn't exist when getting view</li>
</ul>
<p>Xavier Bouchoux (3):</p>
<ul>
<li>st/nine: Additional defines to d3dtypes.h</li>
<li>st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9</li>
<li>st/nine: Fix D3DRS_POINTSPRITE support</li>
</ul>
</div>
</body>
</html>

100
docs/relnotes/10.4.4.html Normal file
View File

@@ -0,0 +1,100 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.4 Release Notes / February 06, 2015</h1>
<p>
Mesa 10.4.4 is a bug fix release which fixes bugs found since the 10.4.3 release.
</p>
<p>
Mesa 10.4.4 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
5cb427eaf980cb8555953e9928f5797979ed783e277745d5f8cbae8bc5364086 MesaLib-10.4.4.tar.gz
f18a967e9c4d80e054b2fdff8c130ce6e6d1f8eecfc42c9f354f8628d8b4df1c MesaLib-10.4.4.tar.bz2
86baad73b77920c80fe58402a905e7dd17e3ea10ead6ea7d3afdc0a56c860bd7 MesaLib-10.4.4.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88662">Bug 88662</a> - unaligned access to gl_dlist_node</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88930">Bug 88930</a> - [osmesa] osbuffer-&gt;textures should be indexed by attachment type</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: fix display list 8-byte alignment issue</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.3 release</li>
<li>Update version to 10.4.4</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>egl: Pass the correct X visual depth to xcb_put_image().</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>gallium/util: Don't use __builtin_clrsb in util_last_bit().</li>
</ul>
<p>Niels Ole Salscheider (1):</p>
<ul>
<li>configure: Link against all LLVM targets when building clover</li>
</ul>
<p>Park, Jeongmin (1):</p>
<ul>
<li>st/osmesa: Fix osbuffer-&gt;textures indexing</li>
</ul>
<p>Ville Syrjälä (1):</p>
<ul>
<li>i965: Fix max_wm_threads for CHV</li>
</ul>
</div>
</body>
</html>

114
docs/relnotes/10.4.5.html Normal file
View File

@@ -0,0 +1,114 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.5 Release Notes / February 21, 2015</h1>
<p>
Mesa 10.4.5 is a bug fix release which fixes bugs found since the 10.4.4 release.
</p>
<p>
Mesa 10.4.5 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e12bbdaee9a758617e8ebd0bb0e987f72addd11db2e4da25ba695e386cd63843 MesaLib-10.4.5.tar.gz
bf60000700a9d58e3aca2bfeee7e781053b0d839e61a95b1883e05a2dee247a0 MesaLib-10.4.5.tar.bz2
3b926de8eee500bb67cf85332c51292f826cc539b8636382aadbb8e70c76527a MesaLib-10.4.5.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82477">Bug 82477</a> - [softpipe] piglit fp-long-alu regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88658">Bug 88658</a> - (bisected) Slow video playback on Kabini</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89069">Bug 89069</a> - Lack of grass in The Talos Principle on radeonsi (native\wine\nine)</li>
</ul>
<h2>Changes</h2>
<p>Carl Worth (1):</p>
<ul>
<li>Revert use of Mesa IR optimizer for ARB_fragment_programs</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.4 release</li>
<li>get-pick-list.sh: Require explicit "10.4" for nominating stable patches</li>
<li>Update version to 10.4.5</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>nvc0: bail out of 2d blits with non-A8_UNORM alpha formats</li>
<li>st/mesa: treat resource-less xfb buffers as if they weren't there</li>
<li>nvc0: allow holes in xfb target lists</li>
</ul>
<p>Jeremy Huddleston Sequoia (2):</p>
<ul>
<li>darwin: build fix</li>
<li>darwin: build fix</li>
</ul>
<p>Kenneth Graunke (4):</p>
<ul>
<li>i965: Override swizzles for integer luminance formats.</li>
<li>i965: Use a gl_color_union for sampler border color.</li>
<li>i965: Fix integer border color on Haswell.</li>
<li>glsl: Reduce memory consumption of copy propagation passes.</li>
</ul>
<p>Laura Ekstrand (1):</p>
<ul>
<li>main: Fixed _mesa_GetCompressedTexImage_sw to copy slices correctly.</li>
</ul>
<p>Marek Olšák (5):</p>
<ul>
<li>r600g,radeonsi: don't append to streamout buffers that haven't been used yet</li>
<li>radeonsi: fix instanced arrays with non-zero start instance</li>
<li>radeonsi: small fix in SPI state</li>
<li>mesa: fix AtomicBuffer typo in _mesa_DeleteBuffers</li>
<li>radeonsi: fix a crash if a stencil ref state is set before a DSA state</li>
</ul>
<p>Michel Dänzer (2):</p>
<ul>
<li>st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB</li>
<li>Revert "radeon/llvm: enable unsafe math for graphics shaders"</li>
</ul>
</div>
</body>
</html>

143
docs/relnotes/10.4.6.html Normal file
View File

@@ -0,0 +1,143 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.6 Release Notes / March 06, 2015</h1>
<p>
Mesa 10.4.6 is a bug fix release which fixes bugs found since the 10.4.5 release.
</p>
<p>
Mesa 10.4.6 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
46c9082142e811c01e49a2c332a9ac0a1eb98f2908985fb9df216539d7eaeaf4 MesaLib-10.4.6.tar.gz
d8baedd20e79ccd98a5a7b05e23d59a30892e68de1fcc057ca6873dafca02735 MesaLib-10.4.6.tar.bz2
6aded6eac7f0d4d55117b8b581d8424710bbb4c768fc90f7b881f29311a751aa MesaLib-10.4.6.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=45348">Bug 45348</a> - [swrast] piglit fbo-drawbuffers-arbfp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84613">Bug 84613</a> - [G965, bisected] piglit regressions : glslparsertest.glsl2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87516">Bug 87516</a> - glProgramBinary violates spec</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88885">Bug 88885</a> - Transform feedback uses incorrect interleaving if a previous draw did not write gl_Position</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89180">Bug 89180</a> - [IVB regression] Rendering issues in Mass Effect through VMware Workstation</li>
</ul>
<h2>Changes</h2>
<p>Abdiel Janulgue (2):</p>
<ul>
<li>glsl: Don't optimize min/max into saturate when EmitNoSat is set</li>
<li>st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported</li>
</ul>
<p>Andreas Boll (1):</p>
<ul>
<li>glx: Fix returned values of GLX_RENDERER_PREFERRED_PROFILE_MESA</li>
</ul>
<p>Brian Paul (2):</p>
<ul>
<li>swrast: fix multiple color buffer writing</li>
<li>st/mesa: fix sampler view reference counting bug in glDraw/CopyPixels</li>
</ul>
<p>Chris Forbes (1):</p>
<ul>
<li>i965/gs: Check newly-generated GS-out VUE map against correct stage</li>
</ul>
<p>Eduardo Lima Mitev (1):</p>
<ul>
<li>mesa: Fix error validating args for TexSubImage3D</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.5 release</li>
<li>install-lib-links: remove the .install-lib-links file</li>
<li>Revert "mesa: Correct backwards NULL check."</li>
<li>mesa: cherry-pick the second half of commit 2aa71e9485a</li>
<li>Revert "gallivm: Update for RTDyldMemoryManager becoming an unique_ptr."</li>
<li>Update version to 10.4.6</li>
</ul>
<p>Ian Romanick (3):</p>
<ul>
<li>mesa: Add missing error checks in _mesa_ProgramBinary</li>
<li>mesa: Ensure that length is set to zero in _mesa_GetProgramBinary</li>
<li>mesa: Always generate GL_INVALID_OPERATION in _mesa_GetProgramBinary</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>auxilary/os: correct sysctl use in os_get_total_physical_memory()</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>gallivm: Update for RTDyldMemoryManager becoming an unique_ptr.</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>st/omx/dec/h264: fix picture out-of-order with poc type 0 v2</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>install-lib-links: don't depend on .libs directory</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>vbo: fix an unitialized-variable warning</li>
<li>radeonsi: fix point sprites</li>
</ul>
<p>Matt Turner (4):</p>
<ul>
<li>glsl: Rewrite and fix min/max to saturate optimization.</li>
<li>mesa: Correct backwards NULL check.</li>
<li>i965/fs: Don't use backend_visitor::instructions after creating the CFG.</li>
<li>mesa: Correct backwards NULL check.</li>
</ul>
</div>
</body>
</html>

134
docs/relnotes/10.4.7.html Normal file
View File

@@ -0,0 +1,134 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.7 Release Notes / March 20, 2015</h1>
<p>
Mesa 10.4.7 is a bug fix release which fixes bugs found since the 10.4.6 release.
</p>
<p>
Mesa 10.4.7 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
9e7b59267199658808f8b33e0410b86fbafbdcd52378658b9df65fac9d24947f MesaLib-10.4.7.tar.gz
2c351c98671f9a7ab3fd9c601bb7a255801b1580f5dd0992639f99152801b0d2 MesaLib-10.4.7.tar.bz2
d14ac578b5ce16560757b53fbd1cb4d6b34652f8e110e4b10a019adc82e67ffd MesaLib-10.4.7.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79202">Bug 79202</a> - valgrind errors in glsl-fs-uniform-array-loop-unroll.shader_test; random code generation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89156">Bug 89156</a> - r300g: GL_COMPRESSED_RED_RGTC1 / ATI1N support broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89224">Bug 89224</a> - Incorrect rendering of Unigine Valley running in VM on VMware Workstation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89530">Bug 89530</a> - FTBFS in loader: missing fstat</li>
</ul>
<h2>Changes</h2>
<p>Andrey Sudnik (1):</p>
<ul>
<li>i965/vec4: Don't lose the saturate modifier in copy propagation.</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>egl: Take alpha bits into account when selecting GBM formats</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.6 release</li>
<li>cherry-ignore: add not applicable/rejected commits</li>
<li>mesa: rename format_info.c to format_info.h</li>
<li>loader: include &lt;sys/stat.h&gt; for non-sysfs builds</li>
<li>auxiliary/os: fix the android build - s/drm_munmap/os_munmap/</li>
<li>Update version to 10.4.7</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>i965: Fix out-of-bounds accesses into pull_constant_loc array</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>freedreno: move fb state copy after checking for size change</li>
<li>freedreno/ir3: fix array count returned by TXQ</li>
<li>freedreno/ir3: get the # of miplevels from getinfo</li>
<li>freedreno: fix slice pitch calculations</li>
</ul>
<p>Marc-Andre Lureau (1):</p>
<ul>
<li>gallium/auxiliary/indices: fix start param</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>r300g: fix RGTC1 and LATC1 SNORM formats</li>
<li>r300g: fix a crash when resolving into an sRGB texture</li>
<li>r300g: fix sRGB-&gt;sRGB blits</li>
<li>radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>glx: Handle out-of-sequence swap completion events correctly. (v2)</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>r300g: Use PATH_MAX instead of limiting ourselves to 100 chars.</li>
<li>r300g: Check return value of snprintf().</li>
</ul>
<p>Rob Clark (2):</p>
<ul>
<li>freedreno/ir3: fix silly typo for binning pass shaders</li>
<li>freedreno: update generated headers</li>
</ul>
<p>Samuel Iglesias Gonsalvez (1):</p>
<ul>
<li>glsl: optimize (0 cmp x + y) into (-x cmp y).</li>
</ul>
<p>Stefan Dösinger (1):</p>
<ul>
<li>r300g: Fix the ATI1N swizzle (RGTC1 and LATC1)</li>
</ul>
</div>
</body>
</html>

View File

@@ -399,6 +399,16 @@ struct IDirect3DVolume9 : public IUnknown
virtual HRESULT WINAPI UnlockBox() = 0;
};
struct IDirect3DVolumeTexture9 : public IDirect3DBaseTexture9
{
virtual HRESULT WINAPI GetLevelDesc(UINT Level, D3DVOLUME_DESC *pDesc) = 0;
virtual HRESULT WINAPI GetVolumeLevel(UINT Level, IDirect3DVolume9 **ppVolumeLevel) = 0;
virtual HRESULT WINAPI LockBox(UINT Level, D3DLOCKED_BOX *pLockedVolume, const D3DBOX *pBox, DWORD Flags) = 0;
virtual HRESULT WINAPI UnlockBox(UINT Level) = 0;
virtual HRESULT WINAPI AddDirtyBox(const D3DBOX *pDirtyBox) = 0;
};
#else /* __cplusplus */
extern const GUID IID_IDirect3D9;

View File

@@ -224,6 +224,8 @@ typedef struct _RGNDATA {
#define D3DERR_INVALIDDEVICE MAKE_D3DHRESULT(2155)
#define D3DERR_INVALIDCALL MAKE_D3DHRESULT(2156)
#define D3DERR_DRIVERINVALIDCALL MAKE_D3DHRESULT(2157)
#define D3DERR_DEVICEREMOVED MAKE_D3DHRESULT(2160)
#define D3DERR_DEVICEHUNG MAKE_D3DHRESULT(2164)
/********************************************************
* Bitmasks *
@@ -331,6 +333,7 @@ typedef struct _RGNDATA {
#define D3DPRESENT_DONOTWAIT 0x00000001
#define D3DPRESENT_LINEAR_CONTENT 0x00000002
#define D3DPRESENT_RATE_DEFAULT 0
#define D3DCREATE_FPU_PRESERVE 0x00000002
#define D3DCREATE_MULTITHREADED 0x00000004
@@ -344,6 +347,13 @@ typedef struct _RGNDATA {
#define D3DSTREAMSOURCE_INDEXEDDATA (1 << 30)
#define D3DSTREAMSOURCE_INSTANCEDATA (2 << 30)
/* D3DRS_COLORWRITEENABLE */
#define D3DCOLORWRITEENABLE_RED (1L << 0)
#define D3DCOLORWRITEENABLE_GREEN (1L << 1)
#define D3DCOLORWRITEENABLE_BLUE (1L << 2)
#define D3DCOLORWRITEENABLE_ALPHA (1L << 3)
/********************************************************
* Function macros *
*******************************************************/
@@ -639,10 +649,13 @@ typedef enum _D3DFORMAT {
D3DFMT_A1 = 118,
D3DFMT_A2B10G10R10_XR_BIAS = 119,
D3DFMT_BINARYBUFFER = 199,
D3DFMT_ATI1 = MAKEFOURCC('A', 'T', 'I', '1'),
D3DFMT_ATI2 = MAKEFOURCC('A', 'T', 'I', '2'),
D3DFMT_DF16 = MAKEFOURCC('D', 'F', '1', '6'),
D3DFMT_DF24 = MAKEFOURCC('D', 'F', '2', '4'),
D3DFMT_INTZ = MAKEFOURCC('I', 'N', 'T', 'Z'),
D3DFMT_NULL = MAKEFOURCC('N', 'U', 'L', 'L'),
D3DFMT_NVDB = MAKEFOURCC('N', 'V', 'D', 'B'),
D3DFMT_NV11 = MAKEFOURCC('N', 'V', '1', '1'),
D3DFMT_NV12 = MAKEFOURCC('N', 'V', '1', '2'),
D3DFMT_Y210 = MAKEFOURCC('Y', '2', '1', '0'),

View File

@@ -3,9 +3,9 @@
if BUILD_SHARED
if HAVE_COMPAT_SYMLINKS
all-local : .libs/install-mesa-links
all-local : .install-mesa-links
.libs/install-mesa-links : $(lib_LTLIBRARIES)
.install-mesa-links : $(lib_LTLIBRARIES)
$(AM_V_GEN)$(MKDIR_P) $(top_builddir)/$(LIB_DIR); \
for f in $(join $(addsuffix .libs/,$(dir $(lib_LTLIBRARIES))),$(notdir $(lib_LTLIBRARIES:%.la=%.$(LIB_EXT)*))); do \
if test -h .libs/$$f; then \
@@ -14,5 +14,9 @@ all-local : .libs/install-mesa-links
ln -f $$f $(top_builddir)/$(LIB_DIR); \
fi; \
done && touch $@
clean-local:
$(RM) .install-mesa-links
endif
endif

View File

@@ -668,15 +668,21 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
for (i = 0; dri2_dpy->driver_configs[i]; i++) {
EGLint format, attr_list[3];
unsigned int mask;
unsigned int red, alpha;
dri2_dpy->core->getConfigAttrib(dri2_dpy->driver_configs[i],
__DRI_ATTRIB_RED_MASK, &mask);
if (mask == 0x3ff00000)
__DRI_ATTRIB_RED_MASK, &red);
dri2_dpy->core->getConfigAttrib(dri2_dpy->driver_configs[i],
__DRI_ATTRIB_ALPHA_MASK, &alpha);
if (red == 0x3ff00000 && alpha == 0x00000000)
format = GBM_FORMAT_XRGB2101010;
else if (mask == 0x00ff0000)
else if (red == 0x3ff00000 && alpha == 0xc0000000)
format = GBM_FORMAT_ARGB2101010;
else if (red == 0x00ff0000 && alpha == 0x00000000)
format = GBM_FORMAT_XRGB8888;
else if (mask == 0xf800)
else if (red == 0x00ff0000 && alpha == 0xff000000)
format = GBM_FORMAT_ARGB8888;
else if (red == 0xf800)
format = GBM_FORMAT_RGB565;
else
continue;

View File

@@ -49,8 +49,7 @@ dri2_x11_swap_interval(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf,
static void
swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
struct dri2_egl_surface * dri2_surf,
int depth)
struct dri2_egl_surface * dri2_surf)
{
uint32_t mask;
const uint32_t function = GXcopy;
@@ -66,8 +65,7 @@ swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
valgc[0] = function;
valgc[1] = False;
xcb_create_gc(dri2_dpy->conn, dri2_surf->swapgc, dri2_surf->drawable, mask, valgc);
dri2_surf->depth = depth;
switch (depth) {
switch (dri2_surf->depth) {
case 32:
case 24:
dri2_surf->bytes_per_pixel = 4;
@@ -82,7 +80,7 @@ swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
dri2_surf->bytes_per_pixel = 0;
break;
default:
_eglLog(_EGL_WARNING, "unsupported depth %d", depth);
_eglLog(_EGL_WARNING, "unsupported depth %d", dri2_surf->depth);
}
}
@@ -257,12 +255,6 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
_eglError(EGL_BAD_ALLOC, "dri2->createNewDrawable");
goto cleanup_pixmap;
}
if (dri2_dpy->dri2) {
xcb_dri2_create_drawable (dri2_dpy->conn, dri2_surf->drawable);
} else {
swrastCreateDrawable(dri2_dpy, dri2_surf, _eglGetConfigKey(conf, EGL_BUFFER_SIZE));
}
if (type != EGL_PBUFFER_BIT) {
cookie = xcb_get_geometry (dri2_dpy->conn, dri2_surf->drawable);
@@ -275,9 +267,19 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
dri2_surf->base.Width = reply->width;
dri2_surf->base.Height = reply->height;
dri2_surf->depth = reply->depth;
free(reply);
}
if (dri2_dpy->dri2) {
xcb_dri2_create_drawable (dri2_dpy->conn, dri2_surf->drawable);
} else {
if (type == EGL_PBUFFER_BIT) {
dri2_surf->depth = _eglGetConfigKey(conf, EGL_BUFFER_SIZE);
}
swrastCreateDrawable(dri2_dpy, dri2_surf);
}
/* we always copy the back buffer to front */
dri2_surf->base.PostSubBufferSupportedNV = EGL_TRUE;

View File

@@ -193,7 +193,7 @@ def lineloop(intype, outtype, inpv, outpv):
print ' for (i = start, j = 0; j < nr - 2; j+=2, i++) { '
do_line( intype, outtype, 'out+j', 'i', 'i+1', inpv, outpv );
print ' }'
do_line( intype, outtype, 'out+j', 'i', '0', inpv, outpv );
do_line( intype, outtype, 'out+j', 'i', 'start', inpv, outpv );
postamble()
def tris(intype, outtype, inpv, outpv):
@@ -218,7 +218,7 @@ def tristrip(intype, outtype, inpv, outpv):
def trifan(intype, outtype, inpv, outpv):
preamble(intype, outtype, inpv, outpv, prim='trifan')
print ' for (i = start, j = 0; j < nr; j+=3, i++) { '
do_tri( intype, outtype, 'out+j', '0', 'i+1', 'i+2', inpv, outpv );
do_tri( intype, outtype, 'out+j', 'start', 'i+1', 'i+2', inpv, outpv );
print ' }'
postamble()
@@ -228,9 +228,9 @@ def polygon(intype, outtype, inpv, outpv):
preamble(intype, outtype, inpv, outpv, prim='polygon')
print ' for (i = start, j = 0; j < nr; j+=3, i++) { '
if inpv == FIRST:
do_tri( intype, outtype, 'out+j', '0', 'i+1', 'i+2', inpv, outpv );
do_tri( intype, outtype, 'out+j', 'start', 'i+1', 'i+2', inpv, outpv );
else:
do_tri( intype, outtype, 'out+j', 'i+1', 'i+2', '0', inpv, outpv );
do_tri( intype, outtype, 'out+j', 'i+1', 'i+2', 'start', inpv, outpv );
print ' }'
postamble()

View File

@@ -118,7 +118,7 @@ os_get_total_physical_memory(uint64_t *size)
*size = phys_pages * page_size;
return (phys_pages > 0 && page_size > 0);
#elif defined(PIPE_OS_APPLE) || defined(PIPE_OS_BSD)
size_t len = sizeof(size);
size_t len = sizeof(*size);
int mib[2];
mib[0] = CTL_HW;
@@ -134,7 +134,7 @@ os_get_total_physical_memory(uint64_t *size)
#error Unsupported *BSD
#endif
return (sysctl(mib, 2, &size, &len, NULL, 0) == 0);
return (sysctl(mib, 2, size, &len, NULL, 0) == 0);
#elif defined(PIPE_OS_HAIKU)
system_info info;
status_t ret;

View File

@@ -70,8 +70,8 @@ static INLINE void *os_mmap(void *addr, size_t length, int prot, int flags,
return __mmap2(addr, length, prot, flags, fd, (size_t) (offset >> 12));
}
# define drm_munmap(addr, length) \
munmap(addr, length)
# define os_munmap(addr, length) \
munmap(addr, length)
#else
/* assume large file support exists */

View File

@@ -272,7 +272,7 @@ static INLINE uint64_t xgetbv(void)
#if defined(PIPE_ARCH_X86)
static INLINE boolean sse2_has_daz(void)
PIPE_ALIGN_STACK static INLINE boolean sse2_has_daz(void)
{
struct {
uint32_t pad1[7];
@@ -409,8 +409,12 @@ util_cpu_detect(void)
}
if (regs[0] >= 0x80000006) {
/* should we really do this if the clflush size above worked? */
unsigned int cacheline;
cpuid(0x80000006, regs2);
util_cpu_caps.cacheline = regs2[2] & 0xFF;
cacheline = regs2[2] & 0xFF;
if (cacheline > 0)
util_cpu_caps.cacheline = cacheline;
}
if (!util_cpu_caps.has_sse) {

View File

@@ -561,14 +561,10 @@ util_last_bit(unsigned u)
static INLINE unsigned
util_last_bit_signed(int i)
{
#if defined(__GNUC__) && ((__GNUC__ * 100 + __GNUC_MINOR__) >= 407) && !defined(__INTEL_COMPILER)
return 31 - __builtin_clrsb(i);
#else
if (i >= 0)
return util_last_bit(i);
else
return util_last_bit(~(unsigned)i);
#endif
}
/* Destructively loop over all of the bits in a mask as in:

View File

@@ -13,7 +13,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10347 bytes, from 2014-10-01 18:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14960 bytes, from 2014-07-27 17:22:13)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 60533 bytes, from 2014-10-15 18:32:43)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 64771 bytes, from 2015-03-15 21:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 41068 bytes, from 2014-08-01 12:22:48)
Copyright (C) 2013-2014 by the following authors:

View File

@@ -13,7 +13,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10347 bytes, from 2014-10-01 18:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14960 bytes, from 2014-07-27 17:22:13)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 60533 bytes, from 2014-10-15 18:32:43)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 64771 bytes, from 2015-03-15 21:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 41068 bytes, from 2014-08-01 12:22:48)
Copyright (C) 2013-2014 by the following authors:
@@ -2572,7 +2572,7 @@ static inline uint32_t A3XX_TEX_CONST_2_SWAP(enum a3xx_color_swap val)
}
#define REG_A3XX_TEX_CONST_3 0x00000003
#define A3XX_TEX_CONST_3_LAYERSZ1__MASK 0x0000000f
#define A3XX_TEX_CONST_3_LAYERSZ1__MASK 0x00001fff
#define A3XX_TEX_CONST_3_LAYERSZ1__SHIFT 0
static inline uint32_t A3XX_TEX_CONST_3_LAYERSZ1(uint32_t val)
{

View File

@@ -13,7 +13,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10347 bytes, from 2014-10-01 18:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14960 bytes, from 2014-07-27 17:22:13)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 60533 bytes, from 2014-10-15 18:32:43)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 64771 bytes, from 2015-03-15 21:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 41068 bytes, from 2014-08-01 12:22:48)
Copyright (C) 2013-2014 by the following authors:

View File

@@ -13,7 +13,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10347 bytes, from 2014-10-01 18:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14960 bytes, from 2014-07-27 17:22:13)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 60533 bytes, from 2014-10-15 18:32:43)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 64771 bytes, from 2015-03-15 21:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 41068 bytes, from 2014-08-01 12:22:48)
Copyright (C) 2013-2014 by the following authors:

View File

@@ -199,7 +199,7 @@ setup_slices(struct fd_resource *rsc)
for (level = 0; level <= prsc->last_level; level++) {
struct fd_resource_slice *slice = fd_resource_slice(rsc, level);
slice->pitch = align(width, 32);
slice->pitch = width = align(width, 32);
slice->offset = size;
slice->size0 = slice->pitch * height * rsc->cpp;

View File

@@ -123,12 +123,12 @@ fd_set_framebuffer_state(struct pipe_context *pctx,
fd_context_render(pctx);
util_copy_framebuffer_state(cso, framebuffer);
if ((cso->width != framebuffer->width) ||
(cso->height != framebuffer->height))
ctx->needs_rb_fbd = true;
util_copy_framebuffer_state(cso, framebuffer);
ctx->dirty |= FD_DIRTY_FRAMEBUFFER;
ctx->disabled_scissor.minx = 0;

View File

@@ -1421,6 +1421,7 @@ trans_txq(const struct instr_translater *t,
struct tgsi_dst_register *dst = &inst->Dst[0].Register;
struct tgsi_src_register *level = &inst->Src[0].Register;
struct tgsi_src_register *samp = &inst->Src[1].Register;
const struct target_info *tgt = &tex_targets[inst->Texture.Texture];
struct tex_info tinf;
memset(&tinf, 0, sizeof(tinf));
@@ -1434,8 +1435,67 @@ trans_txq(const struct instr_translater *t,
instr->cat5.tex = samp->Index;
instr->flags |= tinf.flags;
add_dst_reg_wrmask(ctx, instr, dst, 0, dst->WriteMask);
add_src_reg_wrmask(ctx, instr, level, level->SwizzleX, 0x1);
if (tgt->array && (dst->WriteMask & (1 << tgt->dims))) {
/* Array size actually ends up in .w rather than .z. This doesn't
* matter for miplevel 0, but for higher mips the value in z is
* minified whereas w stays. Also, the value in TEX_CONST_3_DEPTH is
* returned, which means that we have to add 1 to it for arrays.
*/
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register *tmp_src;
type_t type_mov = get_utype(ctx);
tmp_src = get_internal_temp(ctx, &tmp_dst);
add_dst_reg_wrmask(ctx, instr, &tmp_dst, 0,
dst->WriteMask | TGSI_WRITEMASK_W);
add_src_reg_wrmask(ctx, instr, level, level->SwizzleX, 0x1);
if (dst->WriteMask & TGSI_WRITEMASK_X) {
instr = instr_create(ctx, 1, 0);
instr->cat1.src_type = type_mov;
instr->cat1.dst_type = type_mov;
add_dst_reg(ctx, instr, dst, 0);
add_src_reg(ctx, instr, tmp_src, src_swiz(tmp_src, 0));
}
if (tgt->dims == 2) {
if (dst->WriteMask & TGSI_WRITEMASK_Y) {
instr = instr_create(ctx, 1, 0);
instr->cat1.src_type = type_mov;
instr->cat1.dst_type = type_mov;
add_dst_reg(ctx, instr, dst, 1);
add_src_reg(ctx, instr, tmp_src, src_swiz(tmp_src, 1));
}
}
instr = instr_create(ctx, 2, OPC_ADD_U);
add_dst_reg(ctx, instr, dst, tgt->dims);
add_src_reg(ctx, instr, tmp_src, src_swiz(tmp_src, 3));
ir3_reg_create(instr, 0, IR3_REG_IMMED)->iim_val = 1;
} else {
add_dst_reg_wrmask(ctx, instr, dst, 0, dst->WriteMask);
add_src_reg_wrmask(ctx, instr, level, level->SwizzleX, 0x1);
}
if (dst->WriteMask & TGSI_WRITEMASK_W) {
/* The # of levels comes from getinfo.z. We need to add 1 to it, since
* the value in TEX_CONST_0 is zero-based.
*/
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register *tmp_src;
tmp_src = get_internal_temp(ctx, &tmp_dst);
instr = instr_create(ctx, 5, OPC_GETINFO);
instr->cat5.type = get_utype(ctx);
instr->cat5.samp = samp->Index;
instr->cat5.tex = samp->Index;
add_dst_reg_wrmask(ctx, instr, &tmp_dst, 0, TGSI_WRITEMASK_Z);
instr = instr_create(ctx, 2, OPC_ADD_U);
add_dst_reg(ctx, instr, dst, 3);
add_src_reg(ctx, instr, tmp_src, src_swiz(tmp_src, 2));
ir3_reg_create(instr, 0, IR3_REG_IMMED)->iim_val = 1;
}
}
/* DDX/DDY */
@@ -3094,7 +3154,7 @@ ir3_compile_shader(struct ir3_shader_variant *so,
if (key.binning_pass) {
for (i = 0, j = 0; i < so->outputs_count; i++) {
unsigned name = sem2name(so->outputs[i].semantic);
unsigned idx = sem2name(so->outputs[i].semantic);
unsigned idx = sem2idx(so->outputs[i].semantic);
/* throw away everything but first position/psize */
if ((idx == 0) && ((name == TGSI_SEMANTIC_POSITION) ||

View File

@@ -772,7 +772,8 @@ NV50LoweringPreSSA::handleTEX(TexInstruction *i)
if (i->tex.useOffsets) {
for (int c = 0; c < 3; ++c) {
ImmediateValue val;
assert(i->offset[0][c].getImmediate(val));
if (!i->offset[0][c].getImmediate(val))
assert(!"non-immediate offset");
i->tex.offset[c] = val.reg.data.u32;
i->offset[0][c].set(NULL);
}

View File

@@ -754,7 +754,8 @@ NVC0LoweringPass::handleTEX(TexInstruction *i)
assert(i->tex.useOffsets == 1);
for (c = 0; c < 3; ++c) {
ImmediateValue val;
assert(i->offset[0][c].getImmediate(val));
if (!i->offset[0][c].getImmediate(val))
assert(!"non-immediate offset passed to non-TXG");
imm |= (val.reg.data.u32 & 0xf) << (c * 4);
}
if (i->op == OP_TXD && chipset >= NVISA_GK104_CHIPSET) {

View File

@@ -1708,7 +1708,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#define NV50_3D_CULL_FACE_BACK 0x00000405
#define NV50_3D_CULL_FACE_FRONT_AND_BACK 0x00000408
#define NV50_3D_LINE_LAST_PIXEL 0x00001924
#define NV50_3D_PIXEL_CENTER_INTEGER 0x00001924
#define NVA3_3D_FP_MULTISAMPLE 0x00001928
#define NVA3_3D_FP_MULTISAMPLE_EXPORT_SAMPLE_MASK 0x00000001

View File

@@ -461,8 +461,6 @@ nv50_screen_init_hwctx(struct nv50_screen *screen)
PUSH_DATA (push, 0);
BEGIN_NV04(push, NV50_3D(PRIM_RESTART_WITH_DRAW_ARRAYS), 1);
PUSH_DATA (push, 1);
BEGIN_NV04(push, NV50_3D(LINE_LAST_PIXEL), 1);
PUSH_DATA (push, 0);
BEGIN_NV04(push, NV50_3D(BLEND_SEPARATE_ALPHA), 1);
PUSH_DATA (push, 1);
@@ -609,6 +607,13 @@ nv50_screen_init_hwctx(struct nv50_screen *screen)
BEGIN_NV04(push, NV50_3D(EDGEFLAG), 1);
PUSH_DATA (push, 1);
BEGIN_NV04(push, NV50_3D(VB_ELEMENT_BASE), 1);
PUSH_DATA (push, 0);
if (screen->base.class_3d >= NV84_3D_CLASS) {
BEGIN_NV04(push, SUBC_3D(NV84_3D_VERTEX_ID_BASE), 1);
PUSH_DATA (push, 0);
}
PUSH_KICK (push);
}

View File

@@ -57,10 +57,6 @@
* ! pipe_rasterizer_state.flatshade_first also applies to QUADS
* (There's a GL query for that, forcing an exception is just ridiculous.)
*
* ! pipe_rasterizer_state.half_pixel_center is ignored - pixel centers
* are always at half integer coordinates and the top-left rule applies
* (There does not seem to be a hardware switch for this.)
*
* ! pipe_rasterizer_state.sprite_coord_enable is masked with 0xff on NVC0
* (The hardware only has 8 slots meant for TexCoord and we have to assign
* in advance to maintain elegant separate shader objects.)
@@ -221,7 +217,7 @@ nv50_blend_state_delete(struct pipe_context *pipe, void *hwcso)
FREE(hwcso);
}
/* NOTE: ignoring line_last_pixel, using FALSE (set on screen init) */
/* NOTE: ignoring line_last_pixel */
static void *
nv50_rasterizer_state_create(struct pipe_context *pipe,
const struct pipe_rasterizer_state *cso)
@@ -336,6 +332,9 @@ nv50_rasterizer_state_create(struct pipe_context *pipe,
SB_BEGIN_3D(so, DEPTH_CLIP_NEGATIVE_Z, 1);
SB_DATA (so, cso->clip_halfz);
SB_BEGIN_3D(so, PIXEL_CENTER_INTEGER, 1);
SB_DATA (so, !cso->half_pixel_center);
assert(so->size <= (sizeof(so->state) / sizeof(so->state[0])));
return (void *)so;
}

View File

@@ -25,7 +25,7 @@ struct nv50_blend_stateobj {
struct nv50_rasterizer_stateobj {
struct pipe_rasterizer_state pipe;
int size;
uint32_t state[48];
uint32_t state[49];
};
struct nv50_zsa_stateobj {

View File

@@ -472,6 +472,10 @@ nv50_draw_arrays(struct nv50_context *nv50,
if (nv50->state.index_bias) {
BEGIN_NV04(push, NV50_3D(VB_ELEMENT_BASE), 1);
PUSH_DATA (push, 0);
if (nv50->screen->base.class_3d >= NV84_3D_CLASS) {
BEGIN_NV04(push, SUBC_3D(NV84_3D_VERTEX_ID_BASE), 1);
PUSH_DATA (push, 0);
}
nv50->state.index_bias = 0;
}
@@ -594,6 +598,10 @@ nv50_draw_elements(struct nv50_context *nv50, boolean shorten,
if (index_bias != nv50->state.index_bias) {
BEGIN_NV04(push, NV50_3D(VB_ELEMENT_BASE), 1);
PUSH_DATA (push, index_bias);
if (nv50->screen->base.class_3d >= NV84_3D_CLASS) {
BEGIN_NV04(push, SUBC_3D(NV84_3D_VERTEX_ID_BASE), 1);
PUSH_DATA (push, index_bias);
}
nv50->state.index_bias = index_bias;
}

View File

@@ -227,6 +227,7 @@ locn_0f_ts:
/* NVC0_3D_MACRO_DRAW_ELEMENTS_INDIRECT
*
* NOTE: Saves and restores VB_ELEMENT,INSTANCE_BASE.
* Forcefully sets VERTEX_ID_BASE to the value of VB_ELEMENT_BASE.
*
* arg = mode
* parm[0] = count
@@ -247,6 +248,8 @@ locn_0f_ts:
maddr 0x150d /* VB_ELEMENT,INSTANCE_BASE */
send $r4
send $r5
maddr 0x446
send $r4
mov $r4 0x1
dei_again:
maddr 0x586 /* VERTEX_BEGIN_GL */
@@ -258,8 +261,10 @@ dei_again:
branz $r2 #dei_again
mov $r1 (extrinsrt $r1 $r4 0 1 26) /* set INSTANCE_NEXT */
maddr 0x150d /* VB_ELEMENT,INSTANCE_BASE */
exit send $r6
send $r6
send $r7
exit maddr 0x446
send $r6
dei_end:
exit
nop

View File

@@ -128,16 +128,18 @@ uint32_t mme9097_draw_elts_indirect[] = {
0x00000301,
0x00000201,
0x017dc451,
/* 0x000c: dei_again */
/* 0x000e: dei_again */
0x00002431,
0x0004d007,
/* 0x0017: dei_end */
0x0005d007,
0x00000501,
/* 0x001b: dei_end */
0x01434615,
0x01438715,
0x05434021,
0x00002041,
0x00002841,
0x01118021,
0x00002041,
0x00004411,
0x01618021,
0x00000841,
@@ -148,8 +150,10 @@ uint32_t mme9097_draw_elts_indirect[] = {
0xfffe9017,
0xd0410912,
0x05434021,
0x000030c1,
0x00003041,
0x00003841,
0x011180a1,
0x00003041,
0x00000091,
0x00000011,
};

View File

@@ -1041,7 +1041,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#define NVC0_3D_CULL_FACE_BACK 0x00000405
#define NVC0_3D_CULL_FACE_FRONT_AND_BACK 0x00000408
#define NVC0_3D_LINE_LAST_PIXEL 0x00001924
#define NVC0_3D_PIXEL_CENTER_INTEGER 0x00001924
#define NVC0_3D_VIEWPORT_TRANSFORM_EN 0x0000192c

View File

@@ -786,8 +786,6 @@ nvc0_screen_create(struct nouveau_device *dev)
PUSH_DATA (push, 0);
BEGIN_NVC0(push, NVC0_3D(LINE_WIDTH_SEPARATE), 1);
PUSH_DATA (push, 1);
BEGIN_NVC0(push, NVC0_3D(LINE_LAST_PIXEL), 1);
PUSH_DATA (push, 0);
BEGIN_NVC0(push, NVC0_3D(PRIM_RESTART_WITH_DRAW_ARRAYS), 1);
PUSH_DATA (push, 1);
BEGIN_NVC0(push, NVC0_3D(BLEND_SEPARATE_ALPHA), 1);

View File

@@ -252,7 +252,12 @@ nvc0_tfb_validate(struct nvc0_context *nvc0)
for (b = 0; b < nvc0->num_tfbbufs; ++b) {
struct nvc0_so_target *targ = nvc0_so_target(nvc0->tfbbuf[b]);
struct nv04_resource *buf = nv04_resource(targ->pipe.buffer);
struct nv04_resource *buf;
if (!targ) {
IMMED_NVC0(push, NVC0_3D(TFB_BUFFER_ENABLE(b)), 0);
continue;
}
if (tfb)
targ->stride = tfb->stride[b];
@@ -260,6 +265,8 @@ nvc0_tfb_validate(struct nvc0_context *nvc0)
if (!(nvc0->tfbbuf_dirty & (1 << b)))
continue;
buf = nv04_resource(targ->pipe.buffer);
if (!targ->clean)
nvc0_query_fifo_wait(push, targ->pq);
BEGIN_NVC0(push, NVC0_3D(TFB_BUFFER_ENABLE(b)), 5);

View File

@@ -204,7 +204,7 @@ nvc0_blend_state_delete(struct pipe_context *pipe, void *hwcso)
FREE(hwcso);
}
/* NOTE: ignoring line_last_pixel, using FALSE (set on screen init) */
/* NOTE: ignoring line_last_pixel */
static void *
nvc0_rasterizer_state_create(struct pipe_context *pipe,
const struct pipe_rasterizer_state *cso)
@@ -315,6 +315,8 @@ nvc0_rasterizer_state_create(struct pipe_context *pipe,
SB_IMMED_3D(so, DEPTH_CLIP_NEGATIVE_Z, cso->clip_halfz);
SB_IMMED_3D(so, PIXEL_CENTER_INTEGER, !cso->half_pixel_center);
assert(so->size <= (sizeof(so->state) / sizeof(so->state[0])));
return (void *)so;
}
@@ -1087,9 +1089,11 @@ nvc0_set_transform_feedback_targets(struct pipe_context *pipe,
pipe_so_target_reference(&nvc0->tfbbuf[i], targets[i]);
}
for (; i < nvc0->num_tfbbufs; ++i) {
nvc0->tfbbuf_dirty |= 1 << i;
nvc0_so_target_save_offset(pipe, nvc0->tfbbuf[i], i, &serialize);
pipe_so_target_reference(&nvc0->tfbbuf[i], NULL);
if (nvc0->tfbbuf[i]) {
nvc0->tfbbuf_dirty |= 1 << i;
nvc0_so_target_save_offset(pipe, nvc0->tfbbuf[i], i, &serialize);
pipe_so_target_reference(&nvc0->tfbbuf[i], NULL);
}
}
nvc0->num_tfbbufs = num_targets;

View File

@@ -23,7 +23,7 @@ struct nvc0_blend_stateobj {
struct nvc0_rasterizer_stateobj {
struct pipe_rasterizer_state pipe;
int size;
uint32_t state[43];
uint32_t state[44];
};
struct nvc0_zsa_stateobj {

View File

@@ -1401,11 +1401,14 @@ nvc0_blit(struct pipe_context *pipe, const struct pipe_blit_info *info)
} else
if (!nv50_2d_src_format_faithful(info->src.format)) {
if (!util_format_is_luminance(info->src.format)) {
if (!nv50_2d_dst_format_ops_supported(info->dst.format))
eng3d = TRUE;
else
if (util_format_is_intensity(info->src.format))
eng3d = info->src.format != PIPE_FORMAT_I8_UNORM;
else
if (!nv50_2d_dst_format_ops_supported(info->dst.format))
eng3d = TRUE;
if (util_format_is_alpha(info->src.format))
eng3d = info->src.format != PIPE_FORMAT_A8_UNORM;
else
eng3d = !nv50_2d_format_supported(info->src.format);
}

View File

@@ -575,8 +575,9 @@ nvc0_draw_arrays(struct nvc0_context *nvc0,
if (nvc0->state.index_bias) {
/* index_bias is implied 0 if !info->indexed (really ?) */
/* TODO: can we deactivate it for the VERTEX_BUFFER_FIRST command ? */
PUSH_SPACE(push, 1);
PUSH_SPACE(push, 2);
IMMED_NVC0(push, NVC0_3D(VB_ELEMENT_BASE), 0);
IMMED_NVC0(push, NVC0_3D(VERTEX_ID), 0);
nvc0->state.index_bias = 0;
}
@@ -705,9 +706,11 @@ nvc0_draw_elements(struct nvc0_context *nvc0, boolean shorten,
prim = nvc0_prim_gl(mode);
if (index_bias != nvc0->state.index_bias) {
PUSH_SPACE(push, 2);
PUSH_SPACE(push, 4);
BEGIN_NVC0(push, NVC0_3D(VB_ELEMENT_BASE), 1);
PUSH_DATA (push, index_bias);
BEGIN_NVC0(push, NVC0_3D(VERTEX_ID), 1);
PUSH_DATA (push, index_bias);
nvc0->state.index_bias = index_bias;
}
@@ -818,6 +821,7 @@ nvc0_draw_indirect(struct nvc0_context *nvc0, const struct pipe_draw_info *info)
if (nvc0->state.index_bias) {
/* index_bias is implied 0 if !info->indexed (really ?) */
IMMED_NVC0(push, NVC0_3D(VB_ELEMENT_BASE), 0);
IMMED_NVC0(push, NVC0_3D(VERTEX_ID), 0);
nvc0->state.index_bias = 0;
}
size = 4 * 4;

View File

@@ -28,6 +28,7 @@
*/
#include <errno.h>
#include <limits.h>
#include <regex.h>
#include <stdlib.h>
#include <stdio.h>
@@ -528,7 +529,6 @@ void init_compiler(
}
#define MAX_LINE_LENGTH 100
#define MAX_PATH_LENGTH 100
unsigned load_program(
struct radeon_compiler *c,
@@ -536,14 +536,19 @@ unsigned load_program(
const char *filename)
{
char line[MAX_LINE_LENGTH];
char path[MAX_PATH_LENGTH];
char path[PATH_MAX];
FILE *file;
unsigned *count;
char **string_store;
unsigned i = 0;
int n;
memset(line, 0, sizeof(line));
snprintf(path, MAX_PATH_LENGTH, TEST_PATH "/%s", filename);
n = snprintf(path, PATH_MAX, TEST_PATH "/%s", filename);
if (n < 0 || n >= PATH_MAX) {
return 0;
}
file = fopen(path, "r");
if (!file) {
return 0;

View File

@@ -803,6 +803,15 @@ static void r300_blit(struct pipe_context *pipe,
(struct pipe_framebuffer_state*)r300->fb_state.state;
struct pipe_blit_info info = *blit;
/* The driver supports sRGB textures but not framebuffers. Blitting
* from sRGB to sRGB should be the same as blitting from linear
* to linear, so use that, This avoids incorrect linearization.
*/
if (util_format_is_srgb(info.src.format)) {
info.src.format = util_format_linear(info.src.format);
info.dst.format = util_format_linear(info.dst.format);
}
/* MSAA resolve. */
if (info.src.resource->nr_samples > 1 &&
!util_format_is_depth_or_stencil(info.src.resource->format)) {

View File

@@ -170,24 +170,10 @@ static void get_external_state(
}
state->unit[i].non_normalized_coords = !s->state.normalized_coords;
state->unit[i].convert_unorm_to_snorm =
v->base.format == PIPE_FORMAT_RGTC1_SNORM ||
v->base.format == PIPE_FORMAT_LATC1_SNORM;
state->unit[i].convert_unorm_to_snorm = 0;
/* Pass texture swizzling to the compiler, some lowering passes need it. */
if (v->base.format == PIPE_FORMAT_RGTC1_SNORM ||
v->base.format == PIPE_FORMAT_LATC1_SNORM) {
unsigned char swizzle[4];
util_format_compose_swizzles(
util_format_description(v->base.format)->swizzle,
v->swizzle,
swizzle);
state->unit[i].texture_swizzle =
RC_MAKE_SWIZZLE(swizzle[0], swizzle[1],
swizzle[2], swizzle[3]);
} else if (state->unit[i].compare_mode_enabled) {
if (state->unit[i].compare_mode_enabled) {
state->unit[i].texture_swizzle =
RC_MAKE_SWIZZLE(v->swizzle[0], v->swizzle[1],
v->swizzle[2], v->swizzle[3]);

View File

@@ -169,20 +169,21 @@ uint32_t r300_translate_texformat(enum pipe_format format,
/* Add swizzling. */
/* The RGTC1_SNORM and LATC1_SNORM swizzle is done in the shader. */
if (format != PIPE_FORMAT_RGTC1_SNORM &&
if (util_format_is_compressed(format) &&
dxtc_swizzle &&
format != PIPE_FORMAT_RGTC2_UNORM &&
format != PIPE_FORMAT_RGTC2_SNORM &&
format != PIPE_FORMAT_LATC2_UNORM &&
format != PIPE_FORMAT_LATC2_SNORM &&
format != PIPE_FORMAT_RGTC1_UNORM &&
format != PIPE_FORMAT_RGTC1_SNORM &&
format != PIPE_FORMAT_LATC1_UNORM &&
format != PIPE_FORMAT_LATC1_SNORM) {
if (util_format_is_compressed(format) &&
dxtc_swizzle &&
format != PIPE_FORMAT_RGTC2_UNORM &&
format != PIPE_FORMAT_RGTC2_SNORM &&
format != PIPE_FORMAT_LATC2_UNORM &&
format != PIPE_FORMAT_LATC2_SNORM) {
result |= r300_get_swizzle_combined(desc->swizzle, swizzle_view,
TRUE);
} else {
result |= r300_get_swizzle_combined(desc->swizzle, swizzle_view,
FALSE);
}
result |= r300_get_swizzle_combined(desc->swizzle, swizzle_view,
TRUE);
} else {
result |= r300_get_swizzle_combined(desc->swizzle, swizzle_view,
FALSE);
}
/* S3TC formats. */
@@ -213,6 +214,7 @@ uint32_t r300_translate_texformat(enum pipe_format format,
switch (format) {
case PIPE_FORMAT_RGTC1_SNORM:
case PIPE_FORMAT_LATC1_SNORM:
result |= sign_bit[0];
case PIPE_FORMAT_LATC1_UNORM:
case PIPE_FORMAT_RGTC1_UNORM:
return R500_TX_FORMAT_ATI1N | result;
@@ -936,14 +938,16 @@ static void r300_texture_setup_fb_state(struct r300_surface *surf)
surf->pitch_zmask = tex->tex.zmask_stride_in_pixels[level];
surf->pitch_hiz = tex->tex.hiz_stride_in_pixels[level];
} else {
enum pipe_format format = util_format_linear(surf->base.format);
surf->pitch =
stride |
r300_translate_colorformat(surf->base.format) |
r300_translate_colorformat(format) |
R300_COLOR_TILE(tex->tex.macrotile[level]) |
R300_COLOR_MICROTILE(tex->tex.microtile);
surf->format = r300_translate_out_fmt(surf->base.format);
surf->format = r300_translate_out_fmt(format);
surf->colormask_swizzle =
r300_translate_colormask_swizzle(surf->base.format);
r300_translate_colormask_swizzle(format);
surf->pitch_cmask = tex->tex.cmask_stride_in_pixels;
}
}

View File

@@ -6071,7 +6071,7 @@ static int tgsi_ucmp(struct r600_shader_ctx *ctx)
continue;
memset(&alu, 0, sizeof(struct r600_bytecode_alu));
alu.op = ALU_OP3_CNDGE_INT;
alu.op = ALU_OP3_CNDE_INT;
r600_bytecode_src(&alu.src[0], &ctx->src[0], i);
r600_bytecode_src(&alu.src[1], &ctx->src[2], i);
r600_bytecode_src(&alu.src[2], &ctx->src[1], i);

View File

@@ -616,6 +616,8 @@ public:
unsigned num_slots;
bool uses_mova_gpr;
bool r6xx_gpr_index_workaround;
bool stack_workaround_8xx;
bool stack_workaround_9xx;

View File

@@ -38,6 +38,18 @@
namespace r600_sb {
void bc_finalizer::insert_rv6xx_load_ar_workaround(alu_group_node *b4) {
alu_group_node *g = sh.create_alu_group();
alu_node *a = sh.create_alu();
a->bc.set_op(ALU_OP0_NOP);
a->bc.last = 1;
g->push_back(a);
b4->insert_before(g);
}
int bc_finalizer::run() {
run_on(sh.root);
@@ -46,22 +58,15 @@ int bc_finalizer::run() {
for (regions_vec::reverse_iterator I = rv.rbegin(), E = rv.rend(); I != E;
++I) {
region_node *r = *I;
bool is_if = false;
assert(r);
assert(r->first);
if (r->first->is_container()) {
container_node *repdep1 = static_cast<container_node*>(r->first);
assert(repdep1->is_depart() || repdep1->is_repeat());
if_node *n_if = static_cast<if_node*>(repdep1->first);
if (n_if && n_if->is_if())
is_if = true;
}
bool loop = r->is_loop();
if (is_if)
finalize_if(r);
else
if (loop)
finalize_loop(r);
else
finalize_if(r);
r->expand();
}
@@ -117,35 +122,20 @@ int bc_finalizer::run() {
void bc_finalizer::finalize_loop(region_node* r) {
update_nstack(r);
cf_node *loop_start = sh.create_cf(CF_OP_LOOP_START_DX10);
cf_node *loop_end = sh.create_cf(CF_OP_LOOP_END);
bool has_instr = false;
if (!r->is_loop()) {
for (depart_vec::iterator I = r->departs.begin(), E = r->departs.end();
I != E; ++I) {
depart_node *dep = *I;
if (!dep->empty()) {
has_instr = true;
break;
}
}
} else
has_instr = true;
if (has_instr) {
loop_start->jump_after(loop_end);
loop_end->jump_after(loop_start);
}
loop_start->jump_after(loop_end);
loop_end->jump_after(loop_start);
for (depart_vec::iterator I = r->departs.begin(), E = r->departs.end();
I != E; ++I) {
depart_node *dep = *I;
if (has_instr) {
cf_node *loop_break = sh.create_cf(CF_OP_LOOP_BREAK);
loop_break->jump(loop_end);
dep->push_back(loop_break);
}
cf_node *loop_break = sh.create_cf(CF_OP_LOOP_BREAK);
loop_break->jump(loop_end);
dep->push_back(loop_break);
dep->expand();
}
@@ -161,10 +151,8 @@ void bc_finalizer::finalize_loop(region_node* r) {
rep->expand();
}
if (has_instr) {
r->push_front(loop_start);
r->push_back(loop_end);
}
r->push_front(loop_start);
r->push_back(loop_end);
}
void bc_finalizer::finalize_if(region_node* r) {
@@ -235,12 +223,12 @@ void bc_finalizer::finalize_if(region_node* r) {
}
void bc_finalizer::run_on(container_node* c) {
node *prev_node = NULL;
for (node_iterator I = c->begin(), E = c->end(); I != E; ++I) {
node *n = *I;
if (n->is_alu_group()) {
finalize_alu_group(static_cast<alu_group_node*>(n));
finalize_alu_group(static_cast<alu_group_node*>(n), prev_node);
} else {
if (n->is_alu_clause()) {
cf_node *c = static_cast<cf_node*>(n);
@@ -275,17 +263,22 @@ void bc_finalizer::run_on(container_node* c) {
if (n->is_container())
run_on(static_cast<container_node*>(n));
}
prev_node = n;
}
}
void bc_finalizer::finalize_alu_group(alu_group_node* g) {
void bc_finalizer::finalize_alu_group(alu_group_node* g, node *prev_node) {
alu_node *last = NULL;
alu_group_node *prev_g = NULL;
bool add_nop = false;
if (prev_node && prev_node->is_alu_group()) {
prev_g = static_cast<alu_group_node*>(prev_node);
}
for (node_iterator I = g->begin(), E = g->end(); I != E; ++I) {
alu_node *n = static_cast<alu_node*>(*I);
unsigned slot = n->bc.slot;
value *d = n->dst.empty() ? NULL : n->dst[0];
if (d && d->is_special_reg()) {
@@ -323,17 +316,22 @@ void bc_finalizer::finalize_alu_group(alu_group_node* g) {
update_ngpr(n->bc.dst_gpr);
finalize_alu_src(g, n);
add_nop |= finalize_alu_src(g, n, prev_g);
last = n;
}
if (add_nop) {
if (sh.get_ctx().r6xx_gpr_index_workaround) {
insert_rv6xx_load_ar_workaround(g);
}
}
last->bc.last = 1;
}
void bc_finalizer::finalize_alu_src(alu_group_node* g, alu_node* a) {
bool bc_finalizer::finalize_alu_src(alu_group_node* g, alu_node* a, alu_group_node *prev) {
vvec &sv = a->src;
bool add_nop = false;
FBC_DUMP(
sblog << "finalize_alu_src: ";
dump::dump_op(a);
@@ -360,6 +358,15 @@ void bc_finalizer::finalize_alu_src(alu_group_node* g, alu_node* a) {
if (!v->rel->is_const()) {
src.rel = 1;
update_ngpr(v->array->gpr.sel() + v->array->array_size -1);
if (prev && !add_nop) {
for (node_iterator pI = prev->begin(), pE = prev->end(); pI != pE; ++pI) {
alu_node *pn = static_cast<alu_node*>(*pI);
if (pn->bc.dst_gpr == src.sel) {
add_nop = true;
break;
}
}
}
} else
src.rel = 0;
@@ -417,11 +424,23 @@ void bc_finalizer::finalize_alu_src(alu_group_node* g, alu_node* a) {
assert(!"unknown value kind");
break;
}
if (prev && !add_nop) {
for (node_iterator pI = prev->begin(), pE = prev->end(); pI != pE; ++pI) {
alu_node *pn = static_cast<alu_node*>(*pI);
if (pn->bc.dst_rel) {
if (pn->bc.dst_gpr == src.sel) {
add_nop = true;
break;
}
}
}
}
}
while (si < 3) {
a->bc.src[si++].sel = 0;
}
return add_nop;
}
void bc_finalizer::copy_fetch_src(fetch_node &dst, fetch_node &src, unsigned arg_start)

View File

@@ -758,6 +758,8 @@ int bc_parser::prepare_loop(cf_node* c) {
c->insert_before(reg);
rep->move(c, end->next);
reg->src_loop = true;
loop_stack.push(reg);
return 0;
}

View File

@@ -61,6 +61,8 @@ int sb_context::init(r600_isa *isa, sb_hw_chip chip, sb_hw_class cclass) {
uses_mova_gpr = is_r600() && chip != HW_CHIP_RV670;
r6xx_gpr_index_workaround = is_r600() && chip != HW_CHIP_RV670 && chip != HW_CHIP_RS780 && chip != HW_CHIP_RS880;
switch (chip) {
case HW_CHIP_RV610:
case HW_CHIP_RS780:

View File

@@ -115,13 +115,13 @@ void if_conversion::convert_kill_instructions(region_node *r,
bool if_conversion::check_and_convert(region_node *r) {
depart_node *nd1 = static_cast<depart_node*>(r->first);
if (!nd1->is_depart())
if (!nd1->is_depart() || nd1->target != r)
return false;
if_node *nif = static_cast<if_node*>(nd1->first);
if (!nif->is_if())
return false;
depart_node *nd2 = static_cast<depart_node*>(nif->first);
if (!nd2->is_depart())
if (!nd2->is_depart() || nd2->target != r)
return false;
value* &em = nif->cond;

View File

@@ -1089,7 +1089,8 @@ typedef std::vector<repeat_node*> repeat_vec;
class region_node : public container_node {
protected:
region_node(unsigned id) : container_node(NT_REGION, NST_LIST), region_id(id),
loop_phi(), phi(), vars_defined(), departs(), repeats() {}
loop_phi(), phi(), vars_defined(), departs(), repeats(), src_loop()
{}
public:
unsigned region_id;
@@ -1101,12 +1102,16 @@ public:
depart_vec departs;
repeat_vec repeats;
// true if region was created for loop in the parser, sometimes repeat_node
// may be optimized away so we need to remember this information
bool src_loop;
virtual bool accept(vpass &p, bool enter);
unsigned dep_count() { return departs.size(); }
unsigned rep_count() { return repeats.size() + 1; }
bool is_loop() { return !repeats.empty(); }
bool is_loop() { return src_loop || !repeats.empty(); }
container_node* get_entry_code_location() {
node *p = first;

View File

@@ -695,8 +695,9 @@ public:
void run_on(container_node *c);
void finalize_alu_group(alu_group_node *g);
void finalize_alu_src(alu_group_node *g, alu_node *a);
void insert_rv6xx_load_ar_workaround(alu_group_node *b4);
void finalize_alu_group(alu_group_node *g, node *prev_node);
bool finalize_alu_src(alu_group_node *g, alu_node *a, alu_group_node *prev_node);
void emit_set_grad(fetch_node* f);
void finalize_fetch(fetch_node *f);

View File

@@ -1527,6 +1527,9 @@ bool post_scheduler::check_copy(node *n) {
if (!s->is_prealloc()) {
recolor_local(s);
if (!s->chunk || s->chunk != d->chunk)
return false;
}
if (s->gpr == d->gpr) {

View File

@@ -294,6 +294,7 @@ struct r600_so_target {
/* The buffer where BUFFER_FILLED_SIZE is stored. */
struct r600_resource *buf_filled_size;
unsigned buf_filled_size_offset;
bool buf_filled_size_valid;
unsigned stride_in_dw;
};

View File

@@ -237,7 +237,7 @@ static void r600_emit_streamout_begin(struct r600_common_context *rctx, struct r
}
}
if (rctx->streamout.append_bitmask & (1 << i)) {
if (rctx->streamout.append_bitmask & (1 << i) && t[i]->buf_filled_size_valid) {
uint64_t va = t[i]->buf_filled_size->gpu_address +
t[i]->buf_filled_size_offset;
@@ -302,6 +302,8 @@ void r600_emit_streamout_end(struct r600_common_context *rctx)
* buffer bound. This ensures that the primitives-emitted query
* won't increment. */
r600_write_context_reg(cs, R_028AD0_VGT_STRMOUT_BUFFER_SIZE_0 + 16*i, 0);
t[i]->buf_filled_size_valid = true;
}
rctx->streamout.begin_emitted = false;

View File

@@ -80,10 +80,6 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned type)
sprintf(Str, "%1d", llvm_type);
LLVMAddTargetDependentFunctionAttr(F, "ShaderType", Str);
if (type != TGSI_PROCESSOR_COMPUTE) {
LLVMAddTargetDependentFunctionAttr(F, "unsafe-fp-math", "true");
}
}
static void init_r600_target()

View File

@@ -748,7 +748,7 @@ static void txp_fetch_args(
const struct tgsi_full_instruction * inst = emit_data->inst;
LLVMValueRef src_w;
unsigned chan;
LLVMValueRef coords[4];
LLVMValueRef coords[5];
emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
src_w = lp_build_emit_fetch(bld_base, emit_data->inst, 0, TGSI_CHAN_W);

View File

@@ -228,14 +228,14 @@ static LLVMValueRef get_instance_index_for_fetch(
LLVMValueRef result = LLVMGetParam(radeon_bld->main_fn,
si_shader_ctx->param_instance_id);
result = LLVMBuildAdd(gallivm->builder, result, LLVMGetParam(
radeon_bld->main_fn, SI_PARAM_START_INSTANCE), "");
/* The division must be done before START_INSTANCE is added. */
if (divisor > 1)
result = LLVMBuildUDiv(gallivm->builder, result,
lp_build_const_int32(gallivm, divisor), "");
return result;
return LLVMBuildAdd(gallivm->builder, result, LLVMGetParam(
radeon_bld->main_fn, SI_PARAM_START_INSTANCE), "");
}
static void declare_input_vs(
@@ -590,8 +590,11 @@ static void declare_system_value(
break;
case TGSI_SEMANTIC_VERTEXID:
value = LLVMGetParam(radeon_bld->main_fn,
si_shader_ctx->param_vertex_id);
value = LLVMBuildAdd(gallivm->builder,
LLVMGetParam(radeon_bld->main_fn,
si_shader_ctx->param_vertex_id),
LLVMGetParam(radeon_bld->main_fn,
SI_PARAM_BASE_VERTEX), "");
break;
case TGSI_SEMANTIC_SAMPLEID:
@@ -1502,7 +1505,7 @@ static void tex_fetch_args(
const struct tgsi_full_instruction * inst = emit_data->inst;
unsigned opcode = inst->Instruction.Opcode;
unsigned target = inst->Texture.Texture;
LLVMValueRef coords[4];
LLVMValueRef coords[5];
LLVMValueRef address[16];
int ref_pos;
unsigned num_coords = tgsi_util_get_texture_coord_dim(target, &ref_pos);

View File

@@ -697,12 +697,16 @@ static void si_delete_rs_state(struct pipe_context *ctx, void *state)
*/
static void si_update_dsa_stencil_ref(struct si_context *sctx)
{
struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
struct si_pm4_state *pm4;
struct pipe_stencil_ref *ref = &sctx->stencil_ref;
struct si_state_dsa *dsa = sctx->queued.named.dsa;
struct si_state_dsa *dsa = sctx->queued.named.dsa;
if (pm4 == NULL)
return;
if (!dsa)
return;
pm4 = CALLOC_STRUCT(si_pm4_state);
if (pm4 == NULL)
return;
si_pm4_set_reg(pm4, R_028430_DB_STENCILREFMASK,
S_028430_STENCILTESTVAL(ref->ref_value[0]) |
@@ -3281,8 +3285,10 @@ void si_init_config(struct si_context *sctx)
break;
}
/* Always use the default config when all backends are enabled. */
if (rb_mask && util_bitcount(rb_mask) >= num_rb) {
/* Always use the default config when all backends are enabled
* (or when we failed to determine the enabled backends).
*/
if (!rb_mask || util_bitcount(rb_mask) >= num_rb) {
si_pm4_set_reg(pm4, R_028350_PA_SC_RASTER_CONFIG,
raster_config);
} else {

View File

@@ -544,9 +544,11 @@ bcolor:
}
}
if (j == vsinfo->num_outputs) {
/* No corresponding output found, load defaults into input */
tmp |= S_028644_OFFSET(0x20);
if (j == vsinfo->num_outputs && !G_028644_PT_SPRITE_TEX(tmp)) {
/* No corresponding output found, load defaults into input.
* Don't set any other bits.
* (FLAT_SHADE=1 completely changes behavior) */
tmp = S_028644_OFFSET(0x20);
}
si_pm4_set_reg(pm4,

View File

@@ -302,6 +302,9 @@ NineAdapter9_CheckDeviceFormat( struct NineAdapter9 *This,
return D3DERR_NOTAVAILABLE;
}
/* we support ATI1 and ATI2 hack only for 2D textures */
if (RType != D3DRTYPE_TEXTURE && (CheckFormat == D3DFMT_ATI1 || CheckFormat == D3DFMT_ATI2))
return D3DERR_NOTAVAILABLE;
/* if (Usage & D3DUSAGE_NONSECURE) { don't know the implications of this } */
/* if (Usage & D3DUSAGE_SOFTWAREPROCESSING) { we can always support this } */
@@ -549,7 +552,7 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
D3DPMISCCAPS_CULLCCW |
D3DPMISCCAPS_COLORWRITEENABLE |
D3DPMISCCAPS_CLIPPLANESCALEDPOINTS |
D3DPMISCCAPS_CLIPTLVERTS |
/*D3DPMISCCAPS_CLIPTLVERTS |*/
D3DPMISCCAPS_TSSARGTEMP |
D3DPMISCCAPS_BLENDOP |
D3DPIPECAP(INDEP_BLEND_ENABLE, D3DPMISCCAPS_INDEPENDENTWRITEMASKS) |
@@ -560,6 +563,8 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
D3DPIPECAP(MIXED_COLORBUFFER_FORMATS, D3DPMISCCAPS_MRTINDEPENDENTBITDEPTHS) |
D3DPMISCCAPS_MRTPOSTPIXELSHADERBLENDING |
/*D3DPMISCCAPS_FOGVERTEXCLAMPED*/0;
if (!screen->get_param(screen, PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION))
pCaps->PrimitiveMiscCaps |= D3DPMISCCAPS_CLIPTLVERTS;
pCaps->RasterCaps =
D3DPIPECAP(ANISOTROPIC_FILTER, D3DPRASTERCAPS_ANISOTROPY) |

View File

@@ -436,14 +436,21 @@ NineBaseTexture9_CreatePipeResource( struct NineBaseTexture9 *This,
return D3D_OK;
}
#define SWIZZLE_TO_REPLACE(s) (s == UTIL_FORMAT_SWIZZLE_0 || \
s == UTIL_FORMAT_SWIZZLE_1 || \
s == UTIL_FORMAT_SWIZZLE_NONE)
HRESULT
NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 *This,
const int sRGB )
{
const struct util_format_description *desc;
struct pipe_context *pipe = This->pipe;
struct pipe_screen *screen = pipe->screen;
struct pipe_resource *resource = This->base.resource;
struct pipe_sampler_view templ;
enum pipe_format srgb_format;
unsigned i;
uint8_t swizzle[4];
DBG("This=%p sRGB=%d\n", This, sRGB);
@@ -452,6 +459,9 @@ NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 *This,
if (unlikely(This->format == D3DFMT_NULL))
return D3D_OK;
NineBaseTexture9_Dump(This);
/* hack due to incorrect POOL_MANAGED handling */
NineBaseTexture9_GenerateMipSubLevels(This);
resource = This->base.resource;
}
assert(resource);
@@ -463,25 +473,49 @@ NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 *This,
swizzle[3] = PIPE_SWIZZLE_ALPHA;
desc = util_format_description(resource->format);
if (desc->colorspace == UTIL_FORMAT_COLORSPACE_ZS) {
/* ZZZ1 -> 0Z01 (see end of docs/source/tgsi.rst)
* XXX: but it's wrong
swizzle[0] = PIPE_SWIZZLE_ZERO;
swizzle[2] = PIPE_SWIZZLE_ZERO; */
} else
if (desc->swizzle[0] == UTIL_FORMAT_SWIZZLE_X &&
desc->swizzle[3] == UTIL_FORMAT_SWIZZLE_1) {
/* R001/RG01 -> R111/RG11 */
if (desc->swizzle[1] == UTIL_FORMAT_SWIZZLE_0)
swizzle[1] = PIPE_SWIZZLE_ONE;
if (desc->swizzle[2] == UTIL_FORMAT_SWIZZLE_0)
swizzle[2] = PIPE_SWIZZLE_ONE;
/* msdn doc is incomplete here and wrong.
* The only formats that can be read directly here
* are DF16, DF24 and INTZ.
* Tested on win the swizzle is
* R = depth, G = B = 0, A = 1 for DF16 and DF24
* R = G = B = A = depth for INTZ
* For the other ZS formats that can't be read directly
* but can be used as shadow map, the result is duplicated on
* all channel */
if (This->format == D3DFMT_DF16 ||
This->format == D3DFMT_DF24) {
swizzle[1] = PIPE_SWIZZLE_ZERO;
swizzle[2] = PIPE_SWIZZLE_ZERO;
swizzle[3] = PIPE_SWIZZLE_ONE;
} else {
swizzle[1] = PIPE_SWIZZLE_RED;
swizzle[2] = PIPE_SWIZZLE_RED;
swizzle[3] = PIPE_SWIZZLE_RED;
}
} else if (resource->format != PIPE_FORMAT_A8_UNORM &&
resource->format != PIPE_FORMAT_RGTC1_UNORM) {
/* exceptions:
* A8 should have 0.0 as default values for RGB.
* ATI1/RGTC1 should be r 0 0 1 (tested on windows).
* It is already what gallium does. All the other ones
* should have 1.0 for non-defined values */
for (i = 0; i < 4; i++) {
if (SWIZZLE_TO_REPLACE(desc->swizzle[i]))
swizzle[i] = PIPE_SWIZZLE_ONE;
}
}
/* but 000A remains unchanged */
templ.format = sRGB ? util_format_srgb(resource->format) : resource->format;
/* if requested and supported, convert to the sRGB format */
srgb_format = util_format_srgb(resource->format);
if (sRGB && srgb_format != PIPE_FORMAT_NONE &&
screen->is_format_supported(screen, srgb_format,
resource->target, 0, resource->bind))
templ.format = srgb_format;
else
templ.format = resource->format;
templ.u.tex.first_layer = 0;
templ.u.tex.last_layer = (resource->target == PIPE_TEXTURE_CUBE) ?
5 : (This->base.info.depth0 - 1);
templ.u.tex.last_layer = resource->target == PIPE_TEXTURE_3D ?
resource->depth0 - 1 : resource->array_size - 1;
templ.u.tex.first_level = 0;
templ.u.tex.last_level = resource->last_level;
templ.swizzle_r = swizzle[0];

View File

@@ -38,6 +38,8 @@ NineCubeTexture9_ctor( struct NineCubeTexture9 *This,
HANDLE *pSharedHandle )
{
struct pipe_resource *info = &This->base.base.info;
struct pipe_screen *screen = pParams->device->screen;
enum pipe_format pf;
unsigned i;
D3DSURFACE_DESC sfdesc;
HRESULT hr;
@@ -55,9 +57,19 @@ NineCubeTexture9_ctor( struct NineCubeTexture9 *This,
if (Usage & D3DUSAGE_AUTOGENMIPMAP)
Levels = 0;
pf = d3d9_to_pipe_format(Format);
if (pf == PIPE_FORMAT_NONE ||
!screen->is_format_supported(screen, pf, PIPE_TEXTURE_CUBE, 0, PIPE_BIND_SAMPLER_VIEW)) {
return D3DERR_INVALIDCALL;
}
/* We support ATI1 and ATI2 hacks only for 2D textures */
if (Format == D3DFMT_ATI1 || Format == D3DFMT_ATI2)
return D3DERR_INVALIDCALL;
info->screen = pParams->device->screen;
info->target = PIPE_TEXTURE_CUBE;
info->format = d3d9_to_pipe_format(Format);
info->format = pf;
info->width0 = EdgeLength;
info->height0 = EdgeLength;
info->depth0 = 1;
@@ -146,7 +158,7 @@ NineCubeTexture9_GetLevelDesc( struct NineCubeTexture9 *This,
user_assert(Level == 0 || !(This->base.base.usage & D3DUSAGE_AUTOGENMIPMAP),
D3DERR_INVALIDCALL);
*pDesc = This->surfaces[Level]->desc;
*pDesc = This->surfaces[Level * 6]->desc;
return D3D_OK;
}

View File

@@ -62,7 +62,7 @@ NineDevice9_SetDefaultState( struct NineDevice9 *This, boolean is_reset )
assert(!This->is_recording);
nine_state_set_defaults(&This->state, &This->caps, is_reset);
nine_state_set_defaults(This, &This->caps, is_reset);
This->state.viewport.X = 0;
This->state.viewport.Y = 0;
@@ -109,7 +109,7 @@ NineDevice9_RestoreNonCSOState( struct NineDevice9 *This, unsigned mask )
cb.buffer = This->constbuf_vs;
cb.user_buffer = NULL;
}
cb.buffer_size = This->constbuf_vs->width0;
cb.buffer_size = This->vs_const_size;
pipe->set_constant_buffer(pipe, PIPE_SHADER_VERTEX, 0, &cb);
if (This->prefer_user_constbuf) {
@@ -117,7 +117,7 @@ NineDevice9_RestoreNonCSOState( struct NineDevice9 *This, unsigned mask )
} else {
cb.buffer = This->constbuf_ps;
}
cb.buffer_size = This->constbuf_ps->width0;
cb.buffer_size = This->ps_const_size;
pipe->set_constant_buffer(pipe, PIPE_SHADER_FRAGMENT, 0, &cb);
}
@@ -262,10 +262,14 @@ NineDevice9_ctor( struct NineDevice9 *This,
This->max_ps_const_f = max_const_ps -
(NINE_MAX_CONST_I + NINE_MAX_CONST_B / 4);
This->vs_const_size = max_const_vs * sizeof(float[4]);
This->ps_const_size = max_const_ps * sizeof(float[4]);
/* Include space for I,B constants for user constbuf. */
This->state.vs_const_f = CALLOC(NINE_MAX_CONST_ALL, sizeof(float[4]));
This->state.ps_const_f = CALLOC(NINE_MAX_CONST_ALL, sizeof(float[4]));
if (!This->state.vs_const_f || !This->state.ps_const_f)
This->state.vs_const_f = CALLOC(This->vs_const_size, 1);
This->state.ps_const_f = CALLOC(This->ps_const_size, 1);
This->state.vs_lconstf_temp = CALLOC(This->vs_const_size,1);
if (!This->state.vs_const_f || !This->state.ps_const_f ||
!This->state.vs_lconstf_temp)
return E_OUTOFMEMORY;
if (strstr(pScreen->get_name(pScreen), "AMD") ||
@@ -283,23 +287,16 @@ NineDevice9_ctor( struct NineDevice9 *This,
tmpl.bind = PIPE_BIND_CONSTANT_BUFFER;
tmpl.flags = 0;
tmpl.width0 = max_const_vs * sizeof(float[4]);
tmpl.width0 = This->vs_const_size;
This->constbuf_vs = pScreen->resource_create(pScreen, &tmpl);
tmpl.width0 = max_const_ps * sizeof(float[4]);
tmpl.width0 = This->ps_const_size;
This->constbuf_ps = pScreen->resource_create(pScreen, &tmpl);
if (!This->constbuf_vs || !This->constbuf_ps)
return E_OUTOFMEMORY;
}
This->vs_bool_true = pScreen->get_shader_param(pScreen,
PIPE_SHADER_VERTEX,
PIPE_SHADER_CAP_INTEGERS) ? 0xFFFFFFFF : fui(1.0f);
This->ps_bool_true = pScreen->get_shader_param(pScreen,
PIPE_SHADER_FRAGMENT,
PIPE_SHADER_CAP_INTEGERS) ? 0xFFFFFFFF : fui(1.0f);
/* Allocate upload helper for drivers that suck (from st pov ;). */
{
unsigned bind = 0;
@@ -314,6 +311,8 @@ NineDevice9_ctor( struct NineDevice9 *This,
}
This->driver_caps.window_space_position_support = GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION);
This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS);
This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS);
nine_ff_init(This); /* initialize fixed function code */
@@ -350,6 +349,7 @@ NineDevice9_dtor( struct NineDevice9 *This )
pipe_resource_reference(&This->constbuf_ps, NULL);
FREE(This->state.vs_const_f);
FREE(This->state.ps_const_f);
FREE(This->state.vs_lconstf_temp);
if (This->swapchains) {
for (i = 0; i < This->nswapchains; ++i)
@@ -2938,6 +2938,7 @@ NineDevice9_SetVertexShaderConstantI( struct NineDevice9 *This,
UINT Vector4iCount )
{
struct nine_state *state = This->update;
int i;
DBG("This=%p StartRegister=%u pConstantData=%p Vector4iCount=%u\n",
This, StartRegister, pConstantData, Vector4iCount);
@@ -2946,9 +2947,18 @@ NineDevice9_SetVertexShaderConstantI( struct NineDevice9 *This,
user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(&state->vs_const_i[StartRegister][0],
pConstantData,
Vector4iCount * sizeof(state->vs_const_i[0]));
if (This->driver_caps.vs_integer) {
memcpy(&state->vs_const_i[StartRegister][0],
pConstantData,
Vector4iCount * sizeof(state->vs_const_i[0]));
} else {
for (i = 0; i < Vector4iCount; i++) {
state->vs_const_i[StartRegister+i][0] = fui((float)(pConstantData[4*i]));
state->vs_const_i[StartRegister+i][1] = fui((float)(pConstantData[4*i+1]));
state->vs_const_i[StartRegister+i][2] = fui((float)(pConstantData[4*i+2]));
state->vs_const_i[StartRegister+i][3] = fui((float)(pConstantData[4*i+3]));
}
}
state->changed.vs_const_i |= ((1 << Vector4iCount) - 1) << StartRegister;
state->changed.group |= NINE_STATE_VS_CONST;
@@ -2963,14 +2973,24 @@ NineDevice9_GetVertexShaderConstantI( struct NineDevice9 *This,
UINT Vector4iCount )
{
const struct nine_state *state = &This->state;
int i;
user_assert(StartRegister < NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(pConstantData,
&state->vs_const_i[StartRegister][0],
Vector4iCount * sizeof(state->vs_const_i[0]));
if (This->driver_caps.vs_integer) {
memcpy(pConstantData,
&state->vs_const_i[StartRegister][0],
Vector4iCount * sizeof(state->vs_const_i[0]));
} else {
for (i = 0; i < Vector4iCount; i++) {
pConstantData[4*i] = (int32_t) uif(state->vs_const_i[StartRegister+i][0]);
pConstantData[4*i+1] = (int32_t) uif(state->vs_const_i[StartRegister+i][1]);
pConstantData[4*i+2] = (int32_t) uif(state->vs_const_i[StartRegister+i][2]);
pConstantData[4*i+3] = (int32_t) uif(state->vs_const_i[StartRegister+i][3]);
}
}
return D3D_OK;
}
@@ -2982,6 +3002,8 @@ NineDevice9_SetVertexShaderConstantB( struct NineDevice9 *This,
UINT BoolCount )
{
struct nine_state *state = This->update;
int i;
uint32_t bool_true = This->driver_caps.vs_integer ? 0xFFFFFFFF : fui(1.0f);
DBG("This=%p StartRegister=%u pConstantData=%p BoolCount=%u\n",
This, StartRegister, pConstantData, BoolCount);
@@ -2990,9 +3012,8 @@ NineDevice9_SetVertexShaderConstantB( struct NineDevice9 *This,
user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(&state->vs_const_b[StartRegister],
pConstantData,
BoolCount * sizeof(state->vs_const_b[0]));
for (i = 0; i < BoolCount; i++)
state->vs_const_b[StartRegister + i] = pConstantData[i] ? bool_true : 0;
state->changed.vs_const_b |= ((1 << BoolCount) - 1) << StartRegister;
state->changed.group |= NINE_STATE_VS_CONST;
@@ -3007,14 +3028,14 @@ NineDevice9_GetVertexShaderConstantB( struct NineDevice9 *This,
UINT BoolCount )
{
const struct nine_state *state = &This->state;
int i;
user_assert(StartRegister < NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(pConstantData,
&state->vs_const_b[StartRegister],
BoolCount * sizeof(state->vs_const_b[0]));
for (i = 0; i < BoolCount; i++)
pConstantData[i] = state->vs_const_b[StartRegister + i] != 0 ? TRUE : FALSE;
return D3D_OK;
}
@@ -3243,6 +3264,7 @@ NineDevice9_SetPixelShaderConstantI( struct NineDevice9 *This,
UINT Vector4iCount )
{
struct nine_state *state = This->update;
int i;
DBG("This=%p StartRegister=%u pConstantData=%p Vector4iCount=%u\n",
This, StartRegister, pConstantData, Vector4iCount);
@@ -3251,10 +3273,18 @@ NineDevice9_SetPixelShaderConstantI( struct NineDevice9 *This,
user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(&state->ps_const_i[StartRegister][0],
pConstantData,
Vector4iCount * sizeof(state->ps_const_i[0]));
if (This->driver_caps.ps_integer) {
memcpy(&state->ps_const_i[StartRegister][0],
pConstantData,
Vector4iCount * sizeof(state->ps_const_i[0]));
} else {
for (i = 0; i < Vector4iCount; i++) {
state->ps_const_i[StartRegister+i][0] = fui((float)(pConstantData[4*i]));
state->ps_const_i[StartRegister+i][1] = fui((float)(pConstantData[4*i+1]));
state->ps_const_i[StartRegister+i][2] = fui((float)(pConstantData[4*i+2]));
state->ps_const_i[StartRegister+i][3] = fui((float)(pConstantData[4*i+3]));
}
}
state->changed.ps_const_i |= ((1 << Vector4iCount) - 1) << StartRegister;
state->changed.group |= NINE_STATE_PS_CONST;
@@ -3268,14 +3298,24 @@ NineDevice9_GetPixelShaderConstantI( struct NineDevice9 *This,
UINT Vector4iCount )
{
const struct nine_state *state = &This->state;
int i;
user_assert(StartRegister < NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(pConstantData,
&state->ps_const_i[StartRegister][0],
Vector4iCount * sizeof(state->ps_const_i[0]));
if (This->driver_caps.ps_integer) {
memcpy(pConstantData,
&state->ps_const_i[StartRegister][0],
Vector4iCount * sizeof(state->ps_const_i[0]));
} else {
for (i = 0; i < Vector4iCount; i++) {
pConstantData[4*i] = (int32_t) uif(state->ps_const_i[StartRegister+i][0]);
pConstantData[4*i+1] = (int32_t) uif(state->ps_const_i[StartRegister+i][1]);
pConstantData[4*i+2] = (int32_t) uif(state->ps_const_i[StartRegister+i][2]);
pConstantData[4*i+3] = (int32_t) uif(state->ps_const_i[StartRegister+i][3]);
}
}
return D3D_OK;
}
@@ -3287,6 +3327,8 @@ NineDevice9_SetPixelShaderConstantB( struct NineDevice9 *This,
UINT BoolCount )
{
struct nine_state *state = This->update;
int i;
uint32_t bool_true = This->driver_caps.ps_integer ? 0xFFFFFFFF : fui(1.0f);
DBG("This=%p StartRegister=%u pConstantData=%p BoolCount=%u\n",
This, StartRegister, pConstantData, BoolCount);
@@ -3295,9 +3337,8 @@ NineDevice9_SetPixelShaderConstantB( struct NineDevice9 *This,
user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(&state->ps_const_b[StartRegister],
pConstantData,
BoolCount * sizeof(state->ps_const_b[0]));
for (i = 0; i < BoolCount; i++)
state->ps_const_b[StartRegister + i] = pConstantData[i] ? bool_true : 0;
state->changed.ps_const_b |= ((1 << BoolCount) - 1) << StartRegister;
state->changed.group |= NINE_STATE_PS_CONST;
@@ -3312,14 +3353,14 @@ NineDevice9_GetPixelShaderConstantB( struct NineDevice9 *This,
UINT BoolCount )
{
const struct nine_state *state = &This->state;
int i;
user_assert(StartRegister < NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(pConstantData,
&state->ps_const_b[StartRegister],
BoolCount * sizeof(state->ps_const_b[0]));
for (i = 0; i < BoolCount; i++)
pConstantData[i] = state->ps_const_b[StartRegister + i] ? TRUE : FALSE;
return D3D_OK;
}

View File

@@ -77,10 +77,10 @@ struct NineDevice9
struct pipe_resource *constbuf_vs;
struct pipe_resource *constbuf_ps;
uint16_t vs_const_size;
uint16_t ps_const_size;
uint16_t max_vs_const_f;
uint16_t max_ps_const_f;
uint32_t vs_bool_true;
uint32_t ps_bool_true;
struct gen_mipmap_state *gen_mipmap;
@@ -111,6 +111,8 @@ struct NineDevice9
boolean user_vbufs;
boolean user_ibufs;
boolean window_space_position_support;
boolean vs_integer;
boolean ps_integer;
} driver_caps;
struct u_upload_mgr *upload;

View File

@@ -1151,10 +1151,10 @@ ps_do_ts_op(struct ps_build_ctx *ps, unsigned top, struct ureg_dst dst, struct u
ureg_MUL(ureg, ureg_saturate(dst), ureg_src(tmp), ureg_imm4f(ureg,4.0,4.0,4.0,4.0));
break;
case D3DTOP_MULTIPLYADD:
ureg_MAD(ureg, dst, arg[2], arg[0], arg[1]);
ureg_MAD(ureg, dst, arg[1], arg[2], arg[0]);
break;
case D3DTOP_LERP:
ureg_LRP(ureg, dst, arg[1], arg[2], arg[0]);
ureg_LRP(ureg, dst, arg[0], arg[1], arg[2]);
break;
case D3DTOP_DISABLE:
/* no-op ? */
@@ -1278,6 +1278,8 @@ nine_ff_build_ps(struct NineDevice9 *device, struct nine_ff_ps_key *key)
(key->ts[0].resultarg != 0 /* not current */ ||
key->ts[0].colorop == D3DTOP_DISABLE ||
key->ts[0].alphaop == D3DTOP_DISABLE ||
key->ts[0].colorop == D3DTOP_BLENDCURRENTALPHA ||
key->ts[0].alphaop == D3DTOP_BLENDCURRENTALPHA ||
key->ts[0].colorarg0 == D3DTA_CURRENT ||
key->ts[0].colorarg1 == D3DTA_CURRENT ||
key->ts[0].colorarg2 == D3DTA_CURRENT ||

View File

@@ -185,6 +185,8 @@ d3d9_to_pipe_format(D3DFORMAT format)
case D3DFMT_DXT3: return PIPE_FORMAT_DXT3_RGBA;
case D3DFMT_DXT4: return PIPE_FORMAT_DXT5_RGBA; /* XXX */
case D3DFMT_DXT5: return PIPE_FORMAT_DXT5_RGBA;
case D3DFMT_ATI1: return PIPE_FORMAT_RGTC1_UNORM;
case D3DFMT_ATI2: return PIPE_FORMAT_RGTC2_UNORM;
case D3DFMT_UYVY: return PIPE_FORMAT_UYVY;
case D3DFMT_YUY2: return PIPE_FORMAT_YUYV; /* XXX check */
case D3DFMT_NV12: return PIPE_FORMAT_NV12;
@@ -249,6 +251,8 @@ d3dformat_to_string(D3DFORMAT fmt)
case D3DFMT_DXT3: return "D3DFMT_DXT3";
case D3DFMT_DXT4: return "D3DFMT_DXT4";
case D3DFMT_DXT5: return "D3DFMT_DXT5";
case D3DFMT_ATI1: return "D3DFMT_ATI1";
case D3DFMT_ATI2: return "D3DFMT_ATI2";
case D3DFMT_D16_LOCKABLE: return "D3DFMT_D16_LOCKABLE";
case D3DFMT_D32: return "D3DFMT_D32";
case D3DFMT_D15S1: return "D3DFMT_D15S1";
@@ -279,6 +283,7 @@ d3dformat_to_string(D3DFORMAT fmt)
case D3DFMT_DF16: return "D3DFMT_DF16";
case D3DFMT_DF24: return "D3DFMT_DF24";
case D3DFMT_INTZ: return "D3DFMT_INTZ";
case D3DFMT_NVDB: return "D3DFMT_NVDB";
case D3DFMT_NULL: return "D3DFMT_NULL";
default:
break;

View File

@@ -35,11 +35,6 @@
#define DBG_CHANNEL DBG_SHADER
#if 1
#define NINE_TGSI_LAZY_DEVS /* don't use TGSI_OPCODE_BREAKC */
#endif
#define NINE_TGSI_LAZY_R600 /* don't use TGSI_OPCODE_DP2A */
#define DUMP(args...) _nine_debug_printf(DBG_CHANNEL, NULL, args)
@@ -471,14 +466,14 @@ struct shader_translator
struct ureg_src vFace;
struct ureg_src s;
struct ureg_dst p;
struct ureg_dst a;
struct ureg_dst address;
struct ureg_dst a0;
struct ureg_dst tS[8]; /* texture stage registers */
struct ureg_dst tdst; /* scratch dst if we need extra modifiers */
struct ureg_dst t[5]; /* scratch TEMPs */
struct ureg_src vC[2]; /* PS color in */
struct ureg_src vT[8]; /* PS texcoord in */
struct ureg_dst rL[NINE_MAX_LOOP_DEPTH]; /* loop ctr */
struct ureg_dst aL[NINE_MAX_LOOP_DEPTH]; /* loop ctr ADDR register */
} regs;
unsigned num_temp; /* Elements(regs.r) */
unsigned num_scratch;
@@ -487,6 +482,7 @@ struct shader_translator
unsigned cond_depth;
unsigned loop_labels[NINE_MAX_LOOP_DEPTH];
unsigned cond_labels[NINE_MAX_COND_DEPTH];
boolean loop_or_rep[NINE_MAX_LOOP_DEPTH]; /* true: loop, false: rep */
unsigned *inst_labels; /* LABEL op */
unsigned num_inst_labels;
@@ -664,8 +660,10 @@ static INLINE void
tx_addr_alloc(struct shader_translator *tx, INT idx)
{
assert(idx == 0);
if (ureg_dst_is_undef(tx->regs.a))
tx->regs.a = ureg_DECL_address(tx->ureg);
if (ureg_dst_is_undef(tx->regs.address))
tx->regs.address = ureg_DECL_address(tx->ureg);
if (ureg_dst_is_undef(tx->regs.a0))
tx->regs.a0 = ureg_DECL_temporary(tx->ureg);
}
static INLINE void
@@ -707,7 +705,7 @@ tx_endloop(struct shader_translator *tx)
}
static struct ureg_dst
tx_get_loopctr(struct shader_translator *tx)
tx_get_loopctr(struct shader_translator *tx, boolean loop_or_rep)
{
const unsigned l = tx->loop_depth - 1;
@@ -717,26 +715,32 @@ tx_get_loopctr(struct shader_translator *tx)
return ureg_dst_undef();
}
if (ureg_dst_is_undef(tx->regs.aL[l]))
{
struct ureg_dst rreg = ureg_DECL_local_temporary(tx->ureg);
struct ureg_dst areg = ureg_DECL_address(tx->ureg);
unsigned c;
assert(l % 4 == 0);
for (c = l; c < (l + 4) && c < Elements(tx->regs.aL); ++c) {
tx->regs.rL[c] = ureg_writemask(rreg, 1 << (c & 3));
tx->regs.aL[c] = ureg_writemask(areg, 1 << (c & 3));
}
if (ureg_dst_is_undef(tx->regs.rL[l])) {
/* loop or rep ctr creation */
tx->regs.rL[l] = ureg_DECL_local_temporary(tx->ureg);
tx->loop_or_rep[l] = loop_or_rep;
}
/* loop - rep - endloop - endrep not allowed */
assert(tx->loop_or_rep[l] == loop_or_rep);
return tx->regs.rL[l];
}
static struct ureg_dst
tx_get_aL(struct shader_translator *tx)
static struct ureg_src
tx_get_loopal(struct shader_translator *tx)
{
if (!ureg_dst_is_undef(tx_get_loopctr(tx)))
return tx->regs.aL[tx->loop_depth - 1];
return ureg_dst_undef();
int loop_level = tx->loop_depth - 1;
while (loop_level >= 0) {
/* handle loop - rep - endrep - endloop case */
if (tx->loop_or_rep[loop_level])
/* the value is in the loop counter y component (nine implementation) */
return ureg_scalar(ureg_src(tx->regs.rL[loop_level]), TGSI_SWIZZLE_Y);
loop_level--;
}
DBG("aL counter requested outside of loop\n");
return ureg_src_undef();
}
static INLINE unsigned *
@@ -787,8 +791,12 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
case D3DSPR_ADDR:
assert(!param->rel);
if (IS_VS) {
tx_addr_alloc(tx, param->idx);
src = ureg_src(tx->regs.a);
assert(param->idx == 0);
/* the address register (vs only) must be
* assigned before use */
assert(!ureg_dst_is_undef(tx->regs.a0));
ureg_ARR(ureg, tx->regs.address, ureg_src(tx->regs.a0));
src = ureg_src(tx->regs.address);
} else {
if (tx->version.major < 2 && tx->version.minor < 4) {
/* no subroutines, so should be defined */
@@ -827,6 +835,7 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
src = ureg_src_register(TGSI_FILE_SAMPLER, param->idx);
break;
case D3DSPR_CONST:
assert(!param->rel || IS_VS);
if (param->rel)
tx->indirect_const_access = TRUE;
if (param->rel || !tx_lconstf(tx, &src, param->idx)) {
@@ -834,6 +843,13 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
nine_info_mark_const_f_used(tx->info, param->idx);
src = ureg_src_register(TGSI_FILE_CONSTANT, param->idx);
}
if (!IS_VS && tx->version.major < 2) {
/* ps 1.X clamps constants */
tmp = tx_scratch(tx);
ureg_MIN(ureg, tmp, src, ureg_imm1f(ureg, 1.0f));
ureg_MAX(ureg, tmp, ureg_src(tmp), ureg_imm1f(ureg, -1.0f));
src = ureg_src(tmp);
}
break;
case D3DSPR_CONST2:
case D3DSPR_CONST3:
@@ -843,26 +859,33 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
src = ureg_imm1f(ureg, 0.0f);
break;
case D3DSPR_CONSTINT:
if (param->rel || !tx_lconsti(tx, &src, param->idx)) {
if (!param->rel)
nine_info_mark_const_i_used(tx->info, param->idx);
/* relative adressing only possible for float constants in vs */
assert(!param->rel);
if (!tx_lconsti(tx, &src, param->idx)) {
nine_info_mark_const_i_used(tx->info, param->idx);
src = ureg_src_register(TGSI_FILE_CONSTANT,
tx->info->const_i_base + param->idx);
}
break;
case D3DSPR_CONSTBOOL:
if (param->rel || !tx_lconstb(tx, &src, param->idx)) {
assert(!param->rel);
if (!tx_lconstb(tx, &src, param->idx)) {
char r = param->idx / 4;
char s = param->idx & 3;
if (!param->rel)
nine_info_mark_const_b_used(tx->info, param->idx);
nine_info_mark_const_b_used(tx->info, param->idx);
src = ureg_src_register(TGSI_FILE_CONSTANT,
tx->info->const_b_base + r);
src = ureg_swizzle(src, s, s, s, s);
}
break;
case D3DSPR_LOOP:
src = tx_src_scalar(tx_get_aL(tx));
if (ureg_dst_is_undef(tx->regs.address))
tx->regs.address = ureg_DECL_address(ureg);
if (!tx->native_integers)
ureg_ARR(ureg, tx->regs.address, tx_get_loopal(tx));
else
ureg_UARL(ureg, tx->regs.address, tx_get_loopal(tx));
src = ureg_src(tx->regs.address);
break;
case D3DSPR_MISCTYPE:
switch (param->idx) {
@@ -904,6 +927,25 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
if (param->rel)
src = ureg_src_indirect(src, tx_src_param(tx, param->rel));
switch (param->mod) {
case NINED3DSPSM_DW:
tmp = tx_scratch(tx);
/* NOTE: app is not allowed to read w with this modifier */
ureg_RCP(ureg, ureg_writemask(tmp, NINED3DSP_WRITEMASK_3), src);
ureg_MUL(ureg, tmp, src, ureg_swizzle(ureg_src(tmp), NINE_SWIZZLE4(W,W,W,W)));
src = ureg_src(tmp);
break;
case NINED3DSPSM_DZ:
tmp = tx_scratch(tx);
/* NOTE: app is not allowed to read z with this modifier */
ureg_RCP(ureg, ureg_writemask(tmp, NINED3DSP_WRITEMASK_2), src);
ureg_MUL(ureg, tmp, src, ureg_swizzle(ureg_src(tmp), NINE_SWIZZLE4(Z,Z,Z,Z)));
src = ureg_src(tmp);
break;
default:
break;
}
if (param->swizzle != NINED3DSP_NOSWIZZLE)
src = ureg_swizzle(src,
(param->swizzle >> 0) & 0x3,
@@ -946,7 +988,7 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
break;
case NINED3DSPSM_DZ:
case NINED3DSPSM_DW:
/* handled in instruction */
/* Already handled*/
break;
case NINED3DSPSM_SIGN:
tmp = tx_scratch(tx);
@@ -1001,7 +1043,7 @@ _tx_dst_param(struct shader_translator *tx, const struct sm1_dst_param *param)
dst = ureg_dst(tx->regs.vT[param->idx]);
} else {
tx_addr_alloc(tx, param->idx);
dst = tx->regs.a;
dst = tx->regs.a0;
}
break;
case D3DSPR_RASTOUT:
@@ -1016,13 +1058,13 @@ _tx_dst_param(struct shader_translator *tx, const struct sm1_dst_param *param)
case 1:
if (ureg_dst_is_undef(tx->regs.oFog))
tx->regs.oFog =
ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_FOG, 0);
ureg_saturate(ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_FOG, 0));
dst = tx->regs.oFog;
break;
case 2:
if (ureg_dst_is_undef(tx->regs.oPts))
tx->regs.oPts =
ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_PSIZE, 0);
ureg_saturate(ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_PSIZE, 0));
dst = tx->regs.oPts;
break;
default:
@@ -1163,16 +1205,19 @@ NineTranslateInstruction_Mkxn(struct shader_translator *tx, const unsigned k, co
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst;
struct ureg_src src[2];
struct sm1_src_param *src_mat = &tx->insn.src[1];
unsigned i;
dst = tx_dst_param(tx, &tx->insn.dst[0]);
src[0] = tx_src_param(tx, &tx->insn.src[0]);
src[1] = tx_src_param(tx, &tx->insn.src[1]);
for (i = 0; i < n; i++, src[1].Index++)
for (i = 0; i < n; i++)
{
const unsigned m = (1 << i);
src[1] = tx_src_param(tx, src_mat);
src_mat->idx++;
if (!(dst.WriteMask & m))
continue;
@@ -1329,7 +1374,7 @@ NineTranslateInstruction_Generic(struct shader_translator *);
DECL_SPECIAL(M4x4)
{
return NineTranslateInstruction_Mkxn(tx, 4, 3);
return NineTranslateInstruction_Mkxn(tx, 4, 4);
}
DECL_SPECIAL(M4x3)
@@ -1367,33 +1412,29 @@ DECL_SPECIAL(CND)
struct ureg_dst cgt;
struct ureg_src cnd;
if (tx->insn.coissue && tx->version.major == 1 && tx->version.minor < 4) {
/* the coissue flag was a tip for compilers to advise to
* execute two operations at the same time, in cases
* the two executions had same dst with different channels.
* It has no effect on current hw. However it seems CND
* is affected. The handling of this very specific case
* handled below mimick wine behaviour */
if (tx->insn.coissue && tx->version.major == 1 && tx->version.minor < 4 && tx->insn.dst[0].mask != NINED3DSP_WRITEMASK_3) {
ureg_MOV(tx->ureg,
dst, tx_src_param(tx, &tx->insn.src[1]));
return D3D_OK;
}
cnd = tx_src_param(tx, &tx->insn.src[0]);
#ifdef NINE_TGSI_LAZY_R600
cgt = tx_scratch(tx);
if (tx->version.major == 1 && tx->version.minor < 4) {
cgt.WriteMask = TGSI_WRITEMASK_W;
ureg_SGT(tx->ureg, cgt, cnd, ureg_imm1f(tx->ureg, 0.5f));
cnd = ureg_scalar(cnd, TGSI_SWIZZLE_W);
} else {
ureg_SGT(tx->ureg, cgt, cnd, ureg_imm1f(tx->ureg, 0.5f));
}
ureg_CMP(tx->ureg, dst,
tx_src_param(tx, &tx->insn.src[1]),
tx_src_param(tx, &tx->insn.src[2]), ureg_negate(cnd));
#else
if (tx->version.major == 1 && tx->version.minor < 4)
cnd = ureg_scalar(cnd, TGSI_SWIZZLE_W);
ureg_CND(tx->ureg, dst,
ureg_SGT(tx->ureg, cgt, cnd, ureg_imm1f(tx->ureg, 0.5f));
ureg_CMP(tx->ureg, dst, ureg_negate(ureg_src(cgt)),
tx_src_param(tx, &tx->insn.src[1]),
tx_src_param(tx, &tx->insn.src[2]), cnd);
#endif
tx_src_param(tx, &tx->insn.src[2]));
return D3D_OK;
}
@@ -1427,9 +1468,17 @@ DECL_SPECIAL(CALLNZ)
DECL_SPECIAL(MOV_vs1x)
{
if (tx->insn.dst[0].file == D3DSPR_ADDR) {
ureg_ARL(tx->ureg,
/* Implementation note: We don't write directly
* to the addr register, but to an intermediate
* float register.
* Contrary to the doc, when writing to ADDR here,
* the rounding is not to nearest, but to lowest
* (wine test).
* Since we use ARR next, substract 0.5. */
ureg_SUB(tx->ureg,
tx_dst_param(tx, &tx->insn.dst[0]),
tx_src_param(tx, &tx->insn.src[0]));
tx_src_param(tx, &tx->insn.src[0]),
ureg_imm1f(tx->ureg, 0.5f));
return D3D_OK;
}
return NineTranslateInstruction_Generic(tx);
@@ -1440,46 +1489,36 @@ DECL_SPECIAL(LOOP)
struct ureg_program *ureg = tx->ureg;
unsigned *label;
struct ureg_src src = tx_src_param(tx, &tx->insn.src[1]);
struct ureg_src iter = ureg_scalar(src, TGSI_SWIZZLE_X);
struct ureg_src init = ureg_scalar(src, TGSI_SWIZZLE_Y);
struct ureg_src step = ureg_scalar(src, TGSI_SWIZZLE_Z);
struct ureg_dst ctr;
struct ureg_dst tmp = tx_scratch_scalar(tx);
struct ureg_dst tmp;
struct ureg_src ctrx;
label = tx_bgnloop(tx);
ctr = tx_get_loopctr(tx);
ctr = tx_get_loopctr(tx, TRUE);
ctrx = ureg_scalar(ureg_src(ctr), TGSI_SWIZZLE_X);
ureg_MOV(tx->ureg, ctr, init);
/* src: num_iterations - start_value of al - step for al - 0 */
ureg_MOV(ureg, ctr, src);
ureg_BGNLOOP(tx->ureg, label);
if (tx->native_integers) {
/* we'll let the backend pull up that MAD ... */
ureg_UMAD(ureg, tmp, iter, step, init);
ureg_USEQ(ureg, tmp, ureg_src(ctr), tx_src_scalar(tmp));
#ifdef NINE_TGSI_LAZY_DEVS
ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
#endif
} else {
/* can't simply use SGE for precision because step might be negative */
ureg_MAD(ureg, tmp, iter, step, init);
ureg_SEQ(ureg, tmp, ureg_src(ctr), tx_src_scalar(tmp));
#ifdef NINE_TGSI_LAZY_DEVS
tmp = tx_scratch_scalar(tx);
/* Initially ctr.x contains the number of iterations.
* ctr.y will contain the updated value of al.
* We decrease ctr.x at the end of every iteration,
* and stop when it reaches 0. */
if (!tx->native_integers) {
/* case src and ctr contain floats */
/* to avoid precision issue, we stop when ctr <= 0.5 */
ureg_SGE(ureg, tmp, ureg_imm1f(ureg, 0.5f), ctrx);
ureg_IF(ureg, tx_src_scalar(tmp), tx_cond(tx));
#endif
} else {
/* case src and ctr contain integers */
ureg_ISGE(ureg, tmp, ureg_imm1i(ureg, 0), ctrx);
ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
}
#ifdef NINE_TGSI_LAZY_DEVS
ureg_BRK(ureg);
tx_endcond(tx);
ureg_ENDIF(ureg);
#else
ureg_BREAKC(ureg, tx_src_scalar(tmp));
#endif
if (tx->native_integers) {
ureg_UARL(ureg, tx_get_aL(tx), tx_src_scalar(ctr));
ureg_UADD(ureg, ctr, tx_src_scalar(ctr), step);
} else {
ureg_ARL(ureg, tx_get_aL(tx), tx_src_scalar(ctr));
ureg_ADD(ureg, ctr, tx_src_scalar(ctr), step);
}
return D3D_OK;
}
@@ -1491,6 +1530,25 @@ DECL_SPECIAL(RET)
DECL_SPECIAL(ENDLOOP)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst ctr = tx_get_loopctr(tx, TRUE);
struct ureg_dst dst_ctrx, dst_al;
struct ureg_src src_ctr, al_counter;
dst_ctrx = ureg_writemask(ctr, NINED3DSP_WRITEMASK_0);
dst_al = ureg_writemask(ctr, NINED3DSP_WRITEMASK_1);
src_ctr = ureg_src(ctr);
al_counter = ureg_scalar(src_ctr, TGSI_SWIZZLE_Z);
/* ctr.x -= 1
* ctr.y (aL) += step */
if (!tx->native_integers) {
ureg_ADD(ureg, dst_ctrx, src_ctr, ureg_imm1f(ureg, -1.0f));
ureg_ADD(ureg, dst_al, src_ctr, al_counter);
} else {
ureg_UADD(ureg, dst_ctrx, src_ctr, ureg_imm1i(ureg, -1));
ureg_UADD(ureg, dst_al, src_ctr, al_counter);
}
ureg_ENDLOOP(tx->ureg, tx_endloop(tx));
return D3D_OK;
}
@@ -1540,7 +1598,7 @@ DECL_SPECIAL(REP)
tx->native_integers ? ureg_imm1u(ureg, 0) : ureg_imm1f(ureg, 0.0f);
label = tx_bgnloop(tx);
ctr = tx_get_loopctr(tx);
ctr = tx_get_loopctr(tx, FALSE);
/* NOTE: rep must be constant, so we don't have to save the count */
assert(rep.File == TGSI_FILE_CONSTANT || rep.File == TGSI_FILE_IMMEDIATE);
@@ -1550,24 +1608,16 @@ DECL_SPECIAL(REP)
if (tx->native_integers)
{
ureg_USGE(ureg, tmp, tx_src_scalar(ctr), rep);
#ifdef NINE_TGSI_LAZY_DEVS
ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
#endif
}
else
{
ureg_SGE(ureg, tmp, tx_src_scalar(ctr), rep);
#ifdef NINE_TGSI_LAZY_DEVS
ureg_IF(ureg, tx_src_scalar(tmp), tx_cond(tx));
#endif
}
#ifdef NINE_TGSI_LAZY_DEVS
ureg_BRK(ureg);
tx_endcond(tx);
ureg_ENDIF(ureg);
#else
ureg_BREAKC(ureg, tx_src_scalar(tmp));
#endif
if (tx->native_integers) {
ureg_UADD(ureg, ctr, tx_src_scalar(ctr), ureg_imm1u(ureg, 1));
@@ -1645,14 +1695,10 @@ DECL_SPECIAL(BREAKC)
src[0] = tx_src_param(tx, &tx->insn.src[0]);
src[1] = tx_src_param(tx, &tx->insn.src[1]);
ureg_insn(tx->ureg, cmp_op, &tmp, 1, src, 2);
#ifdef NINE_TGSI_LAZY_DEVS
ureg_IF(tx->ureg, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), tx_cond(tx));
ureg_BRK(tx->ureg);
tx_endcond(tx);
ureg_ENDIF(tx->ureg);
#else
ureg_BREAKC(tx->ureg, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
#endif
return D3D_OK;
}
@@ -1958,21 +2004,55 @@ DECL_SPECIAL(DEFI)
return D3D_OK;
}
DECL_SPECIAL(POW)
{
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src[2] = {
tx_src_param(tx, &tx->insn.src[0]),
tx_src_param(tx, &tx->insn.src[1])
};
ureg_POW(tx->ureg, dst, ureg_abs(src[0]), src[1]);
return D3D_OK;
}
DECL_SPECIAL(RSQ)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
struct ureg_dst tmp = tx_scratch(tx);
ureg_RSQ(ureg, tmp, ureg_abs(src));
ureg_MIN(ureg, dst, ureg_imm1f(ureg, FLT_MAX), ureg_src(tmp));
return D3D_OK;
}
DECL_SPECIAL(LOG)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst tmp = tx_scratch_scalar(tx);
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
ureg_LG2(ureg, tmp, ureg_abs(src));
ureg_MAX(ureg, dst, ureg_imm1f(ureg, -FLT_MAX), tx_src_scalar(tmp));
return D3D_OK;
}
DECL_SPECIAL(NRM)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst tmp = tx_scratch_scalar(tx);
struct ureg_src nrm = tx_src_scalar(tmp);
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
ureg_DP3(ureg, tmp, src, src);
ureg_RSQ(ureg, tmp, nrm);
ureg_MUL(ureg, tx_dst_param(tx, &tx->insn.dst[0]), src, nrm);
ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), nrm);
ureg_MUL(ureg, dst, src, nrm);
return D3D_OK;
}
DECL_SPECIAL(DP2ADD)
{
#ifdef NINE_TGSI_LAZY_R600
struct ureg_dst tmp = tx_scratch_scalar(tx);
struct ureg_src dp2 = tx_src_scalar(tmp);
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
@@ -1986,9 +2066,6 @@ DECL_SPECIAL(DP2ADD)
ureg_ADD(tx->ureg, dst, src[2], dp2);
return D3D_OK;
#else
return NineTranslateInstruction_Generic(tx);
#endif
}
DECL_SPECIAL(TEXCOORD)
@@ -1997,9 +2074,9 @@ DECL_SPECIAL(TEXCOORD)
const unsigned s = tx->insn.dst[0].idx;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
if (ureg_src_is_undef(tx->regs.vT[s]))
tx->regs.vT[s] = ureg_DECL_fs_input(ureg, tx->texcoord_sn, s, TGSI_INTERPOLATE_PERSPECTIVE);
ureg_MOV(ureg, dst, tx->regs.vT[s]); /* XXX is this sufficient ? */
tx_texcoord_alloc(tx, s);
ureg_MOV(ureg, ureg_writemask(ureg_saturate(dst), TGSI_WRITEMASK_XYZ), tx->regs.vT[s]);
ureg_MOV(ureg, ureg_writemask(dst, TGSI_WRITEMASK_W), ureg_imm1f(tx->ureg, 1.0f));
return D3D_OK;
}
@@ -2007,12 +2084,12 @@ DECL_SPECIAL(TEXCOORD)
DECL_SPECIAL(TEXCOORD_ps14)
{
struct ureg_program *ureg = tx->ureg;
const unsigned s = tx->insn.src[0].idx;
struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
if (ureg_src_is_undef(tx->regs.vT[s]))
tx->regs.vT[s] = ureg_DECL_fs_input(ureg, tx->texcoord_sn, s, TGSI_INTERPOLATE_PERSPECTIVE);
ureg_MOV(ureg, dst, tx->regs.vT[s]); /* XXX is this sufficient ? */
assert(tx->insn.src[0].file == D3DSPR_TEXTURE);
ureg_MOV(ureg, dst, src);
return D3D_OK;
}
@@ -2046,22 +2123,62 @@ DECL_SPECIAL(TEXBEML)
DECL_SPECIAL(TEXREG2AR)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src sample;
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
sample = ureg_DECL_sampler(ureg, m);
tx->info->sampler_mask |= 1 << m;
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), ureg_swizzle(ureg_src(tx->regs.tS[n]), NINE_SWIZZLE4(W,X,X,X)), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXREG2GB)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src sample;
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
sample = ureg_DECL_sampler(ureg, m);
tx->info->sampler_mask |= 1 << m;
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), ureg_swizzle(ureg_src(tx->regs.tS[n]), NINE_SWIZZLE4(Y,Z,Z,Z)), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXM3x2PAD)
{
STUB(D3DERR_INVALIDCALL);
return D3D_OK; /* this is just padding */
}
DECL_SPECIAL(TEXM3x2TEX)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src sample;
const int m = tx->insn.dst[0].idx - 1;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
tx_texcoord_alloc(tx, m);
tx_texcoord_alloc(tx, m+1);
/* performs the matrix multiplication */
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], ureg_src(tx->regs.tS[n]));
sample = ureg_DECL_sampler(ureg, m + 1);
tx->info->sampler_mask |= 1 << (m + 1);
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 1), ureg_src(dst), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXM3x3PAD)
@@ -2071,61 +2188,180 @@ DECL_SPECIAL(TEXM3x3PAD)
DECL_SPECIAL(TEXM3x3SPEC)
{
STUB(D3DERR_INVALIDCALL);
}
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src E = tx_src_param(tx, &tx->insn.src[1]);
struct ureg_src sample;
struct ureg_dst tmp;
const int m = tx->insn.dst[0].idx - 2;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
DECL_SPECIAL(TEXM3x3VSPEC)
{
STUB(D3DERR_INVALIDCALL);
tx_texcoord_alloc(tx, m);
tx_texcoord_alloc(tx, m+1);
tx_texcoord_alloc(tx, m+2);
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Z), tx->regs.vT[m+2], ureg_src(tx->regs.tS[n]));
sample = ureg_DECL_sampler(ureg, m + 2);
tx->info->sampler_mask |= 1 << (m + 2);
tmp = ureg_writemask(tx_scratch(tx), TGSI_WRITEMASK_XYZ);
/* At this step, dst = N = (u', w', z').
* We want dst to be the texture sampled at (u'', w'', z''), with
* (u'', w'', z'') = 2 * (N.E / N.N) * N - E */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_src(dst), ureg_src(dst));
ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
/* at this step tmp.x = 1/N.N */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), ureg_src(dst), E);
/* at this step tmp.y = N.E */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y));
/* at this step tmp.x = N.E/N.N */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_imm1f(ureg, 2.0f));
ureg_MUL(ureg, tmp, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_src(dst));
/* at this step tmp.xyz = 2 * (N.E / N.N) * N */
ureg_SUB(ureg, tmp, ureg_src(tmp), E);
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(tmp), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXREG2RGB)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src sample;
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
sample = ureg_DECL_sampler(ureg, m);
tx->info->sampler_mask |= 1 << m;
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), ureg_src(tx->regs.tS[n]), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXDP3TEX)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_dst tmp;
struct ureg_src sample;
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
tx_texcoord_alloc(tx, m);
tmp = tx_scratch(tx);
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_MOV(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_YZ), ureg_imm1f(ureg, 0.0f));
sample = ureg_DECL_sampler(ureg, m);
tx->info->sampler_mask |= 1 << m;
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), ureg_src(tmp), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXM3x2DEPTH)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst tmp;
const int m = tx->insn.dst[0].idx - 1;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
tx_texcoord_alloc(tx, m);
tx_texcoord_alloc(tx, m+1);
tmp = tx_scratch(tx);
/* performs the matrix multiplication */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], ureg_src(tx->regs.tS[n]));
ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Z), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y));
/* tmp.x = 'z', tmp.y = 'w', tmp.z = 1/'w'. */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Z));
/* res = 'w' == 0 ? 1.0 : z/w */
ureg_CMP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_negate(ureg_abs(ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y))),
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_imm1f(ureg, 1.0f));
/* replace the depth for depth testing with the result */
tx->regs.oDepth = ureg_DECL_output_masked(ureg, TGSI_SEMANTIC_POSITION, 0, TGSI_WRITEMASK_Z);
ureg_MOV(ureg, tx->regs.oDepth, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
/* note that we write nothing to the destination, since it's disallowed to use it afterward */
return D3D_OK;
}
DECL_SPECIAL(TEXDP3)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
tx_texcoord_alloc(tx, m);
ureg_DP3(ureg, dst, tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
return D3D_OK;
}
DECL_SPECIAL(TEXM3x3)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src[4];
int s;
struct ureg_src sample;
struct ureg_dst E, tmp;
const int m = tx->insn.dst[0].idx - 2;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
for (s = m; s <= (m + 2); ++s) {
if (ureg_src_is_undef(tx->regs.vT[s]))
tx->regs.vT[s] = ureg_DECL_fs_input(ureg, tx->texcoord_sn, s, TGSI_INTERPOLATE_PERSPECTIVE);
src[s] = tx->regs.vT[s];
}
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), src[0], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), src[1], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Z), src[2], ureg_src(tx->regs.tS[n]));
tx_texcoord_alloc(tx, m);
tx_texcoord_alloc(tx, m+1);
tx_texcoord_alloc(tx, m+2);
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Z), tx->regs.vT[m+2], ureg_src(tx->regs.tS[n]));
switch (tx->insn.opcode) {
case D3DSIO_TEXM3x3:
ureg_MOV(ureg, ureg_writemask(dst, TGSI_WRITEMASK_W), ureg_imm1f(ureg, 1.0f));
break;
case D3DSIO_TEXM3x3TEX:
src[3] = ureg_DECL_sampler(ureg, m + 2);
sample = ureg_DECL_sampler(ureg, m + 2);
tx->info->sampler_mask |= 1 << (m + 2);
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(dst), src[3]);
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(dst), sample);
break;
case D3DSIO_TEXM3x3VSPEC:
sample = ureg_DECL_sampler(ureg, m + 2);
tx->info->sampler_mask |= 1 << (m + 2);
E = tx_scratch(tx);
tmp = ureg_writemask(tx_scratch(tx), TGSI_WRITEMASK_XYZ);
ureg_MOV(ureg, ureg_writemask(E, TGSI_WRITEMASK_X), ureg_scalar(tx->regs.vT[m], TGSI_SWIZZLE_W));
ureg_MOV(ureg, ureg_writemask(E, TGSI_WRITEMASK_Y), ureg_scalar(tx->regs.vT[m+1], TGSI_SWIZZLE_W));
ureg_MOV(ureg, ureg_writemask(E, TGSI_WRITEMASK_Z), ureg_scalar(tx->regs.vT[m+2], TGSI_SWIZZLE_W));
/* At this step, dst = N = (u', w', z').
* We want dst to be the texture sampled at (u'', w'', z''), with
* (u'', w'', z'') = 2 * (N.E / N.N) * N - E */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_src(dst), ureg_src(dst));
ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
/* at this step tmp.x = 1/N.N */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), ureg_src(dst), ureg_src(E));
/* at this step tmp.y = N.E */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y));
/* at this step tmp.x = N.E/N.N */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_imm1f(ureg, 2.0f));
ureg_MUL(ureg, tmp, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_src(dst));
/* at this step tmp.xyz = 2 * (N.E / N.N) * N */
ureg_SUB(ureg, tmp, ureg_src(tmp), ureg_src(E));
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(tmp), sample);
break;
default:
return D3DERR_INVALIDCALL;
@@ -2135,7 +2371,28 @@ DECL_SPECIAL(TEXM3x3)
DECL_SPECIAL(TEXDEPTH)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst r5;
struct ureg_src r5r, r5g;
assert(tx->insn.dst[0].idx == 5); /* instruction must get r5 here */
/* we must replace the depth by r5.g == 0 ? 1.0f : r5.r/r5.g.
* r5 won't be used afterward, thus we can use r5.ba */
r5 = tx->regs.r[5];
r5r = ureg_scalar(ureg_src(r5), TGSI_SWIZZLE_X);
r5g = ureg_scalar(ureg_src(r5), TGSI_SWIZZLE_Y);
ureg_RCP(ureg, ureg_writemask(r5, TGSI_WRITEMASK_Z), r5g);
ureg_MUL(ureg, ureg_writemask(r5, TGSI_WRITEMASK_X), r5r, ureg_scalar(ureg_src(r5), TGSI_SWIZZLE_Z));
/* r5.r = r/g */
ureg_CMP(ureg, ureg_writemask(r5, TGSI_WRITEMASK_X), ureg_negate(ureg_abs(r5g)),
r5r, ureg_imm1f(ureg, 1.0f));
/* replace the depth for depth testing with the result */
tx->regs.oDepth = ureg_DECL_output_masked(ureg, TGSI_SEMANTIC_POSITION, 0, TGSI_WRITEMASK_Z);
ureg_MOV(ureg, tx->regs.oDepth, r5r);
return D3D_OK;
}
DECL_SPECIAL(BEM)
@@ -2275,7 +2532,7 @@ struct sm1_op_info inst_table[] =
_OPI(MAD, MAD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 3, NULL), /* 4 */
_OPI(MUL, MUL, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 5 */
_OPI(RCP, RCP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 6 */
_OPI(RSQ, RSQ, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 7 */
_OPI(RSQ, RSQ, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(RSQ)), /* 7 */
_OPI(DP3, DP3, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 8 */
_OPI(DP4, DP4, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 9 */
_OPI(MIN, MIN, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 10 */
@@ -2283,7 +2540,7 @@ struct sm1_op_info inst_table[] =
_OPI(SLT, SLT, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 12 */
_OPI(SGE, SGE, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 13 */
_OPI(EXP, EX2, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 14 */
_OPI(LOG, LG2, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 15 */
_OPI(LOG, LG2, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(LOG)), /* 15 */
_OPI(LIT, LIT, V(0,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL), /* 16 */
_OPI(DST, DST, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 17 */
_OPI(LRP, LRP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 3, NULL), /* 18 */
@@ -2295,16 +2552,16 @@ struct sm1_op_info inst_table[] =
_OPI(M3x3, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(M3x3)),
_OPI(M3x2, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(M3x2)),
_OPI(CALL, CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(CALL)),
_OPI(CALLNZ, CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(CALLNZ)),
_OPI(CALL, CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 1, SPECIAL(CALL)),
_OPI(CALLNZ, CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 2, SPECIAL(CALLNZ)),
_OPI(LOOP, BGNLOOP, V(2,0), V(3,0), V(3,0), V(3,0), 0, 2, SPECIAL(LOOP)),
_OPI(RET, RET, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(RET)),
_OPI(ENDLOOP, ENDLOOP, V(2,0), V(3,0), V(3,0), V(3,0), 0, 0, SPECIAL(ENDLOOP)),
_OPI(LABEL, NOP, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(LABEL)),
_OPI(LABEL, NOP, V(2,0), V(3,0), V(2,1), V(3,0), 0, 1, SPECIAL(LABEL)),
_OPI(DCL, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 0, 0, SPECIAL(DCL)),
_OPI(POW, POW, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL),
_OPI(POW, POW, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(POW)),
_OPI(CRS, XPD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* XXX: .w */
_OPI(SGN, SSG, V(2,0), V(3,0), V(0,0), V(0,0), 1, 3, SPECIAL(SGN)), /* ignore src1,2 */
_OPI(ABS, ABS, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL),
@@ -2322,8 +2579,9 @@ struct sm1_op_info inst_table[] =
_OPI(ENDIF, ENDIF, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(ENDIF)),
_OPI(BREAK, BRK, V(2,1), V(3,0), V(2,1), V(3,0), 0, 0, NULL),
_OPI(BREAKC, BREAKC, V(2,1), V(3,0), V(2,1), V(3,0), 0, 2, SPECIAL(BREAKC)),
_OPI(MOVA, ARR, V(2,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
/* we don't write to the address register, but a normal register (copied
* when needed to the address register), thus we don't use ARR */
_OPI(MOVA, MOV, V(2,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
_OPI(DEFB, NOP, V(0,0), V(3,0) , V(0,0), V(3,0) , 1, 0, SPECIAL(DEFB)),
_OPI(DEFI, NOP, V(0,0), V(3,0) , V(0,0), V(3,0) , 1, 0, SPECIAL(DEFI)),
@@ -2334,42 +2592,42 @@ struct sm1_op_info inst_table[] =
_OPI(TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 0, SPECIAL(TEX)),
_OPI(TEX, TEX, V(0,0), V(0,0), V(1,4), V(1,4), 1, 1, SPECIAL(TEXLD_14)),
_OPI(TEX, TEX, V(0,0), V(0,0), V(2,0), V(3,0), 1, 2, SPECIAL(TEXLD)),
_OPI(TEXBEM, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXBEM)),
_OPI(TEXBEML, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXBEML)),
_OPI(TEXREG2AR, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXREG2AR)),
_OPI(TEXREG2GB, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXREG2GB)),
_OPI(TEXM3x2PAD, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x2PAD)),
_OPI(TEXM3x2TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x2TEX)),
_OPI(TEXM3x3PAD, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x3PAD)),
_OPI(TEXM3x3TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x3)),
_OPI(TEXM3x3SPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x3SPEC)),
_OPI(TEXM3x3VSPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x3VSPEC)),
_OPI(TEXBEM, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXBEM)),
_OPI(TEXBEML, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXBEML)),
_OPI(TEXREG2AR, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXREG2AR)),
_OPI(TEXREG2GB, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXREG2GB)),
_OPI(TEXM3x2PAD, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x2PAD)),
_OPI(TEXM3x2TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x2TEX)),
_OPI(TEXM3x3PAD, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x3PAD)),
_OPI(TEXM3x3TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x3)),
_OPI(TEXM3x3SPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 2, SPECIAL(TEXM3x3SPEC)),
_OPI(TEXM3x3VSPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x3)),
_OPI(EXPP, EXP, V(0,0), V(1,1), V(0,0), V(0,0), 1, 1, NULL),
_OPI(EXPP, EX2, V(2,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
_OPI(LOGP, LG2, V(0,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
_OPI(CND, CND, V(0,0), V(0,0), V(0,0), V(1,4), 1, 3, SPECIAL(CND)),
_OPI(LOGP, LG2, V(0,0), V(3,0), V(0,0), V(0,0), 1, 1, SPECIAL(LOG)),
_OPI(CND, NOP, V(0,0), V(0,0), V(0,0), V(1,4), 1, 3, SPECIAL(CND)),
_OPI(DEF, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 0, SPECIAL(DEF)),
/* More tex stuff */
_OPI(TEXREG2RGB, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, SPECIAL(TEXREG2RGB)),
_OPI(TEXDP3TEX, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, SPECIAL(TEXDP3TEX)),
_OPI(TEXM3x2DEPTH, TEX, V(0,0), V(0,0), V(1,3), V(1,3), 0, 0, SPECIAL(TEXM3x2DEPTH)),
_OPI(TEXDP3, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, SPECIAL(TEXDP3)),
_OPI(TEXM3x3, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, SPECIAL(TEXM3x3)),
_OPI(TEXDEPTH, TEX, V(0,0), V(0,0), V(1,4), V(1,4), 0, 0, SPECIAL(TEXDEPTH)),
_OPI(TEXREG2RGB, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, SPECIAL(TEXREG2RGB)),
_OPI(TEXDP3TEX, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, SPECIAL(TEXDP3TEX)),
_OPI(TEXM3x2DEPTH, TEX, V(0,0), V(0,0), V(1,3), V(1,3), 1, 1, SPECIAL(TEXM3x2DEPTH)),
_OPI(TEXDP3, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, SPECIAL(TEXDP3)),
_OPI(TEXM3x3, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, SPECIAL(TEXM3x3)),
_OPI(TEXDEPTH, TEX, V(0,0), V(0,0), V(1,4), V(1,4), 1, 0, SPECIAL(TEXDEPTH)),
/* Misc */
_OPI(CMP, CMP, V(0,0), V(0,0), V(1,2), V(3,0), 1, 3, SPECIAL(CMP)), /* reversed */
_OPI(BEM, NOP, V(0,0), V(0,0), V(1,4), V(1,4), 0, 0, SPECIAL(BEM)),
_OPI(DP2ADD, DP2A, V(0,0), V(0,0), V(2,0), V(3,0), 1, 3, SPECIAL(DP2ADD)), /* for radeons */
_OPI(BEM, NOP, V(0,0), V(0,0), V(1,4), V(1,4), 1, 2, SPECIAL(BEM)),
_OPI(DP2ADD, NOP, V(0,0), V(0,0), V(2,0), V(3,0), 1, 3, SPECIAL(DP2ADD)),
_OPI(DSX, DDX, V(0,0), V(0,0), V(2,1), V(3,0), 1, 1, NULL),
_OPI(DSY, DDY, V(0,0), V(0,0), V(2,1), V(3,0), 1, 1, NULL),
_OPI(TEXLDD, TXD, V(0,0), V(0,0), V(2,1), V(3,0), 1, 4, SPECIAL(TEXLDD)),
_OPI(SETP, NOP, V(0,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(SETP)),
_OPI(SETP, NOP, V(0,0), V(3,0), V(2,1), V(3,0), 1, 2, SPECIAL(SETP)),
_OPI(TEXLDL, TXL, V(3,0), V(3,0), V(3,0), V(3,0), 1, 2, SPECIAL(TEXLDL)),
_OPI(BREAKP, BRK, V(0,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(BREAKP))
_OPI(BREAKP, BRK, V(0,0), V(3,0), V(2,1), V(3,0), 0, 1, SPECIAL(BREAKP))
};
struct sm1_op_info inst_phase =
@@ -2740,11 +2998,11 @@ tx_ctor(struct shader_translator *tx, struct nine_shader_info *info)
info->lconstf.data = NULL;
info->lconstf.ranges = NULL;
for (i = 0; i < Elements(tx->regs.aL); ++i) {
tx->regs.aL[i] = ureg_dst_undef();
for (i = 0; i < Elements(tx->regs.rL); ++i) {
tx->regs.rL[i] = ureg_dst_undef();
}
tx->regs.a = ureg_dst_undef();
tx->regs.address = ureg_dst_undef();
tx->regs.a0 = ureg_dst_undef();
tx->regs.p = ureg_dst_undef();
tx->regs.oDepth = ureg_dst_undef();
tx->regs.vPos = ureg_src_undef();
@@ -2852,9 +3110,6 @@ nine_translate_shader(struct NineDevice9 *device, struct nine_shader_info *info)
ureg_property_fs_coord_pixel_center(tx->ureg, TGSI_FS_COORD_PIXEL_CENTER_INTEGER);
}
if (!ureg_dst_is_undef(tx->regs.oPts))
info->point_size = TRUE;
while (!sm1_parse_eof(tx))
sm1_parse_instruction(tx);
tx->parse++; /* for byte_size */
@@ -2870,6 +3125,9 @@ nine_translate_shader(struct NineDevice9 *device, struct nine_shader_info *info)
ureg_END(tx->ureg);
if (IS_VS && !ureg_dst_is_undef(tx->regs.oPts))
info->point_size = TRUE;
if (debug_get_bool_option("NINE_TGSI_DUMP", FALSE)) {
unsigned count;
const struct tgsi_token *toks = ureg_get_tokens(tx->ureg, &count);

View File

@@ -347,14 +347,13 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
const int *const_i;
const BOOL *const_b;
uint32_t data_b[NINE_MAX_CONST_B];
uint32_t b_true;
uint16_t dirty_i;
uint16_t dirty_b;
const unsigned usage = PIPE_TRANSFER_WRITE | PIPE_TRANSFER_DISCARD_RANGE;
unsigned x = 0; /* silence warning */
unsigned i, c, n;
const struct nine_lconstf *lconstf;
struct nine_range *r, *p;
unsigned i, c;
struct nine_range *r, *p, *lconstf_ranges;
float *lconstf_data;
box.y = 0;
box.z = 0;
@@ -381,9 +380,10 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
dirty_b = device->state.changed.vs_const_b;
device->state.changed.vs_const_b = 0;
const_b = device->state.vs_const_b;
b_true = device->vs_bool_true;
lconstf = &device->state.vs->lconstf;
lconstf_ranges = device->state.vs->lconstf.ranges;
lconstf_data = device->state.vs->lconstf.data;
device->state.ff.clobber.vs_const = TRUE;
device->state.changed.group &= ~NINE_STATE_VS_CONST;
} else {
@@ -406,9 +406,10 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
dirty_b = device->state.changed.ps_const_b;
device->state.changed.ps_const_b = 0;
const_b = device->state.ps_const_b;
b_true = device->ps_bool_true;
lconstf = &device->state.ps->lconstf;
lconstf_ranges = NULL;
lconstf_data = NULL;
device->state.ff.clobber.ps_const = TRUE;
device->state.changed.group &= ~NINE_STATE_PS_CONST;
}
@@ -420,11 +421,10 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
i = ffs(dirty_b) - 1;
x = buf->width0 - (NINE_MAX_CONST_B - i) * 4;
c -= i;
for (n = 0; n < c; ++n, ++i)
data_b[n] = const_b[i] ? b_true : 0;
memcpy(data_b, &(const_b[i]), c * sizeof(uint32_t));
box.x = x;
box.width = n * 4;
DBG("upload ConstantB [%u .. %u]\n", x, x + n - 1);
box.width = c * 4;
DBG("upload ConstantB [%u .. %u]\n", x, x + c - 1);
pipe->transfer_inline_write(pipe, buf, 0, usage, &box, data_b, 0, 0);
}
@@ -455,14 +455,14 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
}
/* TODO: only upload these when shader itself changes */
if (lconstf->ranges) {
if (lconstf_ranges) {
unsigned n = 0;
struct nine_range *r = lconstf->ranges;
struct nine_range *r = lconstf_ranges;
while (r) {
box.x = r->bgn * 4 * sizeof(float);
n += r->end - r->bgn;
box.width = (r->end - r->bgn) * 4 * sizeof(float);
data = &lconstf->data[4 * n];
data = &lconstf_data[4 * n];
pipe->transfer_inline_write(pipe, buf, 0, usage, &box, data, 0, 0);
r = r->next;
}
@@ -491,19 +491,16 @@ update_vs_constants_userbuf(struct NineDevice9 *device)
if (state->changed.vs_const_b) {
int *idst = (int *)&state->vs_const_f[4 * device->max_vs_const_f];
uint32_t *bdst = (uint32_t *)&idst[4 * NINE_MAX_CONST_I];
int i;
for (i = 0; i < NINE_MAX_CONST_B; ++i)
bdst[i] = state->vs_const_b[i] ? device->vs_bool_true : 0;
memcpy(bdst, state->vs_const_b, sizeof(state->vs_const_b));
state->changed.vs_const_b = 0;
}
#ifdef DEBUG
if (device->state.vs->lconstf.ranges) {
/* TODO: Can we make it so that we don't have to copy everything ? */
const struct nine_lconstf *lconstf = &device->state.vs->lconstf;
const struct nine_range *r = lconstf->ranges;
unsigned n = 0;
float *dst = (float *)MALLOC(cb.buffer_size);
float *dst = device->state.vs_lconstf_temp;
float *src = (float *)cb.user_buffer;
memcpy(dst, src, cb.buffer_size);
while (r) {
@@ -515,15 +512,9 @@ update_vs_constants_userbuf(struct NineDevice9 *device)
}
cb.user_buffer = dst;
}
#endif
pipe->set_constant_buffer(pipe, PIPE_SHADER_VERTEX, 0, &cb);
#ifdef DEBUG
if (device->state.vs->lconstf.ranges)
FREE((void *)cb.user_buffer);
#endif
if (device->state.changed.vs_const_f) {
struct nine_range *r = device->state.changed.vs_const_f;
struct nine_range *p = r;
@@ -557,39 +548,12 @@ update_ps_constants_userbuf(struct NineDevice9 *device)
if (state->changed.ps_const_b) {
int *idst = (int *)&state->ps_const_f[4 * device->max_ps_const_f];
uint32_t *bdst = (uint32_t *)&idst[4 * NINE_MAX_CONST_I];
int i;
for (i = 0; i < NINE_MAX_CONST_B; ++i)
bdst[i] = state->ps_const_b[i] ? device->ps_bool_true : 0;
memcpy(bdst, state->ps_const_b, sizeof(state->ps_const_b));
state->changed.ps_const_b = 0;
}
#ifdef DEBUG
if (device->state.ps->lconstf.ranges) {
/* TODO: Can we make it so that we don't have to copy everything ? */
const struct nine_lconstf *lconstf = &device->state.ps->lconstf;
const struct nine_range *r = lconstf->ranges;
unsigned n = 0;
float *dst = (float *)MALLOC(cb.buffer_size);
float *src = (float *)cb.user_buffer;
memcpy(dst, src, cb.buffer_size);
while (r) {
unsigned p = r->bgn;
unsigned c = r->end - r->bgn;
memcpy(&dst[p * 4], &lconstf->data[n * 4], c * 4 * sizeof(float));
n += c;
r = r->next;
}
cb.user_buffer = dst;
}
#endif
pipe->set_constant_buffer(pipe, PIPE_SHADER_FRAGMENT, 0, &cb);
#ifdef DEBUG
if (device->state.ps->lconstf.ranges)
FREE((void *)cb.user_buffer);
#endif
if (device->state.changed.ps_const_f) {
struct nine_range *r = device->state.changed.ps_const_f;
struct nine_range *p = r;
@@ -1030,9 +994,10 @@ static const DWORD nine_samp_state_defaults[NINED3DSAMP_LAST + 1] =
[NINED3DSAMP_SHADOW] = 0
};
void
nine_state_set_defaults(struct nine_state *state, const D3DCAPS9 *caps,
nine_state_set_defaults(struct NineDevice9 *device, const D3DCAPS9 *caps,
boolean is_reset)
{
struct nine_state *state = &device->state;
unsigned s;
/* Initialize defaults.
@@ -1053,9 +1018,9 @@ nine_state_set_defaults(struct nine_state *state, const D3DCAPS9 *caps,
}
if (state->vs_const_f)
memset(state->vs_const_f, 0, NINE_MAX_CONST_F * 4 * sizeof(float));
memset(state->vs_const_f, 0, device->vs_const_size);
if (state->ps_const_f)
memset(state->ps_const_f, 0, NINE_MAX_CONST_F * 4 * sizeof(float));
memset(state->ps_const_f, 0, device->ps_const_size);
/* Cap dependent initial state:
*/

View File

@@ -144,6 +144,7 @@ struct nine_state
float *vs_const_f;
int vs_const_i[NINE_MAX_CONST_I][4];
BOOL vs_const_b[NINE_MAX_CONST_B];
float *vs_lconstf_temp;
uint32_t vs_key;
struct NinePixelShader9 *ps;
@@ -218,7 +219,7 @@ struct NineDevice9;
boolean nine_update_state(struct NineDevice9 *, uint32_t group_mask);
void nine_state_set_defaults(struct nine_state *, const D3DCAPS9 *,
void nine_state_set_defaults(struct NineDevice9 *, const D3DCAPS9 *,
boolean is_reset);
void nine_state_clear(struct nine_state *, const boolean device);

View File

@@ -72,9 +72,10 @@ NinePixelShader9_ctor( struct NinePixelShader9 *This,
This->sampler_mask = info.sampler_mask;
This->rt_mask = info.rt_mask;
This->const_used_size = info.const_used_size;
if (info.const_used_size == ~0)
This->const_used_size = NINE_CONSTBUF_SIZE(device->max_ps_const_f);
This->lconstf = info.lconstf;
/* no constant relative addressing for ps */
assert(info.const_used_size != ~0);
assert(info.lconstf.data == NULL);
assert(info.lconstf.ranges == NULL);
return D3D_OK;
}
@@ -101,9 +102,6 @@ NinePixelShader9_dtor( struct NinePixelShader9 *This )
if (This->byte_code.tokens)
FREE((void *)This->byte_code.tokens); /* const_cast */
FREE(This->lconstf.data);
FREE(This->lconstf.ranges);
NineUnknown_dtor(&This->base);
}

View File

@@ -41,8 +41,6 @@ struct NinePixelShader9
unsigned const_used_size; /* in bytes */
struct nine_lconstf lconstf;
uint16_t sampler_mask;
uint16_t sampler_mask_shadow;
uint8_t rt_mask;

View File

@@ -43,8 +43,8 @@ NineStateBlock9_ctor( struct NineStateBlock9 *This,
This->type = type;
This->state.vs_const_f = MALLOC(pParams->device->constbuf_vs->width0);
This->state.ps_const_f = MALLOC(pParams->device->constbuf_ps->width0);
This->state.vs_const_f = MALLOC(This->base.device->vs_const_size);
This->state.ps_const_f = MALLOC(This->base.device->ps_const_size);
if (!This->state.vs_const_f || !This->state.ps_const_f)
return E_OUTOFMEMORY;

View File

@@ -38,6 +38,8 @@
#define DBG_CHANNEL DBG_SURFACE
#define is_ATI1_ATI2(format) (format == PIPE_FORMAT_RGTC1_UNORM || format == PIPE_FORMAT_RGTC2_UNORM)
HRESULT
NineSurface9_ctor( struct NineSurface9 *This,
struct NineUnknownParams *pParams,
@@ -150,14 +152,22 @@ struct pipe_surface *
NineSurface9_CreatePipeSurface( struct NineSurface9 *This, const int sRGB )
{
struct pipe_context *pipe = This->pipe;
struct pipe_screen *screen = pipe->screen;
struct pipe_resource *resource = This->base.resource;
struct pipe_surface templ;
enum pipe_format srgb_format;
assert(This->desc.Pool == D3DPOOL_DEFAULT ||
This->desc.Pool == D3DPOOL_MANAGED);
assert(resource);
templ.format = sRGB ? util_format_srgb(resource->format) : resource->format;
srgb_format = util_format_srgb(resource->format);
if (sRGB && srgb_format != PIPE_FORMAT_NONE &&
screen->is_format_supported(screen, srgb_format,
resource->target, 0, resource->bind))
templ.format = srgb_format;
else
templ.format = resource->format;
templ.u.tex.level = This->level;
templ.u.tex.first_layer = This->layer;
templ.u.tex.last_layer = This->layer;
@@ -374,10 +384,19 @@ NineSurface9_LockRect( struct NineSurface9 *This,
if (This->data) {
DBG("returning system memory\n");
pLockedRect->Pitch = This->stride;
pLockedRect->pBits = NineSurface9_GetSystemMemPointer(This,
box.x, box.y);
/* ATI1 and ATI2 need special handling, because of d3d9 bug.
* We must advertise to the application as if it is uncompressed
* and bpp 8, and the app has a workaround to work with the fact
* that it is actually compressed. */
if (is_ATI1_ATI2(This->base.info.format)) {
pLockedRect->Pitch = This->desc.Height;
pLockedRect->pBits = This->data + box.y * This->desc.Height + box.x;
} else {
pLockedRect->Pitch = This->stride;
pLockedRect->pBits = NineSurface9_GetSystemMemPointer(This,
box.x,
box.y);
}
} else {
DBG("mapping pipe_resource %p (level=%u usage=%x)\n",
resource, This->level, usage);

View File

@@ -467,7 +467,7 @@ NineSwapChain9_dtor( struct NineSwapChain9 *This )
if (This->buffers) {
for (i = 0; i < This->params.BackBufferCount; i++) {
NineUnknown_Destroy(NineUnknown(This->buffers[i]));
NineUnknown_Release(NineUnknown(This->buffers[i]));
ID3DPresent_DestroyD3DWindowBuffer(This->present, This->present_handles[i]);
if (This->present_buffers)
pipe_resource_reference(&(This->present_buffers[i]), NULL);

View File

@@ -47,6 +47,7 @@ NineTexture9_ctor( struct NineTexture9 *This,
struct pipe_screen *screen = pParams->device->screen;
struct pipe_resource *info = &This->base.base.info;
struct pipe_resource *resource;
enum pipe_format pf;
unsigned l;
D3DSURFACE_DESC sfdesc;
HRESULT hr;
@@ -92,9 +93,15 @@ NineTexture9_ctor( struct NineTexture9 *This,
if (Usage & D3DUSAGE_AUTOGENMIPMAP)
Levels = 0;
pf = d3d9_to_pipe_format(Format);
if (Format != D3DFMT_NULL && (pf == PIPE_FORMAT_NONE ||
!screen->is_format_supported(screen, pf, PIPE_TEXTURE_2D, 0, PIPE_BIND_SAMPLER_VIEW))) {
return D3DERR_INVALIDCALL;
}
info->screen = screen;
info->target = PIPE_TEXTURE_2D;
info->format = d3d9_to_pipe_format(Format);
info->format = pf;
info->width0 = Width;
info->height0 = Height;
info->depth0 = 1;

View File

@@ -37,6 +37,8 @@ NineVolumeTexture9_ctor( struct NineVolumeTexture9 *This,
HANDLE *pSharedHandle )
{
struct pipe_resource *info = &This->base.base.info;
struct pipe_screen *screen = pParams->device->screen;
enum pipe_format pf;
unsigned l;
D3DVOLUME_DESC voldesc;
HRESULT hr;
@@ -57,9 +59,19 @@ NineVolumeTexture9_ctor( struct NineVolumeTexture9 *This,
if (Usage & D3DUSAGE_AUTOGENMIPMAP)
Levels = 0;
pf = d3d9_to_pipe_format(Format);
if (pf == PIPE_FORMAT_NONE ||
!screen->is_format_supported(screen, pf, PIPE_TEXTURE_3D, 0, PIPE_BIND_SAMPLER_VIEW)) {
return D3DERR_INVALIDCALL;
}
/* We support ATI1 and ATI2 hacks only for 2D textures */
if (Format == D3DFMT_ATI1 || Format == D3DFMT_ATI2)
return D3DERR_INVALIDCALL;
info->screen = pParams->device->screen;
info->target = PIPE_TEXTURE_3D;
info->format = d3d9_to_pipe_format(Format);
info->format = pf;
info->width0 = Width;
info->height0 = Height;
info->depth0 = Depth;

View File

@@ -706,6 +706,11 @@ static void slice_header(vid_dec_PrivateType *priv, struct vl_rbsp *rbsp,
if (pic_order_cnt_lsb != priv->codec_data.h264.pic_order_cnt_lsb)
vid_dec_h264_EndFrame(priv);
if (IdrPicFlag) {
priv->codec_data.h264.pic_order_cnt_msb = 0;
priv->codec_data.h264.pic_order_cnt_lsb = 0;
}
if ((pic_order_cnt_lsb < priv->codec_data.h264.pic_order_cnt_lsb) &&
(priv->codec_data.h264.pic_order_cnt_lsb - pic_order_cnt_lsb) >= (max_pic_order_cnt_lsb / 2))
pic_order_cnt_msb = priv->codec_data.h264.pic_order_cnt_msb + max_pic_order_cnt_lsb;

View File

@@ -431,7 +431,7 @@ osmesa_st_framebuffer_validate(struct st_context_iface *stctx,
templat.format = format;
templat.bind = bind;
out[i] = osbuffer->textures[i] =
out[i] = osbuffer->textures[statts[i]] =
screen->resource_create(screen, &templat);
}

View File

@@ -137,7 +137,9 @@ glsl_test_SOURCES = \
test.cpp \
test_optpass.cpp
glsl_test_LDADD = libglsl.la
glsl_test_LDADD = \
libglsl.la \
$(PTHREAD_LIBS)
# We write our own rules for yacc and lex below. We'd rather use automake,
# but automake makes it especially difficult for a number of reasons:

View File

@@ -578,9 +578,18 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
if (!is_vec_zero(zero))
continue;
return new(mem_ctx) ir_expression(ir->operation,
add->operands[0],
neg(add->operands[1]));
/* Depending of the zero position we want to optimize
* (0 cmp x+y) into (-x cmp y) or (x+y cmp 0) into (x cmp -y)
*/
if (add_pos == 1) {
return new(mem_ctx) ir_expression(ir->operation,
neg(add->operands[0]),
add->operands[1]);
} else {
return new(mem_ctx) ir_expression(ir->operation,
add->operands[0],
neg(add->operands[1]));
}
}
break;
@@ -679,55 +688,72 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
case ir_binop_min:
case ir_binop_max:
if (ir->type->base_type != GLSL_TYPE_FLOAT)
if (ir->type->base_type != GLSL_TYPE_FLOAT || options->EmitNoSat)
break;
/* Replace min(max) operations and its commutative combinations with
* a saturate operation
*/
for (int op = 0; op < 2; op++) {
ir_expression *minmax = op_expr[op];
ir_expression *inner_expr = op_expr[op];
ir_constant *outer_const = op_const[1 - op];
ir_expression_operation op_cond = (ir->operation == ir_binop_max) ?
ir_binop_min : ir_binop_max;
if (!minmax || !outer_const || (minmax->operation != op_cond))
if (!inner_expr || !outer_const || (inner_expr->operation != op_cond))
continue;
/* One of these has to be a constant */
if (!inner_expr->operands[0]->as_constant() &&
!inner_expr->operands[1]->as_constant())
break;
/* Found a min(max) combination. Now try to see if its operands
* meet our conditions that we can do just a single saturate operation
*/
for (int minmax_op = 0; minmax_op < 2; minmax_op++) {
ir_rvalue *inner_val_a = minmax->operands[minmax_op];
ir_rvalue *inner_val_b = minmax->operands[1 - minmax_op];
ir_rvalue *x = inner_expr->operands[minmax_op];
ir_rvalue *y = inner_expr->operands[1 - minmax_op];
if (!inner_val_a || !inner_val_b)
ir_constant *inner_const = y->as_constant();
if (!inner_const)
continue;
/* Found a {min|max} ({max|min} (x, 0.0), 1.0) operation and its variations */
if ((outer_const->is_one() && inner_val_a->is_zero()) ||
(inner_val_a->is_one() && outer_const->is_zero()))
return saturate(inner_val_b);
/* min(max(x, 0.0), 1.0) is sat(x) */
if (ir->operation == ir_binop_min &&
inner_const->is_zero() &&
outer_const->is_one())
return saturate(x);
/* Found a {min|max} ({max|min} (x, 0.0), b) where b < 1.0
* and its variations
*/
if (is_less_than_one(outer_const) && inner_val_b->is_zero())
return expr(ir_binop_min, saturate(inner_val_a), outer_const);
/* max(min(x, 1.0), 0.0) is sat(x) */
if (ir->operation == ir_binop_max &&
inner_const->is_one() &&
outer_const->is_zero())
return saturate(x);
if (!inner_val_b->as_constant())
continue;
/* min(max(x, 0.0), b) where b < 1.0 is sat(min(x, b)) */
if (ir->operation == ir_binop_min &&
inner_const->is_zero() &&
is_less_than_one(outer_const))
return saturate(expr(ir_binop_min, x, outer_const));
if (is_less_than_one(inner_val_b->as_constant()) && outer_const->is_zero())
return expr(ir_binop_min, saturate(inner_val_a), inner_val_b);
/* max(min(x, b), 0.0) where b < 1.0 is sat(min(x, b)) */
if (ir->operation == ir_binop_max &&
is_less_than_one(inner_const) &&
outer_const->is_zero())
return saturate(expr(ir_binop_min, x, inner_const));
/* Found a {min|max} ({max|min} (x, b), 1.0), where b > 0.0
* and its variations
*/
if (outer_const->is_one() && is_greater_than_zero(inner_val_b->as_constant()))
return expr(ir_binop_max, saturate(inner_val_a), inner_val_b);
if (inner_val_b->as_constant()->is_one() && is_greater_than_zero(outer_const))
return expr(ir_binop_max, saturate(inner_val_a), outer_const);
/* max(min(x, 1.0), b) where b > 0.0 is sat(max(x, b)) */
if (ir->operation == ir_binop_max &&
inner_const->is_one() &&
is_greater_than_zero(outer_const))
return saturate(expr(ir_binop_max, x, outer_const));
/* min(max(x, b), 1.0) where b > 0.0 is sat(max(x, b)) */
if (ir->operation == ir_binop_min &&
is_greater_than_zero(inner_const) &&
outer_const->is_one())
return saturate(expr(ir_binop_max, x, inner_const));
}
}

View File

@@ -128,6 +128,9 @@ ir_copy_propagation_visitor::visit_enter(ir_function_signature *ir)
visit_list_elements(this, &ir->body);
ralloc_free(this->acp);
ralloc_free(this->kills);
this->kills = orig_kills;
this->acp = orig_acp;
this->killed_all = orig_killed_all;
@@ -215,7 +218,7 @@ ir_copy_propagation_visitor::handle_if_block(exec_list *instructions)
/* Populate the initial acp with a copy of the original */
foreach_in_list(acp_entry, a, orig_acp) {
this->acp->push_tail(new(this->mem_ctx) acp_entry(a->lhs, a->rhs));
this->acp->push_tail(new(this->acp) acp_entry(a->lhs, a->rhs));
}
visit_list_elements(this, instructions);
@@ -226,12 +229,15 @@ ir_copy_propagation_visitor::handle_if_block(exec_list *instructions)
exec_list *new_kills = this->kills;
this->kills = orig_kills;
ralloc_free(this->acp);
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
foreach_in_list(kill_entry, k, new_kills) {
kill(k->var);
}
ralloc_free(new_kills);
}
ir_visitor_status
@@ -269,6 +275,7 @@ ir_copy_propagation_visitor::visit_enter(ir_loop *ir)
exec_list *new_kills = this->kills;
this->kills = orig_kills;
ralloc_free(this->acp);
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
@@ -276,6 +283,8 @@ ir_copy_propagation_visitor::visit_enter(ir_loop *ir)
kill(k->var);
}
ralloc_free(new_kills);
/* already descended into the children. */
return visit_continue_with_parent;
}
@@ -294,7 +303,7 @@ ir_copy_propagation_visitor::kill(ir_variable *var)
/* Add the LHS variable to the list of killed variables in this block.
*/
this->kills->push_tail(new(this->mem_ctx) kill_entry(var));
this->kills->push_tail(new(this->kills) kill_entry(var));
}
/**
@@ -322,7 +331,7 @@ ir_copy_propagation_visitor::add_copy(ir_assignment *ir)
ir->condition = new(ralloc_parent(ir)) ir_constant(false);
this->progress = true;
} else {
entry = new(this->mem_ctx) acp_entry(lhs_var, rhs_var);
entry = new(this->acp) acp_entry(lhs_var, rhs_var);
this->acp->push_tail(entry);
}
}

View File

@@ -156,6 +156,9 @@ ir_copy_propagation_elements_visitor::visit_enter(ir_function_signature *ir)
visit_list_elements(this, &ir->body);
ralloc_free(this->acp);
ralloc_free(this->kills);
this->kills = orig_kills;
this->acp = orig_acp;
this->killed_all = orig_killed_all;
@@ -173,9 +176,9 @@ ir_copy_propagation_elements_visitor::visit_leave(ir_assignment *ir)
kill_entry *k;
if (lhs)
k = new(mem_ctx) kill_entry(var, ir->write_mask);
k = new(this->kills) kill_entry(var, ir->write_mask);
else
k = new(mem_ctx) kill_entry(var, ~0);
k = new(this->kills) kill_entry(var, ~0);
kill(k);
}
@@ -334,7 +337,7 @@ ir_copy_propagation_elements_visitor::handle_if_block(exec_list *instructions)
/* Populate the initial acp with a copy of the original */
foreach_in_list(acp_entry, a, orig_acp) {
this->acp->push_tail(new(this->mem_ctx) acp_entry(a));
this->acp->push_tail(new(this->acp) acp_entry(a));
}
visit_list_elements(this, instructions);
@@ -345,6 +348,7 @@ ir_copy_propagation_elements_visitor::handle_if_block(exec_list *instructions)
exec_list *new_kills = this->kills;
this->kills = orig_kills;
ralloc_free(this->acp);
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
@@ -354,6 +358,8 @@ ir_copy_propagation_elements_visitor::handle_if_block(exec_list *instructions)
foreach_in_list_safe(kill_entry, k, new_kills) {
kill(k);
}
ralloc_free(new_kills);
}
ir_visitor_status
@@ -391,6 +397,7 @@ ir_copy_propagation_elements_visitor::visit_enter(ir_loop *ir)
exec_list *new_kills = this->kills;
this->kills = orig_kills;
ralloc_free(this->acp);
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
@@ -398,6 +405,8 @@ ir_copy_propagation_elements_visitor::visit_enter(ir_loop *ir)
kill(k);
}
ralloc_free(new_kills);
/* already descended into the children. */
return visit_continue_with_parent;
}
@@ -423,6 +432,7 @@ ir_copy_propagation_elements_visitor::kill(kill_entry *k)
if (k->next)
k->remove();
ralloc_steal(this->kills, k);
this->kills->push_tail(k);
}

View File

@@ -1,6 +1,7 @@
noinst_LTLIBRARIES = libappleglx.la
AM_CFLAGS = \
-I$(top_srcdir)/src \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src/glx \
-I$(top_srcdir)/src/mesa \

View File

@@ -1526,6 +1526,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor,
xcb_connection_t *c = XGetXCBConnection(dpy);
struct dri3_buffer *back;
int64_t ret = 0;
uint32_t options = XCB_PRESENT_OPTION_NONE;
unsigned flags = __DRI2_FLUSH_DRAWABLE;
if (flush)
@@ -1578,6 +1579,17 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor,
remainder = 0;
}
/* From the GLX_EXT_swap_control spec:
*
* "If <interval> is set to a value of 0, buffer swaps are not
* synchronized to a video frame."
*
* Implementation note: It is possible to enable triple buffering behaviour
* by not using XCB_PRESENT_OPTION_ASYNC, but this should not be the default.
*/
if (priv->swap_interval == 0)
options |= XCB_PRESENT_OPTION_ASYNC;
back->busy = 1;
back->last_swap = priv->send_sbc;
xcb_present_pixmap(c,
@@ -1591,7 +1603,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor,
None, /* target_crtc */
None,
back->sync_fence,
XCB_PRESENT_OPTION_NONE,
options,
target_msc,
divisor,
remainder, 0, NULL);

View File

@@ -65,10 +65,23 @@ dri2_convert_glx_query_renderer_attribs(int attribute)
return -1;
}
/* Convert internal dri context profile bits into GLX context profile bits */
static inline void
dri_convert_context_profile_bits(int attribute, unsigned int *value)
{
if (attribute == GLX_RENDERER_PREFERRED_PROFILE_MESA) {
if (value[0] == (1U << __DRI_API_OPENGL_CORE))
value[0] = GLX_CONTEXT_CORE_PROFILE_BIT_ARB;
else if (value[0] == (1U << __DRI_API_OPENGL))
value[0] = GLX_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB;
}
}
_X_HIDDEN int
dri2_query_renderer_integer(struct glx_screen *base, int attribute,
unsigned int *value)
{
int ret;
struct dri2_screen *const psc = (struct dri2_screen *) base;
/* Even though there are invalid values (and
@@ -81,8 +94,11 @@ dri2_query_renderer_integer(struct glx_screen *base, int attribute,
if (psc->rendererQuery == NULL)
return -1;
return psc->rendererQuery->queryInteger(psc->driScreen, dri_attribute,
value);
ret = psc->rendererQuery->queryInteger(psc->driScreen, dri_attribute,
value);
dri_convert_context_profile_bits(attribute, value);
return ret;
}
_X_HIDDEN int
@@ -108,6 +124,7 @@ _X_HIDDEN int
dri3_query_renderer_integer(struct glx_screen *base, int attribute,
unsigned int *value)
{
int ret;
struct dri3_screen *const psc = (struct dri3_screen *) base;
/* Even though there are invalid values (and
@@ -120,8 +137,11 @@ dri3_query_renderer_integer(struct glx_screen *base, int attribute,
if (psc->rendererQuery == NULL)
return -1;
return psc->rendererQuery->queryInteger(psc->driScreen, dri_attribute,
value);
ret = psc->rendererQuery->queryInteger(psc->driScreen, dri_attribute,
value);
dri_convert_context_profile_bits(attribute, value);
return ret;
}
_X_HIDDEN int
@@ -147,6 +167,7 @@ _X_HIDDEN int
drisw_query_renderer_integer(struct glx_screen *base, int attribute,
unsigned int *value)
{
int ret;
struct drisw_screen *const psc = (struct drisw_screen *) base;
/* Even though there are invalid values (and
@@ -159,8 +180,11 @@ drisw_query_renderer_integer(struct glx_screen *base, int attribute,
if (psc->rendererQuery == NULL)
return -1;
return psc->rendererQuery->queryInteger(psc->driScreen, dri_attribute,
value);
ret = psc->rendererQuery->queryInteger(psc->driScreen, dri_attribute,
value);
dri_convert_context_profile_bits(attribute, value);
return ret;
}
_X_HIDDEN int

View File

@@ -143,8 +143,13 @@ __glXWireToEvent(Display *dpy, XEvent *event, xEvent *wire)
aevent->ust = ((CARD64)awire->ust_hi << 32) | awire->ust_lo;
aevent->msc = ((CARD64)awire->msc_hi << 32) | awire->msc_lo;
if (awire->sbc < glxDraw->lastEventSbc)
glxDraw->eventSbcWrap += 0x100000000;
/* Handle 32-Bit wire sbc wraparound in both directions to cope with out
* of sequence 64-Bit sbc's
*/
if ((int64_t) awire->sbc < ((int64_t) glxDraw->lastEventSbc - 0x40000000))
glxDraw->eventSbcWrap += 0x100000000;
if ((int64_t) awire->sbc > ((int64_t) glxDraw->lastEventSbc + 0x40000000))
glxDraw->eventSbcWrap -= 0x100000000;
glxDraw->lastEventSbc = awire->sbc;
aevent->sbc = awire->sbc + glxDraw->eventSbcWrap;
return True;

View File

@@ -221,7 +221,10 @@ DRI_glXUseXFont(struct glx_context *CC, Font font, int first, int count, int lis
XGCValues values;
unsigned long valuemask;
XFontStruct *fs;
#if !defined(GLX_USE_APPLEGL)
__GLXDRIdrawable *glxdraw;
#endif
GLint swapbytes, lsbfirst, rowlength;
GLint skiprows, skippixels, alignment;
@@ -234,9 +237,11 @@ DRI_glXUseXFont(struct glx_context *CC, Font font, int first, int count, int lis
dpy = CC->currentDpy;
win = CC->currentDrawable;
#if !defined(GLX_USE_APPLEGL)
glxdraw = GetGLXDRIDrawable(CC->currentDpy, CC->currentDrawable);
if (glxdraw)
win = glxdraw->xDrawable;
#endif
fs = XQueryFont(dpy, font);
if (!fs) {

View File

@@ -64,6 +64,7 @@
* Rob Clark <robclark@freedesktop.org>
*/
#include <sys/stat.h>
#include <stdarg.h>
#include <stdio.h>
#include <string.h>
@@ -80,7 +81,6 @@
#endif
#endif
#ifdef HAVE_SYSFS
#include <sys/stat.h>
#include <sys/types.h>
#endif
#include "loader.h"

View File

@@ -122,5 +122,5 @@ format_info_deps := \
$(LOCAL_PATH)/main/format_parser.py \
$(FORMAT_INFO)
$(intermediates)/main/format_info.c: $(format_info_deps)
$(intermediates)/main/format_info.h: $(format_info_deps)
@$(MESA_PYTHON2) $(FORMAT_INFO) $< > $@

View File

@@ -64,7 +64,7 @@ include Makefile.sources
BUILT_SOURCES = \
main/get_hash.h \
main/format_info.c \
main/format_info.h \
$(BUILDDIR)main/git_sha1.h \
$(BUILDDIR)program/program_parse.tab.c \
$(BUILDDIR)program/lex.yy.c
@@ -82,14 +82,14 @@ main/get_hash.h: $(GLAPI)/gl_and_es_API.xml main/get_hash_params.py \
-f $< > $@.tmp; \
mv $@.tmp $@;
main/format_info.c: main/formats.csv \
main/format_info.h: main/formats.csv \
main/format_parser.py main/format_info.py
$(AM_V_GEN)set -e; \
$(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/main/format_info.py \
$< > $@.tmp; \
mv $@.tmp $@;
main/formats.c: main/format_info.c
main/formats.c: main/format_info.h
noinst_LTLIBRARIES = $(ARCH_LIBS)
if NEED_LIBMESA

View File

@@ -60,7 +60,7 @@ get_hash_header = env.CodeGenerate(
)
format_info = env.CodeGenerate(
target = 'main/format_info.c',
target = 'main/format_info.h',
script = 'main/format_info.py',
source = 'main/formats.csv',
command = python_cmd + ' $SCRIPT ' + ' $SOURCE > $TARGET'

View File

@@ -1096,11 +1096,8 @@ struct brw_context
uint32_t pma_stall_bits;
struct {
/** Does the current draw use the index buffer? */
bool indexed;
int start_vertex_location;
int base_vertex_location;
/** The value of gl_BaseVertex for the current _mesa_prim. */
int gl_basevertex;
/**
* Buffer and offset used for GL_ARB_shader_draw_parameters

View File

@@ -280,6 +280,19 @@ brw_upload_constant_buffer(struct brw_context *brw)
*/
emit:
/* Work around mysterious 965 hangs that appear to happen if you do
* two 3DPRIMITIVEs with only a CONSTANT_BUFFER inbetween. If we
* haven't already flushed for some other reason, explicitly do so.
*
* We've found no documented reason why this should be necessary.
*/
if (brw->gen == 4 && !brw->is_g4x &&
(brw->state.dirty.brw & (BRW_NEW_BATCH | BRW_NEW_PSP)) == 0) {
BEGIN_BATCH(1);
OUT_BATCH(MI_FLUSH);
ADVANCE_BATCH();
}
/* BRW_NEW_URB_FENCE: From the gen4 PRM, volume 1, section 3.9.8
* (CONSTANT_BUFFER (CURBE Load)):
*

View File

@@ -551,6 +551,7 @@
#define BRW_SURFACE_PITCH_MASK INTEL_MASK(19, 3)
#define BRW_SURFACE_TILED (1 << 1)
#define BRW_SURFACE_TILED_Y (1 << 0)
#define HSW_SURFACE_IS_INTEGER_FORMAT (1 << 18)
/* Surface state DW4 */
#define BRW_SURFACE_MIN_LOD_SHIFT 28

View File

@@ -239,7 +239,7 @@ static const struct brw_device_info brw_device_info_chv = {
.has_llc = false,
.max_vs_threads = 80,
.max_gs_threads = 80,
.max_wm_threads = 102,
.max_wm_threads = 128,
.urb = {
.size = 128,
.min_vs_entries = 64,

Some files were not shown because too many files have changed in this diff Show More