Compare commits

...

137 Commits

Author SHA1 Message Date
Emil Velikov
cb154bb221 docs: Add sha256 sums for the 10.4.7 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 00:50:13 +00:00
Emil Velikov
d26f3c1f86 Add release notes for the 10.4.7 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 00:26:27 +00:00
Emil Velikov
b7b218f3f6 Update version to 10.4.7
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-21 00:19:39 +00:00
Marek Olšák
832c94a55c radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords
radeon_llvm_emit_prepare_cube_coords uses coords[4] in some cases (TXB2 etc.)

Discovered by Coverity. Reported by Ilia Mirkin.

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit a984abdad3)
2015-03-18 21:49:33 +00:00
Mario Kleiner
70832be2f1 glx: Handle out-of-sequence swap completion events correctly. (v2)
The code for emitting INTEL_swap_events swap completion
events needs to translate from 32-Bit sbc on the wire to
64-Bit sbc for the events and handle wraparound accordingly.

It assumed that events would be sent by the server in the
order their corresponding swap requests were emitted from
the client, iow. sbc count should be always increasing. This
was correct for DRI2.

This is not always the case under the DRI3/Present backend,
where the Present extension can execute presents and send out
completion events in a different order than the submission
order of the present requests, due to client code specifying
targetMSC target vblank counts which are not strictly
monotonically increasing. This confused the wraparound
handling. This patch fixes the problem by handling 32-Bit
wraparound in both directions. As long as successive swap
completion events real 64-Bit sbc's don't differ by more
than 2^30, this should be able to do the right thing.

How this is supposed to work:

awire->sbc contains the low 32-Bits of the true 64-Bit sbc
of the current swap event, transmitted over the wire.

glxDraw->lastEventSbc contains the low 32-Bits of the 64-Bit
sbc of the most recently processed swap event.

glxDraw->eventSbcWrap is a 64-Bit offset which tracks the upper
32-Bits of the current sbc. The final 64-Bit output sbc
aevent->sbc is computed from the sum of awire->sbc and
glxDraw->eventSbcWrap.

Under DRI3/Present, swap completion events can be received
slightly out of order due to non-monotic targetMsc specified
by client code, e.g., present request submission:

Submission sbc:   1   2   3
targetMsc:        10  11  9

Reception of completion events:
Completion sbc:   3   1   2

The completion sequence 3, 1, 2 would confuse the old wraparound
handling made for DRI2 as 1 < 3 --> Assumes a 32-Bit wraparound
has happened when it hasn't.

The client can queue multiple present requests, in the case of
Mesa up to n requests for n-buffered rendering, e.g., n =  2-4 in
the current Mesa GLX DRI3/Present implementation. In the case of
direct Pixmap presents via xcb_present_pixmap() the number n is
limited by the amount of memory available.

We reasonably assume that the number of outstanding requests n is
much less than 2 billion due to memory contraints and common sense.
Therefore while the order of received sbc's can be a bit scrambled,
successive 64-Bit sbc's won't deviate by much, a given sbc may be
a few counts lower or higher than the previous received sbc.

Therefore any large difference between the incoming awire->sbc and
the last recorded glxDraw->lastEventSbc will be due to 32-Bit
wraparound and we need to adapt glxDraw->eventSbcWrap accordingly
to adjust the upper 32-Bits of the sbc.

Two cases, correponding to the two if-statements in the patch:

a) Previous sbc event was below the last 2^32 boundary, in the previous
glxDraw->eventSbcWrap epoch, the new sbc event is in the next 2^32
epoch, therefore the low 32-Bit awire->sbc wrapped around to zero,
or close to zero --> awire->sbc is apparently much lower than the
glxDraw->lastEventSbc recorded for the previous epoch

--> We need to increment glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch to be one higher than the previous one.

--> Case a) also handles the old DRI2 behaviour.

b) Previous sbc event was above closest 2^32 boundary, but now a
late event from the previous 2^32 epoch arrives, with a true sbc
that belongs to the previous 2^32 segment, so the awire->sbc of
this late event has a high count close to 2^32, whereas
glxDraw->lastEventSbc is closer to zero --> awire->sbc is much
greater than glXDraw->lastEventSbc.

--> We need to decrement glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch back to the previous lower epoch of this late
completion event.

We assume such a wraparound to a higher (a) epoch or lower (b)
epoch has happened if awire->sbc and glxDraw->lastEventSbc differ
by more than 2^30 counts, as such a difference can only happen
on wraparound, or if somehow 2^30 present requests would be pending
for a given drawable inside the server, which is rather unlikely.

v2: Explain the reason for this patch and the new wraparound handling
    much more extensive in commit message, no code change wrt. initial
    version.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit cc5ddd584d)
2015-03-18 21:49:25 +00:00
Emil Velikov
ad259df2e0 auxiliary/os: fix the android build - s/drm_munmap/os_munmap/
Squash this silly typo introduced with commit c63eb5dd5ec(auxiliary/os: get
the mmap/munmap wrappers working with android)

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 55f0c0a29f)
2015-03-18 21:49:18 +00:00
Emil Velikov
df2db2a55f loader: include <sys/stat.h> for non-sysfs builds
Required by fstat(), otherwise we'll error out due to implicit function
declaration.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89530
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reported-by: Vadim Rutkovsky <vrutkovs@redhat.com>
Tested-by: Vadim Rutkovsky <vrutkovs@redhat.com>
(cherry picked from commit 771cd266b9)
2015-03-18 21:49:05 +00:00
Rob Clark
0506f69f08 freedreno: update generated headers
Fix a3xx texture layer-size.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e92bc6b38e)
[Emil Velikov: sqush trivial conflicts, drop the a4xx.xml.h changes]

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/freedreno/a2xx/a2xx.xml.h
	src/gallium/drivers/freedreno/a3xx/a3xx.xml.h
	src/gallium/drivers/freedreno/a4xx/a4xx.xml.h
	src/gallium/drivers/freedreno/adreno_common.xml.h
	src/gallium/drivers/freedreno/adreno_pm4.xml.h
2015-03-18 21:48:40 +00:00
Ilia Mirkin
a563045009 freedreno: fix slice pitch calculations
For example if width were 65, the first slice would get 96 while the
second would get 32. However the hardware appears to expect the second
pitch to be 64, based on halving the 96 (and aligning up to 32).

This fixes texelFetch piglit tests on a3xx below a certain size. Going
higher they break again, but most likely due to unrelated reasons.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 620e29b748)
2015-03-18 21:32:21 +00:00
Samuel Iglesias Gonsalvez
b2e243f70c glsl: optimize (0 cmp x + y) into (-x cmp y).
The optimization done by commit 34ec1a24d did not take it into account.

Fixes:

dEQP-GLES3.functional.shaders.random.all_features.fragment.20

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b43bbfa90a)
2015-03-18 21:15:35 +00:00
Iago Toral Quiroga
8c25b0f2d1 i965: Fix out-of-bounds accesses into pull_constant_loc array
The piglit test glsl-fs-uniform-array-loop-unroll.shader_test was designed
to do an out of bounds access into an uniform array to make sure that we
handle that situation gracefully inside the driver, however, as Ken describes
in bug 79202, Valgrind reports that this is leading to an out-of-bounds access
in fs_visitor::demote_pull_constants().

Before accessing the pull_constant_loc array we should make sure that
the uniform we are trying to access is valid.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79202
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 6ac1bc90c4)
Nominated-by: Matt Turner <mattst88@gmail.com>
2015-03-11 17:46:03 +00:00
Rob Clark
a91ee1e187 freedreno/ir3: fix silly typo for binning pass shaders
Was resulting in gl_PointSize write being optimized out, causing
particle system type shaders to hang if hw binning enabled.

Fixes neverball, OGLES2ParticleSystem, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 60096ed906)
2015-03-11 17:44:38 +00:00
Marek Olšák
977626f10a r300g: fix sRGB->sRGB blits
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c939231e72)
2015-03-11 17:42:52 +00:00
Marek Olšák
b451a2ffbf r300g: fix a crash when resolving into an sRGB texture
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9953586af2)
2015-03-11 17:42:38 +00:00
Marek Olšák
a561eee82c r300g: fix RGTC1 and LATC1 SNORM formats
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 74a757f92f)
2015-03-11 17:42:07 +00:00
Stefan Dösinger
80ef80d087 r300g: Fix the ATI1N swizzle (RGTC1 and LATC1)
This fixes the GL_COMPRESSED_RED_RGTC1 part of piglit's rgtc-teximage-01
test as well as the precision part of Wine's 3dc format test (fd.o bug
89156).

The Z component seems to contain a lower precision version of the
result, probably a temporary value from the decompression computation.
The Y and W component contain different data that depends on the input
values as well, but I could not make sense of them (Not that I tried
very hard).

GL_COMPRESSED_SIGNED_RED_RGTC1 still seems to have precision problems in
piglit, and both formats are affected by a compiler bug if they're
sampled by the shader with a swizzle other than .xyzw. Wine uses .xxxx,
which returns random garbage.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89156
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f710b99071)
2015-03-11 17:41:43 +00:00
Ilia Mirkin
fa8bfb3ed1 freedreno/ir3: get the # of miplevels from getinfo
This fixes ARB_texture_query_levels to actually return the desired
value.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cb3eb43ad6)
2015-03-11 17:41:32 +00:00
Ilia Mirkin
025cf8cb3f freedreno/ir3: fix array count returned by TXQ
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8ac957a51c)
2015-03-11 17:41:20 +00:00
Ilia Mirkin
4db4f70546 freedreno: move fb state copy after checking for size change
Fixes: 1f3ca56b ("freedreno: use util_copy_framebuffer_state()")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f3dfe6513c)
2015-03-11 17:40:59 +00:00
Andrey Sudnik
d4a95ffcda i965/vec4: Don't lose the saturate modifier in copy propagation.
Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89224
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0dfec59a27)
2015-03-07 16:41:16 +00:00
Emil Velikov
97b0219ed5 mesa: rename format_info.c to format_info.h
The file is auto-generated, and #included by formats.c. Let's rename it
to reflect the latter. This will also help up fix the dependency
tracking by adding it to the _SOURCES variable, without the side effect
of it being compiled (twice).

v2: Update .gitignore to reflect the rename.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 3f6c28f2a9)

Conflicts:
	src/mesa/Makefile.am
	src/mesa/main/.gitignore
2015-03-07 16:40:27 +00:00
Matt Turner
93273f16af r300g: Check return value of snprintf().
Would have at least prevented the crash the previous patch fixed.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540970
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit ade0b580e7)
2015-03-07 16:37:22 +00:00
Matt Turner
8e8d215cae r300g: Use PATH_MAX instead of limiting ourselves to 100 chars.
When built with Gentoo's package manager, the Mesa source directory
exists seven directories deep. The path to the .test file is too long
and is silently truncated, leading to a crash. Just use PATH_MAX.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540970
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit f5e2aa1324)
2015-03-07 16:37:15 +00:00
Daniel Stone
1a929baa0b egl: Take alpha bits into account when selecting GBM formats
This fixes piglit when using PIGLIT_PLATFORM=gbm

Tom Stellard:
  - Fix ARGB2101010 format

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
(cherry picked from commit 65c8965d03)
2015-03-07 16:37:04 +00:00
Marc-Andre Lureau
3a625d0b3f gallium/auxiliary/indices: fix start param
Since commit 28f3f8d, indices generator take a start parameter. However, some
index values have been left to start at 0.

This fixes the glean/fbo test with the virgl driver, and copytexsubimage
with freedreno.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 073a5d2e84)
2015-03-07 16:36:47 +00:00
Emil Velikov
944ef59b2f cherry-ignore: add not applicable/rejected commits
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-07 16:36:05 +00:00
Emil Velikov
fc9dd495b2 docs: Add sha256 sums for the 10.4.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:44:55 +00:00
Emil Velikov
542a754524 Add release notes for the 10.4.6 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:23:34 +00:00
Emil Velikov
e559d126f9 Update version to 10.4.6
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:16:58 +00:00
Emil Velikov
fc5881ad73 Revert "gallivm: Update for RTDyldMemoryManager becoming an unique_ptr."
This reverts commit 66a3f104a5.

The commit is likely insufficient for normal work with LLVM 3.6.
The full discussion and reason can be found at
http://lists.freedesktop.org/archives/mesa-dev/2015-March/078795.html
2015-03-06 19:16:28 +00:00
Emil Velikov
9508ca24f1 mesa: cherry-pick the second half of commit 2aa71e9485
Missed out by commit 39ae85732d2(mesa: Fix error validating args for
TexSubImage3D)

Reported-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-06 19:16:19 +00:00
Matt Turner
644bbf88ec mesa: Correct backwards NULL check.
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 491d42135a)
2015-03-06 18:45:13 +00:00
Ian Romanick
a369361f9e mesa: Always generate GL_INVALID_OPERATION in _mesa_GetProgramBinary
There are no binary formats supported, so what are you doing?  At least
this gives the application developer some feedback about what's going
on.  The spec gives no guidance about what to do in this scenario.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
(cherry picked from commit f591712efe)
2015-03-06 18:44:52 +00:00
Ian Romanick
f1663a5236 mesa: Ensure that length is set to zero in _mesa_GetProgramBinary
v2: Fix assignment of length.  Noticed by Julien Cristau.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
(cherry picked from commit 4fd8b30123)
2015-03-06 18:44:37 +00:00
Ian Romanick
e1b5bc9330 mesa: Add missing error checks in _mesa_ProgramBinary
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Leight Bade <leith@mapbox.com>
(cherry picked from commit 201b9c1818)

Conflicts:
	src/mesa/main/shaderapi.c
2015-03-06 18:42:51 +00:00
Emil Velikov
93edf3e7dc Revert "mesa: Correct backwards NULL check."
This reverts commit a598a9bdfe.

The patch was applied without the required dependencies.
2015-03-06 18:40:09 +00:00
José Fonseca
66a3f104a5 gallivm: Update for RTDyldMemoryManager becoming an unique_ptr.
Trivial.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=86958

(cherry picked from commit ef7e0b39a2)
Nominated-by: Sedat Dilek <sedat.dilek@gmail.com>
2015-03-04 01:51:36 +00:00
Abdiel Janulgue
afa7a851da st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported
There is a bug in the current lowering pass implementation where we lower saturate
to clamp only for vertex shaders on drivers supporting SM 3.0. The correct behavior
is to actually lower to clamp only when we don't support saturate which happens
on drivers that don't support SM 3.0

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit 49e0431211)
Nominated-by: Matt Turner <mattst88@gmail.com>
2015-03-04 01:51:36 +00:00
Abdiel Janulgue
d880aa573c glsl: Don't optimize min/max into saturate when EmitNoSat is set
v3: Fix multi-line comment format (Ian)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit 4ea8c8d56c)
2015-03-04 01:51:36 +00:00
Matt Turner
741aeba26f i965/fs: Don't use backend_visitor::instructions after creating the CFG.
This is a fix for a regression introduced in commit a9f8296d ("i965/fs:
Preserve the CFG in a few more places.").

The errata this code works around is described in a comment before the function:

   "[DevBW, DevCL] Errata: A destination register from a send can not be
    used as a destination register until after it has been sourced by an
    instruction with a different destination register.

The framebuffer write's sources must be in message registers, which SEND
instructions cannot have as a destination. There's no way for this
errata to affect anything at the end of the program. Just remove the
code.

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84613
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e214000f25)
2015-03-04 01:51:36 +00:00
Matt Turner
a598a9bdfe mesa: Correct backwards NULL check.
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 491d42135a)
[Emil Velikov: the patch hunk has a different offset.]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/mesa/main/shaderapi.c
2015-03-04 01:51:36 +00:00
Chris Forbes
0c46d850d9 i965/gs: Check newly-generated GS-out VUE map against correct stage
Previously, we compared our new GS-out VUE map to the existing *VS*-out
VUE map, which is bogus.

This would mostly manifest as redundant dirty flagging where the GS is
in use but the VS and GS output layouts differ; but there is a scary
case where we would fail to flag a GS-out layout change if it happened
to match the VS-out layout.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.5, 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88885
(cherry picked from commit b51ff50a76)
2015-03-04 01:51:36 +00:00
Jonathan Gray
da46b1b160 auxilary/os: correct sysctl use in os_get_total_physical_memory()
The length argument passed to sysctl was the size of the pointer
not the type.  The result of this is sysctl calls would fail on
32 bit BSD/Mac OS X.

Additionally the wrong pointer was passed as an argument to store
the result of the sysctl call.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 7983a3d2e0)
2015-03-04 01:51:36 +00:00
Matt Turner
7e723c98ce glsl: Rewrite and fix min/max to saturate optimization.
There were some bugs, and the code was really difficult to follow. We
would optimize

   min(max(x, b), 1.0) into max(sat(x), b)

but not pay attention to the order of min/max and also do

   max(min(x, b), 1.0) into max(sat(x), b)

Corrects four shaders from Champions of Regnum that do

   min(max(x, 1), 10)

and corrects rendering of Mass Effect under VMware Workstation.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89180
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit cb25087c7b)
2015-03-04 01:51:36 +00:00
Andreas Boll
0a51529a28 glx: Fix returned values of GLX_RENDERER_PREFERRED_PROFILE_MESA
If the renderer supports the core profile the query returned incorrectly
0x8 as value, because it was using (1U << __DRI_API_OPENGL_CORE) for the
returned value.

The same happened with the compatibility profile. It returned 0x1
(1U << __DRI_API_OPENGL) instead of 0x2.

Internal DRI defines:
   dri_interface.h: #define __DRI_API_OPENGL       0
   dri_interface.h: #define __DRI_API_OPENGL_CORE  3

Those two bits are supposed for internal usage only and should be
translated to GLX_CONTEXT_CORE_PROFILE_BIT_ARB (0x1) for a preferred
core context profile and GLX_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB (0x2)
for a preferred compatibility context profile.

This patch implements the above translation in the glx module.

v2: Fix the incorrect behavior in the glx module

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6d164f65c5)
2015-03-04 01:51:36 +00:00
Leo Liu
2a9e9b5aeb st/omx/dec/h264: fix picture out-of-order with poc type 0 v2
poc counter should be reset with IDR frame,
otherwise there would be a re-order issue with
frames before and after IDR

v2: add commit message

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9c7b343bc0)
2015-03-04 01:51:36 +00:00
Emil Velikov
120792fa04 install-lib-links: remove the .install-lib-links file
With earlier commit (install-lib-links: don't depend on .libs directory)
we moved the location of the file from .libs/ to the current dir.
Although we did not attribute that in the former case autotools was
doing us a favour and removing the file. Explicitly remove the file at
clean-local time, otherwise we'll end up with dangling files.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit fece147be5)
2015-03-04 01:51:35 +00:00
Eduardo Lima Mitev
39ae85732d mesa: Fix error validating args for TexSubImage3D
The zoffset and depth values were not being considered when calling
error_check_subtexture_dimensions().

Fixes 2 dEQP tests:
* dEQP-GLES3.functional.negative_api.texture.texsubimage3d_neg_offset
* dEQP-GLES3.functional.negative_api.texture.texsubimage3d_invalid_offset

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedestkop.org>
(cherry picked from commit 2aa71e9485)
[Emil Velikov: Resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/mesa/main/teximage.c
2015-03-04 01:51:35 +00:00
Marek Olšák
61c1aabb9f radeonsi: fix point sprites
Broken by a27b74819a.

This fix is critical and should be ported to stable ASAP.

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7820a11e3d)

Squashed with commit

radeonsi: fix a warning caused by previous commit

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 050bf75c8b)

[Emil Velikov: The file was renamed si_state_{shaders,draw}.c]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state_shader.c
2015-03-04 01:51:16 +00:00
Marek Olšák
6da4e66d4e vbo: fix an unitialized-variable warning
It looks like a bug to me.

Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 0feb0b7373)
2015-03-04 00:39:01 +00:00
Brian Paul
7e57411b9a st/mesa: fix sampler view reference counting bug in glDraw/CopyPixels
Use pipe_sampler_view_reference() instead of ordinary assignment.
Also add a new sanity check assertion.

Fixes piglit gl-1.0-drawpixels-color-index test crash.  But note
that the test still fails.

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 62a8883f32)
2015-03-04 00:38:31 +00:00
Brian Paul
1e6735ead1 swrast: fix multiple color buffer writing
If a fragment program wrote to more than one color buffer, the
first fragment color got replicated to all dest buffers.  This
fixes 5 piglit FBO tests, including fbo-drawbuffers-arbfp.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45348
Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 89c96afe3c)
2015-03-04 00:38:23 +00:00
Lucas Stach
deea686c71 install-lib-links: don't depend on .libs directory
This snippet can be included in Makefiles that may, depending on the
project configuration, not actually build any installable libraries.

In that case we don't have anything to depend on and this part of
the makefile may be executed before the .libs directory is created,
so do not depend on it being there.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
(cherry picked from commit 5c1aac17ad)
2015-03-04 00:38:11 +00:00
Emil Velikov
41bdeda102 docs: Add sha256 sums for the 10.4.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-21 12:31:51 +00:00
Emil Velikov
a5c608e951 Add release notes for the 10.4.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-21 12:22:08 +00:00
Emil Velikov
e0276bc297 Update version to 10.4.5
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-21 12:17:35 +00:00
Michel Dänzer
dc16fb1969 Revert "radeon/llvm: enable unsafe math for graphics shaders"
This reverts commit 0e9cdedd2e.

It caused the grass to disappear in The Talos Principle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89069
Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 4db985a5fa)
2015-02-18 12:17:44 +00:00
Kenneth Graunke
aaa823569b glsl: Reduce memory consumption of copy propagation passes.
opt_copy_propagation and opt_copy_propagation_elements create new ACP
and Kill sets each time they enter a new control flow block.  For if
blocks, they also copy the entire existing ACP set contents into the
new set.

When we exit the control flow block, we discard the new sets.  However,
we weren't freeing them - so they lived on until the pass finished.
This can waste a lot of memory (57MB on one pessimal shader).

This patch makes the pass allocate ACP entries using this->acp as the
memory context, and Kill entries out of this->kill.  It also steals
kill entries when moving them from the inner kill list to the parent.

It then frees the lists, including their contents.

v2: Move ralloc_free(this->acp) just before this->acp = orig_acp
    (suggested by Eric Anholt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 76960a55e6)
2015-02-18 12:17:44 +00:00
Laura Ekstrand
f57b41758d main: Fixed _mesa_GetCompressedTexImage_sw to copy slices correctly.
Previously array textures were not working with GetCompressedTextureImage,
leading to failures in the test
arb_direct_state_access/getcompressedtextureimage.c.

Tested-by: Laura Ekstrand <laura@jlekstrand.net>
Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92163482bd)
2015-02-18 12:17:44 +00:00
Marek Olšák
67ac6a3951 radeonsi: fix a crash if a stencil ref state is set before a DSA state
+ minor indentation fixes

Discovered by Axel Davy.

This can't be reproduced with any app, because all state trackers set a DSA
state first.

Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 2ead74888a)
2015-02-18 12:17:44 +00:00
Marek Olšák
5d04b9eeed mesa: fix AtomicBuffer typo in _mesa_DeleteBuffers
Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit e8625a29fe)
2015-02-18 12:17:43 +00:00
Marek Olšák
53041aecef radeonsi: small fix in SPI state
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

(cherry picked from commit a27b74819a)
[Emil Velikov: The file was renamed si_state_{shaders,draw}.c]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
        src/gallium/drivers/radeonsi/si_state_shader.c
2015-02-18 12:14:04 +00:00
Ilia Mirkin
f76bcbb4cd nvc0: allow holes in xfb target lists
Tested with a modified xfb-streams test which outputs to streams 0, 2,
and 3.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 854eb06bee)
2015-02-18 12:09:55 +00:00
Ilia Mirkin
89289934fc st/mesa: treat resource-less xfb buffers as if they weren't there
If a transform feedback buffer's size is 0, st_bufferobj_data doesn't
end up creating a buffer for it. There's no point in trying to write to
such a buffer, so just pretend as if it's not really there.

This fixes arb_gpu_shader5-xfb-streams-without-invocations on nvc0.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 80d373ed5b)
2015-02-18 12:09:54 +00:00
Ilia Mirkin
dbf82d753b nvc0: bail out of 2d blits with non-A8_UNORM alpha formats
This fixes the teximage-colors uploads with GL_ALPHA format and
non-GL_UNSIGNED_BYTE type.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 68e4f3f572)
2015-02-18 12:09:54 +00:00
Emil Velikov
b786e6332b get-pick-list.sh: Require explicit "10.4" for nominating stable patches
A nomination unadorned with a specific version is now interpreted as
being aimed at the 10.5 branch, which was recently opened.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-18 12:09:54 +00:00
Carl Worth
c0ce908a90 Revert use of Mesa IR optimizer for ARB_fragment_programs
Commit f82f2fb3dc added use of the Mesa
IR optimizer for both ARB_fragment_program and ARB_vertex_program, but
only justified the vertex-program portions with measured performance
improvements.

Meanwhile, the optimizer was seen to generate hundreds of unused
immediates without discarding them, causing failures.

Discard the use of the optimizer for now to fix the regression. (In
the future, we anticpate things moving from Mesa IR to NIR for better
optimization anyway.)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82477

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

CC: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 55a57834bf)
2015-02-18 12:09:54 +00:00
Kenneth Graunke
c83c5f4b69 i965: Fix integer border color on Haswell.
+82 Piglits - 100% of border color tests now pass on Haswell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 08a06b6b89)
2015-02-18 12:09:54 +00:00
Kenneth Graunke
f2663112f6 i965: Use a gl_color_union for sampler border color.
This should have no effect, but will make it easier to implement other
bug fixes.

v2: Eliminate "unsigned one" local; just use the value where necessary.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e1e73443c5)
2015-02-18 12:09:54 +00:00
Kenneth Graunke
2ad93851ff i965: Override swizzles for integer luminance formats.
The hardware's integer luminance formats are completely unusable;
currently we fall back to RGBA.  This means we need to override
the texture swizzle to obtain the XXX1 values expected for luminance
formats.

Fixes spec/EXT_texture_integer/texwrap formats bordercolor [swizzled]
on Broadwell - 100% of border color tests now pass on Broadwell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8cb18760cc)
2015-02-18 12:09:54 +00:00
Michel Dänzer
e35e6773c2 st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB
The latter currently implies CPU read access, so only PIPE_USAGE_STAGING
can be expected to be fast.

Mesa demos src/tests/streaming_rect on Kaveri (radeonsi):

Unpatched:  42 frames in  1.023 seconds = 41.056 FPS
Patched:   615 frames in  1.000 seconds = 615.000 FPS

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88658
Cc: "10.3 10.4" <mesa-stable@lists.freedestkop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a338dc0186)
2015-02-18 12:09:54 +00:00
Marek Olšák
51bdd19c97 radeonsi: fix instanced arrays with non-zero start instance
Fixes piglit ARB_base_instance/arb_base_instance-drawarrays.

Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 50908a8918)
2015-02-18 12:09:54 +00:00
Marek Olšák
5c623ff071 r600g,radeonsi: don't append to streamout buffers that haven't been used yet
The FILLED_SIZE counter is uninitialized at the beginning, so we can't use it.
Instead, use offset = 0, which is what we always do when not appending.

This unexpectedly fixes spec/ARB_texture_multisample/sample-position/*.
Yes, the test does use transform feedback.

Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 658f1d4cfe)
2015-02-18 12:09:53 +00:00
Jeremy Huddleston Sequoia
654f197f19 darwin: build fix
xfont.c:237:14: error: implicit declaration of function 'GetGLXDRIDrawable' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
   glxdraw = GetGLXDRIDrawable(CC->currentDpy, CC->currentDrawable);
             ^
Fixes regression from 291be28476

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
(cherry picked from commit e68b67b53f)
2015-02-11 00:24:04 -08:00
Jeremy Huddleston Sequoia
162cee83ba darwin: build fix
../../../src/mesa/main/compiler.h:47:10: fatal error: 'util/macros.h' file not found

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
(cherry picked from commit 1c67a5687a)
2015-02-10 20:35:33 -08:00
Emil Velikov
54da987bae docs: Add sha256 sums for the 10.4.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 00:47:18 +00:00
Emil Velikov
62eb27ac8b Add release notes for the 10.4.4 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 00:17:09 +00:00
Emil Velikov
a824179af5 Update version to 10.4.4
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-02-07 00:12:04 +00:00
Park, Jeongmin
fecedb6c43 st/osmesa: Fix osbuffer->textures indexing
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88930
Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6fd4a61ad6)
2015-02-04 01:37:33 +00:00
Matt Turner
9d1d1f46c7 gallium/util: Don't use __builtin_clrsb in util_last_bit().
Unclear circumstances lead to undefined symbols on x86.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=536916
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 32e98e8ef0)
2015-02-04 01:37:20 +00:00
José Fonseca
b51d369690 egl: Pass the correct X visual depth to xcb_put_image().
The dri2_x11_add_configs_for_visuals() function happily matches a 32
bits EGLconfig with a 24 bits X visual.  However it was passing 32bits
depth to xcb_put_image(), making X server unhappy:

  https://github.com/apitrace/apitrace/issues/313#issuecomment-70571911

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 11a955aef4)
2015-02-02 00:12:04 +00:00
Niels Ole Salscheider
eab8dc28ed configure: Link against all LLVM targets when building clover
Since 8e7df519bd, we initialise all targets in
clover. This fixes bug 85380.

v2: Mention correct bug in commit message

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4b94c3fc31)
2015-02-02 00:12:04 +00:00
Ville Syrjälä
cc580045a8 i965: Fix max_wm_threads for CHV
Change max_wm_threads to match the spec on CHV. The max number of
threads in 3DSTATE_PS is always programmed to 64 and the hardware
internally scales that depending on the GT SKU. So this doesn't
change the max number of threads actually used, but it does affect
the scratch space calculation.

On CHV the old value was too small, so the amount of scratch space
allocated wasn't sufficient to satisfy the actual max number of
threads used.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit 99754446ab)
2015-02-02 00:12:04 +00:00
Mario Kleiner
0d721fa1d6 glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)
Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>

v2: Add Frank Binns signed off by for his original earlier
patch from April 2014, which is identical to this one, and
Chris Wilsons reviewed tag from May 2014 for that patch, ergo
also for this one.

v3: Incorporate comment about triple buffering as suggested
by Axel Davy, and reference to relevant spec provided by
Eric Anholt.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 455d3036fa)
2015-02-02 00:12:04 +00:00
Brian Paul
c96ed76b3d mesa: fix display list 8-byte alignment issue
The _mesa_dlist_alloc() function is only guaranteed to return a pointer
with 4-byte alignment.  On 64-bit systems which don't support unaligned
loads (e.g. SPARC or MIPS) this could lead to a bus error in the VBO code.

The solution is to add a new  _mesa_dlist_alloc_aligned() function which
will return a pointer to an 8-byte aligned address on 64-bit systems.
This is accomplished by inserting a 4-byte NOP instruction in the display
list when needed.

The only place this actually matters is the VBO code where we need to
allocate a 'struct vbo_save_vertex_list' which needs to be 8-byte
aligned (just as if it were malloc'd).

The gears demo and others hit this bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88662
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 53b01938ed)
2015-01-30 08:51:51 -07:00
Emil Velikov
49a5bce780 docs: Add sha256 sums for the 10.4.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-24 12:54:33 +00:00
Emil Velikov
e92bfa3f95 Add release notes for the 10.4.3 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-24 12:49:17 +00:00
Emil Velikov
f70e4d4afd Update version to 10.4.3
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-24 12:44:46 +00:00
Axel Davy
42806f12a9 st/nine: Allocate vs constbuf buffer for indirect addressing once.
When the shader does indirect addressing on the constants,
we allocate a temporary constant buffer to which we copy
the constants from the app given user constants and
the constants filled in the shader.

This patch makes this buffer be allocated once.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f8a74410f1)
2015-01-23 00:47:26 +00:00
Axel Davy
4c9b64fc44 st/nine: Allocate the correct size for the user constant buffer
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e0f75044c8)
2015-01-23 00:47:26 +00:00
Axel Davy
69c7cf70e7 st/nine: Add variables containing the size of the constant buffers
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b9cbea9dbc)
2015-01-23 00:47:26 +00:00
Axel Davy
4d04fd0871 st/nine: Fix sm3 relative addressing for non-debug build
Relative addressing needs the constant buffer to get all
the correct constants, even those defined by the shader.

The code to copy the shader constants to the constant buffer
was enabled only for debug build. Enable it always.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit a721987077)
2015-01-23 00:47:25 +00:00
Axel Davy
0727ab961c st/nine: Remove unused code for ps
Since constant indirect adressing is not allowed for ps,
we can remove our code to handle that.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4b7a9cfddb)
2015-01-23 00:47:25 +00:00
Axel Davy
7280ddea9d st/nine: Correct rules for relative adressing and constants.
relative adressing for constants is possible only for vs float
constants.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9690bf33d7)
2015-01-23 00:47:25 +00:00
Axel Davy
425bc89720 st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bce94ce831)
2015-01-23 00:47:25 +00:00
Axel Davy
0b3f8c72f7 st/nine: Implement TEXDP3TEX
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9e23b64c15)
2015-01-23 00:47:25 +00:00
Axel Davy
63e668eb18 st/nine: Implement TEXDP3
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 09eb1e901f)
2015-01-23 00:47:24 +00:00
Axel Davy
2b4c577730 st/nine: Implement TEXDEPTH
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f19e699368)
2015-01-23 00:47:24 +00:00
Axel Davy
e3a393b4c3 st/nine: Implement TEXM3x3SPEC
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3676ab02fb)
2015-01-23 00:47:24 +00:00
Axel Davy
7ecd0f9528 st/nine: Implement TEXM3x2TEX
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2b9f079ae3)
2015-01-23 00:47:24 +00:00
Axel Davy
336887bca1 st/nine: implement TEXM3x2DEPTH
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fdff111dc8)
2015-01-23 00:47:24 +00:00
Axel Davy
8e08ba6f96 st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC
The fix is that this line:
"src[s] = tx->regs.vT[s];" is wrong if s doesn't start from 0.
Instead access tx->regs.vT directly when needed.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7865210670)

Conflicts:
	src/gallium/state_trackers/nine/nine_shader.c
2015-01-23 00:47:09 +00:00
Axel Davy
77e1136f44 st/nine: Fill missing dst and src number for some instructions.
Not filling them correctly results in bad padding and later crash.

Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b1259544e3)
2015-01-23 00:44:42 +00:00
Axel Davy
22c75f9f5a st/nine: Implement TEXCOORD special behaviours
texcoord for ps < 1_4 should clamp between 0 and 1 the values.

texcrd (texcoord ps 1_4) does not clamp and can be used with
two modifiers _dw and _dz that means the channels are divided
by w or z.
Implement those in shared code, since the same modifiers can be used
for texld ps 1_4.

v2: replace DIV by RCP + MUL
v3: Remove an useless MOV

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5399119fb1)

Conflicts:
	src/gallium/state_trackers/nine/nine_shader.c
2015-01-23 00:43:57 +00:00
Axel Davy
4b65be8860 st/nine: Fix some fixed function pipeline operation
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6378d74937)
2015-01-22 23:43:28 +00:00
Axel Davy
9ea8e7f0df st/nine: Clamp ps 1.X constants
This is wine (and windows) behaviour.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 018407b5d8)
2015-01-22 23:43:28 +00:00
Axel Davy
d0d09a4eee st/nine: Fix CND implementation
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3ca67f8810)
2015-01-22 23:43:27 +00:00
Axel Davy
75f39e45f0 st/nine: Rewrite LOOP implementation, and a0 aL handling
Previous implementation didn't work well with nested loops.

Instead of using several address registers, put a0 and aL
into normal registers, and copy them to one address register when
we need to use them.

Wine tests loop_index_test() and nested_loop_test() now pass correctly.

Fixes r600g crash while loading Bioshock -
bug https://bugs.freedesktop.org/show_bug.cgi?id=85696

Tested-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6a8e5e48be)
2015-01-22 23:43:27 +00:00
Axel Davy
553089093f st/nine: Correct LOG on negative values
We should take the absolute value of the input.

Also return -FLT_MAX instead of -Inf for an input of 0.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c9aa9a0add)
2015-01-22 23:43:27 +00:00
Axel Davy
add30f01ef st/nine: Handle NRM with input of null norm
When the input's xyz are 0.0, the output
should be 0.0. This is due to the fact that
Inf * 0 = 0 for dx9. To handle this case,
cap the result of RSQ to FLT_MAX. We have
FLT_MAX * 0 = 0.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f5e8e3fb80)
2015-01-22 23:43:27 +00:00
Axel Davy
0dfb9c9e86 st/nine: Handle RSQ special cases
We should use the absolute value of the input as input to ureg_RSQ.

Moreover, an input of 0.0 should return FLT_MAX.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2487f73574)
2015-01-22 23:43:27 +00:00
Axel Davy
7e26cf83ba st/nine: Fix POW implementation
POW doesn't match directly TGSI, since we should
take the absolute value of src0.

Fixes black textures in some games

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c12f8c2088)
2015-01-22 23:43:27 +00:00
Axel Davy
00d22ce0fa st/nine: Fix typo for M4x4
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit e0dd9ca985)
2015-01-22 23:43:26 +00:00
Axel Davy
7f700cc35b st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs
Let's say we have c1 and c2 declared in the shader and c0 given by the app

Then here we would have read c0, c1 and c2 given by the app, instead
of the correct c0, c1, c2.

This correction fixes several issues in some games.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 53dc992f20)
2015-01-22 23:43:26 +00:00
Axel Davy
e6167e749c st/nine: Saturate oFog and oPts vs outputs
According to docs and Wine, these two vs outputs have
to be saturated.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9fb58a74a0)
2015-01-22 23:43:26 +00:00
Axel Davy
bce0058333 st/nine: Remove some shader unused code
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a214838181)
2015-01-22 23:43:26 +00:00
Axel Davy
9a0647ba7f st/nine: Convert integer constants to floats before storing them when cards don't support integers
The shader code is already behaving as if they are floats when the the card doesn't support integers

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d08c7b0b88)
2015-01-22 23:43:26 +00:00
Axel Davy
669c5d6d44 st/nine: Rework of boolean constants
Convert them to shader booleans at earlier stage.
Previous code is fine, but later patch will make
integers being converted at earlier stage, so do
the same for booleans

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d9d18fe39f)
2015-01-22 23:43:26 +00:00
Axel Davy
87ac37074f st/nine: Add ATI1 and ATI2 support
Adds ATI1 and ATI2 support to nine.

They map to PIPE_FORMAT_RGTC1_UNORM and PIPE_FORMAT_RGTC2_UNORM,
but need special handling.

Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 77f0ecf9ce)
2015-01-22 23:43:25 +00:00
Axel Davy
e1bcca4f13 st/nine: Check if srgb format is supported before trying to use it.
According to msdn, we must act as if user didn't ask srgb if we don't
support it.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b0b5430322)
2015-01-22 23:43:25 +00:00
Stanislaw Halik
50ea1c1f5f st/nine: Hack to generate resource if it doesn't exist when getting view
Buffers in the MANAGED pool are supposed to have the content in a ram buffer,
a copy in VRAM if there is enough memory (driver manages memory and decide when
to delete the buffer in VRAM).

This is not implemented properly in nine, and a VRAM copy is going to be created
when the RAM memory is filled, and the VRAM copy will get synced with the RAM
memory updates.

Due to some issues (in the implementation or in app logic), it can happen
we try to create a sampler view of the resource while we haven't created the
VRAM resource. This hack creates the resource when we hit this case, which prevents
crashing, but doesn't help with the resource content.

This fixes several games crashing at launch.

Acked-by: Axel Davy <axel.davy@ens.fr>
Acked-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Stanislaw Halik <sthalik@misaki.pl>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 82810d3b66)
2015-01-22 23:43:25 +00:00
Axel Davy
3ca8b93476 st/nine: NineBaseTexture9: update sampler view creation
While previous code was having the correct behaviour in general,
this new code is more readable (without checking all gallium formats
manually) and has a more defined behaviour for depth stencil resources.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 47280d777d)
2015-01-22 23:43:25 +00:00
Axel Davy
d06b403377 st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 0abfb80dac)
2015-01-22 23:43:12 +00:00
Axel Davy
481af42f28 st/nine: Fix crash when deleting non-implicit swapchain
The implicit swapchains are destroyed when the device instance is
destroyed. However for non-implicit swapchains, it is not the case,
and the application can have kept an reference on the swapchain
buffers to reuse them.

Fixes problems with battle.net launcher.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit 0d2c22e648)
2015-01-22 23:41:09 +00:00
Axel Davy
393fffd07d st/nine: CubeTexture: fix GetLevelDesc
This->surfaces contains the surfaces associated to the levels
and faces. This->surfaces[6*Level] is what we want here,
since it gives us a face descriptor for the level 'Level'.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9232161178)
2015-01-22 23:41:08 +00:00
Axel Davy
c159b4095c st/nine: NineBaseTexture9: fix setting of last_layer
Use same similar settings as u_sampler_view_default_template

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 18c7e70226)
2015-01-22 23:41:08 +00:00
Axel Davy
b80b5b35a3 st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS
The cap means D3DFVF_XYZRHW vertices will see clipping.
This is not the case when
PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION is supported, since
it'll disable clipping.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 05e20e1045)
2015-01-22 23:41:08 +00:00
Xavier Bouchoux
41ca03a7b4 st/nine: Fix D3DRS_POINTSPRITE support
It's done by testing the existence of the point sprite output register *after* parsing the vertex shader.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dc88989189)
2015-01-22 23:41:08 +00:00
Axel Davy
18ac34825b st/nine: Add new texture format strings
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d2f2a550cf)
2015-01-22 23:41:07 +00:00
Xavier Bouchoux
15ef84ccfb st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 072e2ba8e1)
2015-01-22 23:41:07 +00:00
Xavier Bouchoux
44ee59d300 st/nine: Additional defines to d3dtypes.h
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8bb550b958)
2015-01-22 23:41:07 +00:00
Jose Fonseca
1e0ab5b826 nine: Drop use of TGSI_OPCODE_CND.
This was the only state tracker emitting it, and hardware was just having
to lower it anyway (or failing to lower it at all).

v2: Extracted from a larger patch by Jose (which also dropped DP2A), fixed
    to actually not reference TGSI_OPCODE_CND.  Change by anholt.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
(cherry picked from commit 925cb75f89)
2015-01-22 23:40:09 +00:00
Jonathan Gray
a3381286d8 glsl: Link glsl_test with pthreads library.
Otherwise pthread_mutex_lock will be an undefined reference
on OpenBSD.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88219
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c5be9c126d)
2015-01-22 22:27:12 +00:00
Kenneth Graunke
882f702441 i965: Work around mysterious Gen4 GPU hangs with minimal state changes.
Gen4 hardware appears to GPU hang frequently when using Chromium, and
also when running 'glmark2 -b ideas'.  Most of the error states contain
3DPRIMITIVE commands in quick succession, with very few state packets
between them - usually VERTEX_BUFFERS/ELEMENTS and CONSTANT_BUFFER.

I trimmed an apitrace of the glmark2 hang down to two draw calls with a
glUniformMatrix4fv call between the two.  Either draw by itself works
fine, but together, they hang the GPU.  Removing the glUniform call
makes the hangs disappear.  In the hardware state, this translates to
removing the CONSTANT_BUFFER packet between the two 3DPRIMITIVE packets.

Flushing before emitting CONSTANT_BUFFER packets also appears to make
the hangs disappear.  I observed a slowdown in glxgears by doing it all
the time, so I've chosen to only do it when BRW_NEW_BATCH and
BRW_NEW_PSP are unset (i.e. we haven't done a CS_URB_STATE change or
already flushed the whole pipeline).

I'd much rather understand the problem, but at this point, I don't see
how we'd ever be able to track it down further.  We have no real tools,
and the hardware people moved on years ago.  I've analyzed 20+ error
states and read every scrap of documentation I could find.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80568
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85367
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c4fd0c9052)
2015-01-22 16:11:03 +00:00
Jason Ekstrand
a25e26f67f mesa: Fix clamping to -1.0 in snorm_to_float
This patch fixes the return of a wrong value when x is lower than
-MAX_INT(src_bits) as the result would not be between [-1.0 1.0].

v2 by Samuel Iglesias <siglesias@igalia.com>:
    - Modify snorm_to_float() to avoid doing the division when
      x == -MAX_INT(src_bits)

Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 7d1b08ac44)
2015-01-17 14:59:56 +00:00
Kenneth Graunke
021d71b848 i965: Respect the no_8 flag on Gen6, not just Gen7+.
When doing repclears, we only want to use the SIMD16 program, not the
SIMD8 one.  Kristian added this to the Gen7+ code, but apparently we
missed it in the Gen6 code.  This patch copies that code over.

Approximately doubles the performance in a clear microbenchmark from
mesa-demos (clearspd -width 500 -height 500 +color) on Sandybridge.

Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
References: https://code.google.com/p/chrome-os-partner/issues/detail?id=34681
(cherry picked from commit f95733ddb7)

Conflicts:
	src/mesa/drivers/dri/i965/gen6_wm_state.c
2015-01-17 14:59:08 +00:00
Emil Velikov
14f1659b43 docs: Add sha256 sums for the 10.4.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-12 10:37:09 +00:00
101 changed files with 2014 additions and 592 deletions

View File

@@ -1 +1 @@
10.4.2
10.4.7

View File

@@ -1,2 +1,18 @@
# No whitespace commits in stable.
a10bf5c10caf27232d4df8da74d5c35c23eb883d
a10bf5c10caf27232d4df8da74d5c35c23eb883d
# The following patches address code which is missing in 10.4
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078515.html
06084652fefe49c3d6bf1b476ff74ff602fdc22a common: Correct texture init for meta pbo uploads and downloads.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078547.html
ccc5ce6f72c1ec86be4dfcef96c0b51fba0faa6d common: Correct PBO 2D_ARRAY handling.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078549.html
546aba143d13ba3f993ead4cc30b2404abfc0202 common: Fix PBOs for 1D_ARRAY.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078501.html
2b2fa1865248c6e3b7baec81c4f92774759b201f mesa: Indent break statements and add a missing one.
# http://lists.freedesktop.org/archives/mesa-dev/2015-March/078502.html
87109acbed9c9b52f33d58ca06d9048d0ac7a215 mesa: Free memory allocated for luminance in readpixels.

View File

@@ -14,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*10\.4.*mesa-stable\)' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

View File

@@ -1716,7 +1716,7 @@ if test "x$enable_gallium_llvm" = xyes; then
fi
if test "x$enable_opencl" = xyes; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} ipo linker instrumentation"
LLVM_COMPONENTS="${LLVM_COMPONENTS} all-targets ipo linker instrumentation"
# LLVM 3.3 >= 177971 requires IRReader
if $LLVM_CONFIG --components | grep -qw 'irreader'; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader"

View File

@@ -30,7 +30,9 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD
e303e77dd774df0d051b2870b165f98c97084a55980f884731df89c1b56a6146 MesaLib-10.4.2.tar.gz
08a119937d9f2aa2f66dd5de97baffc2a6e675f549e40e699a31f5485d15327f MesaLib-10.4.2.tar.bz2
c2c2921a80a3395824f02bee4572a6a17d6a12a928a3e497618eeea04fb06490 MesaLib-10.4.2.zip
</pre>
<h2>New features</h2>

145
docs/relnotes/10.4.3.html Normal file
View File

@@ -0,0 +1,145 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.3 Release Notes / January 24, 2015</h1>
<p>
Mesa 10.4.3 is a bug fix release which fixes bugs found since the 10.4.2 release.
</p>
<p>
Mesa 10.4.3 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c53eaafc83d9c6315f63e0904d9954d929b841b0b2be7a328eeb6e14f1376129 MesaLib-10.4.3.tar.gz
ef6ecc9c2f36c9f78d1662382a69ae961f38f03af3a0c3268e53f351aa1978ad MesaLib-10.4.3.tar.bz2
179325fc8ec66529d3b0d0c43ef61a33a44d91daa126c3bbdd1efdfd25a7db1d MesaLib-10.4.3.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80568">Bug 80568</a> - [gen4] GPU Crash During Google Chrome Operation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85367">Bug 85367</a> - [gen4] GPU hang in glmark-es2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85696">Bug 85696</a> - r600g+nine: Bioshock shader failure after 7b1c0cbc90d456384b0950ad21faa3c61a6b43ff</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88219">Bug 88219</a> - include/c11/threads_posix.h:197: undefined reference to `pthread_mutex_lock'</li>
</ul>
<h2>Changes</h2>
<p>Axel Davy (39):</p>
<ul>
<li>st/nine: Add new texture format strings</li>
<li>st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS</li>
<li>st/nine: NineBaseTexture9: fix setting of last_layer</li>
<li>st/nine: CubeTexture: fix GetLevelDesc</li>
<li>st/nine: Fix crash when deleting non-implicit swapchain</li>
<li>st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format</li>
<li>st/nine: NineBaseTexture9: update sampler view creation</li>
<li>st/nine: Check if srgb format is supported before trying to use it.</li>
<li>st/nine: Add ATI1 and ATI2 support</li>
<li>st/nine: Rework of boolean constants</li>
<li>st/nine: Convert integer constants to floats before storing them when cards don't support integers</li>
<li>st/nine: Remove some shader unused code</li>
<li>st/nine: Saturate oFog and oPts vs outputs</li>
<li>st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs</li>
<li>st/nine: Fix typo for M4x4</li>
<li>st/nine: Fix POW implementation</li>
<li>st/nine: Handle RSQ special cases</li>
<li>st/nine: Handle NRM with input of null norm</li>
<li>st/nine: Correct LOG on negative values</li>
<li>st/nine: Rewrite LOOP implementation, and a0 aL handling</li>
<li>st/nine: Fix CND implementation</li>
<li>st/nine: Clamp ps 1.X constants</li>
<li>st/nine: Fix some fixed function pipeline operation</li>
<li>st/nine: Implement TEXCOORD special behaviours</li>
<li>st/nine: Fill missing dst and src number for some instructions.</li>
<li>st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC</li>
<li>st/nine: implement TEXM3x2DEPTH</li>
<li>st/nine: Implement TEXM3x2TEX</li>
<li>st/nine: Implement TEXM3x3SPEC</li>
<li>st/nine: Implement TEXDEPTH</li>
<li>st/nine: Implement TEXDP3</li>
<li>st/nine: Implement TEXDP3TEX</li>
<li>st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB</li>
<li>st/nine: Correct rules for relative adressing and constants.</li>
<li>st/nine: Remove unused code for ps</li>
<li>st/nine: Fix sm3 relative addressing for non-debug build</li>
<li>st/nine: Add variables containing the size of the constant buffers</li>
<li>st/nine: Allocate the correct size for the user constant buffer</li>
<li>st/nine: Allocate vs constbuf buffer for indirect addressing once.</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.2 release</li>
<li>Update version to 10.4.3</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>mesa: Fix clamping to -1.0 in snorm_to_float</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>glsl: Link glsl_test with pthreads library.</li>
</ul>
<p>Jose Fonseca (1):</p>
<ul>
<li>nine: Drop use of TGSI_OPCODE_CND.</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Respect the no_8 flag on Gen6, not just Gen7+.</li>
<li>i965: Work around mysterious Gen4 GPU hangs with minimal state changes.</li>
</ul>
<p>Stanislaw Halik (1):</p>
<ul>
<li>st/nine: Hack to generate resource if it doesn't exist when getting view</li>
</ul>
<p>Xavier Bouchoux (3):</p>
<ul>
<li>st/nine: Additional defines to d3dtypes.h</li>
<li>st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9</li>
<li>st/nine: Fix D3DRS_POINTSPRITE support</li>
</ul>
</div>
</body>
</html>

100
docs/relnotes/10.4.4.html Normal file
View File

@@ -0,0 +1,100 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.4 Release Notes / February 06, 2015</h1>
<p>
Mesa 10.4.4 is a bug fix release which fixes bugs found since the 10.4.3 release.
</p>
<p>
Mesa 10.4.4 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
5cb427eaf980cb8555953e9928f5797979ed783e277745d5f8cbae8bc5364086 MesaLib-10.4.4.tar.gz
f18a967e9c4d80e054b2fdff8c130ce6e6d1f8eecfc42c9f354f8628d8b4df1c MesaLib-10.4.4.tar.bz2
86baad73b77920c80fe58402a905e7dd17e3ea10ead6ea7d3afdc0a56c860bd7 MesaLib-10.4.4.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88662">Bug 88662</a> - unaligned access to gl_dlist_node</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88930">Bug 88930</a> - [osmesa] osbuffer-&gt;textures should be indexed by attachment type</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: fix display list 8-byte alignment issue</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.3 release</li>
<li>Update version to 10.4.4</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>egl: Pass the correct X visual depth to xcb_put_image().</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>gallium/util: Don't use __builtin_clrsb in util_last_bit().</li>
</ul>
<p>Niels Ole Salscheider (1):</p>
<ul>
<li>configure: Link against all LLVM targets when building clover</li>
</ul>
<p>Park, Jeongmin (1):</p>
<ul>
<li>st/osmesa: Fix osbuffer-&gt;textures indexing</li>
</ul>
<p>Ville Syrjälä (1):</p>
<ul>
<li>i965: Fix max_wm_threads for CHV</li>
</ul>
</div>
</body>
</html>

114
docs/relnotes/10.4.5.html Normal file
View File

@@ -0,0 +1,114 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.5 Release Notes / February 21, 2015</h1>
<p>
Mesa 10.4.5 is a bug fix release which fixes bugs found since the 10.4.4 release.
</p>
<p>
Mesa 10.4.5 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e12bbdaee9a758617e8ebd0bb0e987f72addd11db2e4da25ba695e386cd63843 MesaLib-10.4.5.tar.gz
bf60000700a9d58e3aca2bfeee7e781053b0d839e61a95b1883e05a2dee247a0 MesaLib-10.4.5.tar.bz2
3b926de8eee500bb67cf85332c51292f826cc539b8636382aadbb8e70c76527a MesaLib-10.4.5.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82477">Bug 82477</a> - [softpipe] piglit fp-long-alu regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88658">Bug 88658</a> - (bisected) Slow video playback on Kabini</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89069">Bug 89069</a> - Lack of grass in The Talos Principle on radeonsi (native\wine\nine)</li>
</ul>
<h2>Changes</h2>
<p>Carl Worth (1):</p>
<ul>
<li>Revert use of Mesa IR optimizer for ARB_fragment_programs</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.4 release</li>
<li>get-pick-list.sh: Require explicit "10.4" for nominating stable patches</li>
<li>Update version to 10.4.5</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>nvc0: bail out of 2d blits with non-A8_UNORM alpha formats</li>
<li>st/mesa: treat resource-less xfb buffers as if they weren't there</li>
<li>nvc0: allow holes in xfb target lists</li>
</ul>
<p>Jeremy Huddleston Sequoia (2):</p>
<ul>
<li>darwin: build fix</li>
<li>darwin: build fix</li>
</ul>
<p>Kenneth Graunke (4):</p>
<ul>
<li>i965: Override swizzles for integer luminance formats.</li>
<li>i965: Use a gl_color_union for sampler border color.</li>
<li>i965: Fix integer border color on Haswell.</li>
<li>glsl: Reduce memory consumption of copy propagation passes.</li>
</ul>
<p>Laura Ekstrand (1):</p>
<ul>
<li>main: Fixed _mesa_GetCompressedTexImage_sw to copy slices correctly.</li>
</ul>
<p>Marek Olšák (5):</p>
<ul>
<li>r600g,radeonsi: don't append to streamout buffers that haven't been used yet</li>
<li>radeonsi: fix instanced arrays with non-zero start instance</li>
<li>radeonsi: small fix in SPI state</li>
<li>mesa: fix AtomicBuffer typo in _mesa_DeleteBuffers</li>
<li>radeonsi: fix a crash if a stencil ref state is set before a DSA state</li>
</ul>
<p>Michel Dänzer (2):</p>
<ul>
<li>st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB</li>
<li>Revert "radeon/llvm: enable unsafe math for graphics shaders"</li>
</ul>
</div>
</body>
</html>

143
docs/relnotes/10.4.6.html Normal file
View File

@@ -0,0 +1,143 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.6 Release Notes / March 06, 2015</h1>
<p>
Mesa 10.4.6 is a bug fix release which fixes bugs found since the 10.4.5 release.
</p>
<p>
Mesa 10.4.6 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
46c9082142e811c01e49a2c332a9ac0a1eb98f2908985fb9df216539d7eaeaf4 MesaLib-10.4.6.tar.gz
d8baedd20e79ccd98a5a7b05e23d59a30892e68de1fcc057ca6873dafca02735 MesaLib-10.4.6.tar.bz2
6aded6eac7f0d4d55117b8b581d8424710bbb4c768fc90f7b881f29311a751aa MesaLib-10.4.6.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=45348">Bug 45348</a> - [swrast] piglit fbo-drawbuffers-arbfp regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84613">Bug 84613</a> - [G965, bisected] piglit regressions : glslparsertest.glsl2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87516">Bug 87516</a> - glProgramBinary violates spec</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88885">Bug 88885</a> - Transform feedback uses incorrect interleaving if a previous draw did not write gl_Position</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89180">Bug 89180</a> - [IVB regression] Rendering issues in Mass Effect through VMware Workstation</li>
</ul>
<h2>Changes</h2>
<p>Abdiel Janulgue (2):</p>
<ul>
<li>glsl: Don't optimize min/max into saturate when EmitNoSat is set</li>
<li>st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported</li>
</ul>
<p>Andreas Boll (1):</p>
<ul>
<li>glx: Fix returned values of GLX_RENDERER_PREFERRED_PROFILE_MESA</li>
</ul>
<p>Brian Paul (2):</p>
<ul>
<li>swrast: fix multiple color buffer writing</li>
<li>st/mesa: fix sampler view reference counting bug in glDraw/CopyPixels</li>
</ul>
<p>Chris Forbes (1):</p>
<ul>
<li>i965/gs: Check newly-generated GS-out VUE map against correct stage</li>
</ul>
<p>Eduardo Lima Mitev (1):</p>
<ul>
<li>mesa: Fix error validating args for TexSubImage3D</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.5 release</li>
<li>install-lib-links: remove the .install-lib-links file</li>
<li>Revert "mesa: Correct backwards NULL check."</li>
<li>mesa: cherry-pick the second half of commit 2aa71e9485a</li>
<li>Revert "gallivm: Update for RTDyldMemoryManager becoming an unique_ptr."</li>
<li>Update version to 10.4.6</li>
</ul>
<p>Ian Romanick (3):</p>
<ul>
<li>mesa: Add missing error checks in _mesa_ProgramBinary</li>
<li>mesa: Ensure that length is set to zero in _mesa_GetProgramBinary</li>
<li>mesa: Always generate GL_INVALID_OPERATION in _mesa_GetProgramBinary</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>auxilary/os: correct sysctl use in os_get_total_physical_memory()</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>gallivm: Update for RTDyldMemoryManager becoming an unique_ptr.</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>st/omx/dec/h264: fix picture out-of-order with poc type 0 v2</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>install-lib-links: don't depend on .libs directory</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>vbo: fix an unitialized-variable warning</li>
<li>radeonsi: fix point sprites</li>
</ul>
<p>Matt Turner (4):</p>
<ul>
<li>glsl: Rewrite and fix min/max to saturate optimization.</li>
<li>mesa: Correct backwards NULL check.</li>
<li>i965/fs: Don't use backend_visitor::instructions after creating the CFG.</li>
<li>mesa: Correct backwards NULL check.</li>
</ul>
</div>
</body>
</html>

134
docs/relnotes/10.4.7.html Normal file
View File

@@ -0,0 +1,134 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.4.7 Release Notes / March 20, 2015</h1>
<p>
Mesa 10.4.7 is a bug fix release which fixes bugs found since the 10.4.6 release.
</p>
<p>
Mesa 10.4.7 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
9e7b59267199658808f8b33e0410b86fbafbdcd52378658b9df65fac9d24947f MesaLib-10.4.7.tar.gz
2c351c98671f9a7ab3fd9c601bb7a255801b1580f5dd0992639f99152801b0d2 MesaLib-10.4.7.tar.bz2
d14ac578b5ce16560757b53fbd1cb4d6b34652f8e110e4b10a019adc82e67ffd MesaLib-10.4.7.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79202">Bug 79202</a> - valgrind errors in glsl-fs-uniform-array-loop-unroll.shader_test; random code generation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89156">Bug 89156</a> - r300g: GL_COMPRESSED_RED_RGTC1 / ATI1N support broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89224">Bug 89224</a> - Incorrect rendering of Unigine Valley running in VM on VMware Workstation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89530">Bug 89530</a> - FTBFS in loader: missing fstat</li>
</ul>
<h2>Changes</h2>
<p>Andrey Sudnik (1):</p>
<ul>
<li>i965/vec4: Don't lose the saturate modifier in copy propagation.</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>egl: Take alpha bits into account when selecting GBM formats</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: Add sha256 sums for the 10.4.6 release</li>
<li>cherry-ignore: add not applicable/rejected commits</li>
<li>mesa: rename format_info.c to format_info.h</li>
<li>loader: include &lt;sys/stat.h&gt; for non-sysfs builds</li>
<li>auxiliary/os: fix the android build - s/drm_munmap/os_munmap/</li>
<li>Update version to 10.4.7</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>i965: Fix out-of-bounds accesses into pull_constant_loc array</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>freedreno: move fb state copy after checking for size change</li>
<li>freedreno/ir3: fix array count returned by TXQ</li>
<li>freedreno/ir3: get the # of miplevels from getinfo</li>
<li>freedreno: fix slice pitch calculations</li>
</ul>
<p>Marc-Andre Lureau (1):</p>
<ul>
<li>gallium/auxiliary/indices: fix start param</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>r300g: fix RGTC1 and LATC1 SNORM formats</li>
<li>r300g: fix a crash when resolving into an sRGB texture</li>
<li>r300g: fix sRGB-&gt;sRGB blits</li>
<li>radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>glx: Handle out-of-sequence swap completion events correctly. (v2)</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>r300g: Use PATH_MAX instead of limiting ourselves to 100 chars.</li>
<li>r300g: Check return value of snprintf().</li>
</ul>
<p>Rob Clark (2):</p>
<ul>
<li>freedreno/ir3: fix silly typo for binning pass shaders</li>
<li>freedreno: update generated headers</li>
</ul>
<p>Samuel Iglesias Gonsalvez (1):</p>
<ul>
<li>glsl: optimize (0 cmp x + y) into (-x cmp y).</li>
</ul>
<p>Stefan Dösinger (1):</p>
<ul>
<li>r300g: Fix the ATI1N swizzle (RGTC1 and LATC1)</li>
</ul>
</div>
</body>
</html>

View File

@@ -399,6 +399,16 @@ struct IDirect3DVolume9 : public IUnknown
virtual HRESULT WINAPI UnlockBox() = 0;
};
struct IDirect3DVolumeTexture9 : public IDirect3DBaseTexture9
{
virtual HRESULT WINAPI GetLevelDesc(UINT Level, D3DVOLUME_DESC *pDesc) = 0;
virtual HRESULT WINAPI GetVolumeLevel(UINT Level, IDirect3DVolume9 **ppVolumeLevel) = 0;
virtual HRESULT WINAPI LockBox(UINT Level, D3DLOCKED_BOX *pLockedVolume, const D3DBOX *pBox, DWORD Flags) = 0;
virtual HRESULT WINAPI UnlockBox(UINT Level) = 0;
virtual HRESULT WINAPI AddDirtyBox(const D3DBOX *pDirtyBox) = 0;
};
#else /* __cplusplus */
extern const GUID IID_IDirect3D9;

View File

@@ -224,6 +224,8 @@ typedef struct _RGNDATA {
#define D3DERR_INVALIDDEVICE MAKE_D3DHRESULT(2155)
#define D3DERR_INVALIDCALL MAKE_D3DHRESULT(2156)
#define D3DERR_DRIVERINVALIDCALL MAKE_D3DHRESULT(2157)
#define D3DERR_DEVICEREMOVED MAKE_D3DHRESULT(2160)
#define D3DERR_DEVICEHUNG MAKE_D3DHRESULT(2164)
/********************************************************
* Bitmasks *
@@ -331,6 +333,7 @@ typedef struct _RGNDATA {
#define D3DPRESENT_DONOTWAIT 0x00000001
#define D3DPRESENT_LINEAR_CONTENT 0x00000002
#define D3DPRESENT_RATE_DEFAULT 0
#define D3DCREATE_FPU_PRESERVE 0x00000002
#define D3DCREATE_MULTITHREADED 0x00000004
@@ -344,6 +347,13 @@ typedef struct _RGNDATA {
#define D3DSTREAMSOURCE_INDEXEDDATA (1 << 30)
#define D3DSTREAMSOURCE_INSTANCEDATA (2 << 30)
/* D3DRS_COLORWRITEENABLE */
#define D3DCOLORWRITEENABLE_RED (1L << 0)
#define D3DCOLORWRITEENABLE_GREEN (1L << 1)
#define D3DCOLORWRITEENABLE_BLUE (1L << 2)
#define D3DCOLORWRITEENABLE_ALPHA (1L << 3)
/********************************************************
* Function macros *
*******************************************************/
@@ -639,10 +649,13 @@ typedef enum _D3DFORMAT {
D3DFMT_A1 = 118,
D3DFMT_A2B10G10R10_XR_BIAS = 119,
D3DFMT_BINARYBUFFER = 199,
D3DFMT_ATI1 = MAKEFOURCC('A', 'T', 'I', '1'),
D3DFMT_ATI2 = MAKEFOURCC('A', 'T', 'I', '2'),
D3DFMT_DF16 = MAKEFOURCC('D', 'F', '1', '6'),
D3DFMT_DF24 = MAKEFOURCC('D', 'F', '2', '4'),
D3DFMT_INTZ = MAKEFOURCC('I', 'N', 'T', 'Z'),
D3DFMT_NULL = MAKEFOURCC('N', 'U', 'L', 'L'),
D3DFMT_NVDB = MAKEFOURCC('N', 'V', 'D', 'B'),
D3DFMT_NV11 = MAKEFOURCC('N', 'V', '1', '1'),
D3DFMT_NV12 = MAKEFOURCC('N', 'V', '1', '2'),
D3DFMT_Y210 = MAKEFOURCC('Y', '2', '1', '0'),

View File

@@ -3,9 +3,9 @@
if BUILD_SHARED
if HAVE_COMPAT_SYMLINKS
all-local : .libs/install-mesa-links
all-local : .install-mesa-links
.libs/install-mesa-links : $(lib_LTLIBRARIES)
.install-mesa-links : $(lib_LTLIBRARIES)
$(AM_V_GEN)$(MKDIR_P) $(top_builddir)/$(LIB_DIR); \
for f in $(join $(addsuffix .libs/,$(dir $(lib_LTLIBRARIES))),$(notdir $(lib_LTLIBRARIES:%.la=%.$(LIB_EXT)*))); do \
if test -h .libs/$$f; then \
@@ -14,5 +14,9 @@ all-local : .libs/install-mesa-links
ln -f $$f $(top_builddir)/$(LIB_DIR); \
fi; \
done && touch $@
clean-local:
$(RM) .install-mesa-links
endif
endif

View File

@@ -668,15 +668,21 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
for (i = 0; dri2_dpy->driver_configs[i]; i++) {
EGLint format, attr_list[3];
unsigned int mask;
unsigned int red, alpha;
dri2_dpy->core->getConfigAttrib(dri2_dpy->driver_configs[i],
__DRI_ATTRIB_RED_MASK, &mask);
if (mask == 0x3ff00000)
__DRI_ATTRIB_RED_MASK, &red);
dri2_dpy->core->getConfigAttrib(dri2_dpy->driver_configs[i],
__DRI_ATTRIB_ALPHA_MASK, &alpha);
if (red == 0x3ff00000 && alpha == 0x00000000)
format = GBM_FORMAT_XRGB2101010;
else if (mask == 0x00ff0000)
else if (red == 0x3ff00000 && alpha == 0xc0000000)
format = GBM_FORMAT_ARGB2101010;
else if (red == 0x00ff0000 && alpha == 0x00000000)
format = GBM_FORMAT_XRGB8888;
else if (mask == 0xf800)
else if (red == 0x00ff0000 && alpha == 0xff000000)
format = GBM_FORMAT_ARGB8888;
else if (red == 0xf800)
format = GBM_FORMAT_RGB565;
else
continue;

View File

@@ -49,8 +49,7 @@ dri2_x11_swap_interval(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf,
static void
swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
struct dri2_egl_surface * dri2_surf,
int depth)
struct dri2_egl_surface * dri2_surf)
{
uint32_t mask;
const uint32_t function = GXcopy;
@@ -66,8 +65,7 @@ swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
valgc[0] = function;
valgc[1] = False;
xcb_create_gc(dri2_dpy->conn, dri2_surf->swapgc, dri2_surf->drawable, mask, valgc);
dri2_surf->depth = depth;
switch (depth) {
switch (dri2_surf->depth) {
case 32:
case 24:
dri2_surf->bytes_per_pixel = 4;
@@ -82,7 +80,7 @@ swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
dri2_surf->bytes_per_pixel = 0;
break;
default:
_eglLog(_EGL_WARNING, "unsupported depth %d", depth);
_eglLog(_EGL_WARNING, "unsupported depth %d", dri2_surf->depth);
}
}
@@ -257,12 +255,6 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
_eglError(EGL_BAD_ALLOC, "dri2->createNewDrawable");
goto cleanup_pixmap;
}
if (dri2_dpy->dri2) {
xcb_dri2_create_drawable (dri2_dpy->conn, dri2_surf->drawable);
} else {
swrastCreateDrawable(dri2_dpy, dri2_surf, _eglGetConfigKey(conf, EGL_BUFFER_SIZE));
}
if (type != EGL_PBUFFER_BIT) {
cookie = xcb_get_geometry (dri2_dpy->conn, dri2_surf->drawable);
@@ -275,9 +267,19 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type,
dri2_surf->base.Width = reply->width;
dri2_surf->base.Height = reply->height;
dri2_surf->depth = reply->depth;
free(reply);
}
if (dri2_dpy->dri2) {
xcb_dri2_create_drawable (dri2_dpy->conn, dri2_surf->drawable);
} else {
if (type == EGL_PBUFFER_BIT) {
dri2_surf->depth = _eglGetConfigKey(conf, EGL_BUFFER_SIZE);
}
swrastCreateDrawable(dri2_dpy, dri2_surf);
}
/* we always copy the back buffer to front */
dri2_surf->base.PostSubBufferSupportedNV = EGL_TRUE;

View File

@@ -193,7 +193,7 @@ def lineloop(intype, outtype, inpv, outpv):
print ' for (i = start, j = 0; j < nr - 2; j+=2, i++) { '
do_line( intype, outtype, 'out+j', 'i', 'i+1', inpv, outpv );
print ' }'
do_line( intype, outtype, 'out+j', 'i', '0', inpv, outpv );
do_line( intype, outtype, 'out+j', 'i', 'start', inpv, outpv );
postamble()
def tris(intype, outtype, inpv, outpv):
@@ -218,7 +218,7 @@ def tristrip(intype, outtype, inpv, outpv):
def trifan(intype, outtype, inpv, outpv):
preamble(intype, outtype, inpv, outpv, prim='trifan')
print ' for (i = start, j = 0; j < nr; j+=3, i++) { '
do_tri( intype, outtype, 'out+j', '0', 'i+1', 'i+2', inpv, outpv );
do_tri( intype, outtype, 'out+j', 'start', 'i+1', 'i+2', inpv, outpv );
print ' }'
postamble()
@@ -228,9 +228,9 @@ def polygon(intype, outtype, inpv, outpv):
preamble(intype, outtype, inpv, outpv, prim='polygon')
print ' for (i = start, j = 0; j < nr; j+=3, i++) { '
if inpv == FIRST:
do_tri( intype, outtype, 'out+j', '0', 'i+1', 'i+2', inpv, outpv );
do_tri( intype, outtype, 'out+j', 'start', 'i+1', 'i+2', inpv, outpv );
else:
do_tri( intype, outtype, 'out+j', 'i+1', 'i+2', '0', inpv, outpv );
do_tri( intype, outtype, 'out+j', 'i+1', 'i+2', 'start', inpv, outpv );
print ' }'
postamble()

View File

@@ -118,7 +118,7 @@ os_get_total_physical_memory(uint64_t *size)
*size = phys_pages * page_size;
return (phys_pages > 0 && page_size > 0);
#elif defined(PIPE_OS_APPLE) || defined(PIPE_OS_BSD)
size_t len = sizeof(size);
size_t len = sizeof(*size);
int mib[2];
mib[0] = CTL_HW;
@@ -134,7 +134,7 @@ os_get_total_physical_memory(uint64_t *size)
#error Unsupported *BSD
#endif
return (sysctl(mib, 2, &size, &len, NULL, 0) == 0);
return (sysctl(mib, 2, size, &len, NULL, 0) == 0);
#elif defined(PIPE_OS_HAIKU)
system_info info;
status_t ret;

View File

@@ -70,8 +70,8 @@ static INLINE void *os_mmap(void *addr, size_t length, int prot, int flags,
return __mmap2(addr, length, prot, flags, fd, (size_t) (offset >> 12));
}
# define drm_munmap(addr, length) \
munmap(addr, length)
# define os_munmap(addr, length) \
munmap(addr, length)
#else
/* assume large file support exists */

View File

@@ -561,14 +561,10 @@ util_last_bit(unsigned u)
static INLINE unsigned
util_last_bit_signed(int i)
{
#if defined(__GNUC__) && ((__GNUC__ * 100 + __GNUC_MINOR__) >= 407) && !defined(__INTEL_COMPILER)
return 31 - __builtin_clrsb(i);
#else
if (i >= 0)
return util_last_bit(i);
else
return util_last_bit(~(unsigned)i);
#endif
}
/* Destructively loop over all of the bits in a mask as in:

View File

@@ -13,7 +13,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10347 bytes, from 2014-10-01 18:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14960 bytes, from 2014-07-27 17:22:13)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 60533 bytes, from 2014-10-15 18:32:43)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 64771 bytes, from 2015-03-15 21:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 41068 bytes, from 2014-08-01 12:22:48)
Copyright (C) 2013-2014 by the following authors:

View File

@@ -13,7 +13,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10347 bytes, from 2014-10-01 18:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14960 bytes, from 2014-07-27 17:22:13)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 60533 bytes, from 2014-10-15 18:32:43)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 64771 bytes, from 2015-03-15 21:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 41068 bytes, from 2014-08-01 12:22:48)
Copyright (C) 2013-2014 by the following authors:
@@ -2572,7 +2572,7 @@ static inline uint32_t A3XX_TEX_CONST_2_SWAP(enum a3xx_color_swap val)
}
#define REG_A3XX_TEX_CONST_3 0x00000003
#define A3XX_TEX_CONST_3_LAYERSZ1__MASK 0x0000000f
#define A3XX_TEX_CONST_3_LAYERSZ1__MASK 0x00001fff
#define A3XX_TEX_CONST_3_LAYERSZ1__SHIFT 0
static inline uint32_t A3XX_TEX_CONST_3_LAYERSZ1(uint32_t val)
{

View File

@@ -13,7 +13,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10347 bytes, from 2014-10-01 18:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14960 bytes, from 2014-07-27 17:22:13)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 60533 bytes, from 2014-10-15 18:32:43)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 64771 bytes, from 2015-03-15 21:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 41068 bytes, from 2014-08-01 12:22:48)
Copyright (C) 2013-2014 by the following authors:

View File

@@ -13,7 +13,7 @@ The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10347 bytes, from 2014-10-01 18:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14960 bytes, from 2014-07-27 17:22:13)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 60533 bytes, from 2014-10-15 18:32:43)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 64771 bytes, from 2015-03-15 21:55:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 41068 bytes, from 2014-08-01 12:22:48)
Copyright (C) 2013-2014 by the following authors:

View File

@@ -199,7 +199,7 @@ setup_slices(struct fd_resource *rsc)
for (level = 0; level <= prsc->last_level; level++) {
struct fd_resource_slice *slice = fd_resource_slice(rsc, level);
slice->pitch = align(width, 32);
slice->pitch = width = align(width, 32);
slice->offset = size;
slice->size0 = slice->pitch * height * rsc->cpp;

View File

@@ -123,12 +123,12 @@ fd_set_framebuffer_state(struct pipe_context *pctx,
fd_context_render(pctx);
util_copy_framebuffer_state(cso, framebuffer);
if ((cso->width != framebuffer->width) ||
(cso->height != framebuffer->height))
ctx->needs_rb_fbd = true;
util_copy_framebuffer_state(cso, framebuffer);
ctx->dirty |= FD_DIRTY_FRAMEBUFFER;
ctx->disabled_scissor.minx = 0;

View File

@@ -1421,6 +1421,7 @@ trans_txq(const struct instr_translater *t,
struct tgsi_dst_register *dst = &inst->Dst[0].Register;
struct tgsi_src_register *level = &inst->Src[0].Register;
struct tgsi_src_register *samp = &inst->Src[1].Register;
const struct target_info *tgt = &tex_targets[inst->Texture.Texture];
struct tex_info tinf;
memset(&tinf, 0, sizeof(tinf));
@@ -1434,8 +1435,67 @@ trans_txq(const struct instr_translater *t,
instr->cat5.tex = samp->Index;
instr->flags |= tinf.flags;
add_dst_reg_wrmask(ctx, instr, dst, 0, dst->WriteMask);
add_src_reg_wrmask(ctx, instr, level, level->SwizzleX, 0x1);
if (tgt->array && (dst->WriteMask & (1 << tgt->dims))) {
/* Array size actually ends up in .w rather than .z. This doesn't
* matter for miplevel 0, but for higher mips the value in z is
* minified whereas w stays. Also, the value in TEX_CONST_3_DEPTH is
* returned, which means that we have to add 1 to it for arrays.
*/
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register *tmp_src;
type_t type_mov = get_utype(ctx);
tmp_src = get_internal_temp(ctx, &tmp_dst);
add_dst_reg_wrmask(ctx, instr, &tmp_dst, 0,
dst->WriteMask | TGSI_WRITEMASK_W);
add_src_reg_wrmask(ctx, instr, level, level->SwizzleX, 0x1);
if (dst->WriteMask & TGSI_WRITEMASK_X) {
instr = instr_create(ctx, 1, 0);
instr->cat1.src_type = type_mov;
instr->cat1.dst_type = type_mov;
add_dst_reg(ctx, instr, dst, 0);
add_src_reg(ctx, instr, tmp_src, src_swiz(tmp_src, 0));
}
if (tgt->dims == 2) {
if (dst->WriteMask & TGSI_WRITEMASK_Y) {
instr = instr_create(ctx, 1, 0);
instr->cat1.src_type = type_mov;
instr->cat1.dst_type = type_mov;
add_dst_reg(ctx, instr, dst, 1);
add_src_reg(ctx, instr, tmp_src, src_swiz(tmp_src, 1));
}
}
instr = instr_create(ctx, 2, OPC_ADD_U);
add_dst_reg(ctx, instr, dst, tgt->dims);
add_src_reg(ctx, instr, tmp_src, src_swiz(tmp_src, 3));
ir3_reg_create(instr, 0, IR3_REG_IMMED)->iim_val = 1;
} else {
add_dst_reg_wrmask(ctx, instr, dst, 0, dst->WriteMask);
add_src_reg_wrmask(ctx, instr, level, level->SwizzleX, 0x1);
}
if (dst->WriteMask & TGSI_WRITEMASK_W) {
/* The # of levels comes from getinfo.z. We need to add 1 to it, since
* the value in TEX_CONST_0 is zero-based.
*/
struct tgsi_dst_register tmp_dst;
struct tgsi_src_register *tmp_src;
tmp_src = get_internal_temp(ctx, &tmp_dst);
instr = instr_create(ctx, 5, OPC_GETINFO);
instr->cat5.type = get_utype(ctx);
instr->cat5.samp = samp->Index;
instr->cat5.tex = samp->Index;
add_dst_reg_wrmask(ctx, instr, &tmp_dst, 0, TGSI_WRITEMASK_Z);
instr = instr_create(ctx, 2, OPC_ADD_U);
add_dst_reg(ctx, instr, dst, 3);
add_src_reg(ctx, instr, tmp_src, src_swiz(tmp_src, 2));
ir3_reg_create(instr, 0, IR3_REG_IMMED)->iim_val = 1;
}
}
/* DDX/DDY */
@@ -3094,7 +3154,7 @@ ir3_compile_shader(struct ir3_shader_variant *so,
if (key.binning_pass) {
for (i = 0, j = 0; i < so->outputs_count; i++) {
unsigned name = sem2name(so->outputs[i].semantic);
unsigned idx = sem2name(so->outputs[i].semantic);
unsigned idx = sem2idx(so->outputs[i].semantic);
/* throw away everything but first position/psize */
if ((idx == 0) && ((name == TGSI_SEMANTIC_POSITION) ||

View File

@@ -252,7 +252,12 @@ nvc0_tfb_validate(struct nvc0_context *nvc0)
for (b = 0; b < nvc0->num_tfbbufs; ++b) {
struct nvc0_so_target *targ = nvc0_so_target(nvc0->tfbbuf[b]);
struct nv04_resource *buf = nv04_resource(targ->pipe.buffer);
struct nv04_resource *buf;
if (!targ) {
IMMED_NVC0(push, NVC0_3D(TFB_BUFFER_ENABLE(b)), 0);
continue;
}
if (tfb)
targ->stride = tfb->stride[b];
@@ -260,6 +265,8 @@ nvc0_tfb_validate(struct nvc0_context *nvc0)
if (!(nvc0->tfbbuf_dirty & (1 << b)))
continue;
buf = nv04_resource(targ->pipe.buffer);
if (!targ->clean)
nvc0_query_fifo_wait(push, targ->pq);
BEGIN_NVC0(push, NVC0_3D(TFB_BUFFER_ENABLE(b)), 5);

View File

@@ -1089,9 +1089,11 @@ nvc0_set_transform_feedback_targets(struct pipe_context *pipe,
pipe_so_target_reference(&nvc0->tfbbuf[i], targets[i]);
}
for (; i < nvc0->num_tfbbufs; ++i) {
nvc0->tfbbuf_dirty |= 1 << i;
nvc0_so_target_save_offset(pipe, nvc0->tfbbuf[i], i, &serialize);
pipe_so_target_reference(&nvc0->tfbbuf[i], NULL);
if (nvc0->tfbbuf[i]) {
nvc0->tfbbuf_dirty |= 1 << i;
nvc0_so_target_save_offset(pipe, nvc0->tfbbuf[i], i, &serialize);
pipe_so_target_reference(&nvc0->tfbbuf[i], NULL);
}
}
nvc0->num_tfbbufs = num_targets;

View File

@@ -1401,11 +1401,14 @@ nvc0_blit(struct pipe_context *pipe, const struct pipe_blit_info *info)
} else
if (!nv50_2d_src_format_faithful(info->src.format)) {
if (!util_format_is_luminance(info->src.format)) {
if (!nv50_2d_dst_format_ops_supported(info->dst.format))
eng3d = TRUE;
else
if (util_format_is_intensity(info->src.format))
eng3d = info->src.format != PIPE_FORMAT_I8_UNORM;
else
if (!nv50_2d_dst_format_ops_supported(info->dst.format))
eng3d = TRUE;
if (util_format_is_alpha(info->src.format))
eng3d = info->src.format != PIPE_FORMAT_A8_UNORM;
else
eng3d = !nv50_2d_format_supported(info->src.format);
}

View File

@@ -28,6 +28,7 @@
*/
#include <errno.h>
#include <limits.h>
#include <regex.h>
#include <stdlib.h>
#include <stdio.h>
@@ -528,7 +529,6 @@ void init_compiler(
}
#define MAX_LINE_LENGTH 100
#define MAX_PATH_LENGTH 100
unsigned load_program(
struct radeon_compiler *c,
@@ -536,14 +536,19 @@ unsigned load_program(
const char *filename)
{
char line[MAX_LINE_LENGTH];
char path[MAX_PATH_LENGTH];
char path[PATH_MAX];
FILE *file;
unsigned *count;
char **string_store;
unsigned i = 0;
int n;
memset(line, 0, sizeof(line));
snprintf(path, MAX_PATH_LENGTH, TEST_PATH "/%s", filename);
n = snprintf(path, PATH_MAX, TEST_PATH "/%s", filename);
if (n < 0 || n >= PATH_MAX) {
return 0;
}
file = fopen(path, "r");
if (!file) {
return 0;

View File

@@ -803,6 +803,15 @@ static void r300_blit(struct pipe_context *pipe,
(struct pipe_framebuffer_state*)r300->fb_state.state;
struct pipe_blit_info info = *blit;
/* The driver supports sRGB textures but not framebuffers. Blitting
* from sRGB to sRGB should be the same as blitting from linear
* to linear, so use that, This avoids incorrect linearization.
*/
if (util_format_is_srgb(info.src.format)) {
info.src.format = util_format_linear(info.src.format);
info.dst.format = util_format_linear(info.dst.format);
}
/* MSAA resolve. */
if (info.src.resource->nr_samples > 1 &&
!util_format_is_depth_or_stencil(info.src.resource->format)) {

View File

@@ -170,24 +170,10 @@ static void get_external_state(
}
state->unit[i].non_normalized_coords = !s->state.normalized_coords;
state->unit[i].convert_unorm_to_snorm =
v->base.format == PIPE_FORMAT_RGTC1_SNORM ||
v->base.format == PIPE_FORMAT_LATC1_SNORM;
state->unit[i].convert_unorm_to_snorm = 0;
/* Pass texture swizzling to the compiler, some lowering passes need it. */
if (v->base.format == PIPE_FORMAT_RGTC1_SNORM ||
v->base.format == PIPE_FORMAT_LATC1_SNORM) {
unsigned char swizzle[4];
util_format_compose_swizzles(
util_format_description(v->base.format)->swizzle,
v->swizzle,
swizzle);
state->unit[i].texture_swizzle =
RC_MAKE_SWIZZLE(swizzle[0], swizzle[1],
swizzle[2], swizzle[3]);
} else if (state->unit[i].compare_mode_enabled) {
if (state->unit[i].compare_mode_enabled) {
state->unit[i].texture_swizzle =
RC_MAKE_SWIZZLE(v->swizzle[0], v->swizzle[1],
v->swizzle[2], v->swizzle[3]);

View File

@@ -169,20 +169,21 @@ uint32_t r300_translate_texformat(enum pipe_format format,
/* Add swizzling. */
/* The RGTC1_SNORM and LATC1_SNORM swizzle is done in the shader. */
if (format != PIPE_FORMAT_RGTC1_SNORM &&
if (util_format_is_compressed(format) &&
dxtc_swizzle &&
format != PIPE_FORMAT_RGTC2_UNORM &&
format != PIPE_FORMAT_RGTC2_SNORM &&
format != PIPE_FORMAT_LATC2_UNORM &&
format != PIPE_FORMAT_LATC2_SNORM &&
format != PIPE_FORMAT_RGTC1_UNORM &&
format != PIPE_FORMAT_RGTC1_SNORM &&
format != PIPE_FORMAT_LATC1_UNORM &&
format != PIPE_FORMAT_LATC1_SNORM) {
if (util_format_is_compressed(format) &&
dxtc_swizzle &&
format != PIPE_FORMAT_RGTC2_UNORM &&
format != PIPE_FORMAT_RGTC2_SNORM &&
format != PIPE_FORMAT_LATC2_UNORM &&
format != PIPE_FORMAT_LATC2_SNORM) {
result |= r300_get_swizzle_combined(desc->swizzle, swizzle_view,
TRUE);
} else {
result |= r300_get_swizzle_combined(desc->swizzle, swizzle_view,
FALSE);
}
result |= r300_get_swizzle_combined(desc->swizzle, swizzle_view,
TRUE);
} else {
result |= r300_get_swizzle_combined(desc->swizzle, swizzle_view,
FALSE);
}
/* S3TC formats. */
@@ -213,6 +214,7 @@ uint32_t r300_translate_texformat(enum pipe_format format,
switch (format) {
case PIPE_FORMAT_RGTC1_SNORM:
case PIPE_FORMAT_LATC1_SNORM:
result |= sign_bit[0];
case PIPE_FORMAT_LATC1_UNORM:
case PIPE_FORMAT_RGTC1_UNORM:
return R500_TX_FORMAT_ATI1N | result;
@@ -936,14 +938,16 @@ static void r300_texture_setup_fb_state(struct r300_surface *surf)
surf->pitch_zmask = tex->tex.zmask_stride_in_pixels[level];
surf->pitch_hiz = tex->tex.hiz_stride_in_pixels[level];
} else {
enum pipe_format format = util_format_linear(surf->base.format);
surf->pitch =
stride |
r300_translate_colorformat(surf->base.format) |
r300_translate_colorformat(format) |
R300_COLOR_TILE(tex->tex.macrotile[level]) |
R300_COLOR_MICROTILE(tex->tex.microtile);
surf->format = r300_translate_out_fmt(surf->base.format);
surf->format = r300_translate_out_fmt(format);
surf->colormask_swizzle =
r300_translate_colormask_swizzle(surf->base.format);
r300_translate_colormask_swizzle(format);
surf->pitch_cmask = tex->tex.cmask_stride_in_pixels;
}
}

View File

@@ -294,6 +294,7 @@ struct r600_so_target {
/* The buffer where BUFFER_FILLED_SIZE is stored. */
struct r600_resource *buf_filled_size;
unsigned buf_filled_size_offset;
bool buf_filled_size_valid;
unsigned stride_in_dw;
};

View File

@@ -237,7 +237,7 @@ static void r600_emit_streamout_begin(struct r600_common_context *rctx, struct r
}
}
if (rctx->streamout.append_bitmask & (1 << i)) {
if (rctx->streamout.append_bitmask & (1 << i) && t[i]->buf_filled_size_valid) {
uint64_t va = t[i]->buf_filled_size->gpu_address +
t[i]->buf_filled_size_offset;
@@ -302,6 +302,8 @@ void r600_emit_streamout_end(struct r600_common_context *rctx)
* buffer bound. This ensures that the primitives-emitted query
* won't increment. */
r600_write_context_reg(cs, R_028AD0_VGT_STRMOUT_BUFFER_SIZE_0 + 16*i, 0);
t[i]->buf_filled_size_valid = true;
}
rctx->streamout.begin_emitted = false;

View File

@@ -80,10 +80,6 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned type)
sprintf(Str, "%1d", llvm_type);
LLVMAddTargetDependentFunctionAttr(F, "ShaderType", Str);
if (type != TGSI_PROCESSOR_COMPUTE) {
LLVMAddTargetDependentFunctionAttr(F, "unsafe-fp-math", "true");
}
}
static void init_r600_target()

View File

@@ -748,7 +748,7 @@ static void txp_fetch_args(
const struct tgsi_full_instruction * inst = emit_data->inst;
LLVMValueRef src_w;
unsigned chan;
LLVMValueRef coords[4];
LLVMValueRef coords[5];
emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
src_w = lp_build_emit_fetch(bld_base, emit_data->inst, 0, TGSI_CHAN_W);

View File

@@ -228,14 +228,14 @@ static LLVMValueRef get_instance_index_for_fetch(
LLVMValueRef result = LLVMGetParam(radeon_bld->main_fn,
si_shader_ctx->param_instance_id);
result = LLVMBuildAdd(gallivm->builder, result, LLVMGetParam(
radeon_bld->main_fn, SI_PARAM_START_INSTANCE), "");
/* The division must be done before START_INSTANCE is added. */
if (divisor > 1)
result = LLVMBuildUDiv(gallivm->builder, result,
lp_build_const_int32(gallivm, divisor), "");
return result;
return LLVMBuildAdd(gallivm->builder, result, LLVMGetParam(
radeon_bld->main_fn, SI_PARAM_START_INSTANCE), "");
}
static void declare_input_vs(
@@ -1505,7 +1505,7 @@ static void tex_fetch_args(
const struct tgsi_full_instruction * inst = emit_data->inst;
unsigned opcode = inst->Instruction.Opcode;
unsigned target = inst->Texture.Texture;
LLVMValueRef coords[4];
LLVMValueRef coords[5];
LLVMValueRef address[16];
int ref_pos;
unsigned num_coords = tgsi_util_get_texture_coord_dim(target, &ref_pos);

View File

@@ -697,12 +697,16 @@ static void si_delete_rs_state(struct pipe_context *ctx, void *state)
*/
static void si_update_dsa_stencil_ref(struct si_context *sctx)
{
struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
struct si_pm4_state *pm4;
struct pipe_stencil_ref *ref = &sctx->stencil_ref;
struct si_state_dsa *dsa = sctx->queued.named.dsa;
struct si_state_dsa *dsa = sctx->queued.named.dsa;
if (pm4 == NULL)
return;
if (!dsa)
return;
pm4 = CALLOC_STRUCT(si_pm4_state);
if (pm4 == NULL)
return;
si_pm4_set_reg(pm4, R_028430_DB_STENCILREFMASK,
S_028430_STENCILTESTVAL(ref->ref_value[0]) |

View File

@@ -544,9 +544,11 @@ bcolor:
}
}
if (j == vsinfo->num_outputs) {
/* No corresponding output found, load defaults into input */
tmp |= S_028644_OFFSET(0x20);
if (j == vsinfo->num_outputs && !G_028644_PT_SPRITE_TEX(tmp)) {
/* No corresponding output found, load defaults into input.
* Don't set any other bits.
* (FLAT_SHADE=1 completely changes behavior) */
tmp = S_028644_OFFSET(0x20);
}
si_pm4_set_reg(pm4,

View File

@@ -302,6 +302,9 @@ NineAdapter9_CheckDeviceFormat( struct NineAdapter9 *This,
return D3DERR_NOTAVAILABLE;
}
/* we support ATI1 and ATI2 hack only for 2D textures */
if (RType != D3DRTYPE_TEXTURE && (CheckFormat == D3DFMT_ATI1 || CheckFormat == D3DFMT_ATI2))
return D3DERR_NOTAVAILABLE;
/* if (Usage & D3DUSAGE_NONSECURE) { don't know the implications of this } */
/* if (Usage & D3DUSAGE_SOFTWAREPROCESSING) { we can always support this } */
@@ -549,7 +552,7 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
D3DPMISCCAPS_CULLCCW |
D3DPMISCCAPS_COLORWRITEENABLE |
D3DPMISCCAPS_CLIPPLANESCALEDPOINTS |
D3DPMISCCAPS_CLIPTLVERTS |
/*D3DPMISCCAPS_CLIPTLVERTS |*/
D3DPMISCCAPS_TSSARGTEMP |
D3DPMISCCAPS_BLENDOP |
D3DPIPECAP(INDEP_BLEND_ENABLE, D3DPMISCCAPS_INDEPENDENTWRITEMASKS) |
@@ -560,6 +563,8 @@ NineAdapter9_GetDeviceCaps( struct NineAdapter9 *This,
D3DPIPECAP(MIXED_COLORBUFFER_FORMATS, D3DPMISCCAPS_MRTINDEPENDENTBITDEPTHS) |
D3DPMISCCAPS_MRTPOSTPIXELSHADERBLENDING |
/*D3DPMISCCAPS_FOGVERTEXCLAMPED*/0;
if (!screen->get_param(screen, PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION))
pCaps->PrimitiveMiscCaps |= D3DPMISCCAPS_CLIPTLVERTS;
pCaps->RasterCaps =
D3DPIPECAP(ANISOTROPIC_FILTER, D3DPRASTERCAPS_ANISOTROPY) |

View File

@@ -436,14 +436,21 @@ NineBaseTexture9_CreatePipeResource( struct NineBaseTexture9 *This,
return D3D_OK;
}
#define SWIZZLE_TO_REPLACE(s) (s == UTIL_FORMAT_SWIZZLE_0 || \
s == UTIL_FORMAT_SWIZZLE_1 || \
s == UTIL_FORMAT_SWIZZLE_NONE)
HRESULT
NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 *This,
const int sRGB )
{
const struct util_format_description *desc;
struct pipe_context *pipe = This->pipe;
struct pipe_screen *screen = pipe->screen;
struct pipe_resource *resource = This->base.resource;
struct pipe_sampler_view templ;
enum pipe_format srgb_format;
unsigned i;
uint8_t swizzle[4];
DBG("This=%p sRGB=%d\n", This, sRGB);
@@ -452,6 +459,9 @@ NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 *This,
if (unlikely(This->format == D3DFMT_NULL))
return D3D_OK;
NineBaseTexture9_Dump(This);
/* hack due to incorrect POOL_MANAGED handling */
NineBaseTexture9_GenerateMipSubLevels(This);
resource = This->base.resource;
}
assert(resource);
@@ -463,25 +473,49 @@ NineBaseTexture9_UpdateSamplerView( struct NineBaseTexture9 *This,
swizzle[3] = PIPE_SWIZZLE_ALPHA;
desc = util_format_description(resource->format);
if (desc->colorspace == UTIL_FORMAT_COLORSPACE_ZS) {
/* ZZZ1 -> 0Z01 (see end of docs/source/tgsi.rst)
* XXX: but it's wrong
swizzle[0] = PIPE_SWIZZLE_ZERO;
swizzle[2] = PIPE_SWIZZLE_ZERO; */
} else
if (desc->swizzle[0] == UTIL_FORMAT_SWIZZLE_X &&
desc->swizzle[3] == UTIL_FORMAT_SWIZZLE_1) {
/* R001/RG01 -> R111/RG11 */
if (desc->swizzle[1] == UTIL_FORMAT_SWIZZLE_0)
swizzle[1] = PIPE_SWIZZLE_ONE;
if (desc->swizzle[2] == UTIL_FORMAT_SWIZZLE_0)
swizzle[2] = PIPE_SWIZZLE_ONE;
/* msdn doc is incomplete here and wrong.
* The only formats that can be read directly here
* are DF16, DF24 and INTZ.
* Tested on win the swizzle is
* R = depth, G = B = 0, A = 1 for DF16 and DF24
* R = G = B = A = depth for INTZ
* For the other ZS formats that can't be read directly
* but can be used as shadow map, the result is duplicated on
* all channel */
if (This->format == D3DFMT_DF16 ||
This->format == D3DFMT_DF24) {
swizzle[1] = PIPE_SWIZZLE_ZERO;
swizzle[2] = PIPE_SWIZZLE_ZERO;
swizzle[3] = PIPE_SWIZZLE_ONE;
} else {
swizzle[1] = PIPE_SWIZZLE_RED;
swizzle[2] = PIPE_SWIZZLE_RED;
swizzle[3] = PIPE_SWIZZLE_RED;
}
} else if (resource->format != PIPE_FORMAT_A8_UNORM &&
resource->format != PIPE_FORMAT_RGTC1_UNORM) {
/* exceptions:
* A8 should have 0.0 as default values for RGB.
* ATI1/RGTC1 should be r 0 0 1 (tested on windows).
* It is already what gallium does. All the other ones
* should have 1.0 for non-defined values */
for (i = 0; i < 4; i++) {
if (SWIZZLE_TO_REPLACE(desc->swizzle[i]))
swizzle[i] = PIPE_SWIZZLE_ONE;
}
}
/* but 000A remains unchanged */
templ.format = sRGB ? util_format_srgb(resource->format) : resource->format;
/* if requested and supported, convert to the sRGB format */
srgb_format = util_format_srgb(resource->format);
if (sRGB && srgb_format != PIPE_FORMAT_NONE &&
screen->is_format_supported(screen, srgb_format,
resource->target, 0, resource->bind))
templ.format = srgb_format;
else
templ.format = resource->format;
templ.u.tex.first_layer = 0;
templ.u.tex.last_layer = (resource->target == PIPE_TEXTURE_CUBE) ?
5 : (This->base.info.depth0 - 1);
templ.u.tex.last_layer = resource->target == PIPE_TEXTURE_3D ?
resource->depth0 - 1 : resource->array_size - 1;
templ.u.tex.first_level = 0;
templ.u.tex.last_level = resource->last_level;
templ.swizzle_r = swizzle[0];

View File

@@ -38,6 +38,8 @@ NineCubeTexture9_ctor( struct NineCubeTexture9 *This,
HANDLE *pSharedHandle )
{
struct pipe_resource *info = &This->base.base.info;
struct pipe_screen *screen = pParams->device->screen;
enum pipe_format pf;
unsigned i;
D3DSURFACE_DESC sfdesc;
HRESULT hr;
@@ -55,9 +57,19 @@ NineCubeTexture9_ctor( struct NineCubeTexture9 *This,
if (Usage & D3DUSAGE_AUTOGENMIPMAP)
Levels = 0;
pf = d3d9_to_pipe_format(Format);
if (pf == PIPE_FORMAT_NONE ||
!screen->is_format_supported(screen, pf, PIPE_TEXTURE_CUBE, 0, PIPE_BIND_SAMPLER_VIEW)) {
return D3DERR_INVALIDCALL;
}
/* We support ATI1 and ATI2 hacks only for 2D textures */
if (Format == D3DFMT_ATI1 || Format == D3DFMT_ATI2)
return D3DERR_INVALIDCALL;
info->screen = pParams->device->screen;
info->target = PIPE_TEXTURE_CUBE;
info->format = d3d9_to_pipe_format(Format);
info->format = pf;
info->width0 = EdgeLength;
info->height0 = EdgeLength;
info->depth0 = 1;
@@ -146,7 +158,7 @@ NineCubeTexture9_GetLevelDesc( struct NineCubeTexture9 *This,
user_assert(Level == 0 || !(This->base.base.usage & D3DUSAGE_AUTOGENMIPMAP),
D3DERR_INVALIDCALL);
*pDesc = This->surfaces[Level]->desc;
*pDesc = This->surfaces[Level * 6]->desc;
return D3D_OK;
}

View File

@@ -62,7 +62,7 @@ NineDevice9_SetDefaultState( struct NineDevice9 *This, boolean is_reset )
assert(!This->is_recording);
nine_state_set_defaults(&This->state, &This->caps, is_reset);
nine_state_set_defaults(This, &This->caps, is_reset);
This->state.viewport.X = 0;
This->state.viewport.Y = 0;
@@ -109,7 +109,7 @@ NineDevice9_RestoreNonCSOState( struct NineDevice9 *This, unsigned mask )
cb.buffer = This->constbuf_vs;
cb.user_buffer = NULL;
}
cb.buffer_size = This->constbuf_vs->width0;
cb.buffer_size = This->vs_const_size;
pipe->set_constant_buffer(pipe, PIPE_SHADER_VERTEX, 0, &cb);
if (This->prefer_user_constbuf) {
@@ -117,7 +117,7 @@ NineDevice9_RestoreNonCSOState( struct NineDevice9 *This, unsigned mask )
} else {
cb.buffer = This->constbuf_ps;
}
cb.buffer_size = This->constbuf_ps->width0;
cb.buffer_size = This->ps_const_size;
pipe->set_constant_buffer(pipe, PIPE_SHADER_FRAGMENT, 0, &cb);
}
@@ -262,10 +262,14 @@ NineDevice9_ctor( struct NineDevice9 *This,
This->max_ps_const_f = max_const_ps -
(NINE_MAX_CONST_I + NINE_MAX_CONST_B / 4);
This->vs_const_size = max_const_vs * sizeof(float[4]);
This->ps_const_size = max_const_ps * sizeof(float[4]);
/* Include space for I,B constants for user constbuf. */
This->state.vs_const_f = CALLOC(NINE_MAX_CONST_ALL, sizeof(float[4]));
This->state.ps_const_f = CALLOC(NINE_MAX_CONST_ALL, sizeof(float[4]));
if (!This->state.vs_const_f || !This->state.ps_const_f)
This->state.vs_const_f = CALLOC(This->vs_const_size, 1);
This->state.ps_const_f = CALLOC(This->ps_const_size, 1);
This->state.vs_lconstf_temp = CALLOC(This->vs_const_size,1);
if (!This->state.vs_const_f || !This->state.ps_const_f ||
!This->state.vs_lconstf_temp)
return E_OUTOFMEMORY;
if (strstr(pScreen->get_name(pScreen), "AMD") ||
@@ -283,23 +287,16 @@ NineDevice9_ctor( struct NineDevice9 *This,
tmpl.bind = PIPE_BIND_CONSTANT_BUFFER;
tmpl.flags = 0;
tmpl.width0 = max_const_vs * sizeof(float[4]);
tmpl.width0 = This->vs_const_size;
This->constbuf_vs = pScreen->resource_create(pScreen, &tmpl);
tmpl.width0 = max_const_ps * sizeof(float[4]);
tmpl.width0 = This->ps_const_size;
This->constbuf_ps = pScreen->resource_create(pScreen, &tmpl);
if (!This->constbuf_vs || !This->constbuf_ps)
return E_OUTOFMEMORY;
}
This->vs_bool_true = pScreen->get_shader_param(pScreen,
PIPE_SHADER_VERTEX,
PIPE_SHADER_CAP_INTEGERS) ? 0xFFFFFFFF : fui(1.0f);
This->ps_bool_true = pScreen->get_shader_param(pScreen,
PIPE_SHADER_FRAGMENT,
PIPE_SHADER_CAP_INTEGERS) ? 0xFFFFFFFF : fui(1.0f);
/* Allocate upload helper for drivers that suck (from st pov ;). */
{
unsigned bind = 0;
@@ -314,6 +311,8 @@ NineDevice9_ctor( struct NineDevice9 *This,
}
This->driver_caps.window_space_position_support = GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION);
This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS);
This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS);
nine_ff_init(This); /* initialize fixed function code */
@@ -350,6 +349,7 @@ NineDevice9_dtor( struct NineDevice9 *This )
pipe_resource_reference(&This->constbuf_ps, NULL);
FREE(This->state.vs_const_f);
FREE(This->state.ps_const_f);
FREE(This->state.vs_lconstf_temp);
if (This->swapchains) {
for (i = 0; i < This->nswapchains; ++i)
@@ -2938,6 +2938,7 @@ NineDevice9_SetVertexShaderConstantI( struct NineDevice9 *This,
UINT Vector4iCount )
{
struct nine_state *state = This->update;
int i;
DBG("This=%p StartRegister=%u pConstantData=%p Vector4iCount=%u\n",
This, StartRegister, pConstantData, Vector4iCount);
@@ -2946,9 +2947,18 @@ NineDevice9_SetVertexShaderConstantI( struct NineDevice9 *This,
user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(&state->vs_const_i[StartRegister][0],
pConstantData,
Vector4iCount * sizeof(state->vs_const_i[0]));
if (This->driver_caps.vs_integer) {
memcpy(&state->vs_const_i[StartRegister][0],
pConstantData,
Vector4iCount * sizeof(state->vs_const_i[0]));
} else {
for (i = 0; i < Vector4iCount; i++) {
state->vs_const_i[StartRegister+i][0] = fui((float)(pConstantData[4*i]));
state->vs_const_i[StartRegister+i][1] = fui((float)(pConstantData[4*i+1]));
state->vs_const_i[StartRegister+i][2] = fui((float)(pConstantData[4*i+2]));
state->vs_const_i[StartRegister+i][3] = fui((float)(pConstantData[4*i+3]));
}
}
state->changed.vs_const_i |= ((1 << Vector4iCount) - 1) << StartRegister;
state->changed.group |= NINE_STATE_VS_CONST;
@@ -2963,14 +2973,24 @@ NineDevice9_GetVertexShaderConstantI( struct NineDevice9 *This,
UINT Vector4iCount )
{
const struct nine_state *state = &This->state;
int i;
user_assert(StartRegister < NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(pConstantData,
&state->vs_const_i[StartRegister][0],
Vector4iCount * sizeof(state->vs_const_i[0]));
if (This->driver_caps.vs_integer) {
memcpy(pConstantData,
&state->vs_const_i[StartRegister][0],
Vector4iCount * sizeof(state->vs_const_i[0]));
} else {
for (i = 0; i < Vector4iCount; i++) {
pConstantData[4*i] = (int32_t) uif(state->vs_const_i[StartRegister+i][0]);
pConstantData[4*i+1] = (int32_t) uif(state->vs_const_i[StartRegister+i][1]);
pConstantData[4*i+2] = (int32_t) uif(state->vs_const_i[StartRegister+i][2]);
pConstantData[4*i+3] = (int32_t) uif(state->vs_const_i[StartRegister+i][3]);
}
}
return D3D_OK;
}
@@ -2982,6 +3002,8 @@ NineDevice9_SetVertexShaderConstantB( struct NineDevice9 *This,
UINT BoolCount )
{
struct nine_state *state = This->update;
int i;
uint32_t bool_true = This->driver_caps.vs_integer ? 0xFFFFFFFF : fui(1.0f);
DBG("This=%p StartRegister=%u pConstantData=%p BoolCount=%u\n",
This, StartRegister, pConstantData, BoolCount);
@@ -2990,9 +3012,8 @@ NineDevice9_SetVertexShaderConstantB( struct NineDevice9 *This,
user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(&state->vs_const_b[StartRegister],
pConstantData,
BoolCount * sizeof(state->vs_const_b[0]));
for (i = 0; i < BoolCount; i++)
state->vs_const_b[StartRegister + i] = pConstantData[i] ? bool_true : 0;
state->changed.vs_const_b |= ((1 << BoolCount) - 1) << StartRegister;
state->changed.group |= NINE_STATE_VS_CONST;
@@ -3007,14 +3028,14 @@ NineDevice9_GetVertexShaderConstantB( struct NineDevice9 *This,
UINT BoolCount )
{
const struct nine_state *state = &This->state;
int i;
user_assert(StartRegister < NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(pConstantData,
&state->vs_const_b[StartRegister],
BoolCount * sizeof(state->vs_const_b[0]));
for (i = 0; i < BoolCount; i++)
pConstantData[i] = state->vs_const_b[StartRegister + i] != 0 ? TRUE : FALSE;
return D3D_OK;
}
@@ -3243,6 +3264,7 @@ NineDevice9_SetPixelShaderConstantI( struct NineDevice9 *This,
UINT Vector4iCount )
{
struct nine_state *state = This->update;
int i;
DBG("This=%p StartRegister=%u pConstantData=%p Vector4iCount=%u\n",
This, StartRegister, pConstantData, Vector4iCount);
@@ -3251,10 +3273,18 @@ NineDevice9_SetPixelShaderConstantI( struct NineDevice9 *This,
user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(&state->ps_const_i[StartRegister][0],
pConstantData,
Vector4iCount * sizeof(state->ps_const_i[0]));
if (This->driver_caps.ps_integer) {
memcpy(&state->ps_const_i[StartRegister][0],
pConstantData,
Vector4iCount * sizeof(state->ps_const_i[0]));
} else {
for (i = 0; i < Vector4iCount; i++) {
state->ps_const_i[StartRegister+i][0] = fui((float)(pConstantData[4*i]));
state->ps_const_i[StartRegister+i][1] = fui((float)(pConstantData[4*i+1]));
state->ps_const_i[StartRegister+i][2] = fui((float)(pConstantData[4*i+2]));
state->ps_const_i[StartRegister+i][3] = fui((float)(pConstantData[4*i+3]));
}
}
state->changed.ps_const_i |= ((1 << Vector4iCount) - 1) << StartRegister;
state->changed.group |= NINE_STATE_PS_CONST;
@@ -3268,14 +3298,24 @@ NineDevice9_GetPixelShaderConstantI( struct NineDevice9 *This,
UINT Vector4iCount )
{
const struct nine_state *state = &This->state;
int i;
user_assert(StartRegister < NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(StartRegister + Vector4iCount <= NINE_MAX_CONST_I, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(pConstantData,
&state->ps_const_i[StartRegister][0],
Vector4iCount * sizeof(state->ps_const_i[0]));
if (This->driver_caps.ps_integer) {
memcpy(pConstantData,
&state->ps_const_i[StartRegister][0],
Vector4iCount * sizeof(state->ps_const_i[0]));
} else {
for (i = 0; i < Vector4iCount; i++) {
pConstantData[4*i] = (int32_t) uif(state->ps_const_i[StartRegister+i][0]);
pConstantData[4*i+1] = (int32_t) uif(state->ps_const_i[StartRegister+i][1]);
pConstantData[4*i+2] = (int32_t) uif(state->ps_const_i[StartRegister+i][2]);
pConstantData[4*i+3] = (int32_t) uif(state->ps_const_i[StartRegister+i][3]);
}
}
return D3D_OK;
}
@@ -3287,6 +3327,8 @@ NineDevice9_SetPixelShaderConstantB( struct NineDevice9 *This,
UINT BoolCount )
{
struct nine_state *state = This->update;
int i;
uint32_t bool_true = This->driver_caps.ps_integer ? 0xFFFFFFFF : fui(1.0f);
DBG("This=%p StartRegister=%u pConstantData=%p BoolCount=%u\n",
This, StartRegister, pConstantData, BoolCount);
@@ -3295,9 +3337,8 @@ NineDevice9_SetPixelShaderConstantB( struct NineDevice9 *This,
user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(&state->ps_const_b[StartRegister],
pConstantData,
BoolCount * sizeof(state->ps_const_b[0]));
for (i = 0; i < BoolCount; i++)
state->ps_const_b[StartRegister + i] = pConstantData[i] ? bool_true : 0;
state->changed.ps_const_b |= ((1 << BoolCount) - 1) << StartRegister;
state->changed.group |= NINE_STATE_PS_CONST;
@@ -3312,14 +3353,14 @@ NineDevice9_GetPixelShaderConstantB( struct NineDevice9 *This,
UINT BoolCount )
{
const struct nine_state *state = &This->state;
int i;
user_assert(StartRegister < NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(StartRegister + BoolCount <= NINE_MAX_CONST_B, D3DERR_INVALIDCALL);
user_assert(pConstantData, D3DERR_INVALIDCALL);
memcpy(pConstantData,
&state->ps_const_b[StartRegister],
BoolCount * sizeof(state->ps_const_b[0]));
for (i = 0; i < BoolCount; i++)
pConstantData[i] = state->ps_const_b[StartRegister + i] ? TRUE : FALSE;
return D3D_OK;
}

View File

@@ -77,10 +77,10 @@ struct NineDevice9
struct pipe_resource *constbuf_vs;
struct pipe_resource *constbuf_ps;
uint16_t vs_const_size;
uint16_t ps_const_size;
uint16_t max_vs_const_f;
uint16_t max_ps_const_f;
uint32_t vs_bool_true;
uint32_t ps_bool_true;
struct gen_mipmap_state *gen_mipmap;
@@ -111,6 +111,8 @@ struct NineDevice9
boolean user_vbufs;
boolean user_ibufs;
boolean window_space_position_support;
boolean vs_integer;
boolean ps_integer;
} driver_caps;
struct u_upload_mgr *upload;

View File

@@ -1151,10 +1151,10 @@ ps_do_ts_op(struct ps_build_ctx *ps, unsigned top, struct ureg_dst dst, struct u
ureg_MUL(ureg, ureg_saturate(dst), ureg_src(tmp), ureg_imm4f(ureg,4.0,4.0,4.0,4.0));
break;
case D3DTOP_MULTIPLYADD:
ureg_MAD(ureg, dst, arg[2], arg[0], arg[1]);
ureg_MAD(ureg, dst, arg[1], arg[2], arg[0]);
break;
case D3DTOP_LERP:
ureg_LRP(ureg, dst, arg[1], arg[2], arg[0]);
ureg_LRP(ureg, dst, arg[0], arg[1], arg[2]);
break;
case D3DTOP_DISABLE:
/* no-op ? */
@@ -1278,6 +1278,8 @@ nine_ff_build_ps(struct NineDevice9 *device, struct nine_ff_ps_key *key)
(key->ts[0].resultarg != 0 /* not current */ ||
key->ts[0].colorop == D3DTOP_DISABLE ||
key->ts[0].alphaop == D3DTOP_DISABLE ||
key->ts[0].colorop == D3DTOP_BLENDCURRENTALPHA ||
key->ts[0].alphaop == D3DTOP_BLENDCURRENTALPHA ||
key->ts[0].colorarg0 == D3DTA_CURRENT ||
key->ts[0].colorarg1 == D3DTA_CURRENT ||
key->ts[0].colorarg2 == D3DTA_CURRENT ||

View File

@@ -185,6 +185,8 @@ d3d9_to_pipe_format(D3DFORMAT format)
case D3DFMT_DXT3: return PIPE_FORMAT_DXT3_RGBA;
case D3DFMT_DXT4: return PIPE_FORMAT_DXT5_RGBA; /* XXX */
case D3DFMT_DXT5: return PIPE_FORMAT_DXT5_RGBA;
case D3DFMT_ATI1: return PIPE_FORMAT_RGTC1_UNORM;
case D3DFMT_ATI2: return PIPE_FORMAT_RGTC2_UNORM;
case D3DFMT_UYVY: return PIPE_FORMAT_UYVY;
case D3DFMT_YUY2: return PIPE_FORMAT_YUYV; /* XXX check */
case D3DFMT_NV12: return PIPE_FORMAT_NV12;
@@ -249,6 +251,8 @@ d3dformat_to_string(D3DFORMAT fmt)
case D3DFMT_DXT3: return "D3DFMT_DXT3";
case D3DFMT_DXT4: return "D3DFMT_DXT4";
case D3DFMT_DXT5: return "D3DFMT_DXT5";
case D3DFMT_ATI1: return "D3DFMT_ATI1";
case D3DFMT_ATI2: return "D3DFMT_ATI2";
case D3DFMT_D16_LOCKABLE: return "D3DFMT_D16_LOCKABLE";
case D3DFMT_D32: return "D3DFMT_D32";
case D3DFMT_D15S1: return "D3DFMT_D15S1";
@@ -279,6 +283,7 @@ d3dformat_to_string(D3DFORMAT fmt)
case D3DFMT_DF16: return "D3DFMT_DF16";
case D3DFMT_DF24: return "D3DFMT_DF24";
case D3DFMT_INTZ: return "D3DFMT_INTZ";
case D3DFMT_NVDB: return "D3DFMT_NVDB";
case D3DFMT_NULL: return "D3DFMT_NULL";
default:
break;

View File

@@ -35,11 +35,6 @@
#define DBG_CHANNEL DBG_SHADER
#if 1
#define NINE_TGSI_LAZY_DEVS /* don't use TGSI_OPCODE_BREAKC */
#endif
#define NINE_TGSI_LAZY_R600 /* don't use TGSI_OPCODE_DP2A */
#define DUMP(args...) _nine_debug_printf(DBG_CHANNEL, NULL, args)
@@ -471,14 +466,14 @@ struct shader_translator
struct ureg_src vFace;
struct ureg_src s;
struct ureg_dst p;
struct ureg_dst a;
struct ureg_dst address;
struct ureg_dst a0;
struct ureg_dst tS[8]; /* texture stage registers */
struct ureg_dst tdst; /* scratch dst if we need extra modifiers */
struct ureg_dst t[5]; /* scratch TEMPs */
struct ureg_src vC[2]; /* PS color in */
struct ureg_src vT[8]; /* PS texcoord in */
struct ureg_dst rL[NINE_MAX_LOOP_DEPTH]; /* loop ctr */
struct ureg_dst aL[NINE_MAX_LOOP_DEPTH]; /* loop ctr ADDR register */
} regs;
unsigned num_temp; /* Elements(regs.r) */
unsigned num_scratch;
@@ -487,6 +482,7 @@ struct shader_translator
unsigned cond_depth;
unsigned loop_labels[NINE_MAX_LOOP_DEPTH];
unsigned cond_labels[NINE_MAX_COND_DEPTH];
boolean loop_or_rep[NINE_MAX_LOOP_DEPTH]; /* true: loop, false: rep */
unsigned *inst_labels; /* LABEL op */
unsigned num_inst_labels;
@@ -664,8 +660,10 @@ static INLINE void
tx_addr_alloc(struct shader_translator *tx, INT idx)
{
assert(idx == 0);
if (ureg_dst_is_undef(tx->regs.a))
tx->regs.a = ureg_DECL_address(tx->ureg);
if (ureg_dst_is_undef(tx->regs.address))
tx->regs.address = ureg_DECL_address(tx->ureg);
if (ureg_dst_is_undef(tx->regs.a0))
tx->regs.a0 = ureg_DECL_temporary(tx->ureg);
}
static INLINE void
@@ -707,7 +705,7 @@ tx_endloop(struct shader_translator *tx)
}
static struct ureg_dst
tx_get_loopctr(struct shader_translator *tx)
tx_get_loopctr(struct shader_translator *tx, boolean loop_or_rep)
{
const unsigned l = tx->loop_depth - 1;
@@ -717,26 +715,32 @@ tx_get_loopctr(struct shader_translator *tx)
return ureg_dst_undef();
}
if (ureg_dst_is_undef(tx->regs.aL[l]))
{
struct ureg_dst rreg = ureg_DECL_local_temporary(tx->ureg);
struct ureg_dst areg = ureg_DECL_address(tx->ureg);
unsigned c;
assert(l % 4 == 0);
for (c = l; c < (l + 4) && c < Elements(tx->regs.aL); ++c) {
tx->regs.rL[c] = ureg_writemask(rreg, 1 << (c & 3));
tx->regs.aL[c] = ureg_writemask(areg, 1 << (c & 3));
}
if (ureg_dst_is_undef(tx->regs.rL[l])) {
/* loop or rep ctr creation */
tx->regs.rL[l] = ureg_DECL_local_temporary(tx->ureg);
tx->loop_or_rep[l] = loop_or_rep;
}
/* loop - rep - endloop - endrep not allowed */
assert(tx->loop_or_rep[l] == loop_or_rep);
return tx->regs.rL[l];
}
static struct ureg_dst
tx_get_aL(struct shader_translator *tx)
static struct ureg_src
tx_get_loopal(struct shader_translator *tx)
{
if (!ureg_dst_is_undef(tx_get_loopctr(tx)))
return tx->regs.aL[tx->loop_depth - 1];
return ureg_dst_undef();
int loop_level = tx->loop_depth - 1;
while (loop_level >= 0) {
/* handle loop - rep - endrep - endloop case */
if (tx->loop_or_rep[loop_level])
/* the value is in the loop counter y component (nine implementation) */
return ureg_scalar(ureg_src(tx->regs.rL[loop_level]), TGSI_SWIZZLE_Y);
loop_level--;
}
DBG("aL counter requested outside of loop\n");
return ureg_src_undef();
}
static INLINE unsigned *
@@ -787,8 +791,12 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
case D3DSPR_ADDR:
assert(!param->rel);
if (IS_VS) {
tx_addr_alloc(tx, param->idx);
src = ureg_src(tx->regs.a);
assert(param->idx == 0);
/* the address register (vs only) must be
* assigned before use */
assert(!ureg_dst_is_undef(tx->regs.a0));
ureg_ARR(ureg, tx->regs.address, ureg_src(tx->regs.a0));
src = ureg_src(tx->regs.address);
} else {
if (tx->version.major < 2 && tx->version.minor < 4) {
/* no subroutines, so should be defined */
@@ -827,6 +835,7 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
src = ureg_src_register(TGSI_FILE_SAMPLER, param->idx);
break;
case D3DSPR_CONST:
assert(!param->rel || IS_VS);
if (param->rel)
tx->indirect_const_access = TRUE;
if (param->rel || !tx_lconstf(tx, &src, param->idx)) {
@@ -834,6 +843,13 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
nine_info_mark_const_f_used(tx->info, param->idx);
src = ureg_src_register(TGSI_FILE_CONSTANT, param->idx);
}
if (!IS_VS && tx->version.major < 2) {
/* ps 1.X clamps constants */
tmp = tx_scratch(tx);
ureg_MIN(ureg, tmp, src, ureg_imm1f(ureg, 1.0f));
ureg_MAX(ureg, tmp, ureg_src(tmp), ureg_imm1f(ureg, -1.0f));
src = ureg_src(tmp);
}
break;
case D3DSPR_CONST2:
case D3DSPR_CONST3:
@@ -843,26 +859,33 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
src = ureg_imm1f(ureg, 0.0f);
break;
case D3DSPR_CONSTINT:
if (param->rel || !tx_lconsti(tx, &src, param->idx)) {
if (!param->rel)
nine_info_mark_const_i_used(tx->info, param->idx);
/* relative adressing only possible for float constants in vs */
assert(!param->rel);
if (!tx_lconsti(tx, &src, param->idx)) {
nine_info_mark_const_i_used(tx->info, param->idx);
src = ureg_src_register(TGSI_FILE_CONSTANT,
tx->info->const_i_base + param->idx);
}
break;
case D3DSPR_CONSTBOOL:
if (param->rel || !tx_lconstb(tx, &src, param->idx)) {
assert(!param->rel);
if (!tx_lconstb(tx, &src, param->idx)) {
char r = param->idx / 4;
char s = param->idx & 3;
if (!param->rel)
nine_info_mark_const_b_used(tx->info, param->idx);
nine_info_mark_const_b_used(tx->info, param->idx);
src = ureg_src_register(TGSI_FILE_CONSTANT,
tx->info->const_b_base + r);
src = ureg_swizzle(src, s, s, s, s);
}
break;
case D3DSPR_LOOP:
src = tx_src_scalar(tx_get_aL(tx));
if (ureg_dst_is_undef(tx->regs.address))
tx->regs.address = ureg_DECL_address(ureg);
if (!tx->native_integers)
ureg_ARR(ureg, tx->regs.address, tx_get_loopal(tx));
else
ureg_UARL(ureg, tx->regs.address, tx_get_loopal(tx));
src = ureg_src(tx->regs.address);
break;
case D3DSPR_MISCTYPE:
switch (param->idx) {
@@ -904,6 +927,25 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
if (param->rel)
src = ureg_src_indirect(src, tx_src_param(tx, param->rel));
switch (param->mod) {
case NINED3DSPSM_DW:
tmp = tx_scratch(tx);
/* NOTE: app is not allowed to read w with this modifier */
ureg_RCP(ureg, ureg_writemask(tmp, NINED3DSP_WRITEMASK_3), src);
ureg_MUL(ureg, tmp, src, ureg_swizzle(ureg_src(tmp), NINE_SWIZZLE4(W,W,W,W)));
src = ureg_src(tmp);
break;
case NINED3DSPSM_DZ:
tmp = tx_scratch(tx);
/* NOTE: app is not allowed to read z with this modifier */
ureg_RCP(ureg, ureg_writemask(tmp, NINED3DSP_WRITEMASK_2), src);
ureg_MUL(ureg, tmp, src, ureg_swizzle(ureg_src(tmp), NINE_SWIZZLE4(Z,Z,Z,Z)));
src = ureg_src(tmp);
break;
default:
break;
}
if (param->swizzle != NINED3DSP_NOSWIZZLE)
src = ureg_swizzle(src,
(param->swizzle >> 0) & 0x3,
@@ -946,7 +988,7 @@ tx_src_param(struct shader_translator *tx, const struct sm1_src_param *param)
break;
case NINED3DSPSM_DZ:
case NINED3DSPSM_DW:
/* handled in instruction */
/* Already handled*/
break;
case NINED3DSPSM_SIGN:
tmp = tx_scratch(tx);
@@ -1001,7 +1043,7 @@ _tx_dst_param(struct shader_translator *tx, const struct sm1_dst_param *param)
dst = ureg_dst(tx->regs.vT[param->idx]);
} else {
tx_addr_alloc(tx, param->idx);
dst = tx->regs.a;
dst = tx->regs.a0;
}
break;
case D3DSPR_RASTOUT:
@@ -1016,13 +1058,13 @@ _tx_dst_param(struct shader_translator *tx, const struct sm1_dst_param *param)
case 1:
if (ureg_dst_is_undef(tx->regs.oFog))
tx->regs.oFog =
ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_FOG, 0);
ureg_saturate(ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_FOG, 0));
dst = tx->regs.oFog;
break;
case 2:
if (ureg_dst_is_undef(tx->regs.oPts))
tx->regs.oPts =
ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_PSIZE, 0);
ureg_saturate(ureg_DECL_output(tx->ureg, TGSI_SEMANTIC_PSIZE, 0));
dst = tx->regs.oPts;
break;
default:
@@ -1163,16 +1205,19 @@ NineTranslateInstruction_Mkxn(struct shader_translator *tx, const unsigned k, co
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst;
struct ureg_src src[2];
struct sm1_src_param *src_mat = &tx->insn.src[1];
unsigned i;
dst = tx_dst_param(tx, &tx->insn.dst[0]);
src[0] = tx_src_param(tx, &tx->insn.src[0]);
src[1] = tx_src_param(tx, &tx->insn.src[1]);
for (i = 0; i < n; i++, src[1].Index++)
for (i = 0; i < n; i++)
{
const unsigned m = (1 << i);
src[1] = tx_src_param(tx, src_mat);
src_mat->idx++;
if (!(dst.WriteMask & m))
continue;
@@ -1329,7 +1374,7 @@ NineTranslateInstruction_Generic(struct shader_translator *);
DECL_SPECIAL(M4x4)
{
return NineTranslateInstruction_Mkxn(tx, 4, 3);
return NineTranslateInstruction_Mkxn(tx, 4, 4);
}
DECL_SPECIAL(M4x3)
@@ -1367,33 +1412,29 @@ DECL_SPECIAL(CND)
struct ureg_dst cgt;
struct ureg_src cnd;
if (tx->insn.coissue && tx->version.major == 1 && tx->version.minor < 4) {
/* the coissue flag was a tip for compilers to advise to
* execute two operations at the same time, in cases
* the two executions had same dst with different channels.
* It has no effect on current hw. However it seems CND
* is affected. The handling of this very specific case
* handled below mimick wine behaviour */
if (tx->insn.coissue && tx->version.major == 1 && tx->version.minor < 4 && tx->insn.dst[0].mask != NINED3DSP_WRITEMASK_3) {
ureg_MOV(tx->ureg,
dst, tx_src_param(tx, &tx->insn.src[1]));
return D3D_OK;
}
cnd = tx_src_param(tx, &tx->insn.src[0]);
#ifdef NINE_TGSI_LAZY_R600
cgt = tx_scratch(tx);
if (tx->version.major == 1 && tx->version.minor < 4) {
cgt.WriteMask = TGSI_WRITEMASK_W;
ureg_SGT(tx->ureg, cgt, cnd, ureg_imm1f(tx->ureg, 0.5f));
cnd = ureg_scalar(cnd, TGSI_SWIZZLE_W);
} else {
ureg_SGT(tx->ureg, cgt, cnd, ureg_imm1f(tx->ureg, 0.5f));
}
ureg_CMP(tx->ureg, dst,
tx_src_param(tx, &tx->insn.src[1]),
tx_src_param(tx, &tx->insn.src[2]), ureg_negate(cnd));
#else
if (tx->version.major == 1 && tx->version.minor < 4)
cnd = ureg_scalar(cnd, TGSI_SWIZZLE_W);
ureg_CND(tx->ureg, dst,
ureg_SGT(tx->ureg, cgt, cnd, ureg_imm1f(tx->ureg, 0.5f));
ureg_CMP(tx->ureg, dst, ureg_negate(ureg_src(cgt)),
tx_src_param(tx, &tx->insn.src[1]),
tx_src_param(tx, &tx->insn.src[2]), cnd);
#endif
tx_src_param(tx, &tx->insn.src[2]));
return D3D_OK;
}
@@ -1427,9 +1468,17 @@ DECL_SPECIAL(CALLNZ)
DECL_SPECIAL(MOV_vs1x)
{
if (tx->insn.dst[0].file == D3DSPR_ADDR) {
ureg_ARL(tx->ureg,
/* Implementation note: We don't write directly
* to the addr register, but to an intermediate
* float register.
* Contrary to the doc, when writing to ADDR here,
* the rounding is not to nearest, but to lowest
* (wine test).
* Since we use ARR next, substract 0.5. */
ureg_SUB(tx->ureg,
tx_dst_param(tx, &tx->insn.dst[0]),
tx_src_param(tx, &tx->insn.src[0]));
tx_src_param(tx, &tx->insn.src[0]),
ureg_imm1f(tx->ureg, 0.5f));
return D3D_OK;
}
return NineTranslateInstruction_Generic(tx);
@@ -1440,46 +1489,36 @@ DECL_SPECIAL(LOOP)
struct ureg_program *ureg = tx->ureg;
unsigned *label;
struct ureg_src src = tx_src_param(tx, &tx->insn.src[1]);
struct ureg_src iter = ureg_scalar(src, TGSI_SWIZZLE_X);
struct ureg_src init = ureg_scalar(src, TGSI_SWIZZLE_Y);
struct ureg_src step = ureg_scalar(src, TGSI_SWIZZLE_Z);
struct ureg_dst ctr;
struct ureg_dst tmp = tx_scratch_scalar(tx);
struct ureg_dst tmp;
struct ureg_src ctrx;
label = tx_bgnloop(tx);
ctr = tx_get_loopctr(tx);
ctr = tx_get_loopctr(tx, TRUE);
ctrx = ureg_scalar(ureg_src(ctr), TGSI_SWIZZLE_X);
ureg_MOV(tx->ureg, ctr, init);
/* src: num_iterations - start_value of al - step for al - 0 */
ureg_MOV(ureg, ctr, src);
ureg_BGNLOOP(tx->ureg, label);
if (tx->native_integers) {
/* we'll let the backend pull up that MAD ... */
ureg_UMAD(ureg, tmp, iter, step, init);
ureg_USEQ(ureg, tmp, ureg_src(ctr), tx_src_scalar(tmp));
#ifdef NINE_TGSI_LAZY_DEVS
ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
#endif
} else {
/* can't simply use SGE for precision because step might be negative */
ureg_MAD(ureg, tmp, iter, step, init);
ureg_SEQ(ureg, tmp, ureg_src(ctr), tx_src_scalar(tmp));
#ifdef NINE_TGSI_LAZY_DEVS
tmp = tx_scratch_scalar(tx);
/* Initially ctr.x contains the number of iterations.
* ctr.y will contain the updated value of al.
* We decrease ctr.x at the end of every iteration,
* and stop when it reaches 0. */
if (!tx->native_integers) {
/* case src and ctr contain floats */
/* to avoid precision issue, we stop when ctr <= 0.5 */
ureg_SGE(ureg, tmp, ureg_imm1f(ureg, 0.5f), ctrx);
ureg_IF(ureg, tx_src_scalar(tmp), tx_cond(tx));
#endif
} else {
/* case src and ctr contain integers */
ureg_ISGE(ureg, tmp, ureg_imm1i(ureg, 0), ctrx);
ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
}
#ifdef NINE_TGSI_LAZY_DEVS
ureg_BRK(ureg);
tx_endcond(tx);
ureg_ENDIF(ureg);
#else
ureg_BREAKC(ureg, tx_src_scalar(tmp));
#endif
if (tx->native_integers) {
ureg_UARL(ureg, tx_get_aL(tx), tx_src_scalar(ctr));
ureg_UADD(ureg, ctr, tx_src_scalar(ctr), step);
} else {
ureg_ARL(ureg, tx_get_aL(tx), tx_src_scalar(ctr));
ureg_ADD(ureg, ctr, tx_src_scalar(ctr), step);
}
return D3D_OK;
}
@@ -1491,6 +1530,25 @@ DECL_SPECIAL(RET)
DECL_SPECIAL(ENDLOOP)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst ctr = tx_get_loopctr(tx, TRUE);
struct ureg_dst dst_ctrx, dst_al;
struct ureg_src src_ctr, al_counter;
dst_ctrx = ureg_writemask(ctr, NINED3DSP_WRITEMASK_0);
dst_al = ureg_writemask(ctr, NINED3DSP_WRITEMASK_1);
src_ctr = ureg_src(ctr);
al_counter = ureg_scalar(src_ctr, TGSI_SWIZZLE_Z);
/* ctr.x -= 1
* ctr.y (aL) += step */
if (!tx->native_integers) {
ureg_ADD(ureg, dst_ctrx, src_ctr, ureg_imm1f(ureg, -1.0f));
ureg_ADD(ureg, dst_al, src_ctr, al_counter);
} else {
ureg_UADD(ureg, dst_ctrx, src_ctr, ureg_imm1i(ureg, -1));
ureg_UADD(ureg, dst_al, src_ctr, al_counter);
}
ureg_ENDLOOP(tx->ureg, tx_endloop(tx));
return D3D_OK;
}
@@ -1540,7 +1598,7 @@ DECL_SPECIAL(REP)
tx->native_integers ? ureg_imm1u(ureg, 0) : ureg_imm1f(ureg, 0.0f);
label = tx_bgnloop(tx);
ctr = tx_get_loopctr(tx);
ctr = tx_get_loopctr(tx, FALSE);
/* NOTE: rep must be constant, so we don't have to save the count */
assert(rep.File == TGSI_FILE_CONSTANT || rep.File == TGSI_FILE_IMMEDIATE);
@@ -1550,24 +1608,16 @@ DECL_SPECIAL(REP)
if (tx->native_integers)
{
ureg_USGE(ureg, tmp, tx_src_scalar(ctr), rep);
#ifdef NINE_TGSI_LAZY_DEVS
ureg_UIF(ureg, tx_src_scalar(tmp), tx_cond(tx));
#endif
}
else
{
ureg_SGE(ureg, tmp, tx_src_scalar(ctr), rep);
#ifdef NINE_TGSI_LAZY_DEVS
ureg_IF(ureg, tx_src_scalar(tmp), tx_cond(tx));
#endif
}
#ifdef NINE_TGSI_LAZY_DEVS
ureg_BRK(ureg);
tx_endcond(tx);
ureg_ENDIF(ureg);
#else
ureg_BREAKC(ureg, tx_src_scalar(tmp));
#endif
if (tx->native_integers) {
ureg_UADD(ureg, ctr, tx_src_scalar(ctr), ureg_imm1u(ureg, 1));
@@ -1645,14 +1695,10 @@ DECL_SPECIAL(BREAKC)
src[0] = tx_src_param(tx, &tx->insn.src[0]);
src[1] = tx_src_param(tx, &tx->insn.src[1]);
ureg_insn(tx->ureg, cmp_op, &tmp, 1, src, 2);
#ifdef NINE_TGSI_LAZY_DEVS
ureg_IF(tx->ureg, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), tx_cond(tx));
ureg_BRK(tx->ureg);
tx_endcond(tx);
ureg_ENDIF(tx->ureg);
#else
ureg_BREAKC(tx->ureg, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
#endif
return D3D_OK;
}
@@ -1958,21 +2004,55 @@ DECL_SPECIAL(DEFI)
return D3D_OK;
}
DECL_SPECIAL(POW)
{
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src[2] = {
tx_src_param(tx, &tx->insn.src[0]),
tx_src_param(tx, &tx->insn.src[1])
};
ureg_POW(tx->ureg, dst, ureg_abs(src[0]), src[1]);
return D3D_OK;
}
DECL_SPECIAL(RSQ)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
struct ureg_dst tmp = tx_scratch(tx);
ureg_RSQ(ureg, tmp, ureg_abs(src));
ureg_MIN(ureg, dst, ureg_imm1f(ureg, FLT_MAX), ureg_src(tmp));
return D3D_OK;
}
DECL_SPECIAL(LOG)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst tmp = tx_scratch_scalar(tx);
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
ureg_LG2(ureg, tmp, ureg_abs(src));
ureg_MAX(ureg, dst, ureg_imm1f(ureg, -FLT_MAX), tx_src_scalar(tmp));
return D3D_OK;
}
DECL_SPECIAL(NRM)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst tmp = tx_scratch_scalar(tx);
struct ureg_src nrm = tx_src_scalar(tmp);
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
ureg_DP3(ureg, tmp, src, src);
ureg_RSQ(ureg, tmp, nrm);
ureg_MUL(ureg, tx_dst_param(tx, &tx->insn.dst[0]), src, nrm);
ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), nrm);
ureg_MUL(ureg, dst, src, nrm);
return D3D_OK;
}
DECL_SPECIAL(DP2ADD)
{
#ifdef NINE_TGSI_LAZY_R600
struct ureg_dst tmp = tx_scratch_scalar(tx);
struct ureg_src dp2 = tx_src_scalar(tmp);
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
@@ -1986,9 +2066,6 @@ DECL_SPECIAL(DP2ADD)
ureg_ADD(tx->ureg, dst, src[2], dp2);
return D3D_OK;
#else
return NineTranslateInstruction_Generic(tx);
#endif
}
DECL_SPECIAL(TEXCOORD)
@@ -1997,9 +2074,9 @@ DECL_SPECIAL(TEXCOORD)
const unsigned s = tx->insn.dst[0].idx;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
if (ureg_src_is_undef(tx->regs.vT[s]))
tx->regs.vT[s] = ureg_DECL_fs_input(ureg, tx->texcoord_sn, s, TGSI_INTERPOLATE_PERSPECTIVE);
ureg_MOV(ureg, dst, tx->regs.vT[s]); /* XXX is this sufficient ? */
tx_texcoord_alloc(tx, s);
ureg_MOV(ureg, ureg_writemask(ureg_saturate(dst), TGSI_WRITEMASK_XYZ), tx->regs.vT[s]);
ureg_MOV(ureg, ureg_writemask(dst, TGSI_WRITEMASK_W), ureg_imm1f(tx->ureg, 1.0f));
return D3D_OK;
}
@@ -2007,12 +2084,12 @@ DECL_SPECIAL(TEXCOORD)
DECL_SPECIAL(TEXCOORD_ps14)
{
struct ureg_program *ureg = tx->ureg;
const unsigned s = tx->insn.src[0].idx;
struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]);
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
if (ureg_src_is_undef(tx->regs.vT[s]))
tx->regs.vT[s] = ureg_DECL_fs_input(ureg, tx->texcoord_sn, s, TGSI_INTERPOLATE_PERSPECTIVE);
ureg_MOV(ureg, dst, tx->regs.vT[s]); /* XXX is this sufficient ? */
assert(tx->insn.src[0].file == D3DSPR_TEXTURE);
ureg_MOV(ureg, dst, src);
return D3D_OK;
}
@@ -2046,22 +2123,62 @@ DECL_SPECIAL(TEXBEML)
DECL_SPECIAL(TEXREG2AR)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src sample;
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
sample = ureg_DECL_sampler(ureg, m);
tx->info->sampler_mask |= 1 << m;
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), ureg_swizzle(ureg_src(tx->regs.tS[n]), NINE_SWIZZLE4(W,X,X,X)), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXREG2GB)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src sample;
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
sample = ureg_DECL_sampler(ureg, m);
tx->info->sampler_mask |= 1 << m;
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), ureg_swizzle(ureg_src(tx->regs.tS[n]), NINE_SWIZZLE4(Y,Z,Z,Z)), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXM3x2PAD)
{
STUB(D3DERR_INVALIDCALL);
return D3D_OK; /* this is just padding */
}
DECL_SPECIAL(TEXM3x2TEX)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src sample;
const int m = tx->insn.dst[0].idx - 1;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
tx_texcoord_alloc(tx, m);
tx_texcoord_alloc(tx, m+1);
/* performs the matrix multiplication */
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], ureg_src(tx->regs.tS[n]));
sample = ureg_DECL_sampler(ureg, m + 1);
tx->info->sampler_mask |= 1 << (m + 1);
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 1), ureg_src(dst), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXM3x3PAD)
@@ -2071,61 +2188,180 @@ DECL_SPECIAL(TEXM3x3PAD)
DECL_SPECIAL(TEXM3x3SPEC)
{
STUB(D3DERR_INVALIDCALL);
}
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src E = tx_src_param(tx, &tx->insn.src[1]);
struct ureg_src sample;
struct ureg_dst tmp;
const int m = tx->insn.dst[0].idx - 2;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
DECL_SPECIAL(TEXM3x3VSPEC)
{
STUB(D3DERR_INVALIDCALL);
tx_texcoord_alloc(tx, m);
tx_texcoord_alloc(tx, m+1);
tx_texcoord_alloc(tx, m+2);
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Z), tx->regs.vT[m+2], ureg_src(tx->regs.tS[n]));
sample = ureg_DECL_sampler(ureg, m + 2);
tx->info->sampler_mask |= 1 << (m + 2);
tmp = ureg_writemask(tx_scratch(tx), TGSI_WRITEMASK_XYZ);
/* At this step, dst = N = (u', w', z').
* We want dst to be the texture sampled at (u'', w'', z''), with
* (u'', w'', z'') = 2 * (N.E / N.N) * N - E */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_src(dst), ureg_src(dst));
ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
/* at this step tmp.x = 1/N.N */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), ureg_src(dst), E);
/* at this step tmp.y = N.E */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y));
/* at this step tmp.x = N.E/N.N */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_imm1f(ureg, 2.0f));
ureg_MUL(ureg, tmp, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_src(dst));
/* at this step tmp.xyz = 2 * (N.E / N.N) * N */
ureg_SUB(ureg, tmp, ureg_src(tmp), E);
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(tmp), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXREG2RGB)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src sample;
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
sample = ureg_DECL_sampler(ureg, m);
tx->info->sampler_mask |= 1 << m;
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), ureg_src(tx->regs.tS[n]), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXDP3TEX)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_dst tmp;
struct ureg_src sample;
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
tx_texcoord_alloc(tx, m);
tmp = tx_scratch(tx);
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_MOV(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_YZ), ureg_imm1f(ureg, 0.0f));
sample = ureg_DECL_sampler(ureg, m);
tx->info->sampler_mask |= 1 << m;
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m), ureg_src(tmp), sample);
return D3D_OK;
}
DECL_SPECIAL(TEXM3x2DEPTH)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst tmp;
const int m = tx->insn.dst[0].idx - 1;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
tx_texcoord_alloc(tx, m);
tx_texcoord_alloc(tx, m+1);
tmp = tx_scratch(tx);
/* performs the matrix multiplication */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], ureg_src(tx->regs.tS[n]));
ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Z), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y));
/* tmp.x = 'z', tmp.y = 'w', tmp.z = 1/'w'. */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Z));
/* res = 'w' == 0 ? 1.0 : z/w */
ureg_CMP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_negate(ureg_abs(ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y))),
ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_imm1f(ureg, 1.0f));
/* replace the depth for depth testing with the result */
tx->regs.oDepth = ureg_DECL_output_masked(ureg, TGSI_SEMANTIC_POSITION, 0, TGSI_WRITEMASK_Z);
ureg_MOV(ureg, tx->regs.oDepth, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
/* note that we write nothing to the destination, since it's disallowed to use it afterward */
return D3D_OK;
}
DECL_SPECIAL(TEXDP3)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
const int m = tx->insn.dst[0].idx;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
tx_texcoord_alloc(tx, m);
ureg_DP3(ureg, dst, tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
return D3D_OK;
}
DECL_SPECIAL(TEXM3x3)
{
struct ureg_program *ureg = tx->ureg;
struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]);
struct ureg_src src[4];
int s;
struct ureg_src sample;
struct ureg_dst E, tmp;
const int m = tx->insn.dst[0].idx - 2;
const int n = tx->insn.src[0].idx;
assert(m >= 0 && m > n);
for (s = m; s <= (m + 2); ++s) {
if (ureg_src_is_undef(tx->regs.vT[s]))
tx->regs.vT[s] = ureg_DECL_fs_input(ureg, tx->texcoord_sn, s, TGSI_INTERPOLATE_PERSPECTIVE);
src[s] = tx->regs.vT[s];
}
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), src[0], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), src[1], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Z), src[2], ureg_src(tx->regs.tS[n]));
tx_texcoord_alloc(tx, m);
tx_texcoord_alloc(tx, m+1);
tx_texcoord_alloc(tx, m+2);
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_X), tx->regs.vT[m], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Y), tx->regs.vT[m+1], ureg_src(tx->regs.tS[n]));
ureg_DP3(ureg, ureg_writemask(dst, TGSI_WRITEMASK_Z), tx->regs.vT[m+2], ureg_src(tx->regs.tS[n]));
switch (tx->insn.opcode) {
case D3DSIO_TEXM3x3:
ureg_MOV(ureg, ureg_writemask(dst, TGSI_WRITEMASK_W), ureg_imm1f(ureg, 1.0f));
break;
case D3DSIO_TEXM3x3TEX:
src[3] = ureg_DECL_sampler(ureg, m + 2);
sample = ureg_DECL_sampler(ureg, m + 2);
tx->info->sampler_mask |= 1 << (m + 2);
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(dst), src[3]);
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(dst), sample);
break;
case D3DSIO_TEXM3x3VSPEC:
sample = ureg_DECL_sampler(ureg, m + 2);
tx->info->sampler_mask |= 1 << (m + 2);
E = tx_scratch(tx);
tmp = ureg_writemask(tx_scratch(tx), TGSI_WRITEMASK_XYZ);
ureg_MOV(ureg, ureg_writemask(E, TGSI_WRITEMASK_X), ureg_scalar(tx->regs.vT[m], TGSI_SWIZZLE_W));
ureg_MOV(ureg, ureg_writemask(E, TGSI_WRITEMASK_Y), ureg_scalar(tx->regs.vT[m+1], TGSI_SWIZZLE_W));
ureg_MOV(ureg, ureg_writemask(E, TGSI_WRITEMASK_Z), ureg_scalar(tx->regs.vT[m+2], TGSI_SWIZZLE_W));
/* At this step, dst = N = (u', w', z').
* We want dst to be the texture sampled at (u'', w'', z''), with
* (u'', w'', z'') = 2 * (N.E / N.N) * N - E */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_src(dst), ureg_src(dst));
ureg_RCP(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X));
/* at this step tmp.x = 1/N.N */
ureg_DP3(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_Y), ureg_src(dst), ureg_src(E));
/* at this step tmp.y = N.E */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_Y));
/* at this step tmp.x = N.E/N.N */
ureg_MUL(ureg, ureg_writemask(tmp, TGSI_WRITEMASK_X), ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_imm1f(ureg, 2.0f));
ureg_MUL(ureg, tmp, ureg_scalar(ureg_src(tmp), TGSI_SWIZZLE_X), ureg_src(dst));
/* at this step tmp.xyz = 2 * (N.E / N.N) * N */
ureg_SUB(ureg, tmp, ureg_src(tmp), ureg_src(E));
ureg_TEX(ureg, dst, ps1x_sampler_type(tx->info, m + 2), ureg_src(tmp), sample);
break;
default:
return D3DERR_INVALIDCALL;
@@ -2135,7 +2371,28 @@ DECL_SPECIAL(TEXM3x3)
DECL_SPECIAL(TEXDEPTH)
{
STUB(D3DERR_INVALIDCALL);
struct ureg_program *ureg = tx->ureg;
struct ureg_dst r5;
struct ureg_src r5r, r5g;
assert(tx->insn.dst[0].idx == 5); /* instruction must get r5 here */
/* we must replace the depth by r5.g == 0 ? 1.0f : r5.r/r5.g.
* r5 won't be used afterward, thus we can use r5.ba */
r5 = tx->regs.r[5];
r5r = ureg_scalar(ureg_src(r5), TGSI_SWIZZLE_X);
r5g = ureg_scalar(ureg_src(r5), TGSI_SWIZZLE_Y);
ureg_RCP(ureg, ureg_writemask(r5, TGSI_WRITEMASK_Z), r5g);
ureg_MUL(ureg, ureg_writemask(r5, TGSI_WRITEMASK_X), r5r, ureg_scalar(ureg_src(r5), TGSI_SWIZZLE_Z));
/* r5.r = r/g */
ureg_CMP(ureg, ureg_writemask(r5, TGSI_WRITEMASK_X), ureg_negate(ureg_abs(r5g)),
r5r, ureg_imm1f(ureg, 1.0f));
/* replace the depth for depth testing with the result */
tx->regs.oDepth = ureg_DECL_output_masked(ureg, TGSI_SEMANTIC_POSITION, 0, TGSI_WRITEMASK_Z);
ureg_MOV(ureg, tx->regs.oDepth, r5r);
return D3D_OK;
}
DECL_SPECIAL(BEM)
@@ -2275,7 +2532,7 @@ struct sm1_op_info inst_table[] =
_OPI(MAD, MAD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 3, NULL), /* 4 */
_OPI(MUL, MUL, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 5 */
_OPI(RCP, RCP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 6 */
_OPI(RSQ, RSQ, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 7 */
_OPI(RSQ, RSQ, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(RSQ)), /* 7 */
_OPI(DP3, DP3, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 8 */
_OPI(DP4, DP4, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 9 */
_OPI(MIN, MIN, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 10 */
@@ -2283,7 +2540,7 @@ struct sm1_op_info inst_table[] =
_OPI(SLT, SLT, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 12 */
_OPI(SGE, SGE, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 13 */
_OPI(EXP, EX2, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 14 */
_OPI(LOG, LG2, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 15 */
_OPI(LOG, LG2, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(LOG)), /* 15 */
_OPI(LIT, LIT, V(0,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL), /* 16 */
_OPI(DST, DST, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 17 */
_OPI(LRP, LRP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 3, NULL), /* 18 */
@@ -2295,16 +2552,16 @@ struct sm1_op_info inst_table[] =
_OPI(M3x3, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(M3x3)),
_OPI(M3x2, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(M3x2)),
_OPI(CALL, CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(CALL)),
_OPI(CALLNZ, CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(CALLNZ)),
_OPI(CALL, CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 1, SPECIAL(CALL)),
_OPI(CALLNZ, CAL, V(2,0), V(3,0), V(2,1), V(3,0), 0, 2, SPECIAL(CALLNZ)),
_OPI(LOOP, BGNLOOP, V(2,0), V(3,0), V(3,0), V(3,0), 0, 2, SPECIAL(LOOP)),
_OPI(RET, RET, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(RET)),
_OPI(ENDLOOP, ENDLOOP, V(2,0), V(3,0), V(3,0), V(3,0), 0, 0, SPECIAL(ENDLOOP)),
_OPI(LABEL, NOP, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(LABEL)),
_OPI(LABEL, NOP, V(2,0), V(3,0), V(2,1), V(3,0), 0, 1, SPECIAL(LABEL)),
_OPI(DCL, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 0, 0, SPECIAL(DCL)),
_OPI(POW, POW, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL),
_OPI(POW, POW, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(POW)),
_OPI(CRS, XPD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* XXX: .w */
_OPI(SGN, SSG, V(2,0), V(3,0), V(0,0), V(0,0), 1, 3, SPECIAL(SGN)), /* ignore src1,2 */
_OPI(ABS, ABS, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL),
@@ -2322,8 +2579,9 @@ struct sm1_op_info inst_table[] =
_OPI(ENDIF, ENDIF, V(2,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(ENDIF)),
_OPI(BREAK, BRK, V(2,1), V(3,0), V(2,1), V(3,0), 0, 0, NULL),
_OPI(BREAKC, BREAKC, V(2,1), V(3,0), V(2,1), V(3,0), 0, 2, SPECIAL(BREAKC)),
_OPI(MOVA, ARR, V(2,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
/* we don't write to the address register, but a normal register (copied
* when needed to the address register), thus we don't use ARR */
_OPI(MOVA, MOV, V(2,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
_OPI(DEFB, NOP, V(0,0), V(3,0) , V(0,0), V(3,0) , 1, 0, SPECIAL(DEFB)),
_OPI(DEFI, NOP, V(0,0), V(3,0) , V(0,0), V(3,0) , 1, 0, SPECIAL(DEFI)),
@@ -2334,42 +2592,42 @@ struct sm1_op_info inst_table[] =
_OPI(TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 0, SPECIAL(TEX)),
_OPI(TEX, TEX, V(0,0), V(0,0), V(1,4), V(1,4), 1, 1, SPECIAL(TEXLD_14)),
_OPI(TEX, TEX, V(0,0), V(0,0), V(2,0), V(3,0), 1, 2, SPECIAL(TEXLD)),
_OPI(TEXBEM, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXBEM)),
_OPI(TEXBEML, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXBEML)),
_OPI(TEXREG2AR, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXREG2AR)),
_OPI(TEXREG2GB, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXREG2GB)),
_OPI(TEXM3x2PAD, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x2PAD)),
_OPI(TEXM3x2TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x2TEX)),
_OPI(TEXM3x3PAD, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x3PAD)),
_OPI(TEXM3x3TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x3)),
_OPI(TEXM3x3SPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x3SPEC)),
_OPI(TEXM3x3VSPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 0, 0, SPECIAL(TEXM3x3VSPEC)),
_OPI(TEXBEM, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXBEM)),
_OPI(TEXBEML, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXBEML)),
_OPI(TEXREG2AR, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXREG2AR)),
_OPI(TEXREG2GB, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXREG2GB)),
_OPI(TEXM3x2PAD, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x2PAD)),
_OPI(TEXM3x2TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x2TEX)),
_OPI(TEXM3x3PAD, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x3PAD)),
_OPI(TEXM3x3TEX, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x3)),
_OPI(TEXM3x3SPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 2, SPECIAL(TEXM3x3SPEC)),
_OPI(TEXM3x3VSPEC, TEX, V(0,0), V(0,0), V(0,0), V(1,3), 1, 1, SPECIAL(TEXM3x3)),
_OPI(EXPP, EXP, V(0,0), V(1,1), V(0,0), V(0,0), 1, 1, NULL),
_OPI(EXPP, EX2, V(2,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
_OPI(LOGP, LG2, V(0,0), V(3,0), V(0,0), V(0,0), 1, 1, NULL),
_OPI(CND, CND, V(0,0), V(0,0), V(0,0), V(1,4), 1, 3, SPECIAL(CND)),
_OPI(LOGP, LG2, V(0,0), V(3,0), V(0,0), V(0,0), 1, 1, SPECIAL(LOG)),
_OPI(CND, NOP, V(0,0), V(0,0), V(0,0), V(1,4), 1, 3, SPECIAL(CND)),
_OPI(DEF, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 0, SPECIAL(DEF)),
/* More tex stuff */
_OPI(TEXREG2RGB, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, SPECIAL(TEXREG2RGB)),
_OPI(TEXDP3TEX, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, SPECIAL(TEXDP3TEX)),
_OPI(TEXM3x2DEPTH, TEX, V(0,0), V(0,0), V(1,3), V(1,3), 0, 0, SPECIAL(TEXM3x2DEPTH)),
_OPI(TEXDP3, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, SPECIAL(TEXDP3)),
_OPI(TEXM3x3, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 0, 0, SPECIAL(TEXM3x3)),
_OPI(TEXDEPTH, TEX, V(0,0), V(0,0), V(1,4), V(1,4), 0, 0, SPECIAL(TEXDEPTH)),
_OPI(TEXREG2RGB, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, SPECIAL(TEXREG2RGB)),
_OPI(TEXDP3TEX, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, SPECIAL(TEXDP3TEX)),
_OPI(TEXM3x2DEPTH, TEX, V(0,0), V(0,0), V(1,3), V(1,3), 1, 1, SPECIAL(TEXM3x2DEPTH)),
_OPI(TEXDP3, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, SPECIAL(TEXDP3)),
_OPI(TEXM3x3, TEX, V(0,0), V(0,0), V(1,2), V(1,3), 1, 1, SPECIAL(TEXM3x3)),
_OPI(TEXDEPTH, TEX, V(0,0), V(0,0), V(1,4), V(1,4), 1, 0, SPECIAL(TEXDEPTH)),
/* Misc */
_OPI(CMP, CMP, V(0,0), V(0,0), V(1,2), V(3,0), 1, 3, SPECIAL(CMP)), /* reversed */
_OPI(BEM, NOP, V(0,0), V(0,0), V(1,4), V(1,4), 0, 0, SPECIAL(BEM)),
_OPI(DP2ADD, DP2A, V(0,0), V(0,0), V(2,0), V(3,0), 1, 3, SPECIAL(DP2ADD)), /* for radeons */
_OPI(BEM, NOP, V(0,0), V(0,0), V(1,4), V(1,4), 1, 2, SPECIAL(BEM)),
_OPI(DP2ADD, NOP, V(0,0), V(0,0), V(2,0), V(3,0), 1, 3, SPECIAL(DP2ADD)),
_OPI(DSX, DDX, V(0,0), V(0,0), V(2,1), V(3,0), 1, 1, NULL),
_OPI(DSY, DDY, V(0,0), V(0,0), V(2,1), V(3,0), 1, 1, NULL),
_OPI(TEXLDD, TXD, V(0,0), V(0,0), V(2,1), V(3,0), 1, 4, SPECIAL(TEXLDD)),
_OPI(SETP, NOP, V(0,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(SETP)),
_OPI(SETP, NOP, V(0,0), V(3,0), V(2,1), V(3,0), 1, 2, SPECIAL(SETP)),
_OPI(TEXLDL, TXL, V(3,0), V(3,0), V(3,0), V(3,0), 1, 2, SPECIAL(TEXLDL)),
_OPI(BREAKP, BRK, V(0,0), V(3,0), V(2,1), V(3,0), 0, 0, SPECIAL(BREAKP))
_OPI(BREAKP, BRK, V(0,0), V(3,0), V(2,1), V(3,0), 0, 1, SPECIAL(BREAKP))
};
struct sm1_op_info inst_phase =
@@ -2740,11 +2998,11 @@ tx_ctor(struct shader_translator *tx, struct nine_shader_info *info)
info->lconstf.data = NULL;
info->lconstf.ranges = NULL;
for (i = 0; i < Elements(tx->regs.aL); ++i) {
tx->regs.aL[i] = ureg_dst_undef();
for (i = 0; i < Elements(tx->regs.rL); ++i) {
tx->regs.rL[i] = ureg_dst_undef();
}
tx->regs.a = ureg_dst_undef();
tx->regs.address = ureg_dst_undef();
tx->regs.a0 = ureg_dst_undef();
tx->regs.p = ureg_dst_undef();
tx->regs.oDepth = ureg_dst_undef();
tx->regs.vPos = ureg_src_undef();
@@ -2852,9 +3110,6 @@ nine_translate_shader(struct NineDevice9 *device, struct nine_shader_info *info)
ureg_property_fs_coord_pixel_center(tx->ureg, TGSI_FS_COORD_PIXEL_CENTER_INTEGER);
}
if (!ureg_dst_is_undef(tx->regs.oPts))
info->point_size = TRUE;
while (!sm1_parse_eof(tx))
sm1_parse_instruction(tx);
tx->parse++; /* for byte_size */
@@ -2870,6 +3125,9 @@ nine_translate_shader(struct NineDevice9 *device, struct nine_shader_info *info)
ureg_END(tx->ureg);
if (IS_VS && !ureg_dst_is_undef(tx->regs.oPts))
info->point_size = TRUE;
if (debug_get_bool_option("NINE_TGSI_DUMP", FALSE)) {
unsigned count;
const struct tgsi_token *toks = ureg_get_tokens(tx->ureg, &count);

View File

@@ -347,14 +347,13 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
const int *const_i;
const BOOL *const_b;
uint32_t data_b[NINE_MAX_CONST_B];
uint32_t b_true;
uint16_t dirty_i;
uint16_t dirty_b;
const unsigned usage = PIPE_TRANSFER_WRITE | PIPE_TRANSFER_DISCARD_RANGE;
unsigned x = 0; /* silence warning */
unsigned i, c, n;
const struct nine_lconstf *lconstf;
struct nine_range *r, *p;
unsigned i, c;
struct nine_range *r, *p, *lconstf_ranges;
float *lconstf_data;
box.y = 0;
box.z = 0;
@@ -381,9 +380,10 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
dirty_b = device->state.changed.vs_const_b;
device->state.changed.vs_const_b = 0;
const_b = device->state.vs_const_b;
b_true = device->vs_bool_true;
lconstf = &device->state.vs->lconstf;
lconstf_ranges = device->state.vs->lconstf.ranges;
lconstf_data = device->state.vs->lconstf.data;
device->state.ff.clobber.vs_const = TRUE;
device->state.changed.group &= ~NINE_STATE_VS_CONST;
} else {
@@ -406,9 +406,10 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
dirty_b = device->state.changed.ps_const_b;
device->state.changed.ps_const_b = 0;
const_b = device->state.ps_const_b;
b_true = device->ps_bool_true;
lconstf = &device->state.ps->lconstf;
lconstf_ranges = NULL;
lconstf_data = NULL;
device->state.ff.clobber.ps_const = TRUE;
device->state.changed.group &= ~NINE_STATE_PS_CONST;
}
@@ -420,11 +421,10 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
i = ffs(dirty_b) - 1;
x = buf->width0 - (NINE_MAX_CONST_B - i) * 4;
c -= i;
for (n = 0; n < c; ++n, ++i)
data_b[n] = const_b[i] ? b_true : 0;
memcpy(data_b, &(const_b[i]), c * sizeof(uint32_t));
box.x = x;
box.width = n * 4;
DBG("upload ConstantB [%u .. %u]\n", x, x + n - 1);
box.width = c * 4;
DBG("upload ConstantB [%u .. %u]\n", x, x + c - 1);
pipe->transfer_inline_write(pipe, buf, 0, usage, &box, data_b, 0, 0);
}
@@ -455,14 +455,14 @@ update_constants(struct NineDevice9 *device, unsigned shader_type)
}
/* TODO: only upload these when shader itself changes */
if (lconstf->ranges) {
if (lconstf_ranges) {
unsigned n = 0;
struct nine_range *r = lconstf->ranges;
struct nine_range *r = lconstf_ranges;
while (r) {
box.x = r->bgn * 4 * sizeof(float);
n += r->end - r->bgn;
box.width = (r->end - r->bgn) * 4 * sizeof(float);
data = &lconstf->data[4 * n];
data = &lconstf_data[4 * n];
pipe->transfer_inline_write(pipe, buf, 0, usage, &box, data, 0, 0);
r = r->next;
}
@@ -491,19 +491,16 @@ update_vs_constants_userbuf(struct NineDevice9 *device)
if (state->changed.vs_const_b) {
int *idst = (int *)&state->vs_const_f[4 * device->max_vs_const_f];
uint32_t *bdst = (uint32_t *)&idst[4 * NINE_MAX_CONST_I];
int i;
for (i = 0; i < NINE_MAX_CONST_B; ++i)
bdst[i] = state->vs_const_b[i] ? device->vs_bool_true : 0;
memcpy(bdst, state->vs_const_b, sizeof(state->vs_const_b));
state->changed.vs_const_b = 0;
}
#ifdef DEBUG
if (device->state.vs->lconstf.ranges) {
/* TODO: Can we make it so that we don't have to copy everything ? */
const struct nine_lconstf *lconstf = &device->state.vs->lconstf;
const struct nine_range *r = lconstf->ranges;
unsigned n = 0;
float *dst = (float *)MALLOC(cb.buffer_size);
float *dst = device->state.vs_lconstf_temp;
float *src = (float *)cb.user_buffer;
memcpy(dst, src, cb.buffer_size);
while (r) {
@@ -515,15 +512,9 @@ update_vs_constants_userbuf(struct NineDevice9 *device)
}
cb.user_buffer = dst;
}
#endif
pipe->set_constant_buffer(pipe, PIPE_SHADER_VERTEX, 0, &cb);
#ifdef DEBUG
if (device->state.vs->lconstf.ranges)
FREE((void *)cb.user_buffer);
#endif
if (device->state.changed.vs_const_f) {
struct nine_range *r = device->state.changed.vs_const_f;
struct nine_range *p = r;
@@ -557,39 +548,12 @@ update_ps_constants_userbuf(struct NineDevice9 *device)
if (state->changed.ps_const_b) {
int *idst = (int *)&state->ps_const_f[4 * device->max_ps_const_f];
uint32_t *bdst = (uint32_t *)&idst[4 * NINE_MAX_CONST_I];
int i;
for (i = 0; i < NINE_MAX_CONST_B; ++i)
bdst[i] = state->ps_const_b[i] ? device->ps_bool_true : 0;
memcpy(bdst, state->ps_const_b, sizeof(state->ps_const_b));
state->changed.ps_const_b = 0;
}
#ifdef DEBUG
if (device->state.ps->lconstf.ranges) {
/* TODO: Can we make it so that we don't have to copy everything ? */
const struct nine_lconstf *lconstf = &device->state.ps->lconstf;
const struct nine_range *r = lconstf->ranges;
unsigned n = 0;
float *dst = (float *)MALLOC(cb.buffer_size);
float *src = (float *)cb.user_buffer;
memcpy(dst, src, cb.buffer_size);
while (r) {
unsigned p = r->bgn;
unsigned c = r->end - r->bgn;
memcpy(&dst[p * 4], &lconstf->data[n * 4], c * 4 * sizeof(float));
n += c;
r = r->next;
}
cb.user_buffer = dst;
}
#endif
pipe->set_constant_buffer(pipe, PIPE_SHADER_FRAGMENT, 0, &cb);
#ifdef DEBUG
if (device->state.ps->lconstf.ranges)
FREE((void *)cb.user_buffer);
#endif
if (device->state.changed.ps_const_f) {
struct nine_range *r = device->state.changed.ps_const_f;
struct nine_range *p = r;
@@ -1030,9 +994,10 @@ static const DWORD nine_samp_state_defaults[NINED3DSAMP_LAST + 1] =
[NINED3DSAMP_SHADOW] = 0
};
void
nine_state_set_defaults(struct nine_state *state, const D3DCAPS9 *caps,
nine_state_set_defaults(struct NineDevice9 *device, const D3DCAPS9 *caps,
boolean is_reset)
{
struct nine_state *state = &device->state;
unsigned s;
/* Initialize defaults.
@@ -1053,9 +1018,9 @@ nine_state_set_defaults(struct nine_state *state, const D3DCAPS9 *caps,
}
if (state->vs_const_f)
memset(state->vs_const_f, 0, NINE_MAX_CONST_F * 4 * sizeof(float));
memset(state->vs_const_f, 0, device->vs_const_size);
if (state->ps_const_f)
memset(state->ps_const_f, 0, NINE_MAX_CONST_F * 4 * sizeof(float));
memset(state->ps_const_f, 0, device->ps_const_size);
/* Cap dependent initial state:
*/

View File

@@ -144,6 +144,7 @@ struct nine_state
float *vs_const_f;
int vs_const_i[NINE_MAX_CONST_I][4];
BOOL vs_const_b[NINE_MAX_CONST_B];
float *vs_lconstf_temp;
uint32_t vs_key;
struct NinePixelShader9 *ps;
@@ -218,7 +219,7 @@ struct NineDevice9;
boolean nine_update_state(struct NineDevice9 *, uint32_t group_mask);
void nine_state_set_defaults(struct nine_state *, const D3DCAPS9 *,
void nine_state_set_defaults(struct NineDevice9 *, const D3DCAPS9 *,
boolean is_reset);
void nine_state_clear(struct nine_state *, const boolean device);

View File

@@ -72,9 +72,10 @@ NinePixelShader9_ctor( struct NinePixelShader9 *This,
This->sampler_mask = info.sampler_mask;
This->rt_mask = info.rt_mask;
This->const_used_size = info.const_used_size;
if (info.const_used_size == ~0)
This->const_used_size = NINE_CONSTBUF_SIZE(device->max_ps_const_f);
This->lconstf = info.lconstf;
/* no constant relative addressing for ps */
assert(info.const_used_size != ~0);
assert(info.lconstf.data == NULL);
assert(info.lconstf.ranges == NULL);
return D3D_OK;
}
@@ -101,9 +102,6 @@ NinePixelShader9_dtor( struct NinePixelShader9 *This )
if (This->byte_code.tokens)
FREE((void *)This->byte_code.tokens); /* const_cast */
FREE(This->lconstf.data);
FREE(This->lconstf.ranges);
NineUnknown_dtor(&This->base);
}

View File

@@ -41,8 +41,6 @@ struct NinePixelShader9
unsigned const_used_size; /* in bytes */
struct nine_lconstf lconstf;
uint16_t sampler_mask;
uint16_t sampler_mask_shadow;
uint8_t rt_mask;

View File

@@ -43,8 +43,8 @@ NineStateBlock9_ctor( struct NineStateBlock9 *This,
This->type = type;
This->state.vs_const_f = MALLOC(pParams->device->constbuf_vs->width0);
This->state.ps_const_f = MALLOC(pParams->device->constbuf_ps->width0);
This->state.vs_const_f = MALLOC(This->base.device->vs_const_size);
This->state.ps_const_f = MALLOC(This->base.device->ps_const_size);
if (!This->state.vs_const_f || !This->state.ps_const_f)
return E_OUTOFMEMORY;

View File

@@ -38,6 +38,8 @@
#define DBG_CHANNEL DBG_SURFACE
#define is_ATI1_ATI2(format) (format == PIPE_FORMAT_RGTC1_UNORM || format == PIPE_FORMAT_RGTC2_UNORM)
HRESULT
NineSurface9_ctor( struct NineSurface9 *This,
struct NineUnknownParams *pParams,
@@ -150,14 +152,22 @@ struct pipe_surface *
NineSurface9_CreatePipeSurface( struct NineSurface9 *This, const int sRGB )
{
struct pipe_context *pipe = This->pipe;
struct pipe_screen *screen = pipe->screen;
struct pipe_resource *resource = This->base.resource;
struct pipe_surface templ;
enum pipe_format srgb_format;
assert(This->desc.Pool == D3DPOOL_DEFAULT ||
This->desc.Pool == D3DPOOL_MANAGED);
assert(resource);
templ.format = sRGB ? util_format_srgb(resource->format) : resource->format;
srgb_format = util_format_srgb(resource->format);
if (sRGB && srgb_format != PIPE_FORMAT_NONE &&
screen->is_format_supported(screen, srgb_format,
resource->target, 0, resource->bind))
templ.format = srgb_format;
else
templ.format = resource->format;
templ.u.tex.level = This->level;
templ.u.tex.first_layer = This->layer;
templ.u.tex.last_layer = This->layer;
@@ -374,10 +384,19 @@ NineSurface9_LockRect( struct NineSurface9 *This,
if (This->data) {
DBG("returning system memory\n");
pLockedRect->Pitch = This->stride;
pLockedRect->pBits = NineSurface9_GetSystemMemPointer(This,
box.x, box.y);
/* ATI1 and ATI2 need special handling, because of d3d9 bug.
* We must advertise to the application as if it is uncompressed
* and bpp 8, and the app has a workaround to work with the fact
* that it is actually compressed. */
if (is_ATI1_ATI2(This->base.info.format)) {
pLockedRect->Pitch = This->desc.Height;
pLockedRect->pBits = This->data + box.y * This->desc.Height + box.x;
} else {
pLockedRect->Pitch = This->stride;
pLockedRect->pBits = NineSurface9_GetSystemMemPointer(This,
box.x,
box.y);
}
} else {
DBG("mapping pipe_resource %p (level=%u usage=%x)\n",
resource, This->level, usage);

View File

@@ -467,7 +467,7 @@ NineSwapChain9_dtor( struct NineSwapChain9 *This )
if (This->buffers) {
for (i = 0; i < This->params.BackBufferCount; i++) {
NineUnknown_Destroy(NineUnknown(This->buffers[i]));
NineUnknown_Release(NineUnknown(This->buffers[i]));
ID3DPresent_DestroyD3DWindowBuffer(This->present, This->present_handles[i]);
if (This->present_buffers)
pipe_resource_reference(&(This->present_buffers[i]), NULL);

View File

@@ -47,6 +47,7 @@ NineTexture9_ctor( struct NineTexture9 *This,
struct pipe_screen *screen = pParams->device->screen;
struct pipe_resource *info = &This->base.base.info;
struct pipe_resource *resource;
enum pipe_format pf;
unsigned l;
D3DSURFACE_DESC sfdesc;
HRESULT hr;
@@ -92,9 +93,15 @@ NineTexture9_ctor( struct NineTexture9 *This,
if (Usage & D3DUSAGE_AUTOGENMIPMAP)
Levels = 0;
pf = d3d9_to_pipe_format(Format);
if (Format != D3DFMT_NULL && (pf == PIPE_FORMAT_NONE ||
!screen->is_format_supported(screen, pf, PIPE_TEXTURE_2D, 0, PIPE_BIND_SAMPLER_VIEW))) {
return D3DERR_INVALIDCALL;
}
info->screen = screen;
info->target = PIPE_TEXTURE_2D;
info->format = d3d9_to_pipe_format(Format);
info->format = pf;
info->width0 = Width;
info->height0 = Height;
info->depth0 = 1;

View File

@@ -37,6 +37,8 @@ NineVolumeTexture9_ctor( struct NineVolumeTexture9 *This,
HANDLE *pSharedHandle )
{
struct pipe_resource *info = &This->base.base.info;
struct pipe_screen *screen = pParams->device->screen;
enum pipe_format pf;
unsigned l;
D3DVOLUME_DESC voldesc;
HRESULT hr;
@@ -57,9 +59,19 @@ NineVolumeTexture9_ctor( struct NineVolumeTexture9 *This,
if (Usage & D3DUSAGE_AUTOGENMIPMAP)
Levels = 0;
pf = d3d9_to_pipe_format(Format);
if (pf == PIPE_FORMAT_NONE ||
!screen->is_format_supported(screen, pf, PIPE_TEXTURE_3D, 0, PIPE_BIND_SAMPLER_VIEW)) {
return D3DERR_INVALIDCALL;
}
/* We support ATI1 and ATI2 hacks only for 2D textures */
if (Format == D3DFMT_ATI1 || Format == D3DFMT_ATI2)
return D3DERR_INVALIDCALL;
info->screen = pParams->device->screen;
info->target = PIPE_TEXTURE_3D;
info->format = d3d9_to_pipe_format(Format);
info->format = pf;
info->width0 = Width;
info->height0 = Height;
info->depth0 = Depth;

View File

@@ -706,6 +706,11 @@ static void slice_header(vid_dec_PrivateType *priv, struct vl_rbsp *rbsp,
if (pic_order_cnt_lsb != priv->codec_data.h264.pic_order_cnt_lsb)
vid_dec_h264_EndFrame(priv);
if (IdrPicFlag) {
priv->codec_data.h264.pic_order_cnt_msb = 0;
priv->codec_data.h264.pic_order_cnt_lsb = 0;
}
if ((pic_order_cnt_lsb < priv->codec_data.h264.pic_order_cnt_lsb) &&
(priv->codec_data.h264.pic_order_cnt_lsb - pic_order_cnt_lsb) >= (max_pic_order_cnt_lsb / 2))
pic_order_cnt_msb = priv->codec_data.h264.pic_order_cnt_msb + max_pic_order_cnt_lsb;

View File

@@ -431,7 +431,7 @@ osmesa_st_framebuffer_validate(struct st_context_iface *stctx,
templat.format = format;
templat.bind = bind;
out[i] = osbuffer->textures[i] =
out[i] = osbuffer->textures[statts[i]] =
screen->resource_create(screen, &templat);
}

View File

@@ -137,7 +137,9 @@ glsl_test_SOURCES = \
test.cpp \
test_optpass.cpp
glsl_test_LDADD = libglsl.la
glsl_test_LDADD = \
libglsl.la \
$(PTHREAD_LIBS)
# We write our own rules for yacc and lex below. We'd rather use automake,
# but automake makes it especially difficult for a number of reasons:

View File

@@ -578,9 +578,18 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
if (!is_vec_zero(zero))
continue;
return new(mem_ctx) ir_expression(ir->operation,
add->operands[0],
neg(add->operands[1]));
/* Depending of the zero position we want to optimize
* (0 cmp x+y) into (-x cmp y) or (x+y cmp 0) into (x cmp -y)
*/
if (add_pos == 1) {
return new(mem_ctx) ir_expression(ir->operation,
neg(add->operands[0]),
add->operands[1]);
} else {
return new(mem_ctx) ir_expression(ir->operation,
add->operands[0],
neg(add->operands[1]));
}
}
break;
@@ -679,55 +688,72 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
case ir_binop_min:
case ir_binop_max:
if (ir->type->base_type != GLSL_TYPE_FLOAT)
if (ir->type->base_type != GLSL_TYPE_FLOAT || options->EmitNoSat)
break;
/* Replace min(max) operations and its commutative combinations with
* a saturate operation
*/
for (int op = 0; op < 2; op++) {
ir_expression *minmax = op_expr[op];
ir_expression *inner_expr = op_expr[op];
ir_constant *outer_const = op_const[1 - op];
ir_expression_operation op_cond = (ir->operation == ir_binop_max) ?
ir_binop_min : ir_binop_max;
if (!minmax || !outer_const || (minmax->operation != op_cond))
if (!inner_expr || !outer_const || (inner_expr->operation != op_cond))
continue;
/* One of these has to be a constant */
if (!inner_expr->operands[0]->as_constant() &&
!inner_expr->operands[1]->as_constant())
break;
/* Found a min(max) combination. Now try to see if its operands
* meet our conditions that we can do just a single saturate operation
*/
for (int minmax_op = 0; minmax_op < 2; minmax_op++) {
ir_rvalue *inner_val_a = minmax->operands[minmax_op];
ir_rvalue *inner_val_b = minmax->operands[1 - minmax_op];
ir_rvalue *x = inner_expr->operands[minmax_op];
ir_rvalue *y = inner_expr->operands[1 - minmax_op];
if (!inner_val_a || !inner_val_b)
ir_constant *inner_const = y->as_constant();
if (!inner_const)
continue;
/* Found a {min|max} ({max|min} (x, 0.0), 1.0) operation and its variations */
if ((outer_const->is_one() && inner_val_a->is_zero()) ||
(inner_val_a->is_one() && outer_const->is_zero()))
return saturate(inner_val_b);
/* min(max(x, 0.0), 1.0) is sat(x) */
if (ir->operation == ir_binop_min &&
inner_const->is_zero() &&
outer_const->is_one())
return saturate(x);
/* Found a {min|max} ({max|min} (x, 0.0), b) where b < 1.0
* and its variations
*/
if (is_less_than_one(outer_const) && inner_val_b->is_zero())
return expr(ir_binop_min, saturate(inner_val_a), outer_const);
/* max(min(x, 1.0), 0.0) is sat(x) */
if (ir->operation == ir_binop_max &&
inner_const->is_one() &&
outer_const->is_zero())
return saturate(x);
if (!inner_val_b->as_constant())
continue;
/* min(max(x, 0.0), b) where b < 1.0 is sat(min(x, b)) */
if (ir->operation == ir_binop_min &&
inner_const->is_zero() &&
is_less_than_one(outer_const))
return saturate(expr(ir_binop_min, x, outer_const));
if (is_less_than_one(inner_val_b->as_constant()) && outer_const->is_zero())
return expr(ir_binop_min, saturate(inner_val_a), inner_val_b);
/* max(min(x, b), 0.0) where b < 1.0 is sat(min(x, b)) */
if (ir->operation == ir_binop_max &&
is_less_than_one(inner_const) &&
outer_const->is_zero())
return saturate(expr(ir_binop_min, x, inner_const));
/* Found a {min|max} ({max|min} (x, b), 1.0), where b > 0.0
* and its variations
*/
if (outer_const->is_one() && is_greater_than_zero(inner_val_b->as_constant()))
return expr(ir_binop_max, saturate(inner_val_a), inner_val_b);
if (inner_val_b->as_constant()->is_one() && is_greater_than_zero(outer_const))
return expr(ir_binop_max, saturate(inner_val_a), outer_const);
/* max(min(x, 1.0), b) where b > 0.0 is sat(max(x, b)) */
if (ir->operation == ir_binop_max &&
inner_const->is_one() &&
is_greater_than_zero(outer_const))
return saturate(expr(ir_binop_max, x, outer_const));
/* min(max(x, b), 1.0) where b > 0.0 is sat(max(x, b)) */
if (ir->operation == ir_binop_min &&
is_greater_than_zero(inner_const) &&
outer_const->is_one())
return saturate(expr(ir_binop_max, x, inner_const));
}
}

View File

@@ -128,6 +128,9 @@ ir_copy_propagation_visitor::visit_enter(ir_function_signature *ir)
visit_list_elements(this, &ir->body);
ralloc_free(this->acp);
ralloc_free(this->kills);
this->kills = orig_kills;
this->acp = orig_acp;
this->killed_all = orig_killed_all;
@@ -215,7 +218,7 @@ ir_copy_propagation_visitor::handle_if_block(exec_list *instructions)
/* Populate the initial acp with a copy of the original */
foreach_in_list(acp_entry, a, orig_acp) {
this->acp->push_tail(new(this->mem_ctx) acp_entry(a->lhs, a->rhs));
this->acp->push_tail(new(this->acp) acp_entry(a->lhs, a->rhs));
}
visit_list_elements(this, instructions);
@@ -226,12 +229,15 @@ ir_copy_propagation_visitor::handle_if_block(exec_list *instructions)
exec_list *new_kills = this->kills;
this->kills = orig_kills;
ralloc_free(this->acp);
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
foreach_in_list(kill_entry, k, new_kills) {
kill(k->var);
}
ralloc_free(new_kills);
}
ir_visitor_status
@@ -269,6 +275,7 @@ ir_copy_propagation_visitor::visit_enter(ir_loop *ir)
exec_list *new_kills = this->kills;
this->kills = orig_kills;
ralloc_free(this->acp);
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
@@ -276,6 +283,8 @@ ir_copy_propagation_visitor::visit_enter(ir_loop *ir)
kill(k->var);
}
ralloc_free(new_kills);
/* already descended into the children. */
return visit_continue_with_parent;
}
@@ -294,7 +303,7 @@ ir_copy_propagation_visitor::kill(ir_variable *var)
/* Add the LHS variable to the list of killed variables in this block.
*/
this->kills->push_tail(new(this->mem_ctx) kill_entry(var));
this->kills->push_tail(new(this->kills) kill_entry(var));
}
/**
@@ -322,7 +331,7 @@ ir_copy_propagation_visitor::add_copy(ir_assignment *ir)
ir->condition = new(ralloc_parent(ir)) ir_constant(false);
this->progress = true;
} else {
entry = new(this->mem_ctx) acp_entry(lhs_var, rhs_var);
entry = new(this->acp) acp_entry(lhs_var, rhs_var);
this->acp->push_tail(entry);
}
}

View File

@@ -156,6 +156,9 @@ ir_copy_propagation_elements_visitor::visit_enter(ir_function_signature *ir)
visit_list_elements(this, &ir->body);
ralloc_free(this->acp);
ralloc_free(this->kills);
this->kills = orig_kills;
this->acp = orig_acp;
this->killed_all = orig_killed_all;
@@ -173,9 +176,9 @@ ir_copy_propagation_elements_visitor::visit_leave(ir_assignment *ir)
kill_entry *k;
if (lhs)
k = new(mem_ctx) kill_entry(var, ir->write_mask);
k = new(this->kills) kill_entry(var, ir->write_mask);
else
k = new(mem_ctx) kill_entry(var, ~0);
k = new(this->kills) kill_entry(var, ~0);
kill(k);
}
@@ -334,7 +337,7 @@ ir_copy_propagation_elements_visitor::handle_if_block(exec_list *instructions)
/* Populate the initial acp with a copy of the original */
foreach_in_list(acp_entry, a, orig_acp) {
this->acp->push_tail(new(this->mem_ctx) acp_entry(a));
this->acp->push_tail(new(this->acp) acp_entry(a));
}
visit_list_elements(this, instructions);
@@ -345,6 +348,7 @@ ir_copy_propagation_elements_visitor::handle_if_block(exec_list *instructions)
exec_list *new_kills = this->kills;
this->kills = orig_kills;
ralloc_free(this->acp);
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
@@ -354,6 +358,8 @@ ir_copy_propagation_elements_visitor::handle_if_block(exec_list *instructions)
foreach_in_list_safe(kill_entry, k, new_kills) {
kill(k);
}
ralloc_free(new_kills);
}
ir_visitor_status
@@ -391,6 +397,7 @@ ir_copy_propagation_elements_visitor::visit_enter(ir_loop *ir)
exec_list *new_kills = this->kills;
this->kills = orig_kills;
ralloc_free(this->acp);
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
@@ -398,6 +405,8 @@ ir_copy_propagation_elements_visitor::visit_enter(ir_loop *ir)
kill(k);
}
ralloc_free(new_kills);
/* already descended into the children. */
return visit_continue_with_parent;
}
@@ -423,6 +432,7 @@ ir_copy_propagation_elements_visitor::kill(kill_entry *k)
if (k->next)
k->remove();
ralloc_steal(this->kills, k);
this->kills->push_tail(k);
}

View File

@@ -1,6 +1,7 @@
noinst_LTLIBRARIES = libappleglx.la
AM_CFLAGS = \
-I$(top_srcdir)/src \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src/glx \
-I$(top_srcdir)/src/mesa \

View File

@@ -1526,6 +1526,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor,
xcb_connection_t *c = XGetXCBConnection(dpy);
struct dri3_buffer *back;
int64_t ret = 0;
uint32_t options = XCB_PRESENT_OPTION_NONE;
unsigned flags = __DRI2_FLUSH_DRAWABLE;
if (flush)
@@ -1578,6 +1579,17 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor,
remainder = 0;
}
/* From the GLX_EXT_swap_control spec:
*
* "If <interval> is set to a value of 0, buffer swaps are not
* synchronized to a video frame."
*
* Implementation note: It is possible to enable triple buffering behaviour
* by not using XCB_PRESENT_OPTION_ASYNC, but this should not be the default.
*/
if (priv->swap_interval == 0)
options |= XCB_PRESENT_OPTION_ASYNC;
back->busy = 1;
back->last_swap = priv->send_sbc;
xcb_present_pixmap(c,
@@ -1591,7 +1603,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor,
None, /* target_crtc */
None,
back->sync_fence,
XCB_PRESENT_OPTION_NONE,
options,
target_msc,
divisor,
remainder, 0, NULL);

View File

@@ -65,10 +65,23 @@ dri2_convert_glx_query_renderer_attribs(int attribute)
return -1;
}
/* Convert internal dri context profile bits into GLX context profile bits */
static inline void
dri_convert_context_profile_bits(int attribute, unsigned int *value)
{
if (attribute == GLX_RENDERER_PREFERRED_PROFILE_MESA) {
if (value[0] == (1U << __DRI_API_OPENGL_CORE))
value[0] = GLX_CONTEXT_CORE_PROFILE_BIT_ARB;
else if (value[0] == (1U << __DRI_API_OPENGL))
value[0] = GLX_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB;
}
}
_X_HIDDEN int
dri2_query_renderer_integer(struct glx_screen *base, int attribute,
unsigned int *value)
{
int ret;
struct dri2_screen *const psc = (struct dri2_screen *) base;
/* Even though there are invalid values (and
@@ -81,8 +94,11 @@ dri2_query_renderer_integer(struct glx_screen *base, int attribute,
if (psc->rendererQuery == NULL)
return -1;
return psc->rendererQuery->queryInteger(psc->driScreen, dri_attribute,
value);
ret = psc->rendererQuery->queryInteger(psc->driScreen, dri_attribute,
value);
dri_convert_context_profile_bits(attribute, value);
return ret;
}
_X_HIDDEN int
@@ -108,6 +124,7 @@ _X_HIDDEN int
dri3_query_renderer_integer(struct glx_screen *base, int attribute,
unsigned int *value)
{
int ret;
struct dri3_screen *const psc = (struct dri3_screen *) base;
/* Even though there are invalid values (and
@@ -120,8 +137,11 @@ dri3_query_renderer_integer(struct glx_screen *base, int attribute,
if (psc->rendererQuery == NULL)
return -1;
return psc->rendererQuery->queryInteger(psc->driScreen, dri_attribute,
value);
ret = psc->rendererQuery->queryInteger(psc->driScreen, dri_attribute,
value);
dri_convert_context_profile_bits(attribute, value);
return ret;
}
_X_HIDDEN int
@@ -147,6 +167,7 @@ _X_HIDDEN int
drisw_query_renderer_integer(struct glx_screen *base, int attribute,
unsigned int *value)
{
int ret;
struct drisw_screen *const psc = (struct drisw_screen *) base;
/* Even though there are invalid values (and
@@ -159,8 +180,11 @@ drisw_query_renderer_integer(struct glx_screen *base, int attribute,
if (psc->rendererQuery == NULL)
return -1;
return psc->rendererQuery->queryInteger(psc->driScreen, dri_attribute,
value);
ret = psc->rendererQuery->queryInteger(psc->driScreen, dri_attribute,
value);
dri_convert_context_profile_bits(attribute, value);
return ret;
}
_X_HIDDEN int

View File

@@ -143,8 +143,13 @@ __glXWireToEvent(Display *dpy, XEvent *event, xEvent *wire)
aevent->ust = ((CARD64)awire->ust_hi << 32) | awire->ust_lo;
aevent->msc = ((CARD64)awire->msc_hi << 32) | awire->msc_lo;
if (awire->sbc < glxDraw->lastEventSbc)
glxDraw->eventSbcWrap += 0x100000000;
/* Handle 32-Bit wire sbc wraparound in both directions to cope with out
* of sequence 64-Bit sbc's
*/
if ((int64_t) awire->sbc < ((int64_t) glxDraw->lastEventSbc - 0x40000000))
glxDraw->eventSbcWrap += 0x100000000;
if ((int64_t) awire->sbc > ((int64_t) glxDraw->lastEventSbc + 0x40000000))
glxDraw->eventSbcWrap -= 0x100000000;
glxDraw->lastEventSbc = awire->sbc;
aevent->sbc = awire->sbc + glxDraw->eventSbcWrap;
return True;

View File

@@ -221,7 +221,10 @@ DRI_glXUseXFont(struct glx_context *CC, Font font, int first, int count, int lis
XGCValues values;
unsigned long valuemask;
XFontStruct *fs;
#if !defined(GLX_USE_APPLEGL)
__GLXDRIdrawable *glxdraw;
#endif
GLint swapbytes, lsbfirst, rowlength;
GLint skiprows, skippixels, alignment;
@@ -234,9 +237,11 @@ DRI_glXUseXFont(struct glx_context *CC, Font font, int first, int count, int lis
dpy = CC->currentDpy;
win = CC->currentDrawable;
#if !defined(GLX_USE_APPLEGL)
glxdraw = GetGLXDRIDrawable(CC->currentDpy, CC->currentDrawable);
if (glxdraw)
win = glxdraw->xDrawable;
#endif
fs = XQueryFont(dpy, font);
if (!fs) {

View File

@@ -64,6 +64,7 @@
* Rob Clark <robclark@freedesktop.org>
*/
#include <sys/stat.h>
#include <stdarg.h>
#include <stdio.h>
#include <string.h>
@@ -80,7 +81,6 @@
#endif
#endif
#ifdef HAVE_SYSFS
#include <sys/stat.h>
#include <sys/types.h>
#endif
#include "loader.h"

View File

@@ -122,5 +122,5 @@ format_info_deps := \
$(LOCAL_PATH)/main/format_parser.py \
$(FORMAT_INFO)
$(intermediates)/main/format_info.c: $(format_info_deps)
$(intermediates)/main/format_info.h: $(format_info_deps)
@$(MESA_PYTHON2) $(FORMAT_INFO) $< > $@

View File

@@ -64,7 +64,7 @@ include Makefile.sources
BUILT_SOURCES = \
main/get_hash.h \
main/format_info.c \
main/format_info.h \
$(BUILDDIR)main/git_sha1.h \
$(BUILDDIR)program/program_parse.tab.c \
$(BUILDDIR)program/lex.yy.c
@@ -82,14 +82,14 @@ main/get_hash.h: $(GLAPI)/gl_and_es_API.xml main/get_hash_params.py \
-f $< > $@.tmp; \
mv $@.tmp $@;
main/format_info.c: main/formats.csv \
main/format_info.h: main/formats.csv \
main/format_parser.py main/format_info.py
$(AM_V_GEN)set -e; \
$(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/main/format_info.py \
$< > $@.tmp; \
mv $@.tmp $@;
main/formats.c: main/format_info.c
main/formats.c: main/format_info.h
noinst_LTLIBRARIES = $(ARCH_LIBS)
if NEED_LIBMESA

View File

@@ -60,7 +60,7 @@ get_hash_header = env.CodeGenerate(
)
format_info = env.CodeGenerate(
target = 'main/format_info.c',
target = 'main/format_info.h',
script = 'main/format_info.py',
source = 'main/formats.csv',
command = python_cmd + ' $SCRIPT ' + ' $SOURCE > $TARGET'

View File

@@ -280,6 +280,19 @@ brw_upload_constant_buffer(struct brw_context *brw)
*/
emit:
/* Work around mysterious 965 hangs that appear to happen if you do
* two 3DPRIMITIVEs with only a CONSTANT_BUFFER inbetween. If we
* haven't already flushed for some other reason, explicitly do so.
*
* We've found no documented reason why this should be necessary.
*/
if (brw->gen == 4 && !brw->is_g4x &&
(brw->state.dirty.brw & (BRW_NEW_BATCH | BRW_NEW_PSP)) == 0) {
BEGIN_BATCH(1);
OUT_BATCH(MI_FLUSH);
ADVANCE_BATCH();
}
/* BRW_NEW_URB_FENCE: From the gen4 PRM, volume 1, section 3.9.8
* (CONSTANT_BUFFER (CURBE Load)):
*

View File

@@ -551,6 +551,7 @@
#define BRW_SURFACE_PITCH_MASK INTEL_MASK(19, 3)
#define BRW_SURFACE_TILED (1 << 1)
#define BRW_SURFACE_TILED_Y (1 << 0)
#define HSW_SURFACE_IS_INTEGER_FORMAT (1 << 18)
/* Surface state DW4 */
#define BRW_SURFACE_MIN_LOD_SHIFT 28

View File

@@ -239,7 +239,7 @@ static const struct brw_device_info brw_device_info_chv = {
.has_llc = false,
.max_vs_threads = 80,
.max_gs_threads = 80,
.max_wm_threads = 102,
.max_wm_threads = 128,
.urb = {
.size = 128,
.min_vs_entries = 64,

View File

@@ -2179,8 +2179,13 @@ fs_visitor::demote_pull_constants()
if (inst->src[i].file != UNIFORM)
continue;
int pull_index = pull_constant_loc[inst->src[i].reg +
inst->src[i].reg_offset];
int pull_index;
unsigned location = inst->src[i].reg + inst->src[i].reg_offset;
if (location >= uniforms) /* Out of bounds access */
pull_index = -1;
else
pull_index = pull_constant_loc[location];
if (pull_index == -1)
continue;
@@ -2842,16 +2847,6 @@ fs_visitor::insert_gen4_post_send_dependency_workarounds(bblock_t *block, fs_ins
if (i == write_len)
return;
}
/* If we hit the end of the program, resolve all remaining dependencies out
* of paranoia.
*/
fs_inst *last_inst = (fs_inst *)this->instructions.get_tail();
assert(last_inst->eot);
for (int i = 0; i < write_len; i++) {
if (needs_dep[i])
last_inst->insert_before(block, DEP_RESOLVE_MOV(first_write_grf + i));
}
}
void

View File

@@ -354,7 +354,7 @@ brw_upload_gs_prog(struct brw_context *brw)
}
brw->gs.base.prog_data = &brw->gs.prog_data->base.base;
if (memcmp(&brw->vs.prog_data->base.vue_map, &brw->vue_map_geom_out,
if (memcmp(&brw->gs.prog_data->base.vue_map, &brw->vue_map_geom_out,
sizeof(brw->vue_map_geom_out)) != 0) {
brw->vue_map_geom_out = brw->gs.prog_data->base.vue_map;
brw->state.dirty.brw |= BRW_NEW_VUE_MAP_GEOM_OUT;

View File

@@ -208,7 +208,7 @@ upload_default_color(struct brw_context *brw,
struct gl_texture_unit *texUnit = &ctx->Texture.Unit[unit];
struct gl_texture_object *texObj = texUnit->_Current;
struct gl_texture_image *firstImage = texObj->Image[0][texObj->BaseLevel];
float color[4];
union gl_color_union color;
switch (firstImage->_BaseFormat) {
case GL_DEPTH_COMPONENT:
@@ -216,40 +216,40 @@ upload_default_color(struct brw_context *brw,
* R channel, while the hardware uses A. Spam R into all the
* channels for safety.
*/
color[0] = sampler->BorderColor.f[0];
color[1] = sampler->BorderColor.f[0];
color[2] = sampler->BorderColor.f[0];
color[3] = sampler->BorderColor.f[0];
color.ui[0] = sampler->BorderColor.ui[0];
color.ui[1] = sampler->BorderColor.ui[0];
color.ui[2] = sampler->BorderColor.ui[0];
color.ui[3] = sampler->BorderColor.ui[0];
break;
case GL_ALPHA:
color[0] = 0.0;
color[1] = 0.0;
color[2] = 0.0;
color[3] = sampler->BorderColor.f[3];
color.ui[0] = 0u;
color.ui[1] = 0u;
color.ui[2] = 0u;
color.ui[3] = sampler->BorderColor.ui[3];
break;
case GL_INTENSITY:
color[0] = sampler->BorderColor.f[0];
color[1] = sampler->BorderColor.f[0];
color[2] = sampler->BorderColor.f[0];
color[3] = sampler->BorderColor.f[0];
color.ui[0] = sampler->BorderColor.ui[0];
color.ui[1] = sampler->BorderColor.ui[0];
color.ui[2] = sampler->BorderColor.ui[0];
color.ui[3] = sampler->BorderColor.ui[0];
break;
case GL_LUMINANCE:
color[0] = sampler->BorderColor.f[0];
color[1] = sampler->BorderColor.f[0];
color[2] = sampler->BorderColor.f[0];
color[3] = 1.0;
color.ui[0] = sampler->BorderColor.ui[0];
color.ui[1] = sampler->BorderColor.ui[0];
color.ui[2] = sampler->BorderColor.ui[0];
color.ui[3] = float_as_int(1.0);
break;
case GL_LUMINANCE_ALPHA:
color[0] = sampler->BorderColor.f[0];
color[1] = sampler->BorderColor.f[0];
color[2] = sampler->BorderColor.f[0];
color[3] = sampler->BorderColor.f[3];
color.ui[0] = sampler->BorderColor.ui[0];
color.ui[1] = sampler->BorderColor.ui[0];
color.ui[2] = sampler->BorderColor.ui[0];
color.ui[3] = sampler->BorderColor.ui[3];
break;
default:
color[0] = sampler->BorderColor.f[0];
color[1] = sampler->BorderColor.f[1];
color[2] = sampler->BorderColor.f[2];
color[3] = sampler->BorderColor.f[3];
color.ui[0] = sampler->BorderColor.ui[0];
color.ui[1] = sampler->BorderColor.ui[1];
color.ui[2] = sampler->BorderColor.ui[2];
color.ui[3] = sampler->BorderColor.ui[3];
break;
}
@@ -258,18 +258,79 @@ upload_default_color(struct brw_context *brw,
* the border color alpha to 1.0 in that case.
*/
if (firstImage->_BaseFormat == GL_RGB)
color[3] = 1.0;
color.ui[3] = float_as_int(1.0);
if (brw->gen >= 8) {
/* On Broadwell, the border color is represented as four 32-bit floats,
* integers, or unsigned values, interpreted according to the surface
* format. This matches the sampler->BorderColor union exactly. Since
* we use floats both here and in the above reswizzling code, we preserve
* the original bit pattern. So we actually handle all three formats.
* format. This matches the sampler->BorderColor union exactly; just
* memcpy the values.
*/
float *sdc = brw_state_batch(brw, AUB_TRACE_SAMPLER_DEFAULT_COLOR,
4 * 4, 64, sdc_offset);
COPY_4FV(sdc, color);
uint32_t *sdc = brw_state_batch(brw, AUB_TRACE_SAMPLER_DEFAULT_COLOR,
4 * 4, 64, sdc_offset);
memcpy(sdc, color.ui, 4 * 4);
} else if (brw->is_haswell && texObj->_IsIntegerFormat) {
/* Haswell's integer border color support is completely insane:
* SAMPLER_BORDER_COLOR_STATE is 20 DWords. The first four are
* for float colors. The next 12 DWords are MBZ and only exist to
* pad it out to a 64 byte cacheline boundary. DWords 16-19 then
* contain integer colors; these are only used if SURFACE_STATE
* has the "Integer Surface Format" bit set. Even then, the
* arrangement of the RGBA data devolves into madness.
*/
uint32_t *sdc = brw_state_batch(brw, AUB_TRACE_SAMPLER_DEFAULT_COLOR,
20 * 4, 512, sdc_offset);
memset(sdc, 0, 20 * 4);
sdc = &sdc[16];
mesa_format format = firstImage->TexFormat;
int bits_per_channel = _mesa_get_format_bits(format, GL_RED_BITS);
/* From the Haswell PRM, "Command Reference: Structures", Page 36:
* "If any color channel is missing from the surface format,
* corresponding border color should be programmed as zero and if
* alpha channel is missing, corresponding Alpha border color should
* be programmed as 1."
*/
unsigned c[4] = { 0, 0, 0, 1 };
for (int i = 0; i < 4; i++) {
if (_mesa_format_has_color_component(format, i))
c[i] = color.ui[i];
}
switch (bits_per_channel) {
case 8:
/* Copy RGBA in order. */
for (int i = 0; i < 4; i++)
((uint8_t *) sdc)[i] = c[i];
break;
case 10:
/* R10G10B10A2_UINT is treated like a 16-bit format. */
case 16:
((uint16_t *) sdc)[0] = c[0]; /* R -> DWord 0, bits 15:0 */
((uint16_t *) sdc)[1] = c[1]; /* G -> DWord 0, bits 31:16 */
/* DWord 1 is Reserved/MBZ! */
((uint16_t *) sdc)[4] = c[2]; /* B -> DWord 2, bits 15:0 */
((uint16_t *) sdc)[5] = c[3]; /* A -> DWord 3, bits 31:16 */
break;
case 32:
if (firstImage->_BaseFormat == GL_RG) {
/* Careful inspection of the tables reveals that for RG32 formats,
* the green channel needs to go where blue normally belongs.
*/
sdc[0] = c[0];
sdc[2] = c[1];
sdc[3] = 1;
} else {
/* Copy RGBA in order. */
for (int i = 0; i < 4; i++)
sdc[i] = c[i];
}
break;
default:
assert(!"Invalid number of bits per channel in integer format.");
break;
}
} else if (brw->gen == 5 || brw->gen == 6) {
struct gen5_sampler_default_color *sdc;
@@ -278,39 +339,39 @@ upload_default_color(struct brw_context *brw,
memset(sdc, 0, sizeof(*sdc));
UNCLAMPED_FLOAT_TO_UBYTE(sdc->ub[0], color[0]);
UNCLAMPED_FLOAT_TO_UBYTE(sdc->ub[1], color[1]);
UNCLAMPED_FLOAT_TO_UBYTE(sdc->ub[2], color[2]);
UNCLAMPED_FLOAT_TO_UBYTE(sdc->ub[3], color[3]);
UNCLAMPED_FLOAT_TO_UBYTE(sdc->ub[0], color.f[0]);
UNCLAMPED_FLOAT_TO_UBYTE(sdc->ub[1], color.f[1]);
UNCLAMPED_FLOAT_TO_UBYTE(sdc->ub[2], color.f[2]);
UNCLAMPED_FLOAT_TO_UBYTE(sdc->ub[3], color.f[3]);
UNCLAMPED_FLOAT_TO_USHORT(sdc->us[0], color[0]);
UNCLAMPED_FLOAT_TO_USHORT(sdc->us[1], color[1]);
UNCLAMPED_FLOAT_TO_USHORT(sdc->us[2], color[2]);
UNCLAMPED_FLOAT_TO_USHORT(sdc->us[3], color[3]);
UNCLAMPED_FLOAT_TO_USHORT(sdc->us[0], color.f[0]);
UNCLAMPED_FLOAT_TO_USHORT(sdc->us[1], color.f[1]);
UNCLAMPED_FLOAT_TO_USHORT(sdc->us[2], color.f[2]);
UNCLAMPED_FLOAT_TO_USHORT(sdc->us[3], color.f[3]);
UNCLAMPED_FLOAT_TO_SHORT(sdc->s[0], color[0]);
UNCLAMPED_FLOAT_TO_SHORT(sdc->s[1], color[1]);
UNCLAMPED_FLOAT_TO_SHORT(sdc->s[2], color[2]);
UNCLAMPED_FLOAT_TO_SHORT(sdc->s[3], color[3]);
UNCLAMPED_FLOAT_TO_SHORT(sdc->s[0], color.f[0]);
UNCLAMPED_FLOAT_TO_SHORT(sdc->s[1], color.f[1]);
UNCLAMPED_FLOAT_TO_SHORT(sdc->s[2], color.f[2]);
UNCLAMPED_FLOAT_TO_SHORT(sdc->s[3], color.f[3]);
sdc->hf[0] = _mesa_float_to_half(color[0]);
sdc->hf[1] = _mesa_float_to_half(color[1]);
sdc->hf[2] = _mesa_float_to_half(color[2]);
sdc->hf[3] = _mesa_float_to_half(color[3]);
sdc->hf[0] = _mesa_float_to_half(color.f[0]);
sdc->hf[1] = _mesa_float_to_half(color.f[1]);
sdc->hf[2] = _mesa_float_to_half(color.f[2]);
sdc->hf[3] = _mesa_float_to_half(color.f[3]);
sdc->b[0] = sdc->s[0] >> 8;
sdc->b[1] = sdc->s[1] >> 8;
sdc->b[2] = sdc->s[2] >> 8;
sdc->b[3] = sdc->s[3] >> 8;
sdc->f[0] = color[0];
sdc->f[1] = color[1];
sdc->f[2] = color[2];
sdc->f[3] = color[3];
sdc->f[0] = color.f[0];
sdc->f[1] = color.f[1];
sdc->f[2] = color.f[2];
sdc->f[3] = color.f[3];
} else {
float *sdc = brw_state_batch(brw, AUB_TRACE_SAMPLER_DEFAULT_COLOR,
4 * 4, 32, sdc_offset);
memcpy(sdc, color, 4 * 4);
memcpy(sdc, color.f, 4 * 4);
}
}

View File

@@ -415,7 +415,7 @@ vec4_visitor::opt_copy_propagation()
entries[reg].saturatemask = 0x0;
for (int i = 0; i < 4; i++) {
if (inst->dst.writemask & (1 << i)) {
entries[reg].value[i] = direct_copy ? &inst->src[0] : NULL;
entries[reg].value[i] = (!inst->saturate && direct_copy) ? &inst->src[0] : NULL;
entries[reg].saturatemask |= (((inst->saturate && direct_copy) ? 1 : 0) << i);
}
}

View File

@@ -199,6 +199,14 @@ brw_get_texture_swizzle(const struct gl_context *ctx,
swizzles[1] = SWIZZLE_ZERO;
swizzles[2] = SWIZZLE_ZERO;
break;
case GL_LUMINANCE:
if (t->_IsIntegerFormat) {
swizzles[0] = SWIZZLE_X;
swizzles[1] = SWIZZLE_X;
swizzles[2] = SWIZZLE_X;
swizzles[3] = SWIZZLE_ONE;
}
break;
case GL_RED:
case GL_RG:
case GL_RGB:

View File

@@ -142,10 +142,10 @@ upload_wm_state(struct brw_context *brw)
_mesa_get_min_invocations_per_fragment(ctx, brw->fragment_program, false);
assert(min_inv_per_frag >= 1);
if (brw->wm.prog_data->prog_offset_16) {
if (brw->wm.prog_data->prog_offset_16 || brw->wm.prog_data->no_8) {
dw5 |= GEN6_WM_16_DISPATCH_ENABLE;
if (min_inv_per_frag == 1) {
if (!brw->wm.prog_data->no_8 && min_inv_per_frag == 1) {
dw5 |= GEN6_WM_8_DISPATCH_ENABLE;
dw4 |= (brw->wm.prog_data->base.dispatch_grf_start_reg <<
GEN6_WM_DISPATCH_START_GRF_SHIFT_0);

View File

@@ -326,6 +326,9 @@ gen7_update_texture_surface(struct gl_context *ctx,
surf[3] = SET_FIELD(effective_depth - 1, BRW_SURFACE_DEPTH) |
(mt->pitch - 1);
if (brw->is_haswell && tObj->_IsIntegerFormat)
surf[3] |= HSW_SURFACE_IS_INTEGER_FORMAT;
surf[4] = gen7_surface_msaa_bits(mt->num_samples, mt->msaa_layout) |
SET_FIELD(tObj->MinLayer, GEN7_SURFACE_MIN_ARRAY_ELEMENT) |
SET_FIELD((effective_depth - 1),

View File

@@ -8,4 +8,4 @@ git_sha1.h.tmp
remap_helper.h
get_hash.h
get_hash.h.tmp
format_info.c
format_info.h

View File

@@ -1226,7 +1226,7 @@ _mesa_DeleteBuffers(GLsizei n, const GLuint *ids)
}
}
if (ctx->UniformBuffer == bufObj) {
if (ctx->AtomicBuffer == bufObj) {
_mesa_BindBuffer( GL_ATOMIC_COUNTER_BUFFER, 0 );
}

View File

@@ -487,6 +487,7 @@ typedef enum
/* The following three are meta instructions */
OPCODE_ERROR, /* raise compiled-in error */
OPCODE_CONTINUE,
OPCODE_NOP, /* No-op (used for 8-byte alignment */
OPCODE_END_OF_LIST,
OPCODE_EXT_0
} OpCode;
@@ -1018,13 +1019,16 @@ memdup(const void *src, GLsizei bytes)
* Allocate space for a display list instruction (opcode + payload space).
* \param opcode the instruction opcode (OPCODE_* value)
* \param bytes instruction payload size (not counting opcode)
* \return pointer to allocated memory (the opcode space)
* \param align8 does the payload need to be 8-byte aligned?
* This is only relevant in 64-bit environments.
* \return pointer to allocated memory (the payload will be at pointer+1)
*/
static Node *
dlist_alloc(struct gl_context *ctx, OpCode opcode, GLuint bytes)
dlist_alloc(struct gl_context *ctx, OpCode opcode, GLuint bytes, bool align8)
{
const GLuint numNodes = 1 + (bytes + sizeof(Node) - 1) / sizeof(Node);
const GLuint contNodes = 1 + POINTER_DWORDS; /* size of continue info */
GLuint nopNode;
Node *n;
if (opcode < (GLuint) OPCODE_EXT_0) {
@@ -1038,7 +1042,20 @@ dlist_alloc(struct gl_context *ctx, OpCode opcode, GLuint bytes)
}
}
if (ctx->ListState.CurrentPos + numNodes + contNodes > BLOCK_SIZE) {
if (sizeof(void *) > sizeof(Node) && align8
&& ctx->ListState.CurrentPos % 2 == 0) {
/* The opcode would get placed at node[0] and the payload would start
* at node[1]. But the payload needs to be at an even offset (8-byte
* multiple).
*/
nopNode = 1;
}
else {
nopNode = 0;
}
if (ctx->ListState.CurrentPos + nopNode + numNodes + contNodes
> BLOCK_SIZE) {
/* This block is full. Allocate a new block and chain to it */
Node *newblock;
n = ctx->ListState.CurrentBlock + ctx->ListState.CurrentPos;
@@ -1048,13 +1065,34 @@ dlist_alloc(struct gl_context *ctx, OpCode opcode, GLuint bytes)
_mesa_error(ctx, GL_OUT_OF_MEMORY, "Building display list");
return NULL;
}
/* a fresh block should be 8-byte aligned on 64-bit systems */
assert(((GLintptr) newblock) % sizeof(void *) == 0);
save_pointer(&n[1], newblock);
ctx->ListState.CurrentBlock = newblock;
ctx->ListState.CurrentPos = 0;
/* Display list nodes are always 4 bytes. If we need 8-byte alignment
* we have to insert a NOP so that the payload of the real opcode lands
* on an even location:
* node[0] = OPCODE_NOP
* node[1] = OPCODE_x;
* node[2] = start of payload
*/
nopNode = sizeof(void *) > sizeof(Node) && align8;
}
n = ctx->ListState.CurrentBlock + ctx->ListState.CurrentPos;
ctx->ListState.CurrentPos += numNodes;
if (nopNode) {
assert(ctx->ListState.CurrentPos % 2 == 0); /* even value */
n[0].opcode = OPCODE_NOP;
n++;
/* The "real" opcode will now be at an odd location and the payload
* will be at an even location.
*/
}
ctx->ListState.CurrentPos += nopNode + numNodes;
n[0].opcode = opcode;
@@ -1075,7 +1113,22 @@ dlist_alloc(struct gl_context *ctx, OpCode opcode, GLuint bytes)
void *
_mesa_dlist_alloc(struct gl_context *ctx, GLuint opcode, GLuint bytes)
{
Node *n = dlist_alloc(ctx, (OpCode) opcode, bytes);
Node *n = dlist_alloc(ctx, (OpCode) opcode, bytes, false);
if (n)
return n + 1; /* return pointer to payload area, after opcode */
else
return NULL;
}
/**
* Same as _mesa_dlist_alloc(), but return a pointer which is 8-byte
* aligned in 64-bit environments, 4-byte aligned otherwise.
*/
void *
_mesa_dlist_alloc_aligned(struct gl_context *ctx, GLuint opcode, GLuint bytes)
{
Node *n = dlist_alloc(ctx, (OpCode) opcode, bytes, true);
if (n)
return n + 1; /* return pointer to payload area, after opcode */
else
@@ -1125,7 +1178,7 @@ _mesa_dlist_alloc_opcode(struct gl_context *ctx,
static inline Node *
alloc_instruction(struct gl_context *ctx, OpCode opcode, GLuint nparams)
{
return dlist_alloc(ctx, opcode, nparams * sizeof(Node));
return dlist_alloc(ctx, opcode, nparams * sizeof(Node), false);
}
@@ -8903,6 +8956,9 @@ execute_list(struct gl_context *ctx, GLuint list)
case OPCODE_CONTINUE:
n = (Node *) get_pointer(&n[1]);
break;
case OPCODE_NOP:
/* no-op */
break;
case OPCODE_END_OF_LIST:
done = GL_TRUE;
break;
@@ -9942,6 +9998,9 @@ print_list(struct gl_context *ctx, GLuint list)
printf("DISPLAY-LIST-CONTINUE\n");
n = (Node *) get_pointer(&n[1]);
break;
case OPCODE_NOP:
printf("NOP\n");
break;
case OPCODE_END_OF_LIST:
printf("END-LIST %u\n", list);
done = GL_TRUE;
@@ -10088,6 +10147,8 @@ _mesa_init_display_list(struct gl_context *ctx)
ctx->List.ListBase = 0;
save_vtxfmt_init(&ctx->ListState.ListVtxfmt);
InstSize[OPCODE_NOP] = 1;
}

View File

@@ -60,6 +60,9 @@ extern void _mesa_compile_error( struct gl_context *ctx, GLenum error, const cha
extern void *_mesa_dlist_alloc(struct gl_context *ctx, GLuint opcode, GLuint sz);
extern void *
_mesa_dlist_alloc_aligned(struct gl_context *ctx, GLuint opcode, GLuint bytes);
extern GLint _mesa_dlist_alloc_opcode( struct gl_context *ctx, GLuint sz,
void (*execute)( struct gl_context *, void * ),
void (*destroy)( struct gl_context *, void * ),

View File

@@ -152,7 +152,7 @@ unorm_to_float(unsigned x, unsigned src_bits)
static inline float
snorm_to_float(int x, unsigned src_bits)
{
if (x == -MAX_INT(src_bits))
if (x <= -MAX_INT(src_bits))
return -1.0f;
else
return x * (1.0f / (float)MAX_INT(src_bits));

View File

@@ -73,7 +73,7 @@ struct gl_format_info
uint8_t Swizzle[4];
};
#include "format_info.c"
#include "format_info.h"
static const struct gl_format_info *
_mesa_get_format_info(mesa_format format)

View File

@@ -2990,6 +2990,7 @@ struct gl_shader_compiler_options
GLboolean EmitNoMainReturn; /**< Emit CONT/RET opcodes? */
GLboolean EmitNoNoise; /**< Emit NOISE opcodes? */
GLboolean EmitNoPow; /**< Emit POW opcodes? */
GLboolean EmitNoSat; /**< Emit SAT opcodes? */
GLboolean LowerClipDistance; /**< Lower gl_ClipDistance from float[8] to vec4[2]? */
/**

View File

@@ -1682,16 +1682,35 @@ _mesa_GetProgramBinary(GLuint program, GLsizei bufSize, GLsizei *length,
GLenum *binaryFormat, GLvoid *binary)
{
struct gl_shader_program *shProg;
GLsizei length_dummy;
GET_CURRENT_CONTEXT(ctx);
shProg = _mesa_lookup_shader_program_err(ctx, program, "glGetProgramBinary");
if (!shProg)
return;
/* The ARB_get_program_binary spec says:
*
* "If <length> is NULL, then no length is returned."
*
* Ensure that length always points to valid storage to avoid multiple NULL
* pointer checks below.
*/
if (length == NULL)
length = &length_dummy;
/* The ARB_get_program_binary spec says:
*
* "When a program object's LINK_STATUS is FALSE, its program binary
* length is zero, and a call to GetProgramBinary will generate an
* INVALID_OPERATION error.
*/
if (!shProg->LinkStatus) {
_mesa_error(ctx, GL_INVALID_OPERATION,
"glGetProgramBinary(program %u not linked)",
shProg->Name);
*length = 0;
return;
}
@@ -1700,12 +1719,9 @@ _mesa_GetProgramBinary(GLuint program, GLsizei bufSize, GLsizei *length,
return;
}
/* The ARB_get_program_binary spec says:
*
* "If <length> is NULL, then no length is returned."
*/
if (length != NULL)
*length = 0;
*length = 0;
_mesa_error(ctx, GL_INVALID_OPERATION,
"glGetProgramBinary(driver supports zero binary formats)");
(void) binaryFormat;
(void) binary;
@@ -1724,8 +1740,31 @@ _mesa_ProgramBinary(GLuint program, GLenum binaryFormat,
(void) binaryFormat;
(void) binary;
(void) length;
_mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__);
/* Section 2.3.1 (Errors) of the OpenGL 4.5 spec says:
*
* "If a negative number is provided where an argument of type sizei or
* sizeiptr is specified, an INVALID_VALUE error is generated."
*/
if (length < 0) {
_mesa_error(ctx, GL_INVALID_VALUE, "glProgramBinary(length < 0)");
return;
}
/* The ARB_get_program_binary spec says:
*
* "<binaryFormat> and <binary> must be those returned by a previous
* call to GetProgramBinary, and <length> must be the length of the
* program binary as returned by GetProgramBinary or GetProgramiv with
* <pname> PROGRAM_BINARY_LENGTH. Loading the program binary will fail,
* setting the LINK_STATUS of <program> to FALSE, if these conditions
* are not met."
*
* Since any value of binaryFormat passed "is not one of those specified as
* allowable for [this] command, an INVALID_ENUM error is generated."
*/
shProg->LinkStatus = GL_FALSE;
_mesa_error(ctx, GL_INVALID_ENUM, "glProgramBinary");
}

View File

@@ -729,7 +729,7 @@ _mesa_get_compressed_teximage(struct gl_context *ctx,
GLubyte *src;
/* map src texture buffer */
ctx->Driver.MapTextureImage(ctx, texImage, 0,
ctx->Driver.MapTextureImage(ctx, texImage, slice,
0, 0, texImage->Width, texImage->Height,
GL_MAP_READ_BIT, &src, &srcRowStride);
@@ -741,7 +741,7 @@ _mesa_get_compressed_teximage(struct gl_context *ctx,
src += srcRowStride;
}
ctx->Driver.UnmapTextureImage(ctx, texImage, 0);
ctx->Driver.UnmapTextureImage(ctx, texImage, slice);
/* Advance to next slice */
dest += store.TotalBytesPerRow * (store.TotalRowsPerSlice - store.CopyRowsPerSlice);

View File

@@ -2497,8 +2497,8 @@ texsubimage_error_check(struct gl_context *ctx, GLuint dimensions,
}
if (error_check_subtexture_dimensions(ctx, "glTexSubImage", dimensions,
texImage, xoffset, yoffset, 0,
width, height, 1)) {
texImage, xoffset, yoffset, zoffset,
width, height, depth)) {
return GL_TRUE;
}

View File

@@ -85,9 +85,6 @@ _mesa_parse_arb_fragment_program(struct gl_context* ctx, GLenum target,
return;
}
if ((ctx->_Shader->Flags & GLSL_NO_OPT) == 0)
_mesa_optimize_program(ctx, &prog);
free(program->Base.String);
/* Copy the relevant contents of the arb_program struct into the

View File

@@ -256,8 +256,15 @@ st_bufferobj_data(struct gl_context *ctx,
break;
case GL_STREAM_DRAW:
case GL_STREAM_COPY:
pipe_usage = PIPE_USAGE_STREAM;
break;
/* XXX: Remove this test and fall-through when we have PBO unpacking
* acceleration. Right now, PBO unpacking is done by the CPU, so we
* have to make sure CPU reads are fast.
*/
if (target != GL_PIXEL_UNPACK_BUFFER_ARB) {
pipe_usage = PIPE_USAGE_STREAM;
break;
}
/* fall through */
case GL_STATIC_READ:
case GL_DYNAMIC_READ:
case GL_STREAM_READ:

View File

@@ -1102,7 +1102,7 @@ st_DrawPixels(struct gl_context *ctx, GLint x, GLint y,
const GLfloat *color;
struct pipe_context *pipe = st->pipe;
GLboolean write_stencil = GL_FALSE, write_depth = GL_FALSE;
struct pipe_sampler_view *sv[2];
struct pipe_sampler_view *sv[2] = { NULL };
int num_sampler_view = 1;
struct st_fp_variant *fpv;
struct gl_pixelstore_attrib clippedUnpack;
@@ -1156,8 +1156,9 @@ st_DrawPixels(struct gl_context *ctx, GLint x, GLint y,
color = NULL;
if (st->pixel_xfer.pixelmap_enabled) {
sv[1] = st->pixel_xfer.pixelmap_sampler_view;
num_sampler_view++;
pipe_sampler_view_reference(&sv[1],
st->pixel_xfer.pixelmap_sampler_view);
num_sampler_view++;
}
}
@@ -1178,7 +1179,8 @@ st_DrawPixels(struct gl_context *ctx, GLint x, GLint y,
if (write_stencil) {
enum pipe_format stencil_format =
util_format_stencil_only(pt->format);
/* we should not be doing pixel map/transfer (see above) */
assert(num_sampler_view == 1);
sv[1] = st_create_texture_sampler_view_format(st->pipe, pt,
stencil_format);
num_sampler_view++;
@@ -1469,7 +1471,7 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, GLint srcy,
struct st_renderbuffer *rbRead;
void *driver_vp, *driver_fp;
struct pipe_resource *pt;
struct pipe_sampler_view *sv[2];
struct pipe_sampler_view *sv[2] = { NULL };
int num_sampler_view = 1;
GLfloat *color;
enum pipe_format srcFormat;
@@ -1518,7 +1520,8 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, GLint srcy,
driver_vp = make_passthrough_vertex_shader(st, GL_FALSE);
if (st->pixel_xfer.pixelmap_enabled) {
sv[1] = st->pixel_xfer.pixelmap_sampler_view;
pipe_sampler_view_reference(&sv[1],
st->pixel_xfer.pixelmap_sampler_view);
num_sampler_view++;
}
}

View File

@@ -122,7 +122,7 @@ st_begin_transform_feedback(struct gl_context *ctx, GLenum mode,
for (i = 0; i < max_num_targets; i++) {
struct st_buffer_object *bo = st_buffer_object(sobj->base.Buffers[i]);
if (bo) {
if (bo && bo->buffer) {
/* Check whether we need to recreate the target. */
if (!sobj->targets[i] ||
sobj->targets[i] == sobj->draw_count ||

View File

@@ -271,6 +271,8 @@ st_create_context_priv( struct gl_context *ctx, struct pipe_context *pipe,
*/
st->ctx->Point.MaxSize = MAX2(ctx->Const.MaxPointSize,
ctx->Const.MaxPointSizeAA);
/* For vertex shaders, make sure not to emit saturate when SM 3.0 is not supported */
ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoSat = !st->has_shader_model3;
_mesa_compute_version(ctx);

View File

@@ -5388,9 +5388,6 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog)
if (!pscreen->get_param(pscreen, PIPE_CAP_TEXTURE_GATHER_OFFSETS))
lower_offset_arrays(ir);
do_mat_op_to_vec(ir);
/* Emit saturates in the vertex shader only if SM 3.0 is supported. */
bool vs_sm3 = (_mesa_shader_stage_to_program(prog->_LinkedShaders[i]->Stage) ==
GL_VERTEX_PROGRAM_ARB) && st_context(ctx)->has_shader_model3;
lower_instructions(ir,
MOD_TO_FRACT |
DIV_TO_MUL_RCP |
@@ -5401,7 +5398,7 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog)
BORROW_TO_ARITH |
(options->EmitNoPow ? POW_TO_EXP2 : 0) |
(!ctx->Const.NativeIntegers ? INT_DIV_TO_MUL_RCP : 0) |
(vs_sm3 ? SAT_TO_CLAMP : 0));
(options->EmitNoSat ? SAT_TO_CLAMP : 0));
lower_ubo_reference(prog->_LinkedShaders[i], ir);
do_vec_index_to_cond_assign(ir);

View File

@@ -932,19 +932,19 @@ clamp_colors(SWspan *span)
* \param output which fragment program color output is being processed
*/
static inline void
convert_color_type(SWspan *span, GLenum newType, GLuint output)
convert_color_type(SWspan *span, GLenum srcType, GLenum newType, GLuint output)
{
GLvoid *src, *dst;
if (output > 0 || span->array->ChanType == GL_FLOAT) {
if (output > 0 || srcType == GL_FLOAT) {
src = span->array->attribs[VARYING_SLOT_COL0 + output];
span->array->ChanType = GL_FLOAT;
}
else if (span->array->ChanType == GL_UNSIGNED_BYTE) {
else if (srcType == GL_UNSIGNED_BYTE) {
src = span->array->rgba8;
}
else {
ASSERT(span->array->ChanType == GL_UNSIGNED_SHORT);
ASSERT(srcType == GL_UNSIGNED_SHORT);
src = span->array->rgba16;
}
@@ -978,7 +978,7 @@ shade_texture_span(struct gl_context *ctx, SWspan *span)
ctx->ATIFragmentShader._Enabled) {
/* programmable shading */
if (span->primitive == GL_BITMAP && span->array->ChanType != GL_FLOAT) {
convert_color_type(span, GL_FLOAT, 0);
convert_color_type(span, span->array->ChanType, GL_FLOAT, 0);
}
else {
span->array->rgba = (void *) span->array->attribs[VARYING_SLOT_COL0];
@@ -1313,6 +1313,8 @@ _swrast_write_rgba_span( struct gl_context *ctx, SWspan *span)
const GLboolean multiFragOutputs =
_swrast_use_fragment_program(ctx)
&& fp->Base.OutputsWritten >= (1 << FRAG_RESULT_DATA0);
/* Save srcColorType because convert_color_type() can change it */
const GLenum srcColorType = span->array->ChanType;
GLuint buf;
for (buf = 0; buf < numBuffers; buf++) {
@@ -1324,17 +1326,18 @@ _swrast_write_rgba_span( struct gl_context *ctx, SWspan *span)
/* re-use one of the attribute array buffers for rgbaSave */
GLchan (*rgbaSave)[4] = (GLchan (*)[4]) span->array->attribs[0];
struct swrast_renderbuffer *srb = swrast_renderbuffer(rb);
GLenum colorType = srb->ColorType;
const GLenum dstColorType = srb->ColorType;
assert(colorType == GL_UNSIGNED_BYTE ||
colorType == GL_FLOAT);
assert(dstColorType == GL_UNSIGNED_BYTE ||
dstColorType == GL_FLOAT);
/* set span->array->rgba to colors for renderbuffer's datatype */
if (span->array->ChanType != colorType) {
convert_color_type(span, colorType, 0);
if (srcColorType != dstColorType) {
convert_color_type(span, srcColorType, dstColorType,
multiFragOutputs ? buf : 0);
}
else {
if (span->array->ChanType == GL_UNSIGNED_BYTE) {
if (srcColorType == GL_UNSIGNED_BYTE) {
span->array->rgba = span->array->rgba8;
}
else {

View File

@@ -210,6 +210,7 @@ static inline float conv_i2_to_norm_float(const struct gl_context *ctx, int i2)
} \
} else if ((type) == GL_UNSIGNED_INT_10F_11F_11F_REV) { \
float res[4]; \
res[3] = 1; \
r11g11b10f_to_float3((arg), res); \
ATTR##val##FV((attr), res); \
} else \

Some files were not shown because too many files have changed in this diff Show More