Compare commits

..

128 Commits

Author SHA1 Message Date
Carl Worth
efe8cb1e53 Add release notes for 10.2.4
Just prior to the release.
2014-07-18 12:45:19 -07:00
Carl Worth
54733e5cb8 Update VERSION to 10.2.4
In preparation for the 10.2.4 release, of course.
2014-07-18 12:37:31 -07:00
Kenneth Graunke
6388ad51ff i965: Enable compressed multisample support (CMS) on Broadwell.
Everything is in place and appears to be working.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 8cf289c3ef)
2014-07-17 15:59:01 -07:00
Kenneth Graunke
ab0ad8f7e9 i965: Add 2x MSAA support to the MCS allocation function.
2x MSAA also uses 8 bits, just like 4x.  More bits are unused.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit db184d43b0)
2014-07-17 15:59:01 -07:00
Kenneth Graunke
1c386d5c35 i965: Hook up the MCS buffers in SURFACE_STATE on Broadwell.
MCS buffers are never allocated on Broadwell, so this does nothing for
now, but puts the infrastructure in place for when they do exist.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit a248b2a4eb)
2014-07-17 15:59:01 -07:00
Kenneth Graunke
e3c0c23873 i965: Drop SINT workaround for CMS layout on Broadwell.
According to the documentation, we don't need this SINT workaround on
Broadwell.  (Or at least, it doesn't mention that we need it.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit e10311be9f)
2014-07-17 15:59:01 -07:00
Kenneth Graunke
2a90fbfce4 i965: Add plumbing for Broadwell's auxiliary surface support.
Broadwell generalizes the MCS fields to allow for multiple kinds of
auxiliary surfaces.  This patch adds the plumbing to set those values,
but doesn't yet hook any up.

v2: (by Jordan Justen) Use mt for qpitch; pitch is tiles - 1.
v3: Don't forget to subtract 1 from aux_mt->pitch.
v4: Drop unnecessary aux_mt->offset (caught by Jordan Justen).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit fd77187689)
2014-07-17 15:59:01 -07:00
Jordan Justen
d374cfe0bc i965: Add auxiliary surface field #defines for Broadwell.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit a46cb6a971)
2014-07-17 15:59:01 -07:00
Matt Turner
b56908d7db i965/fs: Set correct number of regs_written for MCS fetches.
regs_written is in units of virtual GRFs.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit dfd117b857)
2014-07-17 15:59:01 -07:00
Eric Anholt
c6a6acb6b4 i965: Generalize the pixel_x/y workaround for all UW types.
This is the only case where a fs_reg in brw_fs_visitor is used during
optimization/code generation, and it meant that optimizations had to be
careful to not move pixel_x/y's register number without updating it.

Additionally, it turns out we had a couple of other UW values that weren't
getting this treatment (like gl_SampleID), so this more general fix is
probably a good idea (though I wasn't able to replicate problems with
either pixel_[xy]'s values or gl_SampleID, even when telling the register
allocator to reuse registers immediately)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 66f5c8df06)
2014-07-17 15:59:00 -07:00
Kenneth Graunke
64ff84abae i965/fs: Use WE_all for gl_SampleID header register munging.
This code should execute without regard to the currently executing
channels.  Asking for gl_SampleID inside control flow might break in
strange ways.  It appears to break even at the top of the program in
SIMD16 mode occasionally as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 6dc9e4e22a19108057162d9d8f8c7d559545f8de)
2014-07-17 15:59:00 -07:00
Matt Turner
8f4e03c397 i965/fs: Don't use brw_imm_* unnecessarily.
Using brw_imm_* creates a source with file=HW_REG, and the scheduler
inserts barrier dependencies when it sees HW_REG. None of these are
hardware-registers in the sense that they're special and scheduling
shouldn't touch them. A few of the modified cases already have HW_REGs
for other sources, so it won't allow extra flexibility in some cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c938be8ad2)
(This patch was cherry-picked to make the next commit apply cleanly.)
2014-07-17 15:59:00 -07:00
Kenneth Graunke
258f35441a i965/fs: Set force_uncompressed and force_sechalf on samplepos setup.
gen8_fs_generator uses these to decide whether to set the execution size
to 8 or 16, so we incorrectly made both of these MOVs the full width in
SIMD16 shaders.  (It happened to work out on Gen4-7.)

Setting them should also help inform optimization passes what's really
going on, which could help avoid bugs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2014-07-17 15:59:00 -07:00
Kenneth Graunke
cb04294b42 i965: Set execution size to 8 for instructions with force_sechalf set.
Both inst->force_uncompressed and inst->force_sechalf mean that the
generated instruction should be uncompressed and have an execution size
of 8.  We don't require the visitor to set both flags - setting
inst->force_sechalf by itself is supposed to be enough.

On Gen4-7, guess_execution_size() demoted instructions to 8-wide based
on the default compression state.  On Gen8+, we instead set a default
execution size, which worked great...except that we forgot to check
inst->force_sechalf when deciding whether to use 8 or 16.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1c62126612752f6eedb66f705cc3ff1e11beea5d)
2014-07-17 15:59:00 -07:00
Matt Turner
d389a863f2 i965/vec4: Constant propagate into 2-src math instructions on Gen8.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 7192207de1)
2014-07-17 15:59:00 -07:00
Matt Turner
7fcfdfb17b i965/fs: Constant propagate into 2-src math instructions on Gen8.
total instructions in shared programs: 1878133 -> 1876986 (-0.06%)
instructions in affected programs:     153007 -> 151860 (-0.75%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 038eb649b3)
2014-07-17 15:59:00 -07:00
Matt Turner
8612a12a62 i965/fs: Make try_constant_propagate() static.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit aca4a951ea)
2014-07-17 15:59:00 -07:00
Matt Turner
8f787d3ca2 i965/fs: Don't fix_math_operand() on Gen >= 8.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 48f1143c64)
2014-07-17 15:59:00 -07:00
Matt Turner
b323fa8957 i965/vec4: Don't fix_math_operand() on Gen >= 8.
The emit_math?_gen? functions serve to implement workarounds for the
math instruction, none of which exist on Gen8+.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit b24e1cc604)
2014-07-17 15:59:00 -07:00
Matt Turner
d5d94598cb i965/vec4: Don't return void from a void function.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0e800dfe75)
2014-07-17 15:59:00 -07:00
Kenneth Graunke
2efd0a3479 i965: Don't copy propagate abs into Broadwell logic instructions.
It's not clear what abs on logical instructions means on Broadwell, and
it doesn't appear to do anything sensible.

Fixes 270 Piglit tests (the bitand/bitor/bitxor tests with abs).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81157
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a2de656278)
2014-07-17 15:59:00 -07:00
Abdiel Janulgue
1a832e5846 i965/vec4: skip copy-propate for logical instructions with negated src entries
The negation source modifier on src registers has changed meaning in Broadwell when
used with logical operations. Don't copy propagate when negate src modifier is set
and when the destination instruction is a logical op.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit c17db7537f)
2014-07-17 15:59:00 -07:00
Abdiel Janulgue
d55a897929 i965/fs: skip copy-propate for logical instructions with negated src entries
The negation source modifier on src registers has changed meaning in Broadwell when
used with logical operations. Don't copy propagate when negate src modifier is set
and when the destination instruction is a logical op.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit 609d00e13e)
2014-07-17 15:59:00 -07:00
Abdiel Janulgue
276c6bb369 i965/fs: Refactor check for potential copy propagated instructions.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit a66660d2b7)
2014-07-17 15:59:00 -07:00
Marek Olšák
0273f22a10 radeonsi: add support for TXB2
This is needed by latest fixes for samplerCubeShadow with bias.
Otherwise, a crash occurs.
2014-07-17 14:29:10 -07:00
Marek Olšák
e731031372 radeonsi: fix samplerCubeShadow with bias
Pack the depth value before overwriting it with cube coordinates.

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit b279f0143f)

Conflicts:
	src/gallium/drivers/radeonsi/si_shader.c
2014-07-14 11:46:43 -07:00
Marek Olšák
906727dccb st/mesa: fix samplerCubeShadow with bias
It has 5 coordinates: (x,y,z,depth,lodbias)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit a11fff329e)
2014-07-14 11:43:22 -07:00
Brian Paul
d69c9114df gallium/u_blitter: fix some shader memory leaks
The _msaa shaders weren't getting freed.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 378fa34c7b)
2014-07-10 12:59:50 -07:00
Brian Paul
9b062d2020 st/mesa: fix geometry shader memory leak
Spotted by Charmaine Lee.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit d10204930f)
2014-07-10 12:59:34 -07:00
Brian Paul
37005cafa4 mesa: fix geometry shader memory leaks
Spotted by Charmaine Lee.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 176b64b811)
2014-07-10 12:59:19 -07:00
Marek Olšák
abe859d56e gallium: fix u_default_transfer_inline_write for textures
This doesn't fix any known issue. In fact, radeon drivers ignore all
the discard flags for textures and implicitly do "discard range"
for any write transfer.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit fe6be9926f)
2014-07-10 12:58:56 -07:00
Ilia Mirkin
1e6620997f nvc0/ir: use manual TXD when offsets are involved
Something about how we're implementing offsets for TXD is wrong, just
flip to the generic quadop-based implementation in that case.

This is the minimal fix appropriate for backporting.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 114d46829d)
2014-07-10 12:58:28 -07:00
Ilia Mirkin
9fd133747b nvc0/ir: do quadops on the right texture coordinates for TXD
handleTEX moves the layer as the first argument. This makes sure that
the quadops deal with the texture coordinates.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit afea9bae67)
2014-07-10 12:58:05 -07:00
Ilia Mirkin
5e1bfed1ca nv50/ir: ignore bias for samplerCubeShadow on nv50
Unfortunately there's no good way to do this on the nv50 shader isa.
Dropping the bias seems preferable to doing the compare post-filtering.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1065aa92f4)
2014-07-10 12:57:43 -07:00
Ilia Mirkin
0618881c82 nv50/ir: retrieve shadow compare from first arg
This can only happen with texture(samplerCubeShadow, bias), where the
compare will be in the first argument.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 30d91e0eec)
2014-07-10 12:57:17 -07:00
Carl Worth
d00d73d1e1 docs: Add sha256 checksums for the 10.2.3 release
This was not possible until the previous commit was complete, used for
building archives, and then tagged.
2014-07-07 16:17:21 -07:00
Carl Worth
33cb9f9503 docs: Add release notes for the 10.2.3 release.
Which is imminent.
2014-07-07 16:12:42 -07:00
Rob Clark
0186858227 freedreno/a3xx: vtx formats
Add support for more vertex buffer formats.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 06e9536e5f)

Squashed with:

freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit ba6a490bbc)
2014-07-07 16:09:36 -07:00
Rob Clark
b20c82f74c freedreno: fix for null textures
Some apps seem to give us a null sampler/view for texture slots which
come before the last used texture slot.  In particular 0ad triggers
this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 6aeeb706d2)
2014-07-07 16:09:36 -07:00
Rob Clark
8f77fbb6af freedreno/a3xx: texture fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit aa78c4586d)

Squashed with:

freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 2456be63e9)
2014-07-07 16:09:21 -07:00
Rob Clark
afcb63802f freedreno: few caps fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>

(cherry picked from commit 286863939f)
2014-07-07 16:09:21 -07:00
Rob Clark
8b2d1068b5 freedreno/a3xx: fix blend opcode
Seems the opcodes are slightly different from a2xx.  Resync headers and
move blend_func() helper into hw generation specific code.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit a4d229b099)
2014-07-07 16:09:20 -07:00
Rob Clark
f96e3e5351 freedreno/a3xx: fix depth/stencil gmem restore
We already multiply by bytes per pixel for this, so f3ba7611 broke
mem2gmem for depth/stencil.  Drop the now-redundant mutiply by cpp.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit b81de5352d)
2014-07-07 16:09:20 -07:00
Rob Clark
55b6821a9f freedreno/a3xx: fix depth/stencil GMEM positioning
In cases where there was no color buf bound, there were inconsistancies
in register settings related to position of depth/stencil inside GMEM.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit f3ba761129)

Squashed with:

freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 4da8267c36)
2014-07-07 16:09:00 -07:00
Rob Clark
a1b7c7d88e freedreno: use OUT_RELOCW when buffer is written
These aren't buffers we ever read back from CPU, so using incorrect
reloc fxn wasn't really harming anything.  But might as well be correct.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 0d54904c04)
2014-07-03 21:26:09 -07:00
Rob Clark
ff9cea8776 xa: fix segfault
Fixes:

  Program received signal SIGSEGV, Segmentation fault.
  bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430)
      at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445
  445						mask_pic->srf->tex->format);
  (gdb) bt
  #0  bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430)
      at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445
  #1  xa_composite_prepare (ctx=0x211430, comp=comp@entry=0x21b054)
      at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:488
  #2  0xb6f454b4 in XAPrepareComposite (op=<optimized out>, pSrcPicture=<optimized out>,
      pMaskPicture=<optimized out>, pDstPicture=<optimized out>, pSrc=0x5b3ad8, pMask=0x0,
      pDst=0x5923b8) at msm-exa-xa.c:533

We can't yet handle solid fill mask, so explicitly reject that, rather
than segfaulting.  Otherwise DDX would need to check XA version to see
if solid fill mask were supported.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit b7e7ae9f60)
2014-07-03 21:25:50 -07:00
Ilia Mirkin
95ff8c6f18 nvc0: add a memory barrier when there are persistent UBOs
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9a37eb8adb)
2014-07-03 20:04:56 -07:00
Ilia Mirkin
e11b3f8fbc nv50: do an explicit flush on draw when there are persistent buffers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5d4f5218bb)
2014-07-03 20:04:12 -07:00
Ilia Mirkin
da80e6a1c4 nv50: disable dedicated ubo upload method
The hardware allows multiple simultaneous renders with the same
memory-backed constbufs but with each invocation having different
values. However in order for that to work, the data has to be streamed
in via the right constbuf slot. We weren't doing that for UBOs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b2b7c65122)
2014-07-03 20:03:06 -07:00
Aaron Watry
5ba1cf1893 radeon/llvm: Allocate space for kernel metadata operands
Previously, we were assuming that kernel metadata nodes only had 1 operand.

Kernels which have attributes can have more than 1, e.g.:
!0 = metadata !{void (i32 addrspace(1)*)* @testKernel, metadata !1}
!1 = metadata !{metadata !"work_group_size_hint", i32 4, i32 1, i32 1}

Attempting to get the kernel without the correct number of attributes led
to memory corruption and luxrays crashing out.

Fixes the cl/program/execute/attributes.cl piglit test.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76223
CC: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 824197efd5)
2014-07-03 20:02:17 -07:00
Thomas Hellstrom
ff02e7995c st/xa: Don't close the drm fd on failure v2
If XA fails to initialize with pipe_loader enabled, the pipe_loader's
cleanup function will close the drm file descriptor. That's pretty bad
because the file descriptor will probably be the X server driver's only
connection to drm. Temporarily solve this by dup()'ing the file descriptor
before handing it over to the pipe loader.

This fixes freedesktop.org bugzilla bug #80645.

v2: Fix CC addresses.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
(cherry picked from commit 35cf3831d7)

Conflicts:
	src/gallium/state_trackers/xa/xa_tracker.c
2014-07-03 18:42:46 -07:00
Michel Dänzer
ee4274c393 radeon/llvm: Use the llvm.rsq.clamped intrinsic for RSQ
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>

https://bugs.freedesktop.org/show_bug.cgi?id=80015

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b9f501bc6b)

Squashed together with the earlier:

radeon/llvm: Adapt to AMDGPU.rsq intrinsic change in LLVM 3.5

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 93b6b1fa83)
2014-07-03 18:39:43 -07:00
Carl Worth
7b21ee08db cherry-ignore: Add a patch that's been rejected
It may be that the patch is just fine, but at the very least it needs a better
commit message.
2014-07-03 18:36:06 -07:00
Kenneth Graunke
f9718e4b93 i965/disasm: Fix INTEL_DEBUG=fs on Broadwell for ARB_fp applications.
Apparently INTEL_DEBUG=fs has crashed on Broadwell for anything using
ARB_fragment_program since commit 9cee3ff5.  We need to NULL-check the
right field.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c60a4ba7e3)

Conflicts:
	src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
2014-07-03 18:35:54 -07:00
Jasper St. Pierre
3ca2119593 glxext: Send the Drawable's ID in the GLX_BufferSwapComplete event
While the official INTEL_swap_event specification says that the drawable
field should contain the GLXDrawable, not the Drawable, the existing
DRI2 code in dri2.c that translates from DRI2_BufferSwapComplete sends out
GLX_BufferSwapComplete with the Drawable's ID, so existing codebases
like Clutter/Cogl rely on getting the Drawable.

Match DRI2's error here and stuff the event with the X Drawable, not
the GLX drawable.

This fixes apps seeing wrong drawables through an indirect GLX context
or with DRI3, which uses the GLX_BufferSwapComplete event directly on
the wire instead of translates Present in mesa.

At the same time, also modify the structure for the event to make sure
that clients don't make the same mistake. This is not an API or ABI
break, as GLXDrawable and Drawable are both typedefs for XID.

Signed-off-by: Jasper St. Pierre <jstpierre@mecheye.net>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b4dcf87f34)
2014-07-03 18:03:11 -07:00
Kenneth Graunke
ad79d7e987 i965: Include marketing names for Broadwell GPUs.
Intel would like us to include the marketing names.  Developers
additionally want "Broadwell GT1/2/3" because it makes it easier
to identify what hardware users have when they request assistance
or report issues.

Including both makes it easy for everyone to map between the names.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 05126b9bb5)
2014-07-03 18:02:00 -07:00
Takashi Iwai
9bd6dc9371 llvmpipe: Fix zero-division in llvmpipe_texture_layout()
Fix the crash of "gnome-control-center info" invocation on QEMU where
zero height is passed at init.

(sroland: simplify logic by eliminating the div altogether, using 64bit mul.)

Fixes: https://bugzilla.novell.com/show_bug.cgi?id=879462

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6b8b17153a)
2014-07-03 18:01:14 -07:00
Ilia Mirkin
89e3b89796 nouveau: dup fd before passing it to device
nouveau screens are reused for the same device node. However in the
scenario where we create screen 1, screen 2, and then delete screen 1,
the surrounding code might also close the original device node. To
protect against this, dup the fd and use the dup'd fd in the
nouveau_device. Also tell the nouveau_device that it is the owner of the
fd so that it will be closed on destruction.

Also make sure to free the nouveau_device in case of any failure.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79823
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@ubuntu.com>
(cherry picked from commit a59f2bb17b)
2014-07-03 18:00:37 -07:00
Tobias Klausmann
bcff69f18f nv50/ir: allow gl_ViewportIndex to work on non-provoking vertices
Previously, if we had something like:

  gl_ViewportIndex = idx;
  for(int i = 0; i < gl_in.length(); i++) {
     gl_Position = gl_in[i].gl_Position;
     EmitVertex();
  }
  EndPrimitive();

The right viewport index would not be set on the primitive because the
last vertex is the provoking one. However blob drivers appear to move
the gl_ViewportIndex write into the for loop, allowing the application
to be ignorant of this detail.

While the application is technically wrong here, because the blob does
it and other drivers appear to implicitly work this way as well, we add
a buffer register that viewport index writes go into, which is then
exported before every EmitVertex() call.

This fixes the remaining piglit tests in ARB_viewport_array for nv50/nvc0.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 98a86f61a8)
2014-07-03 17:46:01 -07:00
Roland Scheidegger
6ae4aff303 draw: (trivial) fix clamping of viewport index
The old logic would let all negative values go through unclamped, with
potentially disastrous results (probably trying to fetch viewport values
from random memory locations). GL has undefined rendering for vp indices
outside valid range but that's a bit too undefined...
(The logic is now the same as in llvmpipe.)

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 604e54de78)
2014-07-03 17:44:50 -07:00
Kenneth Graunke
500849f9cf i965: Don't emit SURFACE_STATEs for gather workarounds on Broadwell.
As far as I can tell, Broadwell doesn't need any of the SURFACE_STATE
workarounds for textureGather() bugs, so there's no need to emit
a second set of identical copies.

To keep things simple, just point the gather surface index base to the
same place as the texture surface index base.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f6a99d1167)
2014-07-03 17:44:43 -07:00
Carl Worth
05add05438 docs: Add sha256 sums for the 10.2.2 release
Which, of course, we couldn't do until making the release files and tagging
the release.
2014-06-24 21:45:41 -07:00
Carl Worth
623e68fb1b docs: Add release notes for 10.2.2 release
Which is ready to go.
2014-06-24 21:30:02 -07:00
Carl Worth
a9750ff7b5 Update VERSION to 10.2.2
In preparation for the 10.2.2 release.
2014-06-24 21:26:25 -07:00
Ville Syrjälä
274be620a8 i915: Fix gen2 texblend setup
Fix an off by one in the texture unit walk during texblend
setup on gen2. This caused the last enabled texunit to be
skipped resulting in totally messed up texturing.

This is a regression introduced here:
 commit 1ad443ecdd
 Author: Eric Anholt <eric@anholt.net>
 Date:   Wed Apr 23 15:35:27 2014 -0700

    i915: Redo texture unit walking on i830.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit ca55a1aaa7)
2014-06-23 15:04:35 -07:00
Kenneth Graunke
5751b661ad i965: Save meta stencil blit programs in the context.
When the last context in a share group is destroyed, the hash table
containing all of the shader programs (ctx->Shared->ShaderObjects) is
destroyed, throwing away all of the shader programs.

Using a static variable to store program IDs ends up holding on to them
after this, so we think we still have a compiled program, when it
actually got destroyed.  _mesa_UseProgram then hits GL errors, since no
program by that ID exists.

Instead, store the program IDs in the context, so we know to recompile
if our context gets destroyed and the application creates another one.

Fixes es3conform tests when run without -minfmt (where it creates
separate contexts for testing each visual).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77865
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a20994d616)

Conflicts:
	src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c
2014-06-23 15:04:11 -07:00
Daniel Manjarres
282ca8ba98 glx: Don't crash on swap event for a Window (non-GLXWindow)
Prior to GLX 1.3 there was the glxMakeCurrent() function that took a
single drawable handle. The Drawable could be either a bare XID for a
Window or an XID for a glxpixmap.

GLX 1.3 added glxMakeContextCurrent that takes 2 handles: one for
reading, one for writing. Nowadays the old glxMakeCurrent call is
implemented as a call to glxMakeContextCurrent with the single handle
duplicated.

Because of this it is allowed to use a plain-old Window ID as an
argument to glxMakeContextCurrent, although nobody really documents this
sort of thing. The manpage for the NEW call specifies the arguments as
GLXPixmaps, but the actual code accepts Window XIDs too, and handles
them correctly.

Similarly, the glxSelectEvents function can also take a bare Window XID.

The "piglit" tests all use GLXWindows and/or GLXPixmaps. You never
tested swap events with a bare Window XID. That is what my app was
doing.

The swap_events code worked with Window XIDs in mesa 7.x.y. The new code
added in versions 8, 9, and 10 assumes that all buffer swap events have
a GLXPixmap associated with them. Because of the historical quirks
above, this is not true. Swap events for bare Window XIDs do NOT have a
glxpixmap resulting in a segfault.

Any app that uses the old school glxMakeCurrent call with a Window XID
while trying to use swap_events will crash when the libs try to lookup
the nonexistent GLXPixmap associated with the incoming swap event.

I believe that the people who wrote the spec overlooked this, because
the "sbc" field comes from the OML_sync extension that is defined in
terms of glxpixmaps only.

v2 (idr): Formatting changes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54372
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 86bd2196b4)
2014-06-23 15:01:16 -07:00
Iago Toral Quiroga
c50fa76c7e mesa: Copy Geom.UsesEndPrimitive when cloning a geometry program.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 96a95f48ea)
2014-06-23 15:01:06 -07:00
Tom Stellard
ad9264366a clover: Don't use llvm's global context
An LLVMContext should only be accessed by a single and using the global
context was causing crashes in multi-threaded environments.  Now we use
a separate context for each compile.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4aa128a123)
2014-06-23 15:00:49 -07:00
Tom Stellard
855adad132 clover: Prevent Clang from printing number of errors and warnings to stderr.
https://bugs.freedesktop.org/show_bug.cgi?id=78581

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0cc391f013)
2014-06-23 15:00:37 -07:00
Ilia Mirkin
3568cf8128 nv30: hack to avoid errors on unexpected color/zeta combinations
This is just a hack, it should be possible to create a temporary zeta
surface and render to that instead. However that's more complicated and
this avoids the render being entirely broken and errors being reported
by the card.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 25182e249e)
2014-06-23 15:00:14 -07:00
Ilia Mirkin
aca2d98c35 nv30: avoid dangling references to deleted contexts
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c092c46b27)
2014-06-23 15:00:00 -07:00
Ilia Mirkin
08317fa9c4 nv30: plug some memory leaks on screen destroy and shader compile
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5af80f6268)
2014-06-23 14:59:47 -07:00
Ian Romanick
4d0c445af6 meta: Respect the driver's maximum number of draw buffers
Commit c1c1cf5f9 added infrastructure for saving and restoring draw
buffer state.  However, it universially used MAX_DRAW_BUFFERS, but many
drivers support far fewer than that at limit.  For example, the radeon
and i915 drivers only support 1.  Using MAX_DRAW_BUFFERS causes meta to
generate GL errors.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80115
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org> [on Broadwell]
Tested-by: jpsinthemix@verizon.net
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cc219d1d65)
2014-06-23 14:59:28 -07:00
Kristian Høgsberg
9ad103d664 mesa: Remove glClear optimization based on drawable size
A drawable size of 0x0 means that we don't have buffers for a drawable yet,
not that we have a zero-sized buffer.  Core mesa shouldn't be optimizing out
drawing based on buffer size, since the draw call could be what triggers
the driver to go and get buffers.  As discussed in the referenced bug report,
the optimization was added as part of a scatter-shot attempt to fix a
different problem.  There's no other example in mesa core of using the
buffer size in this way.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74005
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7928b946ad)
2014-06-23 14:59:06 -07:00
Grigori Goronzy
12fcbcde47 radeon/uvd: disable VC-1 simple/main on UVD 2.x
It's about as broken as on later UVD revisions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66452
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 6cd30f5d73)
2014-06-23 14:58:46 -07:00
Ilia Mirkin
d8e3158a43 nv50: make sure to mark first scissor dirty after blit
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit af05270ccf)
2014-06-23 14:58:33 -07:00
Kenneth Graunke
ef5f998b76 i965: Use 8x4 aligned rectangles for HiZ operations on Broadwell.
Like on Haswell, we need to use 8x4 aligned rectangle primitives for
hierarchical depth buffer resolves and depth clears.  See the comments
in brw_blorp.cpp's brw_hiz_op_params() constructor.  (The Broadwell
documentation confirms that this is still necessary.)

This patch makes the Broadwell code follow the same behavior as Chad and
Jordan's Gen7 BLORP code.  Based on a patch by Topi Pohjolainen.

This fixes es3conform's framebuffer_blit_functionality_scissor_blit
test, with no Piglit regressions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 49659ad90c)
2014-06-23 14:58:22 -07:00
Kenneth Graunke
31dd2a6f18 i965/vec4: Use the sampler for pull constant loads on Broadwell.
We've used the LD sampler message for pull constant loads on earlier
hardware for some time, and also were already using it for the FS on
Broadwell.  This patch makes us use it for Broadwell VS/GS as well.

I believe that when I wrote this code in 2012, we still used the data
port in some cases, and I somehow neglected to convert it while
rebasing.

Improves performance in GLBenchmark 2.7 Egypt by 416.978% +/- 2.25821%
(n = 17).  Many other applications should benefit similarly: this speeds
up uniform array access in the VS, which is commonly used for skinning
shaders, among other things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5d8e246ac8)
2014-06-23 14:57:56 -07:00
Kenneth Graunke
c07485eab1 i965: Add missing newlines to a few perf_debug messages.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 847abaccc0)
2014-06-23 14:57:27 -07:00
Kenneth Graunke
3b941857ee i965: Drop Broadwell perf_debugs about missing MOCS that aren't missing.
I actually added MOCS support for these things, but forgot to delete the
corresponding perf_debug() warnings.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d053a05ef3)
2014-06-23 14:56:50 -07:00
Kenneth Graunke
6b753df1f4 i965: Add missing MOCS setup for 3DSTATE_INDEX_BUFFER on Broadwell.
Somehow I missed this when adding all of the other MOCS values.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7f256c1c70)
2014-06-23 14:56:31 -07:00
Kenneth Graunke
01a79ac679 i965/vec4: Fix dead code elimination for VGRFs of size > 1.
When faced with code such as:

    mov vgrf31.0:UD, 960D
    mov vgrf31.1:UD, vgrf30.xxxx:UD

The dead code eliminator didn't consider reg_offsets, so it decided that
the second instruction was writing was writing to the same register as
the first one, and eliminated the first one.  But they're actually
different registers.

This fixes INTEL_DEBUG=shader_time for vertex shaders.  In the above
code, vgrf31.0 represents the offset into the shader_time buffer where
the data should be written, and vgrf31.1 represents the actual time
data.  With a completely undefined offset, results were...unexpected.

I think this is probably one of the few cases (maybe only case) where we
generate multiple MOVs to a large VGRF.  Normally, we just use them as
texturing results; the other SEND-from-GRF uses a size 1 VGRF.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79029
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d0575d98fc)
2014-06-23 14:56:11 -07:00
Jason Ekstrand
83be6a5517 meta_blit: properly compute texture width for the CopyTexSubImage fallback
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit ffe609cc69)
2014-06-23 14:55:48 -07:00
Emil Velikov
348125e7f7 configure: correctly autodetect xvmc/vdpau/omx
Commit e62b7d38a1 (configure: autodetect video state-trackers
when non swrast driver is present) added a check that caused
the autodetection to be omitted when we have the swrast gallium
driver. Whereas it should have skipped the VL targets when only
swrast was selected.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79907
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 816d392b58)
2014-06-23 11:47:24 -07:00
Neil Roberts
126600c918 i965: Set the fast clear color value for texture surfaces
When a multisampled texture is used for sampling the fast clear color value
needs to be programmed into the surface state. This was being left as all
zeroes so if the surface was cleared to a value other than black then it
wouldn't work properly. This doesn't matter for single-sample textures because
in that case the MCS buffer is resolved before it is used as a texture source.

https://bugs.freedesktop.org/show_bug.cgi?id=79729

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 765efeef88)
2014-06-23 11:47:04 -07:00
Kenneth Graunke
ee2035a95f i965: Invalidate live intervals when inserting Gen4 SEND workarounds.
We need to invalidate the live intervals when inserting new
instructions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 237aac39b1)
2014-06-23 11:46:43 -07:00
Kenneth Graunke
07a6f8bcab i965: Don't use the head sentinel as an fs_inst in Gen4 workaround code.
When walking backwards, we want to stop at the head sentinel, which is
where scan_inst->prev->prev == NULL, not scan_inst->prev == NULL.

Fixes random crashes, as well as valgrind errors.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ecc78eab11)
2014-06-23 11:46:17 -07:00
Michel Dänzer
1d46c58b83 configure: Only check for OpenCL without LLVM when the latter is certain
LLVM is enabled by default for some architectures, but the test was failing
before that.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2d399bb183)
2014-06-23 11:45:48 -07:00
Adrian Negreanu
f7fd6e52ec android, dricore: undefined reference to _mesa_streaming_load_memcpy
_mesa_streaming_load_memcpy is defined in main/streaming-load-memcpy.c
I'm adding it to the dricore lib

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 357a8b6f33)
2014-06-23 11:43:11 -07:00
Adrian Negreanu
a46fa0f9de android, mesa_gen_matypes: pull in timespec POSIX definition
This fixes:
  include/c11/threads_posix.h: In function 'cnd_timedwait':
  include/c11/threads_posix.h:140:21: error: storage size of 'abs_time' isn't known

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 6eb3888c86)
2014-06-23 11:43:02 -07:00
Adrian Negreanu
f4a19c1e2c android, egl: typo dri2_fallback_pixmap_surface -> dri2_fallback_create_pixmap_surface
I used commit bc8b07a6 as reference, and only the droid_display_vtbl had this issue.

This fixes:
src/egl/drivers/dri2/platform_android.c:641:29:
  error: 'dri2_fallback_pixmap_surface' undeclared here (not in a function)

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 6980cae6ae)
2014-06-23 11:42:53 -07:00
Adrian Negreanu
bed18b082a android, egl: add correct drm include for libmesa_egl_dri2
Fixes:
  src/egl/drivers/dri2/platform_android.c:38:
  include/GL/internal/dri_interface.h:51:17:
    fatal error: drm.h: No such file or directory

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 4dc5545eff)
2014-06-23 11:42:42 -07:00
Adrian Negreanu
43752c3c37 android: add src/gallium/auxiliary as include path for libmesa_dricore
This fixes:
In file included from
/home/adrian/workspace/mesa/mesa-master.git/src/mesa/vbo/vbo_exec_api.c:445:0:
/home/adrian/workspace/mesa/mesa-master.git/src/mesa/vbo/vbo_attrib_tmp.h:28:38:
fatal error: util/u_format_r11g11b10f.h: No such file or directory

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit 0048483f73)
2014-06-23 11:40:25 -07:00
Adrian Negreanu
aa03f78fc8 android: add libloader to libGLES_mesa and libmesa_egl_dri2
This fixes
  src/egl/drivers/dri2/platform_android.c:664: error: undefined reference to 'loader_set_logger'
  src/egl/drivers/dri2/platform_android.c:678: error: undefined reference to 'loader_get_driver_for_fd'

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit a49ebfab1d)
2014-06-23 11:40:14 -07:00
Adrian Negreanu
6194593661 android: adapt to the megadriver mechanism
Fixes linker error:
  ld:
  .../libmesa_dri_common_intermediates/libmesa_dri_common.a(dri_util.o):
    in function globalDriverAPI:dri_util.c(.data.rel+0x0): error:
    undefined reference to 'driDriverAPI'

As an example, you can see that mesa_dri_drivers
also uses common/libmegadriver_stub (src/mesa/drivers/dri/Makefile.am)

The _stub part might be confusing, but
it actually provides the dri-driver shared lib constructor,
megadriver_stub_init, which will later on load the real
platform dependent part and call
l __driDriverGetExtensions_<platform>

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit aba0f152be)
2014-06-23 11:39:59 -07:00
Adrian Negreanu
7654120e86 add megadriver_stub_FILES
So that android part can also use $(megadriver_stub_FILES)

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
(cherry picked from commit eb3f80dbba)
2014-06-23 11:39:43 -07:00
Emil Velikov
d6d80b44c4 configure: error out when building opencl without LLVM
Cc: Tom Stellard <thomas.stellard@amd.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit 93257a56b5)
2014-06-23 11:39:28 -07:00
José Fonseca
a5d00e243c mesa/main: Prevent sefgault on glGetIntegerv(GL_ATOMIC_COUNTER_BUFFER_BINDING).
A recent ApiTrace change, that tries to dump more buffer state
causes Mesa from my distro (10.1.4) to segfaults here.

I haven't actually confirm this fixes it (I can't repro on master),
but it seems a good idea to be defensive here anyway.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit eb58aa9cf0)
2014-06-23 11:39:15 -07:00
Ilia Mirkin
bfff355cef gk110/ir: fix bfind emission
There is a short-immediate version as well, but it should never end up
getting used since it would have gotten folded earlier.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bd7dd3ed06)
2014-06-23 11:38:58 -07:00
Ilia Mirkin
1e1bdee5ec gk110/ir: fix emitting constbuf file index
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7a67318794)
2014-06-23 11:38:38 -07:00
Ilia Mirkin
9e50fc3812 gk110/ir: emit saturate flag on fadd when needed
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4a3a71a183)
2014-06-23 11:38:10 -07:00
Emil Velikov
8c319b3f98 targets/xa: limit the amount of exported symbols
In the presence of LLVM the final library exports every symbol from
the llvm namespace. Resolve this by using a version script (w/o the
version/name tag).

Considering that there are only ~35 symbols, explicitly list them
to minimize the chances of rogue symbols sneaking in.

v2: Conditionally include the version-script.

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> (v1)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit a75baba2f1)
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2014-06-16 15:32:16 +02:00
Ian Romanick
70ce1031e7 docs: Add MD5 checksum, etc. for 10.2.1 release
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-06 23:28:53 -07:00
Ian Romanick
8c4845d29b docs: Add initial 10.2.1 release notes
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-06 23:20:00 -07:00
Ian Romanick
1b69ea1c6d Bump version to 10.2.1 2014-06-06 23:20:00 -07:00
Ian Romanick
c2fc9fb907 radeonsi: Fix build error introduced in 5ab9a9c
While resolving conflicts in cherry picking commit d226191, I
accidentally introduced some garbage.  Because radeonsi isn't built by
default, the problem went unnoticed by me.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reported-by: Laurent Carlier <lordheavym@gmail.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>
2014-06-06 23:19:53 -07:00
Ian Romanick
28d41e409d docs: Add MD5 checksum, etc. for 10.1 release
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-06 21:17:02 -07:00
Ian Romanick
f836ef63fd Bump version to 10.2 (final)
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-06 20:40:00 -07:00
Ilia Mirkin
99b9a0973a gk110/ir: fix slct emission
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9fef8b3d81)
2014-06-06 20:40:00 -07:00
Ilia Mirkin
d36d53b564 gk110/ir: fix interp mode emission
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d588a4919b)
2014-06-06 18:40:58 -07:00
Ilia Mirkin
283cd12933 nvc0: don't bother trying to set up compute for gk110+
The nouveau fw currently prints a bunch of errors. No point in seeing
those all the time, esp since compute doesn't really work in the first
place.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Conflicts:
	src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
(cherry picked from commit ca65fc418f)
2014-06-06 18:40:21 -07:00
Ilia Mirkin
aa8ea648f4 gk110: add in forgotten code for gk110 isa
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Conflicts:
	src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
(cherry picked from commit b9ec766bd0)
2014-06-06 18:37:07 -07:00
Ilia Mirkin
e901f40764 gk110/ir: fix ISAD emission with register args
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ed1b9e5721)
2014-06-06 18:19:45 -07:00
Ilia Mirkin
d5e47ee66b gk110/ir: fix quadon opcode emission
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6e046508a1)
2014-06-06 18:19:10 -07:00
Ilia Mirkin
932a5dadda gk110/ir: emit texbar the same way that the blob does
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 73eec47ef8)
2014-06-06 18:14:50 -07:00
Tobias Klausmann
203bc289a0 nv50/ir: clear subop when folding constant expressions
Some operations (e.g. OP_MUL/OP_MAD/OP_EXTBF) might have a subop set.
After folding, make sure that it is cleared

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3164bfc734)
2014-06-06 18:14:22 -07:00
Kenneth Graunke
11b3011805 i965: Support GL_CLAMP natively on Broadwell.
The new hardware actually supports this OpenGL 1.x feature natively,
so we can finally drop our shader workarounds.

Not many applications use GL_CLAMP, and most use it unintentionally, but
it's trivial to do right, so we should.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 221169693b)
2014-06-06 18:13:03 -07:00
Kenneth Graunke
c62bc58cce i965: Pass brw to translate_wrap_mode().
This lets us do generation checks.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7f3d64a77b)
2014-06-06 18:12:20 -07:00
Kenneth Graunke
304e80e356 i965: Fix copy and pasted values in Broadwell code.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7913b4b97b)
2014-06-06 18:11:54 -07:00
Sinclair Yeh
f4aca6868a egl: Check for NULL native_window in eglCreateWindowSurface
We have customers using NULL as a way to test the robustness of the API.
Without this check, EGL will segfault trying to dereference
dri2_surf->wl_win->private because wl_win is NULL.

This fix adds a check and sets EGL_BAD_NATIVE_WINDOW

v2: Incorporated feedback from idr - moved the check to a higher level
function.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 91ff0d4c65)
2014-06-06 18:11:30 -07:00
Marek Olšák
5ab9a9c0cc r600g,radeonsi: don't use hardware MSAA resolve if dst is fast-cleared
It doesn't work and our docs say so too.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit d226191820)
2014-06-06 18:08:23 -07:00
Marek Olšák
ae16f443c2 r600g,radeonsi: disable fast clear if render condition is on
For some reason, CP DMA doesn't follow the predicate bit if I enable it,
so this is the only option.

This fixes piglit: spec/NV_conditional_render/clear

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit bf701a84eb)
2014-06-06 18:03:10 -07:00
José Fonseca
b8241bb3f2 mesa: Make glGetIntegerv(GL_*_ARRAY_SIZE) return GL_BGRA.
Same as b026b6bbfe, but
COLOR_ARRAY_SIZE/SECONDARY_COLOR_ARRAY_SIZE.

Ideally we wouldn't munge the incoming state, so that we wouldn't need
to unmunge it back on glGet*.  But the array size state is copied and
referred in many places, many of which couldn't take an GLenum like
GL_BGRA instead of a plain integer.  So just hack around on glGet*,
to ensure there is no risk of introducing regressions elsewhere.

This bug causes problems to Apitrace, resulting in wrong traces.  See
https://github.com/apitrace/apitrace/issues/261 for details.

Tested with piglit arb_vertex_array_bgra-get, which was created for this
purpose.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e3e13d6b85)
2014-06-06 17:54:32 -07:00
José Fonseca
224c193237 mesa/main: Make get_hash.c values constant.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 53468dee03)
2014-06-06 17:35:45 -07:00
Beren Minor
494f916125 egl/main: Fix eglMakeCurrent when releasing context from current thread.
EGL 1.4 Specification says that
eglMakeCurrent(display, EGL_NO_SURFACE, EGL_NO_SURFACE, EGL_NO_CONTEXT)
can be used to release the current thread's ownership on the surfaces
and context.

MESA's egl implementation was only accepting the parameters when the
KHR_surfaceless_context extension is supported.

[chadv] Add quote from the EGL 1.4 spec.
Cc: "10,1, 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 0ca0d5743f)
2014-06-06 17:15:51 -07:00
Marek Olšák
767bc05309 Revert "glx: load dri driver with RTLD_LOCAL so dlclose never fails to unload"
This reverts commit e3cc0d90e1.

It breaks too many apps and completely breaks my desktop too.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79469

We'll probably need to re-release all stable versions after this is committed.

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0d5ec2c615)
2014-06-06 17:13:03 -07:00
Roland Scheidegger
3aaae6056e llvmpipe: fix crash when not all attachments are populated in a fb
Framebuffers can have NULL attachments since a while. llvmpipe handled
that properly for lp_rast_shade_quads_mask but it seems the change didn't
make it to lp_rast_shade_tile.
This fixes piglit fbo-drawbuffers-none test (though I need to increase
the FB_SIZE from 32 to 256 so the tris cover some tiles fully).
https://bugs.freedesktop.org/show_bug.cgi?id=79421

Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 576868140b)
2014-06-06 17:06:55 -07:00
119 changed files with 1493 additions and 379 deletions

View File

@@ -1 +1 @@
10.2.0-rc5
10.2.4

View File

@@ -1,3 +1,7 @@
# The first is the change, and the second is the revert of that change.
e6967270c75a5b669152127bb7a746d55f4407a6 i965: Fix depth (array slices) computation for 1D_ARRAY render targets.
155f98d49fdc2f46c760f8214327b3804ee60079 Revert "i965: Fix depth (array slices) computation for 1D_ARRAY render targets."
# This patch didn't have enough in the commit message to convince me it
# is a bug fix, (email sent to author asking for more information).
41d759d076737f94976f5294b734dbc437a12bae

View File

@@ -1324,7 +1324,7 @@ AM_CONDITIONAL(HAVE_OPENVG, test "x$enable_openvg" = xyes)
dnl
dnl Gallium G3DVL configuration
dnl
if test -n "$with_gallium_drivers" && ! echo "$with_gallium_drivers" | grep -q 'swrast'; then
if test -n "$with_gallium_drivers" -a "x$with_gallium_drivers" != xswrast; then
if test "x$enable_xvmc" = xauto; then
PKG_CHECK_EXISTS([xvmc], [enable_xvmc=yes], [enable_xvmc=no])
fi
@@ -1673,6 +1673,10 @@ if test "x$enable_gallium_llvm" = xyes; then
else
MESA_LLVM=0
LLVM_VERSION_INT=0
if test "x$enable_opencl" = xyes; then
AC_MSG_ERROR([cannot enable OpenCL without LLVM])
fi
fi
dnl Directory for XVMC libs

View File

@@ -16,6 +16,20 @@
<h1>News</h1>
<h2>June 6, 2014</h2>
<p>
<a href="relnotes/10.2.1.html">Mesa 10.2.1</a> is released. This release
only fixes a build error in the radeonsi driver that was introduced between
10.2-rc5 and the 10.2 final release.
</p>
<h2>June 6, 2014</h2>
<p>
<a href="relnotes/10.2.html">Mesa 10.2</a> is released. This is a new
development release. See the release notes for more information about
the release.
</p>
<h2>April 18, 2014</h2>
<p>
<a href="relnotes/10.1.1.html">Mesa 10.1.1</a> is released.

View File

@@ -21,6 +21,8 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/10.2.1.html">10.2.1 release notes</a>
<li><a href="relnotes/10.2.html">10.2 release notes</a>
<li><a href="relnotes/10.1.1.html">10.1.1 release notes</a>
<li><a href="relnotes/10.1.html">10.1 release notes</a>
<li><a href="relnotes/10.0.5.html">10.0.5 release notes</a>

61
docs/relnotes/10.2.1.html Normal file
View File

@@ -0,0 +1,61 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.1 Release Notes / June 6, 2014</h1>
<p>
Mesa 10.2.1 is a bug fix release which fixes bugs found since the 10.1 release.
</p>
<p>
Mesa 10.2.1 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>MD5 checksums</h2>
<pre>
96f892dae2d0bb14ac9c2113f586c909 MesaLib-10.2.1.tar.gz
093f9b5d077e5f6061dcd7b01b7aa51a MesaLib-10.2.1.tar.bz2
6ab76c1608e5deed1eb8b54c62d7a48a MesaLib-10.2.1.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>
Mesa 10.2 had a build problem in the radeonsi driver due to an error resolving
conflicts in a patch cherry-pick from master. The build error is fixed.
</p>
<h2>Changes</h2>
<p>Ian Romanick (3):</p>
<ul>
<li>docs: Add MD5 checksum, etc. for 10.1 release</li>
<li>radeonsi: Fix build error introduced in 5ab9a9c</li>
<li>Bump version to 10.2.1</li>
</ul>
</div>
</body>
</html>

181
docs/relnotes/10.2.2.html Normal file
View File

@@ -0,0 +1,181 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.2 Release Notes / June 24, 2014</h1>
<p>
Mesa 10.2.2 is a bug fix release which fixes bugs found since the 10.2.1 release.
</p>
<p>
Mesa 10.2.2 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
38c4a40364000f89cddaa1694f6f3cfb444981d1110238ce603093585477399c MesaLib-10.2.2.tar.bz2
2af2ec8b4db624c352e961eefbcce6c8d1f86d44c5542f6f378c50e1b958d453 MesaLib-10.2.2.tar.gz
d4c0372da59367a344d62ebcdf5cf61039c9cae6925f40f2dab8f8d95cf22da9 MesaLib-10.2.2.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66452">Bug 66452</a> - JUNIPER UVD accelerated playback of WMV3 streams does not work</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74005">Bug 74005</a> - [i965 Bisected]Piglit/glx_glx-make-glxdrawable-current fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77865">Bug 77865</a> - [BDW] Many Ogles3conform framebuffer_blit cases fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78581">Bug 78581</a> - OpenCL: clBuildProgram prints error messages directly rather than storing them</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79029">Bug 79029</a> - INTEL_DEBUG=shader_time is full of lies</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79729">Bug 79729</a> - [i965] glClear on a multisample texture doesn't work</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79907">Bug 79907</a> - Mesa 10.2.1 --enable-vdpau default=auto broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80115">Bug 80115</a> - MESA_META_DRAW_BUFFERS induced GL_INVALID_VALUE errors</li>
</ul>
<h2>Changes</h2>
<p>Adrian Negreanu (8):</p>
<ul>
<li>add megadriver_stub_FILES</li>
<li>android: adapt to the megadriver mechanism</li>
<li>android: add libloader to libGLES_mesa and libmesa_egl_dri2</li>
<li>android: add src/gallium/auxiliary as include path for libmesa_dricore</li>
<li>android, egl: add correct drm include for libmesa_egl_dri2</li>
<li>android, egl: typo dri2_fallback_pixmap_surface -&gt; dri2_fallback_create_pixmap_surface</li>
<li>android, mesa_gen_matypes: pull in timespec POSIX definition</li>
<li>android, dricore: undefined reference to _mesa_streaming_load_memcpy</li>
</ul>
<p>Carl Worth (1):</p>
<ul>
<li>Update VERSION to 10.2.2</li>
</ul>
<p>Daniel Manjarres (1):</p>
<ul>
<li>glx: Don't crash on swap event for a Window (non-GLXWindow)</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>targets/xa: limit the amount of exported symbols</li>
<li>configure: error out when building opencl without LLVM</li>
<li>configure: correctly autodetect xvmc/vdpau/omx</li>
</ul>
<p>Grigori Goronzy (1):</p>
<ul>
<li>radeon/uvd: disable VC-1 simple/main on UVD 2.x</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>mesa: Copy Geom.UsesEndPrimitive when cloning a geometry program.</li>
</ul>
<p>Ian Romanick (3):</p>
<ul>
<li>docs: Add initial 10.2.1 release notes</li>
<li>docs: Add MD5 checksum, etc. for 10.2.1 release</li>
<li>meta: Respect the driver's maximum number of draw buffers</li>
</ul>
<p>Ilia Mirkin (7):</p>
<ul>
<li>gk110/ir: emit saturate flag on fadd when needed</li>
<li>gk110/ir: fix emitting constbuf file index</li>
<li>gk110/ir: fix bfind emission</li>
<li>nv50: make sure to mark first scissor dirty after blit</li>
<li>nv30: plug some memory leaks on screen destroy and shader compile</li>
<li>nv30: avoid dangling references to deleted contexts</li>
<li>nv30: hack to avoid errors on unexpected color/zeta combinations</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>meta_blit: properly compute texture width for the CopyTexSubImage fallback</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>mesa/main: Prevent sefgault on glGetIntegerv(GL_ATOMIC_COUNTER_BUFFER_BINDING).</li>
</ul>
<p>Kenneth Graunke (9):</p>
<ul>
<li>i965: Don't use the head sentinel as an fs_inst in Gen4 workaround code.</li>
<li>i965: Invalidate live intervals when inserting Gen4 SEND workarounds.</li>
<li>i965/vec4: Fix dead code elimination for VGRFs of size &gt; 1.</li>
<li>i965: Add missing MOCS setup for 3DSTATE_INDEX_BUFFER on Broadwell.</li>
<li>i965: Drop Broadwell perf_debugs about missing MOCS that aren't missing.</li>
<li>i965: Add missing newlines to a few perf_debug messages.</li>
<li>i965/vec4: Use the sampler for pull constant loads on Broadwell.</li>
<li>i965: Use 8x4 aligned rectangles for HiZ operations on Broadwell.</li>
<li>i965: Save meta stencil blit programs in the context.</li>
</ul>
<p>Kristian Høgsberg (1):</p>
<ul>
<li>mesa: Remove glClear optimization based on drawable size</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>configure: Only check for OpenCL without LLVM when the latter is certain</li>
</ul>
<p>Neil Roberts (1):</p>
<ul>
<li>i965: Set the fast clear color value for texture surfaces</li>
</ul>
<p>Tom Stellard (2):</p>
<ul>
<li>clover: Prevent Clang from printing number of errors and warnings to stderr.</li>
<li>clover: Don't use llvm's global context</li>
</ul>
<p>Ville Syrjälä (1):</p>
<ul>
<li>i915: Fix gen2 texblend setup</li>
</ul>
</div>
</body>
</html>

130
docs/relnotes/10.2.3.html Normal file
View File

@@ -0,0 +1,130 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.3 Release Notes / July 7, 2014</h1>
<p>
Mesa 10.2.3 is a bug fix release which fixes bugs found since the 10.2.2 release.
</p>
<p>
Mesa 10.2.3 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e482a96170c98b17d6aba0d6e4dda4b9a2e61c39587bb64ac38cadfa4aba4aeb MesaLib-10.2.3.tar.bz2
96cffacaa1c52ae659b3b0f91be2eebf5528b748934256751261fb79ea3d6636 MesaLib-10.2.3.tar.gz
82cab6ff14c8038ee39842dbdea0d447a78d119efd8d702d1497bc7c246434e9 MesaLib-10.2.3.zip
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76223">Bug 76223</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79823">Bug 79823</a> - </li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80015">Bug 80015</a> - </li>
</ul>
<h2>Changes</h2>
<p>Aaron Watry (1):</p>
<ul>
<li>radeon/llvm: Allocate space for kernel metadata operands</li>
</ul>
<p>Carl Worth (2):</p>
<ul>
<li>docs: Add sha256 sums for the 10.2.2 release</li>
<li>cherry-ignore: Add a patch that's been rejected</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>nouveau: dup fd before passing it to device</li>
<li>nv50: disable dedicated ubo upload method</li>
<li>nv50: do an explicit flush on draw when there are persistent buffers</li>
<li>nvc0: add a memory barrier when there are persistent UBOs</li>
</ul>
<p>Jasper St. Pierre (1):</p>
<ul>
<li>glxext: Send the Drawable's ID in the GLX_BufferSwapComplete event</li>
</ul>
<p>Kenneth Graunke (3):</p>
<ul>
<li>i965: Don't emit SURFACE_STATEs for gather workarounds on Broadwell.</li>
<li>i965: Include marketing names for Broadwell GPUs.</li>
<li>i965/disasm: Fix INTEL_DEBUG=fs on Broadwell for ARB_fp applications.</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>radeon/llvm: Use the llvm.rsq.clamped intrinsic for RSQ</li>
</ul>
<p>Rob Clark (9):</p>
<ul>
<li>xa: fix segfault</li>
<li>freedreno: use OUT_RELOCW when buffer is written</li>
<li>freedreno/a3xx: fix depth/stencil GMEM positioning</li>
<li>freedreno/a3xx: fix depth/stencil gmem restore</li>
<li>freedreno/a3xx: fix blend opcode</li>
<li>freedreno: few caps fixes</li>
<li>freedreno/a3xx: texture fixes</li>
<li>freedreno: fix for null textures</li>
<li>freedreno/a3xx: vtx formats</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>draw: (trivial) fix clamping of viewport index</li>
</ul>
<p>Takashi Iwai (1):</p>
<ul>
<li>llvmpipe: Fix zero-division in llvmpipe_texture_layout()</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>st/xa: Don't close the drm fd on failure v2</li>
</ul>
<p>Tobias Klausmann (1):</p>
<ul>
<li>nv50/ir: allow gl_ViewportIndex to work on non-provoking vertices</li>
</ul>
</div>
</body>
</html>

125
docs/relnotes/10.2.4.html Normal file
View File

@@ -0,0 +1,125 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2.4 Release Notes / July 18, 2014</h1>
<p>
Mesa 10.2.4 is a bug fix release which fixes bugs found since the 10.2.3 release.
</p>
<p>
Mesa 10.2.4 implements the OpenGL 3.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 3.3. OpenGL
3.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81157">Bug 81157</a> - [BDW]Piglit some spec_glsl-1.50_execution_built-in-functions* cases fail</li>
</ul>
<h2>Changes</h2>
<p>Abdiel Janulgue (3):</p>
<ul>
<li>i965/fs: Refactor check for potential copy propagated instructions.</li>
<li>i965/fs: skip copy-propate for logical instructions with negated src entries</li>
<li>i965/vec4: skip copy-propate for logical instructions with negated src entries</li>
</ul>
<p>Brian Paul (3):</p>
<ul>
<li>mesa: fix geometry shader memory leaks</li>
<li>st/mesa: fix geometry shader memory leak</li>
<li>gallium/u_blitter: fix some shader memory leaks</li>
</ul>
<p>Carl Worth (2):</p>
<ul>
<li>docs: Add sha256 checksums for the 10.2.3 release</li>
<li>Update VERSION to 10.2.4</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>i965: Generalize the pixel_x/y workaround for all UW types.</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>nv50/ir: retrieve shadow compare from first arg</li>
<li>nv50/ir: ignore bias for samplerCubeShadow on nv50</li>
<li>nvc0/ir: do quadops on the right texture coordinates for TXD</li>
<li>nvc0/ir: use manual TXD when offsets are involved</li>
</ul>
<p>Jordan Justen (1):</p>
<ul>
<li>i965: Add auxiliary surface field #defines for Broadwell.</li>
</ul>
<p>Kenneth Graunke (9):</p>
<ul>
<li>i965: Don't copy propagate abs into Broadwell logic instructions.</li>
<li>i965: Set execution size to 8 for instructions with force_sechalf set.</li>
<li>i965/fs: Set force_uncompressed and force_sechalf on samplepos setup.</li>
<li>i965/fs: Use WE_all for gl_SampleID header register munging.</li>
<li>i965: Add plumbing for Broadwell's auxiliary surface support.</li>
<li>i965: Drop SINT workaround for CMS layout on Broadwell.</li>
<li>i965: Hook up the MCS buffers in SURFACE_STATE on Broadwell.</li>
<li>i965: Add 2x MSAA support to the MCS allocation function.</li>
<li>i965: Enable compressed multisample support (CMS) on Broadwell.</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>gallium: fix u_default_transfer_inline_write for textures</li>
<li>st/mesa: fix samplerCubeShadow with bias</li>
<li>radeonsi: fix samplerCubeShadow with bias</li>
<li>radeonsi: add support for TXB2</li>
</ul>
<p>Matt Turner (8):</p>
<ul>
<li>i965/vec4: Don't return void from a void function.</li>
<li>i965/vec4: Don't fix_math_operand() on Gen &gt;= 8.</li>
<li>i965/fs: Don't fix_math_operand() on Gen &gt;= 8.</li>
<li>i965/fs: Make try_constant_propagate() static.</li>
<li>i965/fs: Constant propagate into 2-src math instructions on Gen8.</li>
<li>i965/vec4: Constant propagate into 2-src math instructions on Gen8.</li>
<li>i965/fs: Don't use brw_imm_* unnecessarily.</li>
<li>i965/fs: Set correct number of regs_written for MCS fetches.</li>
</ul>
</div>
</body>
</html>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 10.2 Release Notes / TBD</h1>
<h1>Mesa 10.2 Release Notes / June 6, 2014</h1>
<p>
Mesa 10.2 is a new development release.
@@ -33,7 +33,9 @@ because compatibility contexts are not supported.
<h2>MD5 checksums</h2>
<pre>
TBD.
c87bfb6dd5cbcf1fdef42e5ccd972581 MesaLib-10.2.0.tar.gz
7aaba90bd7169a94ae2fe83febdec963 MesaLib-10.2.0.tar.bz2
58b203aca15dadc25ab4d1126db1052b MesaLib-10.2.0.zip
</pre>

View File

@@ -518,7 +518,7 @@ typedef struct {
unsigned long serial; /* # of last request processed by server */
Bool send_event; /* true if this came from a SendEvent request */
Display *display; /* Display the event was read from */
GLXDrawable drawable; /* drawable on which event was requested in event mask */
Drawable drawable; /* drawable on which event was requested in event mask */
int event_type;
int64_t ust;
int64_t msc;

View File

@@ -91,24 +91,24 @@ CHIPSET(0x0F32, byt, "Intel(R) Bay Trail")
CHIPSET(0x0F33, byt, "Intel(R) Bay Trail")
CHIPSET(0x0157, byt, "Intel(R) Bay Trail")
CHIPSET(0x0155, byt, "Intel(R) Bay Trail")
CHIPSET(0x1602, bdw_gt1, "Intel(R) Broadwell")
CHIPSET(0x1606, bdw_gt1, "Intel(R) Broadwell")
CHIPSET(0x160A, bdw_gt1, "Intel(R) Broadwell")
CHIPSET(0x160B, bdw_gt1, "Intel(R) Broadwell")
CHIPSET(0x160D, bdw_gt1, "Intel(R) Broadwell")
CHIPSET(0x160E, bdw_gt1, "Intel(R) Broadwell")
CHIPSET(0x1612, bdw_gt2, "Intel(R) Broadwell")
CHIPSET(0x1616, bdw_gt2, "Intel(R) Broadwell")
CHIPSET(0x161A, bdw_gt2, "Intel(R) Broadwell")
CHIPSET(0x161B, bdw_gt2, "Intel(R) Broadwell")
CHIPSET(0x161D, bdw_gt2, "Intel(R) Broadwell")
CHIPSET(0x161E, bdw_gt2, "Intel(R) Broadwell")
CHIPSET(0x1622, bdw_gt3, "Intel(R) Broadwell")
CHIPSET(0x1626, bdw_gt3, "Intel(R) Broadwell")
CHIPSET(0x162A, bdw_gt3, "Intel(R) Broadwell")
CHIPSET(0x162B, bdw_gt3, "Intel(R) Broadwell")
CHIPSET(0x162D, bdw_gt3, "Intel(R) Broadwell")
CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell")
CHIPSET(0x1602, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x1606, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x160A, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x160B, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x160D, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x160E, bdw_gt1, "Intel(R) Broadwell GT1")
CHIPSET(0x1612, bdw_gt2, "Intel(R) HD Graphics 5600 (Broadwell GT2)")
CHIPSET(0x1616, bdw_gt2, "Intel(R) HD Graphics 5500 (Broadwell GT2)")
CHIPSET(0x161A, bdw_gt2, "Intel(R) Broadwell GT2")
CHIPSET(0x161B, bdw_gt2, "Intel(R) Broadwell GT2")
CHIPSET(0x161D, bdw_gt2, "Intel(R) Broadwell GT2")
CHIPSET(0x161E, bdw_gt2, "Intel(R) HD Graphics 5300 (Broadwell GT2)")
CHIPSET(0x1622, bdw_gt3, "Intel(R) Iris Pro 6200 (Broadwell GT3e)")
CHIPSET(0x1626, bdw_gt3, "Intel(R) HD Graphics 6000 (Broadwell GT3)")
CHIPSET(0x162A, bdw_gt3, "Intel(R) Iris Pro P6300 (Broadwell GT3e)")
CHIPSET(0x162B, bdw_gt3, "Intel(R) Iris 6100 (Broadwell GT3)")
CHIPSET(0x162D, bdw_gt3, "Intel(R) Broadwell GT3")
CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")
CHIPSET(0x22B0, chv, "Intel(R) Cherryview")
CHIPSET(0x22B1, chv, "Intel(R) Cherryview")
CHIPSET(0x22B2, chv, "Intel(R) Cherryview")

View File

@@ -40,8 +40,12 @@ LOCAL_C_INCLUDES := \
$(MESA_TOP)/src/mapi \
$(MESA_TOP)/src/egl/main \
$(MESA_TOP)/src/loader \
$(DRM_TOP)/include/drm \
$(DRM_GRALLOC_TOP)
LOCAL_STATIC_LIBRARIES := \
libloader
LOCAL_MODULE := libmesa_egl_dri2
include $(MESA_COMMON_MK)

View File

@@ -638,7 +638,7 @@ droid_log(EGLint level, const char *msg)
static struct dri2_egl_display_vtbl droid_display_vtbl = {
.authenticate = NULL,
.create_window_surface = droid_create_window_surface,
.create_pixmap_surface = dri2_fallback_pixmap_surface,
.create_pixmap_surface = dri2_fallback_create_pixmap_surface,
.create_pbuffer_surface = droid_create_pbuffer_surface,
.destroy_surface = droid_destroy_surface,
.create_image = droid_create_image_khr,

View File

@@ -154,11 +154,14 @@ LOCAL_STATIC_LIBRARIES := \
libmesa_glsl \
libmesa_glsl_utils \
libmesa_gallium \
libloader \
$(LOCAL_STATIC_LIBRARIES)
endif # MESA_BUILD_GALLIUM
LOCAL_STATIC_LIBRARIES := \
$(LOCAL_STATIC_LIBRARIES) \
libloader
LOCAL_MODULE := libGLES_mesa
LOCAL_MODULE_PATH := $(TARGET_OUT_SHARED_LIBRARIES)/egl

View File

@@ -524,8 +524,12 @@ eglMakeCurrent(EGLDisplay dpy, EGLSurface draw, EGLSurface read,
if (!context && ctx != EGL_NO_CONTEXT)
RETURN_EGL_ERROR(disp, EGL_BAD_CONTEXT, EGL_FALSE);
if (!draw_surf || !read_surf) {
/* surfaces may be NULL if surfaceless */
if (!disp->Extensions.KHR_surfaceless_context)
/* From the EGL 1.4 (20130211) spec:
*
* To release the current context without assigning a new one, set ctx
* to EGL_NO_CONTEXT and set draw and read to EGL_NO_SURFACE.
*/
if (!disp->Extensions.KHR_surfaceless_context && ctx != EGL_NO_CONTEXT)
RETURN_EGL_ERROR(disp, EGL_BAD_SURFACE, EGL_FALSE);
if ((!draw_surf && draw != EGL_NO_SURFACE) ||
@@ -567,6 +571,10 @@ _eglCreateWindowSurfaceCommon(_EGLDisplay *disp, EGLConfig config,
EGLSurface ret;
_EGL_CHECK_CONFIG(disp, conf, EGL_NO_SURFACE, drv);
if (native_window == NULL)
RETURN_EGL_ERROR(disp, EGL_BAD_NATIVE_WINDOW, EGL_NO_SURFACE);
surf = drv->API.CreateWindowSurface(drv, disp, conf, native_window,
attrib_list);
ret = (surf) ? _eglLinkSurface(surf) : EGL_NO_SURFACE;

View File

@@ -493,7 +493,7 @@ draw_stats_clipper_primitives(struct draw_context *draw,
static INLINE unsigned
draw_clamp_viewport_idx(int idx)
{
return ((PIPE_MAX_VIEWPORTS > idx || idx < 0) ? idx : 0);
return ((PIPE_MAX_VIEWPORTS > idx && idx >= 0) ? idx : 0);
}
/**

View File

@@ -383,6 +383,15 @@ void util_blitter_destroy(struct blitter_context *blitter)
if (ctx->fs_texfetch_stencil[i])
ctx->delete_fs_state(pipe, ctx->fs_texfetch_stencil[i]);
if (ctx->fs_texfetch_col_msaa[i])
ctx->delete_fs_state(pipe, ctx->fs_texfetch_col_msaa[i]);
if (ctx->fs_texfetch_depth_msaa[i])
ctx->delete_fs_state(pipe, ctx->fs_texfetch_depth_msaa[i]);
if (ctx->fs_texfetch_depthstencil_msaa[i])
ctx->delete_fs_state(pipe, ctx->fs_texfetch_depthstencil_msaa[i]);
if (ctx->fs_texfetch_stencil_msaa[i])
ctx->delete_fs_state(pipe, ctx->fs_texfetch_stencil_msaa[i]);
for (j = 0; j< Elements(ctx->fs_resolve[i]); j++)
for (f = 0; f < 2; f++)
if (ctx->fs_resolve[i][j][f])

View File

@@ -25,8 +25,8 @@ void u_default_transfer_inline_write( struct pipe_context *pipe,
usage |= PIPE_TRANSFER_WRITE;
/* transfer_inline_write implicitly discards the rewritten buffer range */
/* XXX this looks very broken for non-buffer resources having more than one dim. */
if (box->x == 0 && box->width == resource->width0) {
if (resource->target == PIPE_BUFFER &&
box->x == 0 && box->width == resource->width0) {
usage |= PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE;
} else {
usage |= PIPE_TRANSFER_DISCARD_RANGE;

View File

@@ -10,11 +10,11 @@ git clone https://github.com/freedreno/envytools.git
The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno.xml ( 364 bytes, from 2013-11-30 14:47:15)
- /home/robclark/src/freedreno/envytools/rnndb/freedreno_copyright.xml ( 1453 bytes, from 2013-03-31 16:51:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32580 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10186 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 9859 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14477 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 57831 bytes, from 2014-05-19 21:02:34)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 26293 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 58020 bytes, from 2014-06-13 17:29:47)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 26602 bytes, from 2014-06-13 17:28:10)
Copyright (C) 2013-2014 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)
@@ -203,6 +203,15 @@ enum a2xx_rb_copy_sample_select {
SAMPLE_0123 = 6,
};
enum a2xx_rb_blend_opcode {
BLEND_DST_PLUS_SRC = 0,
BLEND_SRC_MINUS_DST = 1,
BLEND_MIN_DST_SRC = 2,
BLEND_MAX_DST_SRC = 3,
BLEND_DST_MINUS_SRC = 4,
BLEND_DST_PLUS_SRC_BIAS = 5,
};
enum adreno_mmu_clnt_beh {
BEH_NEVR = 0,
BEH_TRAN_RNG = 1,
@@ -996,7 +1005,7 @@ static inline uint32_t A2XX_RB_BLEND_CONTROL_COLOR_SRCBLEND(enum adreno_rb_blend
}
#define A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN__MASK 0x000000e0
#define A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN__SHIFT 5
static inline uint32_t A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(enum adreno_rb_blend_opcode val)
static inline uint32_t A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(enum a2xx_rb_blend_opcode val)
{
return ((val) << A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN__SHIFT) & A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN__MASK;
}
@@ -1014,7 +1023,7 @@ static inline uint32_t A2XX_RB_BLEND_CONTROL_ALPHA_SRCBLEND(enum adreno_rb_blend
}
#define A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN__MASK 0x00e00000
#define A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN__SHIFT 21
static inline uint32_t A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(enum adreno_rb_blend_opcode val)
static inline uint32_t A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(enum a2xx_rb_blend_opcode val)
{
return ((val) << A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN__SHIFT) & A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN__MASK;
}

View File

@@ -34,6 +34,27 @@
#include "fd2_context.h"
#include "fd2_util.h"
static enum a2xx_rb_blend_opcode
blend_func(unsigned func)
{
switch (func) {
case PIPE_BLEND_ADD:
return BLEND_DST_PLUS_SRC;
case PIPE_BLEND_MIN:
return BLEND_MIN_DST_SRC;
case PIPE_BLEND_MAX:
return BLEND_MAX_DST_SRC;
case PIPE_BLEND_SUBTRACT:
return BLEND_SRC_MINUS_DST;
case PIPE_BLEND_REVERSE_SUBTRACT:
return BLEND_DST_MINUS_SRC;
default:
DBG("invalid blend func: %x", func);
return 0;
}
}
void *
fd2_blend_state_create(struct pipe_context *pctx,
const struct pipe_blend_state *cso)
@@ -61,10 +82,10 @@ fd2_blend_state_create(struct pipe_context *pctx,
so->rb_blendcontrol =
A2XX_RB_BLEND_CONTROL_COLOR_SRCBLEND(fd_blend_factor(rt->rgb_src_factor)) |
A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(fd_blend_func(rt->rgb_func)) |
A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(blend_func(rt->rgb_func)) |
A2XX_RB_BLEND_CONTROL_COLOR_DESTBLEND(fd_blend_factor(rt->rgb_dst_factor)) |
A2XX_RB_BLEND_CONTROL_ALPHA_SRCBLEND(fd_blend_factor(rt->alpha_src_factor)) |
A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(fd_blend_func(rt->alpha_func)) |
A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(blend_func(rt->alpha_func)) |
A2XX_RB_BLEND_CONTROL_ALPHA_DESTBLEND(fd_blend_factor(rt->alpha_dst_factor));
if (rt->colormask & PIPE_MASK_R)

View File

@@ -10,11 +10,11 @@ git clone https://github.com/freedreno/envytools.git
The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno.xml ( 364 bytes, from 2013-11-30 14:47:15)
- /home/robclark/src/freedreno/envytools/rnndb/freedreno_copyright.xml ( 1453 bytes, from 2013-03-31 16:51:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32580 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10186 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 9859 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14477 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 57831 bytes, from 2014-05-19 21:02:34)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 26293 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 58020 bytes, from 2014-06-13 17:29:47)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 26602 bytes, from 2014-06-13 17:28:10)
Copyright (C) 2013-2014 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)
@@ -186,16 +186,26 @@ enum a3xx_rop_code {
ROP_SET = 15,
};
enum a3xx_rb_blend_opcode {
BLEND_DST_PLUS_SRC = 0,
BLEND_SRC_MINUS_DST = 1,
BLEND_DST_MINUS_SRC = 2,
BLEND_MIN_DST_SRC = 3,
BLEND_MAX_DST_SRC = 4,
};
enum a3xx_tex_filter {
A3XX_TEX_NEAREST = 0,
A3XX_TEX_LINEAR = 1,
A3XX_TEX_ANISO = 2,
};
enum a3xx_tex_clamp {
A3XX_TEX_REPEAT = 0,
A3XX_TEX_CLAMP_TO_EDGE = 1,
A3XX_TEX_MIRROR_REPEAT = 2,
A3XX_TEX_CLAMP_NONE = 3,
A3XX_TEX_CLAMP_TO_BORDER = 3,
A3XX_TEX_MIRROR_CLAMP = 4,
};
enum a3xx_tex_swiz {
@@ -877,7 +887,7 @@ static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_RGB_SRC_FACTOR(enum adreno_rb_b
}
#define A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE__MASK 0x000000e0
#define A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE__SHIFT 5
static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE(enum adreno_rb_blend_opcode val)
static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE(enum a3xx_rb_blend_opcode val)
{
return ((val) << A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE__SHIFT) & A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE__MASK;
}
@@ -895,7 +905,7 @@ static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_ALPHA_SRC_FACTOR(enum adreno_rb
}
#define A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE__MASK 0x00e00000
#define A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE__SHIFT 21
static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE(enum adreno_rb_blend_opcode val)
static inline uint32_t A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE(enum a3xx_rb_blend_opcode val)
{
return ((val) << A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE__SHIFT) & A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE__MASK;
}
@@ -978,6 +988,7 @@ static inline uint32_t A3XX_RB_COPY_CONTROL_MSAA_RESOLVE(enum a3xx_msaa_samples
{
return ((val) << A3XX_RB_COPY_CONTROL_MSAA_RESOLVE__SHIFT) & A3XX_RB_COPY_CONTROL_MSAA_RESOLVE__MASK;
}
#define A3XX_RB_COPY_CONTROL_DEPTHCLEAR 0x00000008
#define A3XX_RB_COPY_CONTROL_MODE__MASK 0x00000070
#define A3XX_RB_COPY_CONTROL_MODE__SHIFT 4
static inline uint32_t A3XX_RB_COPY_CONTROL_MODE(enum adreno_rb_copy_control_mode val)
@@ -1078,7 +1089,7 @@ static inline uint32_t A3XX_RB_DEPTH_INFO_DEPTH_FORMAT(enum adreno_rb_depth_form
#define A3XX_RB_DEPTH_INFO_DEPTH_BASE__SHIFT 11
static inline uint32_t A3XX_RB_DEPTH_INFO_DEPTH_BASE(uint32_t val)
{
return ((val >> 10) << A3XX_RB_DEPTH_INFO_DEPTH_BASE__SHIFT) & A3XX_RB_DEPTH_INFO_DEPTH_BASE__MASK;
return ((val >> 12) << A3XX_RB_DEPTH_INFO_DEPTH_BASE__SHIFT) & A3XX_RB_DEPTH_INFO_DEPTH_BASE__MASK;
}
#define REG_A3XX_RB_DEPTH_PITCH 0x00002103
@@ -1526,6 +1537,12 @@ static inline uint32_t A3XX_VFD_DECODE_INSTR_REGID(uint32_t val)
{
return ((val) << A3XX_VFD_DECODE_INSTR_REGID__SHIFT) & A3XX_VFD_DECODE_INSTR_REGID__MASK;
}
#define A3XX_VFD_DECODE_INSTR_SWAP__MASK 0x00c00000
#define A3XX_VFD_DECODE_INSTR_SWAP__SHIFT 22
static inline uint32_t A3XX_VFD_DECODE_INSTR_SWAP(enum a3xx_color_swap val)
{
return ((val) << A3XX_VFD_DECODE_INSTR_SWAP__SHIFT) & A3XX_VFD_DECODE_INSTR_SWAP__MASK;
}
#define A3XX_VFD_DECODE_INSTR_SHIFTCNT__MASK 0x1f000000
#define A3XX_VFD_DECODE_INSTR_SHIFTCNT__SHIFT 24
static inline uint32_t A3XX_VFD_DECODE_INSTR_SHIFTCNT(uint32_t val)

View File

@@ -34,6 +34,27 @@
#include "fd3_context.h"
#include "fd3_util.h"
static enum a3xx_rb_blend_opcode
blend_func(unsigned func)
{
switch (func) {
case PIPE_BLEND_ADD:
return BLEND_DST_PLUS_SRC;
case PIPE_BLEND_MIN:
return BLEND_MIN_DST_SRC;
case PIPE_BLEND_MAX:
return BLEND_MAX_DST_SRC;
case PIPE_BLEND_SUBTRACT:
return BLEND_SRC_MINUS_DST;
case PIPE_BLEND_REVERSE_SUBTRACT:
return BLEND_DST_MINUS_SRC;
default:
DBG("invalid blend func: %x", func);
return 0;
}
}
void *
fd3_blend_state_create(struct pipe_context *pctx,
const struct pipe_blend_state *cso)
@@ -80,10 +101,10 @@ fd3_blend_state_create(struct pipe_context *pctx,
so->rb_mrt[i].blend_control =
A3XX_RB_MRT_BLEND_CONTROL_RGB_SRC_FACTOR(fd_blend_factor(rt->rgb_src_factor)) |
A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE(fd_blend_func(rt->rgb_func)) |
A3XX_RB_MRT_BLEND_CONTROL_RGB_BLEND_OPCODE(blend_func(rt->rgb_func)) |
A3XX_RB_MRT_BLEND_CONTROL_RGB_DEST_FACTOR(fd_blend_factor(rt->rgb_dst_factor)) |
A3XX_RB_MRT_BLEND_CONTROL_ALPHA_SRC_FACTOR(fd_blend_factor(rt->alpha_src_factor)) |
A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE(fd_blend_func(rt->alpha_func)) |
A3XX_RB_MRT_BLEND_CONTROL_ALPHA_BLEND_OPCODE(blend_func(rt->alpha_func)) |
A3XX_RB_MRT_BLEND_CONTROL_ALPHA_DEST_FACTOR(fd_blend_factor(rt->alpha_dst_factor)) |
A3XX_RB_MRT_BLEND_CONTROL_CLAMP_ENABLE;

View File

@@ -195,8 +195,10 @@ emit_textures(struct fd_ringbuffer *ring,
OUT_RING(ring, CP_LOAD_STATE_1_STATE_TYPE(ST_CONSTANTS) |
CP_LOAD_STATE_1_EXT_SRC_ADDR(0));
for (i = 0; i < tex->num_textures; i++) {
struct fd3_pipe_sampler_view *view =
fd3_pipe_sampler_view(tex->textures[i]);
static const struct fd3_pipe_sampler_view dummy_view = {};
const struct fd3_pipe_sampler_view *view = tex->textures[i] ?
fd3_pipe_sampler_view(tex->textures[i]) :
&dummy_view;
OUT_RING(ring, view->texconst0);
OUT_RING(ring, view->texconst1);
OUT_RING(ring, view->texconst2 |
@@ -213,8 +215,10 @@ emit_textures(struct fd_ringbuffer *ring,
OUT_RING(ring, CP_LOAD_STATE_1_STATE_TYPE(ST_CONSTANTS) |
CP_LOAD_STATE_1_EXT_SRC_ADDR(0));
for (i = 0; i < tex->num_textures; i++) {
struct fd3_pipe_sampler_view *view =
fd3_pipe_sampler_view(tex->textures[i]);
static const struct fd3_pipe_sampler_view dummy_view = {};
const struct fd3_pipe_sampler_view *view = tex->textures[i] ?
fd3_pipe_sampler_view(tex->textures[i]) :
&dummy_view;
struct fd_resource *rsc = view->tex_resource;
for (j = 0; j < view->mipaddrs; j++) {
@@ -323,9 +327,12 @@ fd3_emit_vertex_bufs(struct fd_ringbuffer *ring,
if (vp->inputs[i].compmask) {
struct pipe_resource *prsc = vbufs[i].prsc;
struct fd_resource *rsc = fd_resource(prsc);
enum a3xx_vtx_fmt fmt = fd3_pipe2vtx(vbufs[i].format);
enum pipe_format pfmt = vbufs[i].format;
enum a3xx_vtx_fmt fmt = fd3_pipe2vtx(pfmt);
bool switchnext = (i != last);
uint32_t fs = util_format_get_blocksize(vbufs[i].format);
uint32_t fs = util_format_get_blocksize(pfmt);
debug_assert(fmt != ~0);
OUT_PKT0(ring, REG_A3XX_VFD_FETCH(j), 2);
OUT_RING(ring, A3XX_VFD_FETCH_INSTR_0_FETCHSIZE(fs - 1) |
@@ -339,6 +346,7 @@ fd3_emit_vertex_bufs(struct fd_ringbuffer *ring,
OUT_RING(ring, A3XX_VFD_DECODE_INSTR_CONSTFILL |
A3XX_VFD_DECODE_INSTR_WRITEMASK(vp->inputs[i].compmask) |
A3XX_VFD_DECODE_INSTR_FORMAT(fmt) |
A3XX_VFD_DECODE_INSTR_SWAP(fd3_pipe2swap(pfmt)) |
A3XX_VFD_DECODE_INSTR_REGID(vp->inputs[i].regid) |
A3XX_VFD_DECODE_INSTR_SHIFTCNT(fs) |
A3XX_VFD_DECODE_INSTR_LASTCOMPVALID |

View File

@@ -82,7 +82,7 @@ emit_mrt(struct fd_ringbuffer *ring, unsigned nr_bufs,
stride = bin_w * rsc->cpp;
if (bases) {
base = bases[i] * rsc->cpp;
base = bases[i];
}
} else {
stride = slice->pitch * rsc->cpp;
@@ -106,9 +106,17 @@ emit_mrt(struct fd_ringbuffer *ring, unsigned nr_bufs,
}
static uint32_t
depth_base(struct fd_gmem_stateobj *gmem)
depth_base(struct fd_context *ctx)
{
return align(gmem->bin_w * gmem->bin_h, 0x4000);
struct fd_gmem_stateobj *gmem = &ctx->gmem;
struct pipe_framebuffer_state *pfb = &ctx->framebuffer;
uint32_t cpp = 4;
if (pfb->cbufs[0]) {
struct fd_resource *rsc =
fd_resource(pfb->cbufs[0]->texture);
cpp = rsc->cpp;
}
return align(gmem->bin_w * gmem->bin_h * cpp, 0x4000);
}
static bool
@@ -156,7 +164,7 @@ emit_binning_workaround(struct fd_context *ctx)
OUT_RING(ring, A3XX_RB_COPY_CONTROL_MSAA_RESOLVE(MSAA_ONE) |
A3XX_RB_COPY_CONTROL_MODE(0) |
A3XX_RB_COPY_CONTROL_GMEM_BASE(0));
OUT_RELOC(ring, fd_resource(fd3_ctx->solid_vbuf)->bo, 0x20, 0, -1); /* RB_COPY_DEST_BASE */
OUT_RELOCW(ring, fd_resource(fd3_ctx->solid_vbuf)->bo, 0x20, 0, -1); /* RB_COPY_DEST_BASE */
OUT_RING(ring, A3XX_RB_COPY_DEST_PITCH_PITCH(128));
OUT_RING(ring, A3XX_RB_COPY_DEST_INFO_TILE(LINEAR) |
A3XX_RB_COPY_DEST_INFO_FORMAT(RB_R8G8B8A8_UNORM) |
@@ -399,12 +407,7 @@ fd3_emit_tile_gmem2mem(struct fd_context *ctx, struct fd_tile *tile)
}}, 1);
if (ctx->resolve & (FD_BUFFER_DEPTH | FD_BUFFER_STENCIL)) {
uint32_t base = 0;
if (pfb->cbufs[0]) {
struct fd_resource *rsc =
fd_resource(pfb->cbufs[0]->texture);
base = depth_base(&ctx->gmem) * rsc->cpp;
}
uint32_t base = depth_base(ctx);
emit_gmem2mem_surf(ctx, RB_COPY_DEPTH_STENCIL, base, pfb->zsbuf);
}
@@ -458,7 +461,7 @@ fd3_emit_tile_mem2gmem(struct fd_context *ctx, struct fd_tile *tile)
y1 = ((float)tile->yoff + bin_h) / ((float)pfb->height);
OUT_PKT3(ring, CP_MEM_WRITE, 5);
OUT_RELOC(ring, fd_resource(fd3_ctx->blit_texcoord_vbuf)->bo, 0, 0, 0);
OUT_RELOCW(ring, fd_resource(fd3_ctx->blit_texcoord_vbuf)->bo, 0, 0, 0);
OUT_RING(ring, fui(x0));
OUT_RING(ring, fui(y0));
OUT_RING(ring, fui(x1));
@@ -558,7 +561,7 @@ fd3_emit_tile_mem2gmem(struct fd_context *ctx, struct fd_tile *tile)
bin_h = gmem->bin_h;
if (ctx->restore & (FD_BUFFER_DEPTH | FD_BUFFER_STENCIL))
emit_mem2gmem_surf(ctx, depth_base(gmem), pfb->zsbuf, bin_w);
emit_mem2gmem_surf(ctx, depth_base(ctx), pfb->zsbuf, bin_w);
if (ctx->restore & FD_BUFFER_COLOR)
emit_mem2gmem_surf(ctx, 0, pfb->cbufs[0], bin_w);
@@ -639,7 +642,7 @@ update_vsc_pipe(struct fd_context *ctx)
int i;
OUT_PKT0(ring, REG_A3XX_VSC_SIZE_ADDRESS, 1);
OUT_RELOC(ring, fd3_ctx->vsc_size_mem, 0, 0, 0); /* VSC_SIZE_ADDRESS */
OUT_RELOCW(ring, fd3_ctx->vsc_size_mem, 0, 0, 0); /* VSC_SIZE_ADDRESS */
for (i = 0; i < 8; i++) {
struct fd_vsc_pipe *pipe = &ctx->pipe[i];
@@ -654,7 +657,7 @@ update_vsc_pipe(struct fd_context *ctx)
A3XX_VSC_PIPE_CONFIG_Y(pipe->y) |
A3XX_VSC_PIPE_CONFIG_W(pipe->w) |
A3XX_VSC_PIPE_CONFIG_H(pipe->h));
OUT_RELOC(ring, pipe->bo, 0, 0, 0); /* VSC_PIPE[i].DATA_ADDRESS */
OUT_RELOCW(ring, pipe->bo, 0, 0, 0); /* VSC_PIPE[i].DATA_ADDRESS */
OUT_RING(ring, fd_bo_size(pipe->bo) - 32); /* VSC_PIPE[i].DATA_LENGTH */
}
}
@@ -789,6 +792,7 @@ fd3_emit_tile_init(struct fd_context *ctx)
{
struct fd_ringbuffer *ring = ctx->ring;
struct fd_gmem_stateobj *gmem = &ctx->gmem;
uint32_t rb_render_control;
fd3_emit_restore(ctx);
@@ -813,8 +817,10 @@ fd3_emit_tile_init(struct fd_context *ctx)
patch_draws(ctx, IGNORE_VISIBILITY);
}
patch_rbrc(ctx, A3XX_RB_RENDER_CONTROL_ENABLE_GMEM |
A3XX_RB_RENDER_CONTROL_BIN_WIDTH(gmem->bin_w));
rb_render_control = A3XX_RB_RENDER_CONTROL_ENABLE_GMEM |
A3XX_RB_RENDER_CONTROL_BIN_WIDTH(gmem->bin_w);
patch_rbrc(ctx, rb_render_control);
}
/* before mem2gmem */
@@ -827,7 +833,7 @@ fd3_emit_tile_prep(struct fd_context *ctx, struct fd_tile *tile)
uint32_t reg;
OUT_PKT0(ring, REG_A3XX_RB_DEPTH_INFO, 2);
reg = A3XX_RB_DEPTH_INFO_DEPTH_BASE(depth_base(gmem));
reg = A3XX_RB_DEPTH_INFO_DEPTH_BASE(depth_base(ctx));
if (pfb->zsbuf) {
reg |= A3XX_RB_DEPTH_INFO_DEPTH_FORMAT(fd_pipe2depth(pfb->zsbuf->format));
}

View File

@@ -48,12 +48,14 @@ tex_clamp(unsigned wrap)
case PIPE_TEX_WRAP_REPEAT:
return A3XX_TEX_REPEAT;
case PIPE_TEX_WRAP_CLAMP:
case PIPE_TEX_WRAP_CLAMP_TO_BORDER:
case PIPE_TEX_WRAP_CLAMP_TO_EDGE:
return A3XX_TEX_CLAMP_TO_EDGE;
case PIPE_TEX_WRAP_CLAMP_TO_BORDER:
return A3XX_TEX_CLAMP_TO_BORDER;
case PIPE_TEX_WRAP_MIRROR_CLAMP:
case PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER:
case PIPE_TEX_WRAP_MIRROR_CLAMP_TO_EDGE:
return A3XX_TEX_MIRROR_CLAMP;
case PIPE_TEX_WRAP_MIRROR_REPEAT:
return A3XX_TEX_MIRROR_REPEAT;
default:

View File

@@ -37,70 +37,44 @@ fd3_pipe2vtx(enum pipe_format format)
{
switch (format) {
/* 8-bit buffers. */
case PIPE_FORMAT_A8_UNORM:
case PIPE_FORMAT_I8_UNORM:
case PIPE_FORMAT_L8_UNORM:
case PIPE_FORMAT_R8_UNORM:
case PIPE_FORMAT_L8_SRGB:
return VFMT_NORM_UBYTE_8;
case PIPE_FORMAT_A8_SNORM:
case PIPE_FORMAT_I8_SNORM:
case PIPE_FORMAT_L8_SNORM:
case PIPE_FORMAT_R8_SNORM:
return VFMT_NORM_BYTE_8;
case PIPE_FORMAT_A8_UINT:
case PIPE_FORMAT_I8_UINT:
case PIPE_FORMAT_L8_UINT:
case PIPE_FORMAT_R8_UINT:
return VFMT_UBYTE_8;
case PIPE_FORMAT_A8_SINT:
case PIPE_FORMAT_I8_SINT:
case PIPE_FORMAT_L8_SINT:
case PIPE_FORMAT_R8_SINT:
return VFMT_BYTE_8;
/* 16-bit buffers. */
case PIPE_FORMAT_R16_UNORM:
case PIPE_FORMAT_A16_UNORM:
case PIPE_FORMAT_L16_UNORM:
case PIPE_FORMAT_I16_UNORM:
case PIPE_FORMAT_Z16_UNORM:
return VFMT_NORM_USHORT_16;
case PIPE_FORMAT_R16_SNORM:
case PIPE_FORMAT_A16_SNORM:
case PIPE_FORMAT_L16_SNORM:
case PIPE_FORMAT_I16_SNORM:
return VFMT_NORM_SHORT_16;
case PIPE_FORMAT_R16_UINT:
case PIPE_FORMAT_A16_UINT:
case PIPE_FORMAT_L16_UINT:
case PIPE_FORMAT_I16_UINT:
return VFMT_USHORT_16;
case PIPE_FORMAT_R16_SINT:
case PIPE_FORMAT_A16_SINT:
case PIPE_FORMAT_L16_SINT:
case PIPE_FORMAT_I16_SINT:
return VFMT_SHORT_16;
case PIPE_FORMAT_L8A8_UNORM:
case PIPE_FORMAT_R16_FLOAT:
return VFMT_FLOAT_16;
case PIPE_FORMAT_R8G8_UNORM:
return VFMT_NORM_UBYTE_8_8;
case PIPE_FORMAT_L8A8_SNORM:
case PIPE_FORMAT_R8G8_SNORM:
return VFMT_NORM_BYTE_8_8;
case PIPE_FORMAT_L8A8_UINT:
case PIPE_FORMAT_R8G8_UINT:
return VFMT_UBYTE_8_8;
case PIPE_FORMAT_L8A8_SINT:
case PIPE_FORMAT_R8G8_SINT:
return VFMT_BYTE_8_8;
@@ -121,42 +95,62 @@ fd3_pipe2vtx(enum pipe_format format)
case PIPE_FORMAT_A8B8G8R8_UNORM:
case PIPE_FORMAT_A8R8G8B8_UNORM:
case PIPE_FORMAT_B8G8R8A8_UNORM:
case PIPE_FORMAT_B8G8R8X8_UNORM:
case PIPE_FORMAT_R8G8B8A8_UNORM:
case PIPE_FORMAT_R8G8B8X8_UNORM:
case PIPE_FORMAT_X8B8G8R8_UNORM:
case PIPE_FORMAT_X8R8G8B8_UNORM:
case PIPE_FORMAT_A8B8G8R8_SRGB:
case PIPE_FORMAT_B8G8R8A8_SRGB:
return VFMT_NORM_UBYTE_8_8_8_8;
case PIPE_FORMAT_R8G8B8A8_SNORM:
case PIPE_FORMAT_R8G8B8X8_SNORM:
return VFMT_NORM_BYTE_8_8_8_8;
case PIPE_FORMAT_R8G8B8A8_UINT:
case PIPE_FORMAT_R8G8B8X8_UINT:
return VFMT_UBYTE_8_8_8_8;
case PIPE_FORMAT_R8G8B8A8_SINT:
case PIPE_FORMAT_R8G8B8X8_SINT:
return VFMT_BYTE_8_8_8_8;
/* TODO probably need gles3 blob drivers to find the 32bit int formats:
case PIPE_FORMAT_R32_UINT:
case PIPE_FORMAT_R32_SINT:
case PIPE_FORMAT_A32_UINT:
case PIPE_FORMAT_A32_SINT:
case PIPE_FORMAT_L32_UINT:
case PIPE_FORMAT_L32_SINT:
case PIPE_FORMAT_I32_UINT:
case PIPE_FORMAT_I32_SINT:
*/
case PIPE_FORMAT_R16G16_SSCALED:
return VFMT_SHORT_16_16;
case PIPE_FORMAT_R16G16_FLOAT:
return VFMT_FLOAT_16_16;
case PIPE_FORMAT_R16G16_UINT:
return VFMT_USHORT_16_16;
case PIPE_FORMAT_R16G16_UNORM:
return VFMT_NORM_USHORT_16_16;
case PIPE_FORMAT_R16G16_SNORM:
return VFMT_NORM_SHORT_16_16;
case PIPE_FORMAT_R10G10B10A2_UNORM:
return VFMT_NORM_UINT_10_10_10_2;
case PIPE_FORMAT_R10G10B10A2_SNORM:
return VFMT_NORM_INT_10_10_10_2;
case PIPE_FORMAT_R10G10B10A2_USCALED:
return VFMT_UINT_10_10_10_2;
case PIPE_FORMAT_R10G10B10A2_SSCALED:
return VFMT_INT_10_10_10_2;
/* 48-bit buffers. */
case PIPE_FORMAT_R16G16B16_FLOAT:
return VFMT_FLOAT_16_16_16;
case PIPE_FORMAT_R16G16B16_SSCALED:
return VFMT_SHORT_16_16_16;
case PIPE_FORMAT_R16G16B16_UINT:
return VFMT_USHORT_16_16_16;
case PIPE_FORMAT_R16G16B16_SNORM:
return VFMT_NORM_SHORT_16_16_16;
case PIPE_FORMAT_R16G16B16_UNORM:
return VFMT_NORM_USHORT_16_16_16;
case PIPE_FORMAT_R32_FLOAT:
case PIPE_FORMAT_A32_FLOAT:
case PIPE_FORMAT_L32_FLOAT:
case PIPE_FORMAT_I32_FLOAT:
case PIPE_FORMAT_Z32_FLOAT:
return VFMT_FLOAT_32;
@@ -177,23 +171,14 @@ fd3_pipe2vtx(enum pipe_format format)
return VFMT_SHORT_16_16_16_16;
case PIPE_FORMAT_R32G32_FLOAT:
case PIPE_FORMAT_L32A32_FLOAT:
return VFMT_FLOAT_32_32;
case PIPE_FORMAT_R32G32_FIXED:
return VFMT_FIXED_32_32;
case PIPE_FORMAT_R16G16B16A16_FLOAT:
case PIPE_FORMAT_R16G16B16X16_FLOAT:
return VFMT_FLOAT_16_16_16_16;
/* TODO probably need gles3 blob drivers to find the 32bit int formats:
case PIPE_FORMAT_R32G32_SINT:
case PIPE_FORMAT_R32G32_UINT:
case PIPE_FORMAT_L32A32_UINT:
case PIPE_FORMAT_L32A32_SINT:
*/
/* 96-bit buffers. */
case PIPE_FORMAT_R32G32B32_FLOAT:
return VFMT_FLOAT_32_32_32;
@@ -203,7 +188,6 @@ fd3_pipe2vtx(enum pipe_format format)
/* 128-bit buffers. */
case PIPE_FORMAT_R32G32B32A32_FLOAT:
case PIPE_FORMAT_R32G32B32X32_FLOAT:
return VFMT_FLOAT_32_32_32_32;
case PIPE_FORMAT_R32G32B32A32_FIXED:
@@ -214,6 +198,20 @@ fd3_pipe2vtx(enum pipe_format format)
case PIPE_FORMAT_R32G32B32A32_UNORM:
case PIPE_FORMAT_R32G32B32A32_SINT:
case PIPE_FORMAT_R32G32B32A32_UINT:
case PIPE_FORMAT_R32_UINT:
case PIPE_FORMAT_R32_SINT:
case PIPE_FORMAT_A32_UINT:
case PIPE_FORMAT_A32_SINT:
case PIPE_FORMAT_L32_UINT:
case PIPE_FORMAT_L32_SINT:
case PIPE_FORMAT_I32_UINT:
case PIPE_FORMAT_I32_SINT:
case PIPE_FORMAT_R32G32_SINT:
case PIPE_FORMAT_R32G32_UINT:
case PIPE_FORMAT_L32A32_UINT:
case PIPE_FORMAT_L32A32_SINT:
*/
default:
@@ -358,8 +356,22 @@ fd3_pipe2swap(enum pipe_format format)
switch (format) {
case PIPE_FORMAT_B8G8R8A8_UNORM:
case PIPE_FORMAT_B8G8R8X8_UNORM:
case PIPE_FORMAT_B8G8R8A8_SRGB:
case PIPE_FORMAT_B8G8R8X8_SRGB:
return WXYZ;
case PIPE_FORMAT_A8R8G8B8_UNORM:
case PIPE_FORMAT_X8R8G8B8_UNORM:
case PIPE_FORMAT_A8R8G8B8_SRGB:
case PIPE_FORMAT_X8R8G8B8_SRGB:
return ZYXW;
case PIPE_FORMAT_A8B8G8R8_UNORM:
case PIPE_FORMAT_X8B8G8R8_UNORM:
case PIPE_FORMAT_A8B8G8R8_SRGB:
case PIPE_FORMAT_X8B8G8R8_SRGB:
return XYZW;
case PIPE_FORMAT_R8G8B8A8_UNORM:
case PIPE_FORMAT_R8G8B8X8_UNORM:
case PIPE_FORMAT_Z24X8_UNORM:

View File

@@ -10,11 +10,11 @@ git clone https://github.com/freedreno/envytools.git
The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno.xml ( 364 bytes, from 2013-11-30 14:47:15)
- /home/robclark/src/freedreno/envytools/rnndb/freedreno_copyright.xml ( 1453 bytes, from 2013-03-31 16:51:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32580 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10186 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 9859 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14477 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 57831 bytes, from 2014-05-19 21:02:34)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 26293 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 58020 bytes, from 2014-06-13 17:29:47)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 26602 bytes, from 2014-06-13 17:28:10)
Copyright (C) 2013-2014 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)
@@ -87,15 +87,6 @@ enum adreno_rb_blend_factor {
FACTOR_SRC_ALPHA_SATURATE = 16,
};
enum adreno_rb_blend_opcode {
BLEND_DST_PLUS_SRC = 0,
BLEND_SRC_MINUS_DST = 1,
BLEND_MIN_DST_SRC = 2,
BLEND_MAX_DST_SRC = 3,
BLEND_DST_MINUS_SRC = 4,
BLEND_DST_PLUS_SRC_BIAS = 5,
};
enum adreno_rb_surface_endian {
ENDIAN_NONE = 0,
ENDIAN_8IN16 = 1,

View File

@@ -10,11 +10,11 @@ git clone https://github.com/freedreno/envytools.git
The rules-ng-ng source files this header was generated from are:
- /home/robclark/src/freedreno/envytools/rnndb/adreno.xml ( 364 bytes, from 2013-11-30 14:47:15)
- /home/robclark/src/freedreno/envytools/rnndb/freedreno_copyright.xml ( 1453 bytes, from 2013-03-31 16:51:27)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32580 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 10186 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a2xx.xml ( 32901 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_common.xml ( 9859 bytes, from 2014-06-02 15:21:30)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/adreno_pm4.xml ( 14477 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 57831 bytes, from 2014-05-19 21:02:34)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 26293 bytes, from 2014-05-16 11:51:57)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a3xx.xml ( 58020 bytes, from 2014-06-13 17:29:47)
- /home/robclark/src/freedreno/envytools/rnndb/adreno/a4xx.xml ( 26602 bytes, from 2014-06-13 17:28:10)
Copyright (C) 2013-2014 by the following authors:
- Rob Clark <robdclark@gmail.com> (robclark)

View File

@@ -48,6 +48,10 @@ realloc_bo(struct fd_resource *rsc, uint32_t size)
uint32_t flags = DRM_FREEDRENO_GEM_CACHE_WCOMBINE |
DRM_FREEDRENO_GEM_TYPE_KMEM; /* TODO */
/* if we start using things other than write-combine,
* be sure to check for PIPE_RESOURCE_FLAG_MAP_COHERENT
*/
if (rsc->bo)
fd_bo_del(rsc->bo);

View File

@@ -161,9 +161,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_MIXED_COLORBUFFER_FORMATS:
case PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT:
case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER:
case PIPE_CAP_SM3:
case PIPE_CAP_SEAMLESS_CUBE_MAP:
case PIPE_CAP_TEXTURE_BARRIER:
case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
case PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION:
case PIPE_CAP_TGSI_INSTANCEID:
@@ -173,8 +171,8 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_COMPUTE:
case PIPE_CAP_START_INSTANCE:
case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS:
case PIPE_CAP_TEXTURE_MULTISAMPLE:
case PIPE_CAP_USER_CONSTANT_BUFFERS:
case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
return 1;
case PIPE_CAP_SHADER_STENCIL_EXPORT:
@@ -182,6 +180,9 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
case PIPE_CAP_CONDITIONAL_RENDER:
case PIPE_CAP_PRIMITIVE_RESTART:
case PIPE_CAP_TEXTURE_MULTISAMPLE:
case PIPE_CAP_TEXTURE_BARRIER:
case PIPE_CAP_SM3:
return 0;
case PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT:
@@ -207,7 +208,6 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_TGSI_VS_LAYER:
case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
case PIPE_CAP_TEXTURE_GATHER_SM5:
case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
case PIPE_CAP_FAKE_SW_MSAA:
case PIPE_CAP_TEXTURE_QUERY_LOD:
case PIPE_CAP_SAMPLE_SHADING:

View File

@@ -57,7 +57,7 @@ static void bind_sampler_states(struct fd_texture_stateobj *prog,
for (i = 0; i < nr; i++) {
if (hwcso[i])
new_nr++;
new_nr = i + 1;
prog->samplers[i] = hwcso[i];
prog->dirty_samplers |= (1 << i);
}
@@ -78,7 +78,7 @@ static void set_sampler_views(struct fd_texture_stateobj *prog,
for (i = 0; i < nr; i++) {
if (views[i])
new_nr++;
new_nr = i + 1;
pipe_sampler_view_reference(&prog->textures[i], views[i]);
prog->dirty_samplers |= (1 << i);
}

View File

@@ -111,26 +111,6 @@ fd_blend_factor(unsigned factor)
}
}
enum adreno_rb_blend_opcode
fd_blend_func(unsigned func)
{
switch (func) {
case PIPE_BLEND_ADD:
return BLEND_DST_PLUS_SRC;
case PIPE_BLEND_MIN:
return BLEND_MIN_DST_SRC;
case PIPE_BLEND_MAX:
return BLEND_MAX_DST_SRC;
case PIPE_BLEND_SUBTRACT:
return BLEND_SRC_MINUS_DST;
case PIPE_BLEND_REVERSE_SUBTRACT:
return BLEND_DST_MINUS_SRC;
default:
DBG("invalid blend func: %x", func);
return 0;
}
}
enum adreno_pa_su_sc_draw
fd_polygon_mode(unsigned mode)
{

View File

@@ -45,7 +45,6 @@
enum adreno_rb_depth_format fd_pipe2depth(enum pipe_format format);
enum pc_di_index_size fd_pipe2index(enum pipe_format format);
enum adreno_rb_blend_factor fd_blend_factor(unsigned factor);
enum adreno_rb_blend_opcode fd_blend_func(unsigned func);
enum adreno_pa_su_sc_draw fd_polygon_mode(unsigned mode);
enum adreno_stencil_op fd_stencil_op(unsigned op);

View File

@@ -312,9 +312,15 @@ lp_rast_shade_tile(struct lp_rasterizer_task *task,
/* color buffer */
for (i = 0; i < scene->fb.nr_cbufs; i++){
stride[i] = scene->cbufs[i].stride;
color[i] = lp_rast_get_unswizzled_color_block_pointer(task, i, tile_x + x,
tile_y + y, inputs->layer);
if (scene->fb.cbufs[i]) {
stride[i] = scene->cbufs[i].stride;
color[i] = lp_rast_get_unswizzled_color_block_pointer(task, i, tile_x + x,
tile_y + y, inputs->layer);
}
else {
stride[i] = 0;
color[i] = NULL;
}
}
/* depth buffer */

View File

@@ -115,7 +115,7 @@ llvmpipe_texture_layout(struct llvmpipe_screen *screen,
lpr->row_stride[level] = align(nblocksx * block_size, util_cpu_caps.cacheline);
/* if row_stride * height > LP_MAX_TEXTURE_SIZE */
if (lpr->row_stride[level] > LP_MAX_TEXTURE_SIZE / nblocksy) {
if ((uint64_t)lpr->row_stride[level] * nblocksy > LP_MAX_TEXTURE_SIZE) {
/* image too large */
goto fail;
}

View File

@@ -177,6 +177,7 @@ struct nv50_ir_prog_info
uint8_t vertexId; /* system value index of VertexID */
uint8_t edgeFlagIn;
uint8_t edgeFlagOut;
int8_t viewportId; /* output index of ViewportIndex */
uint8_t fragDepth; /* output index of FragDepth */
uint8_t sampleMask; /* output index of SampleMask */
boolean sampleInterp; /* perform sample interp on all fp inputs */

View File

@@ -287,10 +287,12 @@ CodeEmitterGK110::emitPredicate(const Instruction *i)
void
CodeEmitterGK110::setCAddress14(const ValueRef& src)
{
const int32_t addr = src.get()->asSym()->reg.data.offset / 4;
const Storage& res = src.get()->asSym()->reg;
const int32_t addr = res.data.offset / 4;
code[0] |= (addr & 0x01ff) << 23;
code[1] |= (addr & 0x3e00) >> 9;
code[1] |= res.fileIndex << 5;
}
void
@@ -413,7 +415,6 @@ CodeEmitterGK110::emitForm_21(const Instruction *i, uint32_t opc2,
case FILE_MEMORY_CONST:
code[1] &= (s == 2) ? ~(0x4 << 28) : ~(0x8 << 28);
setCAddress14(i->src(s));
code[1] |= i->getSrc(s)->reg.fileIndex << 5;
break;
case FILE_IMMEDIATE:
setShortImmediate(i, s);
@@ -555,6 +556,7 @@ CodeEmitterGK110::emitFADD(const Instruction *i)
RND_(2a, F);
ABS_(31, 0);
NEG_(33, 0);
SAT_(35);
if (code[0] & 0x1) {
modNegAbsF32_3b(i, 1);
@@ -633,7 +635,7 @@ CodeEmitterGK110::emitISAD(const Instruction *i)
{
assert(i->dType == TYPE_S32 || i->dType == TYPE_U32);
emitForm_21(i, 0x1fc, 0xb74);
emitForm_21(i, 0x1f4, 0xb74);
if (i->dType == TYPE_S32)
code[1] |= 1 << 19;
@@ -711,7 +713,7 @@ CodeEmitterGK110::emitEXTBF(const Instruction *i)
void
CodeEmitterGK110::emitBFIND(const Instruction *i)
{
emitForm_21(i, 0x618, 0xc18);
emitForm_C(i, 0x218, 0x2);
if (i->dType == TYPE_S32)
code[1] |= 0x80000;
@@ -952,7 +954,7 @@ CodeEmitterGK110::emitSLCT(const CmpInstruction *i)
FTZ_(32);
emitCondCode(cc, 0x33, 0xf);
} else {
emitForm_21(i, 0x1a4, 0xb20);
emitForm_21(i, 0x1a0, 0xb20);
emitCondCode(cc, 0x34, 0x7);
}
}
@@ -967,7 +969,7 @@ void CodeEmitterGK110::emitSELP(const Instruction *i)
void CodeEmitterGK110::emitTEXBAR(const Instruction *i)
{
code[0] = 0x00000002 | (i->subOp << 23);
code[0] = 0x0000003e | (i->subOp << 23);
code[1] = 0x77000000;
emitPredicate(i);
@@ -1204,7 +1206,7 @@ CodeEmitterGK110::emitFlow(const Instruction *i)
case OP_PRECONT: code[1] = 0x15800000; mask = 2; break;
case OP_PRERET: code[1] = 0x13800000; mask = 2; break;
case OP_QUADON: code[1] = 0x1b000000; mask = 0; break;
case OP_QUADON: code[1] = 0x1b800000; mask = 0; break;
case OP_QUADPOP: code[1] = 0x1c000000; mask = 0; break;
case OP_BRKPT: code[1] = 0x00000000; mask = 0; break;
default:
@@ -1326,7 +1328,8 @@ CodeEmitterGK110::emitOUT(const Instruction *i)
void
CodeEmitterGK110::emitInterpMode(const Instruction *i)
{
code[1] |= i->ipa << 21; // TODO: INTERP_SAMPLEID
code[1] |= (i->ipa & 0x3) << 21; // TODO: INTERP_SAMPLEID
code[1] |= (i->ipa & 0xc) << (19 - 2);
}
void

View File

@@ -790,6 +790,8 @@ bool Source::scanSource()
info->prop.gp.instanceCount = 1; // default value
}
info->io.viewportId = -1;
info->immd.data = (uint32_t *)MALLOC(scan.immediate_count * 16);
info->immd.type = (ubyte *)MALLOC(scan.immediate_count * sizeof(ubyte));
@@ -982,6 +984,9 @@ bool Source::scanDeclaration(const struct tgsi_full_declaration *decl)
case TGSI_SEMANTIC_SAMPLEMASK:
info->io.sampleMask = i;
break;
case TGSI_SEMANTIC_VIEWPORT_INDEX:
info->io.viewportId = i;
break;
default:
break;
}
@@ -1258,6 +1263,8 @@ private:
Stack joinBBs; // fork BB, for inserting join ops on ENDIF
Stack loopBBs; // loop headers
Stack breakBBs; // end of / after loop
Value *viewport;
};
Symbol *
@@ -1555,8 +1562,16 @@ Converter::storeDst(const tgsi::Instruction::DstRegister dst, int c,
mkOp2(OP_WRSV, TYPE_U32, NULL, dstToSym(dst, c), val);
} else
if (f == TGSI_FILE_OUTPUT && prog->getType() != Program::TYPE_FRAGMENT) {
if (ptr || (info->out[idx].mask & (1 << c)))
mkStore(OP_EXPORT, TYPE_U32, dstToSym(dst, c), ptr, val);
if (ptr || (info->out[idx].mask & (1 << c))) {
/* Save the viewport index into a scratch register so that it can be
exported at EMIT time */
if (info->out[idx].sn == TGSI_SEMANTIC_VIEWPORT_INDEX &&
viewport != NULL)
mkOp1(OP_MOV, TYPE_U32, viewport, val);
else
mkStore(OP_EXPORT, TYPE_U32, dstToSym(dst, c), ptr, val);
}
} else
if (f == TGSI_FILE_TEMPORARY ||
f == TGSI_FILE_PREDICATE ||
@@ -2489,7 +2504,7 @@ Converter::handleInstruction(const struct tgsi_full_instruction *insn)
break;
case TGSI_OPCODE_TXB2:
case TGSI_OPCODE_TXL2:
handleTEX(dst0, 2, 2, 0x10, 0x11, 0x00, 0x00);
handleTEX(dst0, 2, 2, 0x10, 0x0f, 0x00, 0x00);
break;
case TGSI_OPCODE_SAMPLE:
case TGSI_OPCODE_SAMPLE_B:
@@ -2523,6 +2538,13 @@ Converter::handleInstruction(const struct tgsi_full_instruction *insn)
mkCvt(OP_CVT, dstTy, dst0[c], srcTy, fetchSrc(0, c));
break;
case TGSI_OPCODE_EMIT:
/* export the saved viewport index */
if (viewport != NULL) {
Symbol *vpSym = mkSymbol(FILE_SHADER_OUTPUT, 0, TYPE_U32,
info->out[info->io.viewportId].slot[0] * 4);
mkStore(OP_EXPORT, TYPE_U32, vpSym, NULL, viewport);
}
/* fallthrough */
case TGSI_OPCODE_ENDPRIM:
// get vertex stream if specified (must be immediate)
src0 = tgsi.srcCount() ?
@@ -2952,6 +2974,11 @@ Converter::run()
mkOp1(OP_RCP, TYPE_F32, fragCoord[3], fragCoord[3]);
}
if (info->io.viewportId >= 0)
viewport = getScratch();
else
viewport = NULL;
for (ip = 0; ip < code->scan.num_instructions; ++ip) {
if (!handleInstruction(&code->insns[ip]))
return false;

View File

@@ -797,6 +797,16 @@ NV50LoweringPreSSA::handleTXB(TexInstruction *i)
const CondCode cc[4] = { CC_EQU, CC_S, CC_C, CC_O };
int l, d;
// We can't actually apply bias *and* do a compare for a cube
// texture. Since the compare has to be done before the filtering, just
// drop the bias on the floor.
if (i->tex.target == TEX_TARGET_CUBE_SHADOW) {
i->op = OP_TEX;
i->setSrc(3, i->getSrc(4));
i->setSrc(4, NULL);
return handleTEX(i);
}
handleTEX(i);
Value *bias = i->getSrc(i->tex.target.getArgCount());
if (bias->isUniform())

View File

@@ -814,6 +814,7 @@ NVC0LoweringPass::handleManualTXD(TexInstruction *i)
Value *zero = bld.loadImm(bld.getSSA(), 0);
int l, c;
const int dim = i->tex.target.getDim();
const int array = i->tex.target.isArray();
i->op = OP_TEX; // no need to clone dPdx/dPdy later
@@ -824,7 +825,7 @@ NVC0LoweringPass::handleManualTXD(TexInstruction *i)
for (l = 0; l < 4; ++l) {
// mov coordinates from lane l to all lanes
for (c = 0; c < dim; ++c)
bld.mkQuadop(0x00, crd[c], l, i->getSrc(c), zero);
bld.mkQuadop(0x00, crd[c], l, i->getSrc(c + array), zero);
// add dPdx from lane l to lanes dx
for (c = 0; c < dim; ++c)
bld.mkQuadop(qOps[l][0], crd[c], l, i->dPdx[c].get(), crd[c]);
@@ -834,7 +835,7 @@ NVC0LoweringPass::handleManualTXD(TexInstruction *i)
// texture
bld.insert(tex = cloneForward(func, i));
for (c = 0; c < dim; ++c)
tex->setSrc(c, crd[c]);
tex->setSrc(c + array, crd[c]);
// save results
for (c = 0; i->defExists(c); ++c) {
Instruction *mov;
@@ -870,7 +871,8 @@ NVC0LoweringPass::handleTXD(TexInstruction *txd)
if (dim > 2 ||
txd->tex.target.isCube() ||
arg > 4 ||
txd->tex.target.isShadow())
txd->tex.target.isShadow() ||
txd->tex.useOffsets)
return handleManualTXD(txd);
for (int c = 0; c < dim; ++c) {

View File

@@ -563,6 +563,7 @@ ConstantFolding::expr(Instruction *i,
} else {
i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary() */
}
i->subOp = 0;
}
void

View File

@@ -165,6 +165,9 @@ nv30_context_destroy(struct pipe_context *pipe)
if (nv30->draw)
draw_destroy(nv30->draw);
if (nv30->screen->base.pushbuf->user_priv == &nv30->bufctx)
nv30->screen->base.pushbuf->user_priv = NULL;
nouveau_bufctx_del(&nv30->bufctx);
if (nv30->screen->cur_ctx == nv30)

View File

@@ -325,6 +325,12 @@ nv30_screen_destroy(struct pipe_screen *pscreen)
nouveau_fence_ref(NULL, &screen->base.fence.current);
}
nouveau_bo_ref(NULL, &screen->notify);
nouveau_heap_destroy(&screen->query_heap);
nouveau_heap_destroy(&screen->vp_exec_heap);
nouveau_heap_destroy(&screen->vp_data_heap);
nouveau_object_del(&screen->query);
nouveau_object_del(&screen->fence);
nouveau_object_del(&screen->ntfy);

View File

@@ -23,6 +23,7 @@
*
*/
#include "util/u_format.h"
#include "util/u_helpers.h"
#include "util/u_inlines.h"
@@ -360,6 +361,22 @@ nv30_set_framebuffer_state(struct pipe_context *pipe,
nv30->framebuffer = *fb;
nv30->dirty |= NV30_NEW_FRAMEBUFFER;
/* Hardware can't handle different swizzled-ness or different blocksizes
* for zs and cbufs. If both are supplied and something doesn't match,
* blank out the zs for now so that at least *some* rendering can occur.
*/
if (fb->nr_cbufs > 0 && fb->zsbuf) {
struct nv30_miptree *color_mt = nv30_miptree(fb->cbufs[0]->texture);
struct nv30_miptree *zeta_mt = nv30_miptree(fb->zsbuf->texture);
if (color_mt->swizzled != zeta_mt->swizzled ||
(util_format_get_blocksize(fb->zsbuf->format) > 2) !=
(util_format_get_blocksize(fb->cbufs[0]->format) > 2)) {
nv30->framebuffer.zsbuf = NULL;
debug_printf("Mismatched color and zeta formats, ignoring zeta.\n");
}
}
}
static void

View File

@@ -1225,6 +1225,7 @@ out:
if(fpc)
{
FREE(fpc->r_temp);
FREE(fpc->r_imm);
util_dynarray_fini(&fpc->if_stack);
util_dynarray_fini(&fpc->label_relocs);
util_dynarray_fini(&fpc->imm_data);

View File

@@ -61,7 +61,7 @@ static void
nv50_memory_barrier(struct pipe_context *pipe, unsigned flags)
{
struct nv50_context *nv50 = nv50_context(pipe);
int i;
int i, s;
if (flags & PIPE_BARRIER_MAPPED_BUFFER) {
for (i = 0; i < nv50->num_vtxbufs; ++i) {
@@ -74,6 +74,26 @@ nv50_memory_barrier(struct pipe_context *pipe, unsigned flags)
if (nv50->idxbuf.buffer &&
nv50->idxbuf.buffer->flags & PIPE_RESOURCE_FLAG_MAP_PERSISTENT)
nv50->base.vbo_dirty = TRUE;
for (s = 0; s < 3 && !nv50->cb_dirty; ++s) {
uint32_t valid = nv50->constbuf_valid[s];
while (valid && !nv50->cb_dirty) {
const unsigned i = ffs(valid) - 1;
struct pipe_resource *res;
valid &= ~(1 << i);
if (nv50->constbuf[s][i].user)
continue;
res = nv50->constbuf[s][i].u.buf;
if (!res)
continue;
if (res->flags & PIPE_RESOURCE_FLAG_MAP_PERSISTENT)
nv50->cb_dirty = TRUE;
}
}
}
}
@@ -253,7 +273,14 @@ nv50_create(struct pipe_screen *pscreen, void *priv)
nv50->base.screen = &screen->base;
nv50->base.copy_data = nv50_m2mf_copy_linear;
nv50->base.push_data = nv50_sifc_linear_u8;
/* FIXME: Make it possible to use this again. The problem is that there is
* some clever logic in the card that allows for multiple renders to happen
* when there are only constbuf changes. However that relies on the
* constbuf updates happening to the right constbuf slots. Currently
* implementation just makes it go through a separate slot which doesn't
* properly update the right constbuf data.
nv50->base.push_cb = nv50_cb_push;
*/
nv50->screen = screen;
pipe->screen = pscreen;

View File

@@ -106,6 +106,7 @@ struct nv50_context {
struct nouveau_bufctx *bufctx;
uint32_t dirty;
boolean cb_dirty;
struct {
uint32_t instance_elts; /* bitmask of per-instance elements */

View File

@@ -1106,6 +1106,7 @@ nv50_blitctx_post_blit(struct nv50_blitctx *blit)
NV50_NEW_RASTERIZER | NV50_NEW_ZSA | NV50_NEW_BLEND |
NV50_NEW_TEXTURES | NV50_NEW_SAMPLERS |
NV50_NEW_VERTPROG | NV50_NEW_GMTYPROG | NV50_NEW_FRAGPROG);
nv50->scissors_dirty |= 1;
nv50->base.pipe.set_min_samples(&nv50->base.pipe, blit->saved.min_samples);
}

View File

@@ -747,7 +747,7 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
{
struct nv50_context *nv50 = nv50_context(pipe);
struct nouveau_pushbuf *push = nv50->base.pushbuf;
int i;
int i, s;
/* NOTE: caller must ensure that (min_index + index_bias) is >= 0 */
nv50->vb_elt_first = info->min_index + info->index_bias;
@@ -776,6 +776,33 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
push->kick_notify = nv50_draw_vbo_kick_notify;
for (s = 0; s < 3 && !nv50->cb_dirty; ++s) {
uint32_t valid = nv50->constbuf_valid[s];
while (valid && !nv50->cb_dirty) {
const unsigned i = ffs(valid) - 1;
struct pipe_resource *res;
valid &= ~(1 << i);
if (nv50->constbuf[s][i].user)
continue;
res = nv50->constbuf[s][i].u.buf;
if (!res)
continue;
if (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)
nv50->cb_dirty = TRUE;
}
}
/* If there are any coherent constbufs, flush the cache */
if (nv50->cb_dirty) {
BEGIN_NV04(push, NV50_3D(CODE_CB_FLUSH), 1);
PUSH_DATA (push, 0);
nv50->cb_dirty = FALSE;
}
if (nv50->vbo_fifo) {
nv50_push_vbo(nv50, info);
push->kick_notify = nv50_default_kick_notify;

View File

@@ -60,7 +60,7 @@ static void
nvc0_memory_barrier(struct pipe_context *pipe, unsigned flags)
{
struct nvc0_context *nvc0 = nvc0_context(pipe);
int i;
int i, s;
if (flags & PIPE_BARRIER_MAPPED_BUFFER) {
for (i = 0; i < nvc0->num_vtxbufs; ++i) {
@@ -73,6 +73,26 @@ nvc0_memory_barrier(struct pipe_context *pipe, unsigned flags)
if (nvc0->idxbuf.buffer &&
nvc0->idxbuf.buffer->flags & PIPE_RESOURCE_FLAG_MAP_PERSISTENT)
nvc0->base.vbo_dirty = TRUE;
for (s = 0; s < 5 && !nvc0->cb_dirty; ++s) {
uint32_t valid = nvc0->constbuf_valid[s];
while (valid && !nvc0->cb_dirty) {
const unsigned i = ffs(valid) - 1;
struct pipe_resource *res;
valid &= ~(1 << i);
if (nvc0->constbuf[s][i].user)
continue;
res = nvc0->constbuf[s][i].u.buf;
if (!res)
continue;
if (res->flags & PIPE_RESOURCE_FLAG_MAP_PERSISTENT)
nvc0->cb_dirty = TRUE;
}
}
}
}

View File

@@ -154,6 +154,8 @@ struct nvc0_context {
struct nvc0_constbuf constbuf[6][NVC0_MAX_PIPE_CONSTBUFS];
uint16_t constbuf_dirty[6];
uint16_t constbuf_valid[6];
boolean cb_dirty;
struct pipe_vertex_buffer vtxbuf[PIPE_MAX_ATTRIBS];
unsigned num_vtxbufs;

View File

@@ -171,7 +171,7 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
return 0;
case PIPE_CAP_COMPUTE:
return (class_3d >= NVE4_3D_CLASS) ? 1 : 0;
return (class_3d == NVE4_3D_CLASS) ? 1 : 0;
case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
return 1;
case PIPE_CAP_TEXTURE_BORDER_COLOR_QUIRK:
@@ -211,7 +211,7 @@ nvc0_screen_get_shader_param(struct pipe_screen *pscreen, unsigned shader,
case PIPE_SHADER_FRAGMENT:
break;
case PIPE_SHADER_COMPUTE:
if (class_3d < NVE4_3D_CLASS)
if (class_3d != NVE4_3D_CLASS)
return 0;
break;
default:
@@ -514,9 +514,10 @@ nvc0_screen_init_compute(struct nvc0_screen *screen)
return nvc0_screen_compute_setup(screen, screen->base.pushbuf);
return 0;
case 0xe0:
return nve4_screen_compute_setup(screen, screen->base.pushbuf);
case 0xf0:
case 0x100:
return nve4_screen_compute_setup(screen, screen->base.pushbuf);
return 0;
default:
return -1;
}

View File

@@ -808,10 +808,15 @@ nvc0_set_constant_buffer(struct pipe_context *pipe, uint shader, uint index,
if (nvc0->constbuf[s][i].user) {
nvc0->constbuf[s][i].u.data = cb->user_buffer;
nvc0->constbuf[s][i].size = cb->buffer_size;
nvc0->constbuf_valid[s] |= 1 << i;
} else
if (cb) {
nvc0->constbuf[s][i].offset = cb->buffer_offset;
nvc0->constbuf[s][i].size = align(cb->buffer_size, 0x100);
nvc0->constbuf_valid[s] |= 1 << i;
}
else {
nvc0->constbuf_valid[s] &= ~(1 << i);
}
}

View File

@@ -543,9 +543,22 @@ nvc0_blitter_make_vp(struct nvc0_blitter *blit)
0x03f01c46, 0x0a7e0080, /* export b96 o[0x80] $r0:$r1:$r2 */
0x00001de7, 0x80000000, /* exit */
};
static const uint32_t code_gk110[] =
{
0x00000000, 0x08000000, /* sched */
0x401ffc12, 0x7ec7fc00, /* ld b64 $r4d a[0x80] 0x0 0x0 */
0x481ffc02, 0x7ecbfc00, /* ld b96 $r0t a[0x90] 0x0 0x0 */
0x381ffc12, 0x7f07fc00, /* st b64 a[0x70] $r4d 0x0 0x0 */
0x401ffc02, 0x7f0bfc00, /* st b96 a[0x80] $r0t 0x0 0x0 */
0x001c003c, 0x18000000, /* exit */
};
blit->vp.type = PIPE_SHADER_VERTEX;
blit->vp.translated = TRUE;
if (blit->screen->base.class_3d >= NVF0_3D_CLASS) {
blit->vp.code = (uint32_t *)code_gk110; /* const_cast */
blit->vp.code_size = sizeof(code_gk110);
} else
if (blit->screen->base.class_3d >= NVE4_3D_CLASS) {
blit->vp.code = (uint32_t *)code_nve4; /* const_cast */
blit->vp.code_size = sizeof(code_nve4);

View File

@@ -797,7 +797,7 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
{
struct nvc0_context *nvc0 = nvc0_context(pipe);
struct nouveau_pushbuf *push = nvc0->base.pushbuf;
int i;
int i, s;
/* NOTE: caller must ensure that (min_index + index_bias) is >= 0 */
nvc0->vb_elt_first = info->min_index + info->index_bias;
@@ -830,6 +830,31 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
push->kick_notify = nvc0_draw_vbo_kick_notify;
for (s = 0; s < 5 && !nvc0->cb_dirty; ++s) {
uint32_t valid = nvc0->constbuf_valid[s];
while (valid && !nvc0->cb_dirty) {
const unsigned i = ffs(valid) - 1;
struct pipe_resource *res;
valid &= ~(1 << i);
if (nvc0->constbuf[s][i].user)
continue;
res = nvc0->constbuf[s][i].u.buf;
if (!res)
continue;
if (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)
nvc0->cb_dirty = TRUE;
}
}
if (nvc0->cb_dirty) {
IMMED_NVC0(push, NVC0_3D(MEM_BARRIER), 0x1011);
nvc0->cb_dirty = FALSE;
}
if (nvc0->state.vbo_mode) {
nvc0_push_vbo(nvc0, info);
push->kick_notify = nvc0_default_kick_notify;

View File

@@ -80,8 +80,9 @@ NVC0_FIFO_PKHDR_NI(int subc, int mthd, unsigned size)
}
static INLINE uint32_t
NVC0_FIFO_PKHDR_IL(int subc, int mthd, uint8_t data)
NVC0_FIFO_PKHDR_IL(int subc, int mthd, uint16_t data)
{
assert(data < 0x2000);
return 0x80000000 | (data << 16) | (subc << 13) | (mthd >> 2);
}
@@ -133,7 +134,7 @@ BEGIN_1IC0(struct nouveau_pushbuf *push, int subc, int mthd, unsigned size)
}
static INLINE void
IMMED_NVC0(struct nouveau_pushbuf *push, int subc, int mthd, uint8_t data)
IMMED_NVC0(struct nouveau_pushbuf *push, int subc, int mthd, uint16_t data)
{
#ifndef NVC0_PUSH_EXPLICIT_SPACE_CHECKING
PUSH_SPACE(push, 1);

View File

@@ -789,7 +789,8 @@ static bool do_hardware_msaa_resolve(struct pipe_context *ctx,
info->src.box.width == dst_width &&
info->src.box.height == dst_height &&
info->src.box.depth == 1 &&
dst->surface.level[info->dst.level].mode >= RADEON_SURF_MODE_1D) {
dst->surface.level[info->dst.level].mode >= RADEON_SURF_MODE_1D &&
(!dst->cmask.size || !dst->dirty_level_mask) /* dst cannot be fast-cleared */) {
r600_blitter_begin(ctx, R600_COLOR_RESOLVE);
util_blitter_custom_resolve_color(rctx->blitter,
info->dst.resource, info->dst.level,

View File

@@ -1235,6 +1235,9 @@ void evergreen_do_fast_color_clear(struct r600_common_context *rctx,
{
int i;
if (rctx->current_render_cond)
return;
for (i = 0; i < fb->nr_cbufs; i++) {
struct r600_texture *tex;
unsigned clear_bit = PIPE_CLEAR_COLOR0 << i;

View File

@@ -100,13 +100,17 @@ LLVMModuleRef radeon_llvm_get_kernel_module(LLVMContextRef ctx, unsigned index,
kernel_metadata = MALLOC(num_kernels * sizeof(LLVMValueRef));
LLVMGetNamedMetadataOperands(mod, "opencl.kernels", kernel_metadata);
for (i = 0; i < num_kernels; i++) {
LLVMValueRef kernel_signature, kernel_function;
LLVMValueRef kernel_signature, *kernel_function;
unsigned num_kernel_md_operands;
if (i == index) {
continue;
}
kernel_signature = kernel_metadata[i];
LLVMGetMDNodeOperands(kernel_signature, &kernel_function);
LLVMDeleteFunction(kernel_function);
num_kernel_md_operands = LLVMGetMDNodeNumOperands(kernel_signature);
kernel_function = MALLOC(num_kernel_md_operands * sizeof (LLVMValueRef));
LLVMGetMDNodeOperands(kernel_signature, kernel_function);
LLVMDeleteFunction(*kernel_function);
FREE(kernel_function);
}
FREE(kernel_metadata);
radeon_llvm_optimize(mod);

View File

@@ -1378,7 +1378,11 @@ void radeon_llvm_context_init(struct radeon_llvm_context * ctx)
bld_base->op_actions[TGSI_OPCODE_UCMP].emit = emit_ucmp;
bld_base->rsq_action.emit = build_tgsi_intrinsic_nomem;
#if HAVE_LLVM >= 0x0305
bld_base->rsq_action.intr_name = "llvm.AMDGPU.rsq.clamped.f32";
#else
bld_base->rsq_action.intr_name = "llvm.AMDGPU.rsq";
#endif
}
void radeon_llvm_create_func(struct radeon_llvm_context * ctx,

View File

@@ -242,7 +242,10 @@ int rvid_get_video_param(struct pipe_screen *screen,
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
/* no support for MPEG4 */
return codec != PIPE_VIDEO_FORMAT_MPEG4;
return codec != PIPE_VIDEO_FORMAT_MPEG4 &&
/* FIXME: VC-1 simple/main profile is broken */
profile != PIPE_VIDEO_PROFILE_VC1_SIMPLE &&
profile != PIPE_VIDEO_PROFILE_VC1_MAIN;
case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED:
/* and MPEG2 only with shaders */

View File

@@ -689,7 +689,8 @@ static bool do_hardware_msaa_resolve(struct pipe_context *ctx,
info->src.box.height == dst_height &&
info->src.box.depth == 1 &&
dst->surface.level[info->dst.level].mode >= RADEON_SURF_MODE_1D &&
!(dst->surface.flags & RADEON_SURF_SCANOUT)) {
!(dst->surface.flags & RADEON_SURF_SCANOUT) &&
(!dst->cmask.size || !dst->dirty_level_mask) /* dst cannot be fast-cleared */) {
si_blitter_begin(ctx, SI_COLOR_RESOLVE);
util_blitter_custom_resolve_color(sctx->blitter,
info->dst.resource, info->dst.level,

View File

@@ -1539,9 +1539,8 @@ static void tex_fetch_args(
/* Pack LOD bias value */
if (opcode == TGSI_OPCODE_TXB)
address[count++] = coords[3];
if (target == TGSI_TEXTURE_CUBE || target == TGSI_TEXTURE_SHADOWCUBE)
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, coords);
if (opcode == TGSI_OPCODE_TXB2)
address[count++] = lp_build_emit_fetch(bld_base, inst, 1, 0);
/* Pack depth comparison value */
switch (target) {
@@ -1558,6 +1557,9 @@ static void tex_fetch_args(
address[count++] = lp_build_emit_fetch(bld_base, inst, 1, 0);
}
if (target == TGSI_TEXTURE_CUBE || target == TGSI_TEXTURE_SHADOWCUBE)
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, coords);
/* Pack user derivatives */
if (opcode == TGSI_OPCODE_TXD) {
for (chan = 0; chan < 2; chan++) {
@@ -2497,6 +2499,7 @@ int si_pipe_shader_create(
bld_base->op_actions[TGSI_OPCODE_TEX] = tex_action;
bld_base->op_actions[TGSI_OPCODE_TXB] = txb_action;
bld_base->op_actions[TGSI_OPCODE_TXB2] = txb_action;
#if HAVE_LLVM >= 0x0304
bld_base->op_actions[TGSI_OPCODE_TXD] = txd_action;
#endif

View File

@@ -117,12 +117,13 @@ namespace {
#endif
llvm::Module *
compile(const std::string &source, const std::string &name,
const std::string &triple, const std::string &processor,
const std::string &opts, clang::LangAS::Map& address_spaces) {
compile(llvm::LLVMContext &llvm_ctx, const std::string &source,
const std::string &name, const std::string &triple,
const std::string &processor, const std::string &opts,
clang::LangAS::Map& address_spaces) {
clang::CompilerInstance c;
clang::EmitLLVMOnlyAction act(&llvm::getGlobalContext());
clang::EmitLLVMOnlyAction act(&llvm_ctx);
std::string log;
llvm::raw_string_ostream s_log(log);
std::string libclc_path = LIBCLC_LIBEXECDIR + processor + "-"
@@ -187,6 +188,11 @@ namespace {
c.getLangOpts().NoBuiltin = true;
c.getTargetOpts().Triple = triple;
c.getTargetOpts().CPU = processor;
// This is a workaround for a Clang bug which causes the number
// of warnings and errors to be printed to stderr.
// http://www.llvm.org/bugs/show_bug.cgi?id=19735
c.getDiagnosticOpts().ShowCarets = false;
#if HAVE_LLVM <= 0x0301
c.getInvocation().setLangDefaults(clang::IK_OpenCL);
#else
@@ -394,10 +400,12 @@ clover::compile_program_llvm(const compat::string &source,
target.size() - processor_str_len - 1);
clang::LangAS::Map address_spaces;
llvm::LLVMContext llvm_ctx;
// The input file name must have the .cl extension in order for the
// CompilerInvocation class to recognize it as an OpenCL source file.
llvm::Module *mod = compile(source, "input.cl", triple, processor, opts,
address_spaces);
llvm::Module *mod = compile(llvm_ctx, source, "input.cl", triple, processor,
opts, address_spaces);
find_kernels(mod, kernels);

View File

@@ -472,6 +472,9 @@ xa_composite_prepare(struct xa_context *ctx,
struct xa_surface *dst_srf = comp->dst->srf;
int ret;
if (comp->mask && !comp->mask->srf)
return -XA_ERR_INVAL;
ret = xa_ctx_srf_create(ctx, dst_srf);
if (ret != XA_ERR_NONE)
return ret;

View File

@@ -26,6 +26,7 @@
* Thomas Hellstrom <thellstrom-at-vmware-dot-com>
*/
#include <unistd.h>
#include "xa_tracker.h"
#include "xa_priv.h"
#include "pipe/p_state.h"
@@ -140,11 +141,15 @@ xa_tracker_create(int drm_fd)
struct xa_tracker *xa = calloc(1, sizeof(struct xa_tracker));
enum xa_surface_type stype;
unsigned int num_formats;
int loader_fd;
if (!xa)
return NULL;
if (pipe_loader_drm_probe_fd(&xa->dev, drm_fd, false))
loader_fd = dup(drm_fd);
if (loader_fd == -1)
return NULL;
if (pipe_loader_drm_probe_fd(&xa->dev, loader_fd, false))
xa->screen = pipe_loader_create_screen(xa->dev, PIPE_SEARCH_DIR);
if (!xa->screen)
goto out_no_screen;

View File

@@ -66,6 +66,9 @@ libxatracker_la_LDFLAGS = \
$(GC_SECTIONS) \
$(LD_NO_UNDEFINED)
libxatracker_la_LDFLAGS += \
-Wl,--version-script=$(top_srcdir)/src/gallium/targets/xa/xa.sym
if HAVE_MESA_LLVM
libxatracker_la_LIBADD += $(LLVM_LIBS)
libxatracker_la_LDFLAGS += $(LLVM_LDFLAGS)

View File

@@ -0,0 +1,38 @@
{
global:
xa_composite_allocation;
xa_composite_check_accelerated;
xa_composite_done;
xa_composite_prepare;
xa_composite_rect;
xa_context_create;
xa_context_default;
xa_context_destroy;
xa_context_flush;
xa_copy;
xa_copy_done;
xa_copy_prepare;
xa_fence_get;
xa_fence_wait;
xa_fence_destroy;
xa_format_check_supported;
xa_solid;
xa_solid_done;
xa_solid_prepare;
xa_surface_create;
xa_surface_dma;
xa_surface_format;
xa_surface_from_handle;
xa_surface_handle;
xa_surface_map;
xa_surface_redefine;
xa_surface_ref;
xa_surface_unmap;
xa_surface_unref;
xa_tracker_create;
xa_tracker_destroy;
xa_tracker_version;
xa_yuv_planar_blit;
local:
*;
};

View File

@@ -1,4 +1,5 @@
#include <sys/stat.h>
#include <unistd.h>
#include "pipe/p_context.h"
#include "pipe/p_state.h"
#include "util/u_format.h"
@@ -59,7 +60,7 @@ nouveau_drm_screen_create(int fd)
struct nouveau_device *dev = NULL;
struct pipe_screen *(*init)(struct nouveau_device *);
struct nouveau_screen *screen;
int ret;
int ret, dupfd = -1;
pipe_mutex_lock(nouveau_screen_mutex);
if (!fd_tab) {
@@ -75,7 +76,17 @@ nouveau_drm_screen_create(int fd)
return &screen->base;
}
ret = nouveau_device_wrap(fd, 0, &dev);
/* Since the screen re-use is based on the device node and not the fd,
* create a copy of the fd to be owned by the device. Otherwise a
* scenario could occur where two screens are created, and the first
* one is shut down, along with the fd being closed. The second
* (identical) screen would now have a reference to the closed fd. We
* avoid this by duplicating the original fd. Note that
* nouveau_device_wrap does not close the fd in case of a device
* creation error.
*/
dupfd = dup(fd);
ret = nouveau_device_wrap(dupfd, 1, &dev);
if (ret)
goto err;
@@ -114,6 +125,10 @@ nouveau_drm_screen_create(int fd)
return &screen->base;
err:
if (dev)
nouveau_device_del(&dev);
else if (dupfd >= 0)
close(dupfd);
pipe_mutex_unlock(nouveau_screen_mutex);
return NULL;
}

View File

@@ -131,10 +131,14 @@ DRI2WireToEvent(Display *dpy, XEvent *event, xEvent *wire)
aevent->msc = ((CARD64)awire->msc_hi << 32) | awire->msc_lo;
glxDraw = GetGLXDrawable(dpy, pdraw->drawable);
if (awire->sbc < glxDraw->lastEventSbc)
glxDraw->eventSbcWrap += 0x100000000;
glxDraw->lastEventSbc = awire->sbc;
aevent->sbc = awire->sbc + glxDraw->eventSbcWrap;
if (glxDraw != NULL) {
if (awire->sbc < glxDraw->lastEventSbc)
glxDraw->eventSbcWrap += 0x100000000;
glxDraw->lastEventSbc = awire->sbc;
aevent->sbc = awire->sbc + glxDraw->eventSbcWrap;
} else {
aevent->sbc = awire->sbc;
}
return True;
}

View File

@@ -45,8 +45,8 @@
#ifndef RTLD_NOW
#define RTLD_NOW 0
#endif
#ifndef RTLD_LOCAL
#define RTLD_LOCAL 0
#ifndef RTLD_GLOBAL
#define RTLD_GLOBAL 0
#endif
_X_HIDDEN void
@@ -99,7 +99,7 @@ driOpenDriver(const char *driverName)
int len;
/* Attempt to make sure libGL symbols will be visible to the driver */
glhandle = dlopen("libGL.so.1", RTLD_NOW | RTLD_LOCAL);
glhandle = dlopen("libGL.so.1", RTLD_NOW | RTLD_GLOBAL);
libPaths = NULL;
if (geteuid() == getuid()) {
@@ -127,14 +127,14 @@ driOpenDriver(const char *driverName)
snprintf(realDriverName, sizeof realDriverName,
"%.*s/tls/%s_dri.so", len, p, driverName);
InfoMessageF("OpenDriver: trying %s\n", realDriverName);
handle = dlopen(realDriverName, RTLD_NOW | RTLD_LOCAL);
handle = dlopen(realDriverName, RTLD_NOW | RTLD_GLOBAL);
#endif
if (handle == NULL) {
snprintf(realDriverName, sizeof realDriverName,
"%.*s/%s_dri.so", len, p, driverName);
InfoMessageF("OpenDriver: trying %s\n", realDriverName);
handle = dlopen(realDriverName, RTLD_NOW | RTLD_LOCAL);
handle = dlopen(realDriverName, RTLD_NOW | RTLD_GLOBAL);
}
if (handle != NULL)

View File

@@ -134,14 +134,15 @@ __glXWireToEvent(Display *dpy, XEvent *event, xEvent *wire)
GLXBufferSwapComplete *aevent = (GLXBufferSwapComplete *)event;
xGLXBufferSwapComplete2 *awire = (xGLXBufferSwapComplete2 *)wire;
struct glx_drawable *glxDraw = GetGLXDrawable(dpy, awire->drawable);
aevent->event_type = awire->event_type;
aevent->drawable = awire->drawable;
aevent->ust = ((CARD64)awire->ust_hi << 32) | awire->ust_lo;
aevent->msc = ((CARD64)awire->msc_hi << 32) | awire->msc_lo;
if (!glxDraw)
return False;
aevent->event_type = awire->event_type;
aevent->drawable = glxDraw->xDrawable;
aevent->ust = ((CARD64)awire->ust_hi << 32) | awire->ust_lo;
aevent->msc = ((CARD64)awire->msc_hi << 32) | awire->msc_lo;
if (awire->sbc < glxDraw->lastEventSbc)
glxDraw->eventSbcWrap += 0x100000000;
glxDraw->lastEventSbc = awire->sbc;

View File

@@ -47,10 +47,16 @@ ifeq ($(TARGET_ARCH),x86)
endif # x86
endif # MESA_ENABLE_ASM
ifeq ($(ARCH_X86_HAVE_SSE4_1),true)
LOCAL_SRC_FILES += \
$(SRCDIR)main/streaming-load-memcpy.c
endif
LOCAL_C_INCLUDES := \
$(call intermediates-dir-for STATIC_LIBRARIES,libmesa_program,,) \
$(MESA_TOP)/src/mapi \
$(MESA_TOP)/src/glsl
$(MESA_TOP)/src/glsl \
$(MESA_TOP)/src/gallium/auxiliary
LOCAL_WHOLE_STATIC_LIBRARIES := \
libmesa_program

View File

@@ -56,6 +56,7 @@ include $(CLEAR_VARS)
LOCAL_MODULE := libmesa_glsl_utils
LOCAL_IS_HOST_MODULE := true
LOCAL_CFLAGS := -D_POSIX_C_SOURCE=199309L
LOCAL_C_INCLUDES := \
$(MESA_TOP)/src/glsl \

View File

@@ -33,6 +33,7 @@ include $(CLEAR_VARS)
LOCAL_MODULE := mesa_gen_matypes
LOCAL_IS_HOST_MODULE := true
LOCAL_CFLAGS := -D_POSIX_C_SOURCE=199309L
LOCAL_C_INCLUDES := \
$(MESA_TOP)/src/mapi \

View File

@@ -801,7 +801,7 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state)
int buf, real_color_buffers = 0;
memset(save->ColorDrawBuffers, 0, sizeof(save->ColorDrawBuffers));
for (buf = 0; buf < MAX_DRAW_BUFFERS; buf++) {
for (buf = 0; buf < ctx->Const.MaxDrawBuffers; buf++) {
int buf_index = ctx->DrawBuffer->_ColorDrawBufferIndexes[buf];
if (buf_index == -1)
continue;
@@ -1213,7 +1213,7 @@ _mesa_meta_end(struct gl_context *ctx)
_mesa_BindRenderbuffer(GL_RENDERBUFFER, save->RenderbufferName);
if (state & MESA_META_DRAW_BUFFERS) {
_mesa_DrawBuffers(MAX_DRAW_BUFFERS, save->ColorDrawBuffers);
_mesa_DrawBuffers(ctx->Const.MaxDrawBuffers, save->ColorDrawBuffers);
}
ctx->Meta->SaveStackDepth--;

View File

@@ -405,7 +405,7 @@ blitframebuffer_texture(struct gl_context *ctx,
}
} else {
GLenum tex_base_format;
int srcW = abs(srcY1 - srcY0);
int srcW = abs(srcX1 - srcX0);
int srcH = abs(srcY1 - srcY0);
/* Fall back to doing a CopyTexSubImage to get the destination
* renderbuffer into a texture.

View File

@@ -43,6 +43,7 @@ MESA_DRI_C_INCLUDES := \
MESA_DRI_WHOLE_STATIC_LIBRARIES := \
libmesa_glsl \
libmegadriver_stub \
libmesa_dri_common \
libmesa_dricore

View File

@@ -86,3 +86,20 @@ $(intermediates)/xmlpool/options.h: $$(PRIVATE_SCRIPT) $$(PRIVATE_TEMPLATE_HEADE
include $(MESA_COMMON_MK)
include $(BUILD_STATIC_LIBRARY)
#
# Build libmegadriver_stub
#
include $(CLEAR_VARS)
include $(LOCAL_PATH)/Makefile.sources
LOCAL_MODULE := libmegadriver_stub
LOCAL_MODULE_CLASS := STATIC_LIBRARIES
LOCAL_C_INCLUDES := \
$(MESA_DRI_C_INCLUDES)
LOCAL_SRC_FILES := $(megadriver_stub_FILES)
include $(MESA_COMMON_MK)
include $(BUILD_STATIC_LIBRARY)

View File

@@ -42,7 +42,7 @@ libdricommon_la_SOURCES = $(DRI_COMMON_FILES)
libdri_test_stubs_la_SOURCES = $(test_stubs_FILES)
libdri_test_stubs_la_CFLAGS = $(AM_CFLAGS) -DNO_MAIN
libmegadriver_stub_la_SOURCES = megadriver_stub.c
libmegadriver_stub_la_SOURCES = $(megadriver_stub_FILES)
sysconf_DATA = drirc

View File

@@ -14,3 +14,6 @@ mesa_dri_common_INCLUDES := \
test_stubs_FILES := \
dri_test.c
megadriver_stub_FILES := \
megadriver_stub.c

View File

@@ -445,7 +445,7 @@ i830EmitTextureBlend(struct i830_context *i830)
I830_ACTIVESTATE(i830, I830_UPLOAD_TEXBLEND_ALL, false);
if (ctx->Texture._MaxEnabledTexImageUnit != -1) {
for (unit = 0; unit < ctx->Texture._MaxEnabledTexImageUnit; unit++)
for (unit = 0; unit <= ctx->Texture._MaxEnabledTexImageUnit; unit++)
if (ctx->Texture.Unit[unit]._Current)
emit_texblend(i830, unit, blendunit++,
unit == ctx->Texture._MaxEnabledTexImageUnit);

View File

@@ -1115,6 +1115,9 @@ struct brw_context
struct brw_cache cache;
/** IDs for meta stencil blit shader programs. */
unsigned meta_stencil_blit_programs[2];
/* Whether a meta-operation is in progress. */
bool meta_in_progress;

View File

@@ -580,6 +580,16 @@
#define GEN7_SURFACE_MCS_ENABLE (1 << 0)
#define GEN7_SURFACE_MCS_PITCH_SHIFT 3
#define GEN7_SURFACE_MCS_PITCH_MASK INTEL_MASK(11, 3)
#define GEN8_SURFACE_AUX_QPITCH_SHIFT 16
#define GEN8_SURFACE_AUX_QPITCH_MASK INTEL_MASK(30, 16)
#define GEN8_SURFACE_AUX_PITCH_SHIFT 3
#define GEN8_SURFACE_AUX_PITCH_MASK INTEL_MASK(11, 3)
#define GEN8_SURFACE_AUX_MODE_MASK INTEL_MASK(2, 0)
#define GEN8_SURFACE_AUX_MODE_NONE 0
#define GEN8_SURFACE_AUX_MODE_MCS 1
#define GEN8_SURFACE_AUX_MODE_APPEND 2
#define GEN8_SURFACE_AUX_MODE_HIZ 3
/* Surface state DW7 */
#define GEN7_SURFACE_CLEAR_COLOR_SHIFT 28
@@ -606,6 +616,7 @@
#define BRW_TEXCOORDMODE_CUBE 3
#define BRW_TEXCOORDMODE_CLAMP_BORDER 4
#define BRW_TEXCOORDMODE_MIRROR_ONCE 5
#define GEN8_TEXCOORDMODE_HALF_BORDER 6
#define BRW_THREAD_PRIORITY_NORMAL 0
#define BRW_THREAD_PRIORITY_HIGH 1

View File

@@ -192,33 +192,44 @@ static const struct brw_device_info brw_device_info_hsw_gt3 = {
},
};
/* Thread counts and URB limits are placeholders, and may not be accurate. */
#define GEN8_FEATURES \
.gen = 8, \
.has_hiz_and_separate_stencil = true, \
.must_use_separate_stencil = true, \
.has_llc = true, \
.has_pln = true, \
.max_vs_threads = 280, \
.max_gs_threads = 256, \
.max_wm_threads = 408, \
.urb = { \
.size = 128, \
.min_vs_entries = 64, \
.max_vs_entries = 1664, \
.max_gs_entries = 640, \
}
.max_vs_threads = 504, \
.max_gs_threads = 504, \
.max_wm_threads = 384 \
static const struct brw_device_info brw_device_info_bdw_gt1 = {
GEN8_FEATURES, .gt = 1,
.urb = {
.size = 192,
.min_vs_entries = 64,
.max_vs_entries = 2560,
.max_gs_entries = 960,
}
};
static const struct brw_device_info brw_device_info_bdw_gt2 = {
GEN8_FEATURES, .gt = 2,
.urb = {
.size = 384,
.min_vs_entries = 64,
.max_vs_entries = 2560,
.max_gs_entries = 960,
}
};
static const struct brw_device_info brw_device_info_bdw_gt3 = {
GEN8_FEATURES, .gt = 3,
.urb = {
.size = 384,
.min_vs_entries = 64,
.max_vs_entries = 2560,
.max_gs_entries = 960,
}
};
/* Thread counts and URB limits are placeholders, and may not be accurate.

View File

@@ -1263,19 +1263,21 @@ fs_visitor::emit_samplepos_setup(ir_variable *ir)
stride(retype(brw_vec1_grf(c->sample_pos_reg, 0),
BRW_REGISTER_TYPE_B), 16, 8, 2);
emit(MOV(int_sample_x, fs_reg(sample_pos_reg)));
fs_inst *inst = emit(MOV(int_sample_x, fs_reg(sample_pos_reg)));
if (dispatch_width == 16) {
fs_inst *inst = emit(MOV(half(int_sample_x, 1),
fs_reg(suboffset(sample_pos_reg, 16))));
inst->force_uncompressed = true;
inst = emit(MOV(half(int_sample_x, 1),
fs_reg(suboffset(sample_pos_reg, 16))));
inst->force_sechalf = true;
}
/* Compute gl_SamplePosition.x */
compute_sample_position(pos, int_sample_x);
pos.reg_offset++;
emit(MOV(int_sample_y, fs_reg(suboffset(sample_pos_reg, 1))));
inst = emit(MOV(int_sample_y, fs_reg(suboffset(sample_pos_reg, 1))));
if (dispatch_width == 16) {
fs_inst *inst = emit(MOV(half(int_sample_y, 1),
fs_reg(suboffset(sample_pos_reg, 17))));
inst->force_uncompressed = true;
inst = emit(MOV(half(int_sample_y, 1),
fs_reg(suboffset(sample_pos_reg, 17))));
inst->force_sechalf = true;
}
/* Compute gl_SamplePosition.y */
@@ -1310,12 +1312,16 @@ fs_visitor::emit_sampleid_setup(ir_variable *ir)
* and then reading from it using vstride=1, width=4, hstride=0.
* These computations hold good for 4x multisampling as well.
*/
emit(BRW_OPCODE_AND, t1,
fs_reg(retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_D)),
fs_reg(brw_imm_d(0xc0)));
emit(BRW_OPCODE_SHR, t1, t1, fs_reg(5));
fs_inst *inst;
inst = emit(BRW_OPCODE_AND, t1,
fs_reg(retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_UD)),
fs_reg(0xc0));
inst->force_writemask_all = true;
inst = emit(BRW_OPCODE_SHR, t1, t1, fs_reg(5));
inst->force_writemask_all = true;
/* This works for both SIMD8 and SIMD16 */
emit(MOV(t2, brw_imm_v(0x3210)));
inst = emit(MOV(t2, brw_imm_v(0x3210)));
inst->force_writemask_all = true;
/* This special instruction takes care of setting vstride=1,
* width=4, hstride=0 of t2 during an ADD instruction.
*/
@@ -1393,7 +1399,7 @@ fs_visitor::emit_math(enum opcode opcode, fs_reg dst, fs_reg src)
* Gen 6 hardware ignores source modifiers (negate and abs) on math
* instructions, so we also move to a temp to set those up.
*/
if (brw->gen >= 6)
if (brw->gen == 6 || brw->gen == 7)
src = fix_math_operand(src);
fs_inst *inst = emit(opcode, dst, src);
@@ -1425,7 +1431,9 @@ fs_visitor::emit_math(enum opcode opcode, fs_reg dst, fs_reg src0, fs_reg src1)
return NULL;
}
if (brw->gen >= 6) {
if (brw->gen >= 8) {
inst = emit(opcode, dst, src0, src1);
} else if (brw->gen >= 6) {
src0 = fix_math_operand(src0);
src1 = fix_math_operand(src1);
@@ -2406,7 +2414,7 @@ fs_visitor::insert_gen4_pre_send_dependency_workarounds(fs_inst *inst)
* program.
*/
for (fs_inst *scan_inst = (fs_inst *)inst->prev;
scan_inst != NULL;
!scan_inst->is_head_sentinel();
scan_inst = (fs_inst *)scan_inst->prev) {
/* If we hit control flow, assume that there *are* outstanding
@@ -2533,6 +2541,8 @@ fs_visitor::insert_gen4_send_dependency_workarounds()
if (brw->gen != 4 || brw->is_g4x)
return;
bool progress = false;
/* Note that we're done with register allocation, so GRF fs_regs always
* have a .reg_offset of 0.
*/
@@ -2543,8 +2553,12 @@ fs_visitor::insert_gen4_send_dependency_workarounds()
if (inst->mlen != 0 && inst->dst.file == GRF) {
insert_gen4_pre_send_dependency_workarounds(inst);
insert_gen4_post_send_dependency_workarounds(inst);
progress = true;
}
}
if (progress)
invalidate_live_intervals();
}
/**

View File

@@ -368,7 +368,6 @@ public:
bool opt_cse_local(bblock_t *block, exec_list *aeb);
bool opt_copy_propagate();
bool try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry);
bool try_constant_propagate(fs_inst *inst, acp_entry *entry);
bool opt_copy_propagate_local(void *mem_ctx, bblock_t *block,
exec_list *acp);
void opt_drop_redundant_mov_to_flags();

View File

@@ -273,6 +273,15 @@ fs_copy_prop_dataflow::dump_block_data() const
}
}
static bool
is_logic_op(enum opcode opcode)
{
return (opcode == BRW_OPCODE_AND ||
opcode == BRW_OPCODE_OR ||
opcode == BRW_OPCODE_XOR ||
opcode == BRW_OPCODE_NOT);
}
bool
fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
{
@@ -331,6 +340,11 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
if (has_source_modifiers && entry->dst.type != inst->src[arg].type)
return false;
if (brw->gen >= 8 && (entry->src.negate || entry->src.abs) &&
is_logic_op(inst->opcode)) {
return false;
}
inst->src[arg].file = entry->src.file;
inst->src[arg].reg = entry->src.reg;
inst->src[arg].reg_offset = entry->src.reg_offset;
@@ -346,8 +360,9 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
}
bool
fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry *entry)
static bool
try_constant_propagate(struct brw_context *brw, fs_inst *inst,
acp_entry *entry)
{
bool progress = false;
@@ -375,6 +390,12 @@ fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry *entry)
progress = true;
break;
case SHADER_OPCODE_POW:
case SHADER_OPCODE_INT_QUOTIENT:
case SHADER_OPCODE_INT_REMAINDER:
if (brw->gen < 8)
break;
/* fallthrough */
case BRW_OPCODE_BFI1:
case BRW_OPCODE_ASR:
case BRW_OPCODE_SHL:
@@ -479,6 +500,22 @@ fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry *entry)
return progress;
}
static bool
can_propagate_from(fs_inst *inst)
{
return (inst->opcode == BRW_OPCODE_MOV &&
inst->dst.file == GRF &&
((inst->src[0].file == GRF &&
(inst->src[0].reg != inst->dst.reg ||
inst->src[0].reg_offset != inst->dst.reg_offset)) ||
inst->src[0].file == UNIFORM ||
inst->src[0].file == IMM) &&
inst->src[0].type == inst->dst.type &&
!inst->saturate &&
!inst->is_partial_write());
}
/* Walks a basic block and does copy propagation on it using the acp
* list.
*/
@@ -500,7 +537,7 @@ fs_visitor::opt_copy_propagate_local(void *copy_prop_ctx, bblock_t *block,
foreach_list(entry_node, &acp[inst->src[i].reg % ACP_HASH_SIZE]) {
acp_entry *entry = (acp_entry *)entry_node;
if (try_constant_propagate(inst, entry))
if (try_constant_propagate(brw, inst, entry))
progress = true;
if (try_copy_propagate(inst, i, entry))
@@ -533,16 +570,7 @@ fs_visitor::opt_copy_propagate_local(void *copy_prop_ctx, bblock_t *block,
/* If this instruction's source could potentially be folded into the
* operand of another instruction, add it to the ACP.
*/
if (inst->opcode == BRW_OPCODE_MOV &&
inst->dst.file == GRF &&
((inst->src[0].file == GRF &&
(inst->src[0].reg != inst->dst.reg ||
inst->src[0].reg_offset != inst->dst.reg_offset)) ||
inst->src[0].file == UNIFORM ||
inst->src[0].file == IMM) &&
inst->src[0].type == inst->dst.type &&
!inst->saturate &&
!inst->is_partial_write()) {
if (can_propagate_from(inst)) {
acp_entry *entry = ralloc(copy_prop_ctx, acp_entry);
entry->dst = inst->dst;
entry->src = inst->src[0];

View File

@@ -86,10 +86,10 @@ fs_live_variables::setup_one_read(bblock_t *block, fs_inst *inst,
*/
int end_ip = ip;
if (v->dispatch_width == 16 && (reg.stride == 0 ||
((v->pixel_x.file == GRF &&
v->pixel_x.reg == reg.reg) ||
(v->pixel_y.file == GRF &&
v->pixel_y.reg == reg.reg)))) {
reg.type == BRW_REGISTER_TYPE_UW ||
reg.type == BRW_REGISTER_TYPE_W ||
reg.type == BRW_REGISTER_TYPE_UB ||
reg.type == BRW_REGISTER_TYPE_B)) {
end_ip++;
}

View File

@@ -1533,7 +1533,7 @@ fs_visitor::rescale_texcoord(ir_texture *ir, fs_reg coordinate,
fs_reg chan = coordinate;
chan.reg_offset += i;
inst = emit(BRW_OPCODE_SEL, chan, chan, brw_imm_f(0.0));
inst = emit(BRW_OPCODE_SEL, chan, chan, fs_reg(0.0f));
inst->conditional_mod = BRW_CONDITIONAL_G;
/* Our parameter comes in as 1.0/width or 1.0/height,
@@ -1588,9 +1588,9 @@ fs_visitor::emit_mcs_fetch(ir_texture *ir, fs_reg coordinate, int sampler)
inst->base_mrf = -1;
inst->mlen = next.reg_offset * reg_width;
inst->header_present = false;
inst->regs_written = 4 * reg_width; /* we only care about one reg of response,
* but the sampler always writes 4/8
*/
inst->regs_written = 4; /* we only care about one reg of response,
* but the sampler always writes 4/8
*/
inst->sampler = sampler;
return dest;
@@ -2396,7 +2396,7 @@ fs_visitor::emit_untyped_atomic(unsigned atomic_op, unsigned surf_index,
unsigned mlen = 0;
/* Initialize the sample mask in the message header. */
emit(MOV(brw_uvec_mrf(8, mlen, 0), brw_imm_ud(0)))
emit(MOV(brw_uvec_mrf(8, mlen, 0), fs_reg(0u)))
->force_writemask_all = true;
if (fp->UsesKill) {
@@ -2442,7 +2442,7 @@ fs_visitor::emit_untyped_surface_read(unsigned surf_index, fs_reg dst,
unsigned mlen = 0;
/* Initialize the sample mask in the message header. */
emit(MOV(brw_uvec_mrf(8, mlen, 0), brw_imm_ud(0)))
emit(MOV(brw_uvec_mrf(8, mlen, 0), fs_reg(0u)))
->force_writemask_all = true;
if (fp->UsesKill) {

View File

@@ -272,28 +272,30 @@ setup_coord_transform(GLuint prog, const struct blit_dims *dims)
}
static GLuint
setup_program(struct gl_context *ctx, bool msaa_tex)
setup_program(struct brw_context *brw, bool msaa_tex)
{
struct gl_context *ctx = &brw->ctx;
struct blit_state *blit = &ctx->Meta->Blit;
static GLuint prog_cache[] = { 0, 0 };
const char *fs_source;
const struct sampler_and_fetch *sampler = &samplers[msaa_tex];
_mesa_meta_setup_vertex_objects(&blit->VAO, &blit->VBO, true, 2, 2, 0);
if (prog_cache[msaa_tex]) {
_mesa_UseProgram(prog_cache[msaa_tex]);
return prog_cache[msaa_tex];
GLuint *prog_id = &brw->meta_stencil_blit_programs[msaa_tex];
if (*prog_id) {
_mesa_UseProgram(*prog_id);
return *prog_id;
}
fs_source = ralloc_asprintf(NULL, fs_tmpl, sampler->sampler,
sampler->fetch);
_mesa_meta_compile_and_link_program(ctx, vs_source, fs_source,
"i965 stencil blit",
&prog_cache[msaa_tex]);
prog_id);
ralloc_free(fs_source);
return prog_cache[msaa_tex];
return *prog_id;
}
/**
@@ -427,7 +429,7 @@ brw_meta_stencil_blit(struct brw_context *brw,
_mesa_TexParameteri(target, GL_DEPTH_STENCIL_TEXTURE_MODE,
GL_STENCIL_INDEX);
prog = setup_program(ctx, target != GL_TEXTURE_2D);
prog = setup_program(brw, target != GL_TEXTURE_2D);
setup_bounding_rect(prog, orig_dims);
setup_drawing_rect(prog, &dims);
setup_coord_transform(prog, orig_dims);

View File

@@ -729,8 +729,13 @@ backend_visitor::assign_common_binding_table_offsets(uint32_t next_binding_table
}
if (prog->UsesGather) {
stage_prog_data->binding_table.gather_texture_start = next_binding_table_offset;
next_binding_table_offset += num_textures;
if (brw->gen >= 8) {
stage_prog_data->binding_table.gather_texture_start =
stage_prog_data->binding_table.texture_start;
} else {
stage_prog_data->binding_table.gather_texture_start = next_binding_table_offset;
next_binding_table_offset += num_textures;
}
} else {
stage_prog_data->binding_table.gather_texture_start = 0xd0d0d0d0;
}

View File

@@ -243,7 +243,8 @@ void gen7_upload_3dstate_so_decl_list(struct brw_context *brw,
void gen8_init_vtable_surface_functions(struct brw_context *brw);
/* brw_wm_sampler_state.c */
uint32_t translate_wrap_mode(GLenum wrap, bool using_nearest);
uint32_t translate_wrap_mode(struct brw_context *brw,
GLenum wrap, bool using_nearest);
void upload_default_color(struct brw_context *brw,
struct gl_sampler_object *sampler,
int unit,

View File

@@ -464,7 +464,8 @@ vec4_visitor::dead_code_eliminate()
}
if (inst->dst.file == scan_inst->dst.file &&
inst->dst.reg == scan_inst->dst.reg) {
inst->dst.reg == scan_inst->dst.reg &&
inst->dst.reg_offset == scan_inst->dst.reg_offset) {
int new_writemask = scan_inst->dst.writemask & ~dead_channels;
progress = try_eliminate_instruction(scan_inst, new_writemask, brw) ||

View File

@@ -73,7 +73,8 @@ is_channel_updated(vec4_instruction *inst, src_reg *values[4], int ch)
}
static bool
try_constant_propagation(vec4_instruction *inst, int arg, src_reg *values[4])
try_constant_propagation(struct brw_context *brw, vec4_instruction *inst,
int arg, src_reg *values[4])
{
/* For constant propagation, we only handle the same constant
* across all 4 channels. Some day, we should handle the 8-bit
@@ -110,6 +111,12 @@ try_constant_propagation(vec4_instruction *inst, int arg, src_reg *values[4])
inst->src[arg] = value;
return true;
case SHADER_OPCODE_POW:
case SHADER_OPCODE_INT_QUOTIENT:
case SHADER_OPCODE_INT_REMAINDER:
if (brw->gen < 8)
break;
/* fallthrough */
case BRW_OPCODE_DP2:
case BRW_OPCODE_DP3:
case BRW_OPCODE_DP4:
@@ -195,6 +202,15 @@ try_constant_propagation(vec4_instruction *inst, int arg, src_reg *values[4])
return false;
}
static bool
is_logic_op(enum opcode opcode)
{
return (opcode == BRW_OPCODE_AND ||
opcode == BRW_OPCODE_OR ||
opcode == BRW_OPCODE_XOR ||
opcode == BRW_OPCODE_NOT);
}
bool
vec4_visitor::try_copy_propagation(vec4_instruction *inst, int arg,
src_reg *values[4])
@@ -233,6 +249,11 @@ vec4_visitor::try_copy_propagation(vec4_instruction *inst, int arg,
value.file != ATTR)
return false;
if (brw->gen >= 8 && (value.negate || value.abs) &&
is_logic_op(inst->opcode)) {
return false;
}
if (inst->src[arg].abs) {
value.negate = false;
value.abs = true;
@@ -343,7 +364,7 @@ vec4_visitor::opt_copy_propagation()
if (c != 4)
continue;
if (try_constant_propagation(inst, i, values) ||
if (try_constant_propagation(brw, inst, i, values) ||
try_copy_propagation(inst, i, values))
progress = true;
}

View File

@@ -365,10 +365,12 @@ vec4_visitor::emit_math(opcode opcode, dst_reg dst, src_reg src)
return;
}
if (brw->gen >= 6) {
return emit_math1_gen6(opcode, dst, src);
if (brw->gen >= 8) {
emit(opcode, dst, src);
} else if (brw->gen >= 6) {
emit_math1_gen6(opcode, dst, src);
} else {
return emit_math1_gen4(opcode, dst, src);
emit_math1_gen4(opcode, dst, src);
}
}
@@ -417,10 +419,12 @@ vec4_visitor::emit_math(enum opcode opcode,
return;
}
if (brw->gen >= 6) {
return emit_math2_gen6(opcode, dst, src0, src1);
if (brw->gen >= 8) {
emit(opcode, dst, src0, src1);
} else if (brw->gen >= 6) {
emit_math2_gen6(opcode, dst, src0, src1);
} else {
return emit_math2_gen4(opcode, dst, src0, src1);
emit_math2_gen4(opcode, dst, src0, src1);
}
}

View File

@@ -352,7 +352,8 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx,
if (alpha_depth || (brw->gen < 8 && !brw->is_haswell))
key->swizzles[s] = brw_get_texture_swizzle(ctx, t);
if (sampler->MinFilter != GL_NEAREST &&
if (brw->gen < 8 &&
sampler->MinFilter != GL_NEAREST &&
sampler->MagFilter != GL_NEAREST) {
if (sampler->WrapS == GL_CLAMP)
key->gl_clamp_mask[0] |= 1 << s;

View File

@@ -46,7 +46,7 @@
uint32_t
translate_wrap_mode(GLenum wrap, bool using_nearest)
translate_wrap_mode(struct brw_context *brw, GLenum wrap, bool using_nearest)
{
switch( wrap ) {
case GL_REPEAT:
@@ -55,9 +55,16 @@ translate_wrap_mode(GLenum wrap, bool using_nearest)
/* GL_CLAMP is the weird mode where coordinates are clamped to
* [0.0, 1.0], so linear filtering of coordinates outside of
* [0.0, 1.0] give you half edge texel value and half border
* color. The fragment shader will clamp the coordinates, and
* we set clamp_border here, which gets the result desired. We
* just use clamp(_to_edge) for nearest, because for nearest
* color.
*
* Gen8+ supports this natively.
*/
if (brw->gen >= 8)
return GEN8_TEXCOORDMODE_HALF_BORDER;
/* On Gen4-7.5, we clamp the coordinates in the fragment shader
* and set clamp_border here, which gets the result desired.
* We just use clamp(_to_edge) for nearest, because for nearest
* clamping to 1.0 gives border color instead of the desired
* edge texels.
*/
@@ -276,11 +283,11 @@ static void brw_update_sampler_state(struct brw_context *brw,
}
}
sampler->ss1.r_wrap_mode = translate_wrap_mode(gl_sampler->WrapR,
sampler->ss1.r_wrap_mode = translate_wrap_mode(brw, gl_sampler->WrapR,
using_nearest);
sampler->ss1.s_wrap_mode = translate_wrap_mode(gl_sampler->WrapS,
sampler->ss1.s_wrap_mode = translate_wrap_mode(brw, gl_sampler->WrapS,
using_nearest);
sampler->ss1.t_wrap_mode = translate_wrap_mode(gl_sampler->WrapT,
sampler->ss1.t_wrap_mode = translate_wrap_mode(brw, gl_sampler->WrapT,
using_nearest);
if (brw->gen >= 6 &&

View File

@@ -829,12 +829,14 @@ brw_update_texture_surfaces(struct brw_context *brw)
/* emit alternate set of surface state for gather. this
* allows the surface format to be overriden for only the
* gather4 messages. */
if (vs && vs->UsesGather)
update_stage_texture_surfaces(brw, vs, &brw->vs.base, true);
if (gs && gs->UsesGather)
update_stage_texture_surfaces(brw, gs, &brw->gs.base, true);
if (fs && fs->UsesGather)
update_stage_texture_surfaces(brw, fs, &brw->wm.base, true);
if (brw->gen < 8) {
if (vs && vs->UsesGather)
update_stage_texture_surfaces(brw, vs, &brw->vs.base, true);
if (gs && gs->UsesGather)
update_stage_texture_surfaces(brw, gs, &brw->gs.base, true);
if (fs && fs->UsesGather)
update_stage_texture_surfaces(brw, fs, &brw->wm.base, true);
}
brw->state.dirty.brw |= BRW_NEW_SURFACES;
}

Some files were not shown because too many files have changed in this diff Show More